CN112270184B - Natural language processing method, device and storage medium - Google Patents

Natural language processing method, device and storage medium Download PDF

Info

Publication number
CN112270184B
CN112270184B CN202011152152.5A CN202011152152A CN112270184B CN 112270184 B CN112270184 B CN 112270184B CN 202011152152 A CN202011152152 A CN 202011152152A CN 112270184 B CN112270184 B CN 112270184B
Authority
CN
China
Prior art keywords
word
text
network model
medical
corrected
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011152152.5A
Other languages
Chinese (zh)
Other versions
CN112270184A (en
Inventor
朱威
李恬静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202011152152.5A priority Critical patent/CN112270184B/en
Publication of CN112270184A publication Critical patent/CN112270184A/en
Application granted granted Critical
Publication of CN112270184B publication Critical patent/CN112270184B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Machine Translation (AREA)

Abstract

The application relates to the technical field of medical science and technology, and particularly discloses a natural language processing method, a natural language processing device and a storage medium. The method comprises the following steps: acquiring a text sample; word segmentation is carried out on the text sample to obtain at least one word; obtaining morphemes corresponding to each word in the at least one word from a pre-constructed semantic knowledge base, and taking the morphemes corresponding to each word as a supervision tag of each word in the each word; inputting the text sample into a network model to obtain a first morpheme of each word in the text sample; according to the supervision labels and the first morphemes of each word in the text sample, adjusting the network parameters of the network model to obtain a pre-training network model; and performing natural language processing by using the pre-training network model. The application is beneficial to improving the precision of natural language processing.

Description

Natural language processing method, device and storage medium
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a natural language processing method, a natural language processing device and a storage medium.
Background
Along with the development of the artificial intelligence technology, a better processing effect is achieved in the field of natural language processing, and great convenience is brought to life of people. For example, a trained language processing model is used to correct the text input by the user in the dialog box, so as to accurately express the intention of the user, for example, in a man-machine dialog, such as siri voice, the user's spoken language is understood through trained language processing, and then the intention of the user is executed.
Although different language processing models which are trained in the prior art can execute different natural language processing tasks, training is only carried out by relying on literal semantics of a language in the training process, potential semantics of the language cannot be mined, and the processing precision in the natural language processing process is low.
Disclosure of Invention
The embodiment of the application provides a natural language processing method, a natural language processing device and a storage medium. By merging the morpheme information of each word, the potential semantics of the language can be mined, and the processing progress of the natural language is improved.
In a first aspect, an embodiment of the present application provides a natural language processing method, including:
acquiring a text sample;
word segmentation is carried out on the text sample to obtain at least one word;
obtaining morphemes corresponding to each word in the at least one word from a pre-constructed semantic knowledge base, and taking the morphemes corresponding to each word as a supervision tag of each word in the each word;
inputting the text sample into a network model to obtain a first morpheme of each word in the text sample;
according to the supervision labels and the first morphemes of each word in the text sample, adjusting the network parameters of the network model to obtain a pre-training network model;
and performing natural language processing by using the pre-training network model.
In a second aspect, an embodiment of the present application provides a natural language processing apparatus, including:
the acquisition unit is used for acquiring a text sample;
the processing unit is used for word segmentation of the text sample to obtain at least one word;
the processing unit is further configured to obtain a morpheme corresponding to each word in the at least one word from a pre-constructed semantic knowledge base, and use the morpheme corresponding to each word as a supervision tag of each word in the each word;
the processing unit is further used for inputting the text sample into a network model to obtain a first morpheme of each word in the text sample;
the processing unit is further used for adjusting network parameters of the network model according to the supervision labels and the first morphemes of each word in the text sample to obtain a pre-trained network model;
the processing unit is further used for performing natural language processing by using the pre-training network model.
In a third aspect, an embodiment of the present application provides a natural language processing device, including a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the processor, the programs including instructions for performing the steps in the method according to the first aspect.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium storing a computer program that causes a computer to perform the method according to the first aspect.
In a fifth aspect, embodiments of the present application provide a computer program product comprising a non-transitory computer readable storage medium storing a computer program, the computer being operable to cause a computer to perform the method according to the first aspect.
The embodiment of the application has the following beneficial effects:
it can be seen that in pre-training the network model, the morpheme information of each word needs to be aligned, i.e. model training using the underlying semantics of each word. After the network model is iterated for a plurality of times, the obtained pre-training network model carries out the natural language processing process subsequently, word vectors obtained by encoding each word contain morpheme information (hidden semantic information) corresponding to the word, so that the word vectors contain more semantic information, and the accuracy of the natural language processing is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required for the description of the embodiments will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a natural language processing method according to an embodiment of the present application;
fig. 2 is a schematic structural diagram of a network model according to an embodiment of the present application;
FIG. 3 is a flowchart of a method for constructing a text sample according to an embodiment of the present application;
FIG. 4 is a schematic diagram of medical text error correction according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a natural language processing device according to an embodiment of the present application;
fig. 6 is a functional unit composition block diagram of a natural language processing device according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The terms "first," "second," "third," and "fourth" and the like in the description and in the claims and drawings are used for distinguishing between different objects and not necessarily for describing a particular sequential or chronological order. Furthermore, the terms "comprise" and "have," as well as any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those listed steps or elements but may include other steps or elements not listed or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, result, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
Referring to fig. 1, fig. 1 is a flow chart of a natural language processing method according to an embodiment of the application. The method is applied to a natural language processing device. The method comprises the following steps:
101: the natural language processing device obtains a text sample.
The text sample is pre-constructed, and the construction process of the text sample is described in detail below, which will not be described in detail.
102: and the natural language processing device divides words from the text sample to obtain at least one word.
For example, the text sample may be segmented through an existing segmentation network, such as a loop network, a long and short term memory network, and so forth. The word segmentation process is the prior art and is not described.
103: the natural language processing device acquires the morpheme corresponding to each word in the at least one word from a pre-constructed semantic knowledge base, and takes the morpheme corresponding to each word as a supervision tag of each word in the each word.
Illustratively, the semantic knowledge base is a knowledge base of morpheme compositions of individual terms that are pre-constructed. Therefore, dictionary matching can be performed on each word of at least one word in the text sample and the semantic knowledge base respectively to obtain morphemes corresponding to each word in the text sample, and the morphemes corresponding to each word are used as supervision labels of each word, so that supervision labels corresponding to each word in the text sample are obtained.
104: the natural language processing device inputs the text sample into a network model to obtain a first morpheme of each word in the text sample.
Illustratively, inputting the text sample into a network model, and encoding each word in at least one word in the text sample, such as word embedding, to obtain a word vector corresponding to each word; and then, carrying out morpheme prediction according to the word vector corresponding to each word to obtain a first morpheme corresponding to each word, taking the first morpheme corresponding to each word as the morpheme corresponding to each word in each word, and further obtaining the first morpheme of each word in the text sample.
105: and the natural language processing device adjusts network parameters of the network model according to the supervision labels and the first morphemes of each word in the text sample to obtain a pre-training network model.
For example, the first loss of each word is determined according to the supervision tag corresponding to each word and the first morpheme, for example, the euclidean distance between the supervision tag of each word and the first morpheme may be taken as the first loss; network parameters of the network model are then adjusted based on the first penalty for each word and the gradient descent method. For example, determining cross entropy loss of each word to obtain first loss corresponding to each word, then taking an average value of the first loss of all words in the text sample as a target loss, and adjusting network parameters of the network model according to the target loss and a gradient descent method until the network model converges to obtain a pre-trained network model.
Illustratively, the target loss may be represented by equation (1):
wherein L is m For the target loss, cross_entry is the Cross Entropy loss, N is the number of words in the text sample, N is an integer greater than or equal to 1, θ i Is the supervision tag of the i-th word,is the first morpheme of the i-th word.
106: the natural language processing device uses the pre-training network model to perform natural language processing.
For example, the pre-trained network model may be used for text correction, intent recognition, spoken understanding, human-machine interaction, and so forth.
It can be seen that in pre-training the network model, the morpheme information of each word needs to be aligned, i.e. model training using the underlying semantics of each word. After the network model is iterated for a plurality of times, the obtained pre-training network model carries out the natural language processing process subsequently, word vectors obtained by encoding each word contain morpheme information (hidden semantic information) corresponding to the word, so that the word vectors contain more semantic information, and the accuracy of the natural language processing is improved.
In one embodiment of the present application, the natural language processing method of the present application may be applied to the field of smart medicine. For example, the pre-training model can be finely adjusted, and under the condition that a doctor searches for a historical case, the medical text input by the doctor can be corrected by using the finely adjusted network model, so that the medical text input by the doctor can be ensured to be correct, the historical case can be accurately searched, case reference is provided for the current diagnosis of the doctor, the diagnosis efficiency of the doctor is improved, and the development of medical science and technology is further promoted.
The above process of training the network model to obtain a pre-trained network model is illustrated below in conjunction with a schematic diagram of the network model.
As shown in fig. 2, the network model includes an embedded layer, an encoding layer, which may be an Albert encoder, and a classification layer.
Word segmentation is carried out on the text sample to obtain at least one word [ X1, X2, X3, … …, xn ], morphemes corresponding to each word in the at least one word are matched in a semantic knowledge base, and the morphemes corresponding to each word are used as supervision labels of each word in the word;
then, word embedding processing is carried out on each word through an embedding layer, so that word vectors of each word are obtained; then, the word vector of each word is encoded through an encoding layer to obtain a target feature vector of each word, for example, the word vector of the word can be fused through an attention mechanism; finally, classifying and predicting each word through a classification layer and a target feature vector of each word to obtain a first morpheme corresponding to each word, and taking the first morpheme corresponding to each word as the first morpheme of each word in the word; finally, obtaining loss according to the first morpheme and the supervision tag of each word; and adjusting network parameters of the network model according to the loss and gradient descent method until the model converges to obtain a pre-trained network model.
Referring to fig. 3, fig. 3 is a flowchart illustrating a method for constructing a text sample according to an embodiment of the present application. The method comprises the following steps:
301: a first text sequence is acquired.
Wherein the first text sequence is an original text sequence.
302: and replacing the target word in the first text sequence to obtain at least one second text sequence.
The target word may be other words except for stop words, entity words and vertical keywords in the first text sequence. For example, the first text sequence "i want to take a drug, a piece of metformin," then "want," or "want," can be replaced with "want," or "want," etc. in the first text sequence. The intention of the first text sequence is not changed by replacing the target word, but the expression mode of the first text sequence is changed, so that the first text sequence can be expanded into a plurality of second text sequences with the same intention, a plurality of rich corpus corresponding to one intention is obtained, and the network model can recognize morphemes of partial words in different expression modes, for example, the unchanged words in the first text sequence, and further the generalization capability of the network model is improved.
303: and replacing the entity of each second text sequence in the at least one second text sequence to obtain at least one third text sequence corresponding to each second text sequence.
For example, an entity of each second text sequence of the at least one second text sequence may be determined, wherein the entity of determining each second text sequence may be implemented by a recurrent neural network, a long-short-term memory network, and not described again; then, at least one candidate entity corresponding to the entity of each second text sequence is obtained, wherein the entity type of each candidate entity in the at least one candidate entity is the same as the entity type of the entity of each second text sequence. And finally, replacing the entity in the second text sequence by using each entity in the at least one candidate entity to obtain at least one third text sequence corresponding to each second text sequence.
It should be understood that the replacement of the entities in each second text sequence mainly expands the richness of the text samples in each entity field, so that after the network model is trained by using such text samples, the morphemes of each entity in each field can be encoded, and the generalization capability of the network model is further improved.
304: and taking each third text sequence in at least one third text sequence corresponding to each second text sequence as the training text.
The application scenario of the pre-training network is illustrated below.
Scene 1: scenes of medical text correction using a pre-trained network model.
For example, medical text may be acquired first; fine-tuning (fine-tuning) the pre-trained network model using the medical text; and correcting the medical text to be corrected by using the network model after fine adjustment.
For example, medical text may be read from a medical database and used as the correct medical text, i.e., a supervision tag; then, randomly selecting a first word from the medical text as a word to be replaced, and acquiring a candidate word corresponding to the first word from a dictionary database, wherein the candidate word is an error-prone word corresponding to the first word, such as a word closest to the first word, and the like; then, replacing the first word in the medical text by using the candidate word to obtain a training sample; finally, inputting the training sample into the pre-training network to obtain an error correction result, and obtaining loss according to the error correction result and the medical text (supervision label); the pre-trained network is trimmed using the loss and gradient descent method. After the trimming is completed, the trimmed network model may be used to correct the medical text to be corrected.
Illustratively, as shown in fig. 4, the entity in the medical text to be corrected is determined, for example, the entity in the medical text to be corrected may be identified through a recurrent neural network RNN or a long short term memory network LSTM; then, acquiring a medical knowledge graph corresponding to the entity from a medical knowledge graph base which is constructed in advance, and coding the medical knowledge graph to obtain a graph vector; encoding each word in the medical text to be corrected to obtain a word vector corresponding to each word, and splicing the word vector of each word with the map vector to obtain a target feature vector corresponding to each word; and correcting the medical text to be corrected according to the target feature vector corresponding to each word to obtain corrected medical text.
For example, the score corresponding to each word can be determined according to the target feature vector corresponding to each word, and the word with the score smaller than the threshold value is used as the word to be corrected; then, at least one candidate word corresponding to the word to be corrected is obtained from a dictionary library; and finally, determining the score of each candidate word in the at least one candidate word, replacing the word to be corrected by using the candidate word with the largest score to obtain corrected medical text, wherein the score of each candidate word is similar to the score of each word in the medical text to be corrected, for example, replacing the word to be corrected by using each candidate word in turn to obtain the replaced medical text, and obtaining the score of each candidate word through the replaced medical text without description.
It can be seen that, because the fine-tuned network model encodes the morpheme information of each word into the word vector corresponding to the word in the process of encoding each word in the text to be corrected, the morpheme information of each word can be used in the process of correcting the text to be corrected, and the correction accuracy is improved. And medical map knowledge is combined in the error correction process, so that the error correction accuracy is further improved.
Scene 2: intent recognition in spoken language understanding processes.
The method comprises the steps of obtaining a training sample and a training label, wherein the training label is an intention labeling result of the training sample; the training sample is used for fine tuning the pre-training model to obtain a network model after fine tuning, namely the loss between the prediction result of the training sample and the training label is used for fine tuning, and detailed description is omitted; and carrying out intention recognition on the text to be recognized by using the fine-tuned network model to obtain the intention corresponding to the text to be recognized, wherein the text to be recognized is obtained by carrying out voice conversion on the words of the user. Namely, each word in the text to be recognized is encoded to obtain a word vector of each word, and slot filling is carried out according to the word vector of each word; and determining the intention of the text to be recognized according to the slot filling result of each word.
It can be seen that, in this embodiment, in the process of understanding the spoken language, when the text to be recognized is intended to be recognized, morpheme information of each word can be encoded into a word vector of each word, so that the slot filling accuracy of the word can be improved, the accuracy of the intended recognition can be further improved, and the spoken language understanding can be better performed.
Referring to fig. 5, fig. 5 is a schematic structural diagram of a natural language processing device according to an embodiment of the present application. As shown in fig. 4, the natural language processing apparatus 400 includes a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the processor, the programs including instructions for performing the steps of:
acquiring a text sample;
word segmentation is carried out on the text sample to obtain at least one word;
obtaining morphemes corresponding to each word in the at least one word from a pre-constructed semantic knowledge base, and taking the morphemes corresponding to each word as a supervision tag of each word in the each word;
inputting the text sample into a network model to obtain a first morpheme of each word in the text sample;
according to the supervision labels and the first morphemes of each word in the text sample, adjusting the network parameters of the network model to obtain a pre-training network model;
and performing natural language processing by using the pre-training network model.
In some possible embodiments, in adjusting the network parameters of the network model according to the supervision labels and the first morphemes of each word in the text sample to obtain a pre-trained network model, the above program is specifically configured to execute the following instructions:
determining a first loss corresponding to each word in the text sample according to the supervision tag and the first morpheme of each word in the text sample;
and adjusting network parameters of the network model according to the first loss and the gradient descent method corresponding to each word in the text sample to obtain a pre-training network model.
In some possible embodiments, the above program is further configured to, prior to obtaining the text sample, execute instructions for:
acquiring a first text sequence;
replacing target words in the first text sequence to obtain at least one second text sequence;
replacing an entity of each second text sequence in the at least one second text sequence to obtain at least one third text sequence corresponding to each second text sequence;
and taking each third text sequence in at least one third text sequence corresponding to each second text sequence as the training text.
In some possible embodiments, in replacing a part of words in the first text sequence to obtain at least one second text sequence, the above program is specifically configured to execute instructions for:
shielding a target word in the first text sequence;
predicting the target word by using a Bert model to obtain at least one word to be replaced;
and replacing the target word by using each word to be replaced in the at least one word to be replaced to obtain at least one second text sequence.
In some possible embodiments, in the case of medical text correction using the pre-trained network model, the above-mentioned program is specifically configured to execute instructions for performing the following steps in terms of natural language processing using the pre-trained network model:
acquiring a medical text;
fine tuning the pre-trained network model using the medical text;
and correcting the medical text to be corrected by using the network model after fine adjustment.
In some possible embodiments, in using the trimmed network model, performing error correction on the medical text to be error corrected, the program is specifically configured to execute instructions for:
determining an entity in the medical text to be corrected;
acquiring a medical knowledge graph corresponding to the entity from a medical knowledge graph library which is constructed in advance;
and correcting the medical text to be corrected according to the medical knowledge graph corresponding to the entity to obtain corrected medical text.
In some possible embodiments, in correcting the medical text to be corrected according to the medical knowledge graph corresponding to the entity to obtain corrected medical text, the program is specifically configured to execute the following instructions:
encoding each word in the medical text to be corrected to obtain a word vector corresponding to each word in the medical text to be corrected;
encoding the medical knowledge graph corresponding to the entity to obtain a graph vector;
splicing word vectors corresponding to each word in the medical text to be corrected with the map vectors to obtain target feature vectors corresponding to each word in the medical text to be corrected;
and correcting the medical text to be corrected according to the target feature vector corresponding to each word in the medical text to be corrected, so as to obtain the corrected medical text.
Referring to fig. 6, fig. 6 is a functional unit block diagram of a natural language processing device according to an embodiment of the present application. The natural language processing apparatus 600 includes: an acquisition unit 601 and a processing unit 602, wherein:
an obtaining unit 601, configured to obtain a text sample;
a processing unit 602, configured to segment the text sample to obtain at least one word;
the processing unit 602 is further configured to obtain, from a pre-configured semantic knowledge base, a morpheme corresponding to each word in the at least one word, and use the morpheme corresponding to each word as a supervision tag of each word in the each word;
the processing unit 602 is further configured to input the text sample into a network model, so as to obtain a first morpheme of each word in the text sample;
the processing unit 602 is further configured to adjust network parameters of the network model according to the supervision tag and the first morpheme of each word in the text sample, so as to obtain a pre-trained network model;
the processing unit 602 is further configured to perform natural language processing using the pre-trained network model.
In some possible embodiments, the processing unit 602 is specifically configured to, in adjusting the network parameters of the network model according to the supervision tag and the first morpheme of each word in the text sample, obtain a pre-trained network model:
determining a first loss corresponding to each word in the text sample according to the supervision tag and the first morpheme of each word in the text sample;
and adjusting network parameters of the network model according to the first loss and the gradient descent method corresponding to each word in the text sample to obtain a pre-training network model.
In some possible embodiments, before the text sample is acquired, the acquiring unit 601 is further configured to: acquiring a first text sequence;
the processing unit 602 is further configured to replace a target word in the first text sequence to obtain at least one second text sequence;
replacing an entity of each second text sequence in the at least one second text sequence to obtain at least one third text sequence corresponding to each second text sequence;
and taking each third text sequence in at least one third text sequence corresponding to each second text sequence as the training text.
In some possible implementations, the processing unit 602 is specifically configured to, in replacing a part of the words in the first text sequence to obtain at least one second text sequence:
shielding a target word in the first text sequence;
predicting the target word by using a Bert model to obtain at least one word to be replaced;
and replacing the target word by using each word to be replaced in the at least one word to be replaced to obtain at least one second text sequence.
In some possible embodiments, in the case of medical text correction using the pre-trained network model, the above-mentioned program is specifically configured to execute instructions for performing the following steps in terms of natural language processing using the pre-trained network model:
acquiring a medical text;
fine tuning the pre-trained network model using the medical text;
and correcting the medical text to be corrected by using the network model after fine adjustment.
In some possible embodiments, in using the trimmed network model, the processing unit 602 is specifically configured to, in correcting the medical text to be corrected:
determining an entity in the medical text to be corrected;
acquiring a medical knowledge graph corresponding to the entity from a medical knowledge graph library which is constructed in advance;
and correcting the medical text to be corrected according to the medical knowledge graph corresponding to the entity to obtain corrected medical text.
In some possible embodiments, in terms of correcting the medical text to be corrected according to the medical knowledge graph corresponding to the entity to obtain corrected medical text, the processing unit 602 is specifically configured to:
encoding each word in the medical text to be corrected to obtain a word vector corresponding to each word in the medical text to be corrected;
encoding the medical knowledge graph corresponding to the entity to obtain a graph vector;
splicing word vectors corresponding to each word in the medical text to be corrected with the map vectors to obtain target feature vectors corresponding to each word in the medical text to be corrected;
and correcting the medical text to be corrected according to the target feature vector corresponding to each word in the medical text to be corrected, so as to obtain the corrected medical text.
It should be understood that the natural language processing device in the present application may include a smart Phone (such as an Android mobile Phone, an iOS mobile Phone, a Windows Phone mobile Phone, etc.), a tablet computer, a palm computer, a notebook computer, a mobile internet device MID (Mobile Internet Devices, abbreviated as MID) or a wearable device, etc. The above-described natural language processing device is merely exemplary and not exhaustive and includes, but is not limited to, the above-described natural language processing device. In practical application, the above natural language processing device may further include: intelligent vehicle terminals, computer devices, etc.
The embodiment of the present application also provides a computer storage medium storing a computer program that is executed by a processor to implement some or all of the steps of any one of the natural language processing methods described in the above method embodiments.
Embodiments of the present application also provide a computer program product comprising a non-transitory computer readable storage medium storing a computer program operable to cause a computer to perform part or all of the steps of any one of the natural language processing methods described in the method embodiments above.
It should be noted that, for simplicity of description, the foregoing method embodiments are all described as a series of acts, but it should be understood by those skilled in the art that the present application is not limited by the order of acts described, as some steps may be performed in other orders or concurrently in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are alternative embodiments, and that the acts and modules referred to are not necessarily required for the present application.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to related descriptions of other embodiments.
In the several embodiments provided by the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, such as the division of the units, merely a logical function division, and there may be additional manners of dividing the actual implementation, such as multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, or may be in electrical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units described above may be implemented either in hardware or in software program modules.
The integrated units, if implemented in the form of software program modules, may be stored in a computer-readable memory for sale or use as a stand-alone product. Based on this understanding, the technical solution of the present application may be embodied essentially or partly in the form of a software product, or all or part of the technical solution, which is stored in a memory, and includes several instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned memory includes: a U-disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Those of ordinary skill in the art will appreciate that all or a portion of the steps in the various methods of the above embodiments may be implemented by a program that instructs associated hardware, and the program may be stored in a computer readable memory, which may include: flash disk, read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), magnetic disk or optical disk.
The foregoing has outlined rather broadly the more detailed description of embodiments of the application, wherein the principles and embodiments of the application are explained in detail using specific examples, the above examples being provided solely to facilitate the understanding of the method and core concepts of the application; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims (9)

1. A method of natural language processing, comprising:
acquiring a first text sequence;
replacing target words in the first text sequence to obtain at least one second text sequence;
replacing an entity of each second text sequence in the at least one second text sequence to obtain at least one third text sequence corresponding to each second text sequence;
taking each third text sequence in at least one third text sequence corresponding to each second text sequence as a text sample;
acquiring a text sample;
word segmentation is carried out on the text sample to obtain at least one word;
obtaining morphemes corresponding to each word in the at least one word from a pre-constructed semantic knowledge base, and taking the morphemes corresponding to each word as a supervision tag of each word in the each word;
inputting the text sample into a network model to obtain a first morpheme of each word in the text sample;
according to the supervision labels and the first morphemes of each word in the text sample, adjusting the network parameters of the network model to obtain a pre-training network model;
and performing natural language processing by using the pre-training network model.
2. The method of claim 1, wherein adjusting network parameters of the network model based on the supervision labels and the first morphemes of each word in the text sample to obtain a pre-trained network model comprises:
determining a first loss corresponding to each word in the text sample according to the supervision tag and the first morpheme of each word in the text sample;
and adjusting network parameters of the network model according to the first loss and the gradient descent method corresponding to each word in the text sample to obtain a pre-training network model.
3. The method of claim 2, wherein replacing the target word in the first text sequence to obtain at least one second text sequence comprises:
shielding a target word in the first text sequence;
predicting the target word by using a Bert model to obtain at least one word to be replaced;
and replacing the target word by using each word to be replaced in the at least one word to be replaced to obtain at least one second text sequence.
4. The method of claim 1, wherein in the event of medical text correction using the pre-trained network model, the performing natural language processing using the pre-trained network model comprises:
acquiring a medical text;
fine tuning the pre-trained network model using the medical text;
and correcting the medical text to be corrected by using the network model after fine adjustment.
5. The method of claim 4, wherein using the trimmed network model to correct the medical text to be corrected comprises:
determining an entity in the medical text to be corrected;
acquiring a medical knowledge graph corresponding to the entity from a medical knowledge graph library which is constructed in advance;
and correcting the medical text to be corrected according to the medical knowledge graph corresponding to the entity to obtain corrected medical text.
6. The method of claim 5, wherein the performing error correction on the medical text to be error corrected according to the medical knowledge graph corresponding to the entity to obtain the error corrected medical text comprises:
encoding each word in the medical text to be corrected to obtain a word vector corresponding to each word in the medical text to be corrected;
encoding the medical knowledge graph corresponding to the entity to obtain a graph vector;
splicing word vectors corresponding to each word in the medical text to be corrected with the map vectors to obtain target feature vectors corresponding to each word in the medical text to be corrected;
and correcting the medical text to be corrected according to the target feature vector corresponding to each word in the medical text to be corrected, so as to obtain the corrected medical text.
7. A natural language processing apparatus for performing the method of any one of claims 1-6, the apparatus comprising:
the acquisition unit is used for acquiring a text sample;
the processing unit is used for word segmentation of the text sample to obtain at least one word;
the processing unit is further configured to obtain a morpheme corresponding to each word in the at least one word from a pre-constructed semantic knowledge base, and use the morpheme corresponding to each word as a supervision tag of each word in the each word;
the processing unit is further used for inputting the text sample into a network model to obtain a first morpheme of each word in the text sample;
the processing unit is further used for adjusting network parameters of the network model according to the supervision labels and the first morphemes of each word in the text sample to obtain a pre-trained network model;
the processing unit is further used for performing natural language processing by using the pre-training network model.
8. A natural language processing device comprising a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the processor, the programs comprising instructions for performing the steps of the method of any of claims 1-6.
9. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program, which is executed by a processor to implement the method of any of claims 1-6.
CN202011152152.5A 2020-10-23 2020-10-23 Natural language processing method, device and storage medium Active CN112270184B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011152152.5A CN112270184B (en) 2020-10-23 2020-10-23 Natural language processing method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011152152.5A CN112270184B (en) 2020-10-23 2020-10-23 Natural language processing method, device and storage medium

Publications (2)

Publication Number Publication Date
CN112270184A CN112270184A (en) 2021-01-26
CN112270184B true CN112270184B (en) 2023-11-14

Family

ID=74341694

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011152152.5A Active CN112270184B (en) 2020-10-23 2020-10-23 Natural language processing method, device and storage medium

Country Status (1)

Country Link
CN (1) CN112270184B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112860896A (en) * 2021-03-05 2021-05-28 三一重工股份有限公司 Corpus generalization method and man-machine conversation emotion analysis method for industrial field
CN113569974B (en) * 2021-08-04 2023-07-18 网易(杭州)网络有限公司 Programming statement error correction method, device, electronic equipment and storage medium
CN114048321A (en) * 2021-08-12 2022-02-15 湖南达德曼宁信息技术有限公司 Multi-granularity text error correction data set generation method, device and equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110442870A (en) * 2019-08-02 2019-11-12 深圳市珍爱捷云信息技术有限公司 Text error correction method, device, computer equipment and storage medium
CN110598213A (en) * 2019-09-06 2019-12-20 腾讯科技(深圳)有限公司 Keyword extraction method, device, equipment and storage medium
CN111062217A (en) * 2019-12-19 2020-04-24 江苏满运软件科技有限公司 Language information processing method and device, storage medium and electronic equipment
CN111507104A (en) * 2020-03-19 2020-08-07 北京百度网讯科技有限公司 Method and device for establishing label labeling model, electronic equipment and readable storage medium
CN111783451A (en) * 2020-06-30 2020-10-16 北京百度网讯科技有限公司 Method and apparatus for enhancing text samples

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7899666B2 (en) * 2007-05-04 2011-03-01 Expert System S.P.A. Method and system for automatically extracting relations between concepts included in text
CN108280061B (en) * 2018-01-17 2021-10-26 北京百度网讯科技有限公司 Text processing method and device based on ambiguous entity words

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110442870A (en) * 2019-08-02 2019-11-12 深圳市珍爱捷云信息技术有限公司 Text error correction method, device, computer equipment and storage medium
CN110598213A (en) * 2019-09-06 2019-12-20 腾讯科技(深圳)有限公司 Keyword extraction method, device, equipment and storage medium
CN111062217A (en) * 2019-12-19 2020-04-24 江苏满运软件科技有限公司 Language information processing method and device, storage medium and electronic equipment
CN111507104A (en) * 2020-03-19 2020-08-07 北京百度网讯科技有限公司 Method and device for establishing label labeling model, electronic equipment and readable storage medium
CN111783451A (en) * 2020-06-30 2020-10-16 北京百度网讯科技有限公司 Method and apparatus for enhancing text samples

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Dynamic Semantic Matching and Aggregation Network for Few-shot Intent Detection;Hoang Nguyen等;https://arxiv.org/abs/2010.02481;第1-10页 *

Also Published As

Publication number Publication date
CN112270184A (en) 2021-01-26

Similar Documents

Publication Publication Date Title
CN112270184B (en) Natural language processing method, device and storage medium
CN109918680B (en) Entity identification method and device and computer equipment
CN109783655B (en) Cross-modal retrieval method and device, computer equipment and storage medium
CN110162669B (en) Video classification processing method and device, computer equipment and storage medium
KR102577514B1 (en) Method, apparatus for text generation, device and storage medium
US10332507B2 (en) Method and device for waking up via speech based on artificial intelligence
WO2021189971A1 (en) Medical plan recommendation system and method based on knowledge graph representation learning
CN112528637B (en) Text processing model training method, device, computer equipment and storage medium
CN108899013B (en) Voice search method and device and voice recognition system
CN111931490B (en) Text error correction method, device and storage medium
CN112100349A (en) Multi-turn dialogue method and device, electronic equipment and storage medium
CN114245203B (en) Video editing method, device, equipment and medium based on script
CN113408287B (en) Entity identification method and device, electronic equipment and storage medium
Tündik et al. Joint word-and character-level embedding CNN-RNN models for punctuation restoration
US20230094730A1 (en) Model training method and method for human-machine interaction
CN113657105A (en) Medical entity extraction method, device, equipment and medium based on vocabulary enhancement
CN109933773A (en) A kind of multiple semantic sentence analysis system and method
TWI752406B (en) Speech recognition method, speech recognition device, electronic equipment, computer-readable storage medium and computer program product
CN117556276B (en) Method and device for determining similarity between text and video
CN116955579B (en) Chat reply generation method and device based on keyword knowledge retrieval
CN114049501A (en) Image description generation method, system, medium and device fusing cluster search
CN111368531B (en) Translation text processing method and device, computer equipment and storage medium
CN115712739B (en) Dance motion generation method, computer device and storage medium
CN114417891B (en) Reply statement determination method and device based on rough semantics and electronic equipment
CN116343747A (en) Speech synthesis method, speech synthesis device, electronic device, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant