CN112016281A - Method and device for generating wrong medical text and storage medium - Google Patents

Method and device for generating wrong medical text and storage medium Download PDF

Info

Publication number
CN112016281A
CN112016281A CN202011135476.8A CN202011135476A CN112016281A CN 112016281 A CN112016281 A CN 112016281A CN 202011135476 A CN202011135476 A CN 202011135476A CN 112016281 A CN112016281 A CN 112016281A
Authority
CN
China
Prior art keywords
medical text
medical
text
neural network
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011135476.8A
Other languages
Chinese (zh)
Other versions
CN112016281B (en
Inventor
张颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202011135476.8A priority Critical patent/CN112016281B/en
Publication of CN112016281A publication Critical patent/CN112016281A/en
Application granted granted Critical
Publication of CN112016281B publication Critical patent/CN112016281B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/226Validation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The application relates to the field of medical science and technology, and particularly discloses a method and a device for generating an error medical text and a storage medium. The method comprises the following steps: acquiring a plurality of first medical texts, wherein each first medical text in the plurality of first medical texts is a correct medical text; and inputting each first medical text into a first neural network which finishes training to obtain a second medical text corresponding to each first medical text, wherein the second medical text is an error medical text. The method and the device are beneficial to improving the richness of generating the wrong medical corpus.

Description

Method and device for generating wrong medical text and storage medium
Technical Field
The application relates to the technical field of text recognition, in particular to a method and a device for generating an error medical text and a storage medium.
Background
Compared with the natural language texts in the general field, the medical texts in the medical field include more specialized words and transliterated words, such as atorvastatin and metformin sustained release tablets. Users are more prone to spelling errors during the entry of such medical text. Also, in search systems and dialog systems, erroneous medical text input by a user may make it difficult for the text recognition system to understand the user's intention or to misunderstand the user's intention, thereby making it difficult to feed back a desired result to the user.
Therefore, in order to correctly understand the intention of the user, a text error correction model is trained through the training sample, after the user inputs the medical text, the medical text input by the user is corrected through the text error correction model to obtain the correct medical text, and then the correct medical text can be applied to a next-level search system or a dialog system to output the result expected by the user.
However, training samples used for training the text error correction model are artificially constructed, the corpus of the training samples is not rich enough, the number of the training samples is small, and the generalization capability of the trained text error correction model is poor.
Disclosure of Invention
The embodiment of the application provides a method and a device for generating an error medical text and a storage medium. And generating a large number of error medical texts with rich linguistic data, and improving the generalization capability of the text error correction model.
In a first aspect, an embodiment of the present application provides a method for generating an incorrect medical text, including:
acquiring a plurality of first medical texts, wherein each first medical text in the plurality of first medical texts is a correct medical text;
and inputting each first medical text into a first neural network which finishes training to obtain a second medical text corresponding to each first medical text, wherein the second medical text is an error medical text.
In a second aspect, an embodiment of the present application provides an apparatus for generating an incorrect medical text, including:
the medical image processing device comprises an acquisition unit, a processing unit and a display unit, wherein the acquisition unit is used for acquiring a plurality of first medical texts, and each first medical text in the plurality of first medical texts is a correct medical text;
and the processing unit is used for inputting each first medical text into the trained first neural network to obtain a second medical text corresponding to each first medical text, wherein the second medical text is an error medical text.
In a third aspect, embodiments of the present application provide an electronic device, including a processor, a memory, a communication interface, and one or more programs, where the one or more programs are stored in the memory and configured to be executed by the processor, and the program includes instructions for performing the steps in the method according to the first aspect.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium, which stores a computer program, where the computer program makes a computer execute the method according to the first aspect.
In a fifth aspect, embodiments of the present application provide a computer program product comprising a non-transitory computer-readable storage medium storing a computer program, the computer being operable to cause a computer to perform the method according to the first aspect.
The embodiment of the application has the following beneficial effects:
it can be seen that, in the embodiment of the present application, the correct medical text may be generated into the wrong second medical text by the first neural network. Since the number of correct medical texts is relatively large, the number of erroneous second medical texts is generated. Moreover, the wrong corpora are generated through the neural network, artificial experience is not doped, the randomness of the generation of the second medical text can be improved, and the corpora of the second medical text are rich. In addition, when the text correction model is trained by using the second medical text, the recognition accuracy and generalization capability of the text correction model can be improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic flowchart of a method for generating an error medical text according to an embodiment of the present application;
FIG. 2 is a schematic flow chart illustrating another example of generating a wrong medical text according to the present disclosure;
FIG. 3 is a schematic flow chart of a method for training a first neural network according to an embodiment of the present disclosure;
fig. 4 is a schematic structural diagram of a first neural network according to an embodiment of the present disclosure;
FIG. 5 is a schematic structural diagram of a second neural network provided in an embodiment of the present application;
fig. 6 is a schematic structural diagram of an apparatus for generating an error medical text according to an embodiment of the present application;
fig. 7 is a block diagram illustrating functional units of an apparatus for generating an error medical text according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The terms "first," "second," "third," and "fourth," etc. in the description and claims of this application and in the accompanying drawings are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not limited to only those steps or elements listed, but may alternatively include other steps or elements not listed, or inherent to such process, method, article, or apparatus.
Reference herein to "an embodiment" means that a particular feature, result, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments.
Referring to fig. 1, fig. 1 is a schematic flowchart of a method for generating an error medical text according to an embodiment of the present application. The method is applied to a device for generating the wrong medical text. The method comprises the following steps:
101: the generation device of the error medical texts acquires a plurality of first medical texts, wherein each first medical text in the plurality of first medical texts is a correct medical text.
Wherein the plurality of first medical texts may be read from various medical servers (e.g., servers of various hospitals) or obtained from a medical text library. The manner in which the first medical text is obtained is not limited in this application.
102: and inputting each first medical text into a first neural network which finishes training to obtain a second medical text corresponding to each first medical text, wherein the second medical text is an error medical text.
Illustratively, the trained first neural network performs a replacement operation on the target word in each first medical text to obtain at least one third medical text, wherein the target word comprises at least one of the following: entity words, adjectives, verbs, nouns, and verticals. The first neural network is trained in advance, and the training process of the first neural network will be described in detail later, which will not be described herein too much.
Illustratively, the target word of each first medical text can be encoded through the trained first neural network, so as to obtain the target intention of the target word; then, at least one first intention matched with the target intention is inquired from a pre-constructed dictionary library; finally, decoding each first intention through the trained first neural network to obtain at least one word to be replaced corresponding to at least one first intention; and replacing the target word by using each word to be replaced respectively to obtain the at least one third medical text. Namely, each word to be replaced is used for replacing the target word in the first medical text, and the words which are not replaced are combined to form a third medical text.
The encoding of the words through the neural network to obtain the corresponding intentions of the words and the decoding of the intentions to obtain the corresponding words are prior art and are not described.
For example, if the target word is a verb and the first medical text is "i want to eat metformin hydrochloride," the intentions of the words due to the verbs "want" and "read", "get", "band", etc. are matched. Therefore, the third medical text can be obtained by replacing the situation that the user wants to eat the metformin tablet with the situation that the user thinks to eat the metformin tablet, or the user takes the metformin tablet, and the like.
In one embodiment of the present application, the target word may be replaced with an adjective corresponding to the target word, where the adjective includes at least one of the following: words with the same pronunciation as the target word but different characters; or a word having the same meaning as the target word but a different character. Therefore, at least one adjective corresponding to the target term can be inquired in the dictionary library, and the at least one adjective is used as a term to be replaced; then, each adjective is used for replacing the target word, and the at least one third medical text is obtained.
For example, if the first medical context is "i want to eat metformin hydrochloride". Then the target word is "metformin tablet" in the case where the target word is a pendulous word. Because the pronunciation of the 'guanidine' is the same as that of the 'melon', the 'metformin hydrochloride tablet' can be used as an adjective of the 'metformin hydrochloride tablet', and the 'metformin hydrochloride tablet' is replaced by the 'metformin hydrochloride tablet' to obtain a third medical text, namely the third medical text 'i want to eat the metformin hydrochloride tablet'.
It should be understood that, in practical applications, a word matching the intention of the target word and an adjective corresponding to the target word may be synchronously obtained in a dictionary library, and all the word matching the intention of the target word and the adjective corresponding to the target word are used as the words to be replaced of the target word.
It can be understood that if the number of the target words is multiple, the at least one third medical text can be obtained after combining the words to be replaced corresponding to each target word. Illustratively, six third medical texts may be combined if there are two target words, there are two words to be replaced corresponding to the first target word, and there are three words to be replaced corresponding to the other target word.
Further, a third medical text is selected from the at least one third medical text as a second medical text corresponding to the first medical text.
For example, in daily life, wrong medical texts input by users are random and various. Therefore, in order to simulate a scene in which the user inputs an incorrect medical text, one third medical text may be randomly selected from the at least one third medical text as the second medical text, so that the second medical text generated by the trained first neural network may be matched with the scene in which the user inputs the incorrect medical text, and the generated incorrect medical text may have randomness.
For example, in the process of generating each third medical text, a score corresponding to each third medical text may be further generated, and the score corresponding to each third medical text is used for representing the similarity between the third medical text and the first medical text. That is, the higher the score is, the more similar the third medical text is to the first medical text, and the easier it is for the user to incorrectly input the first medical text as the third medical text. Therefore, the third medical text with the highest score can be used as the second medical text, and further the input habit of the user can be simulated to obtain the medical text which is most frequently mistakenly input by the user.
For example, in order to ensure the randomness of the occurrence of the erroneous medical texts, the score corresponding to each third medical text may be summed with a random number to obtain a final score corresponding to each third medical text, where the random number corresponding to each third medical text is generated by a random function. And finally, taking the third medical text with the maximum final score as the second medical text.
It can be seen that, since the random number generated by the random function has randomness, the final score corresponding to each third medical text is also randomly generated, for example, the original score of a certain third medical text is the lowest score, but may be summed with a random number with a larger value, so that the final score of the third medical text becomes the maximum. Therefore, the second medical text selected by the random number summation mode is not the medical text which is most prone to error input, but is randomly generated, so that the corpus of the second medical text is richer.
As can be seen, in the embodiment of the present application, an erroneous medical text (second medical text) corresponding to the first medical text is generated by the trained first neural network. Since the number of correct medical texts (first medical texts) is relatively large, a large number of wrong medical texts can be obtained after the trained first neural network processing is completed. Moreover, in the prior art, certain words are manually replaced through manual processing, so that the generated wrong medical text is doped with manual experience, for example, transliteration replacement is performed, and the generated wrong medical text is not abundant enough, but the wrong medical text is generated through the trained first neural network in the application, so that the manual experience is not doped, and the second medical text can be selected through multiple modes, so that the randomness of the generation of the wrong medical text is ensured, the corpus of the generated wrong medical text is more abundant, and therefore the text error correction model trained by using the wrong medical text is high in identification precision and strong in generalization capability.
Referring to fig. 2, fig. 2 is a schematic flowchart of another method for generating an error medical text according to an embodiment of the present application. The same contents in this embodiment as those in the embodiment shown in fig. 1 will not be repeated here. The method comprises the following steps:
201: obtaining a plurality of first medical texts, wherein each first medical text in the plurality of first medical texts is a correct medical text.
202: and inputting each first medical text into a first neural network which finishes training to obtain a second medical text corresponding to each first medical text, wherein the second medical text is an error medical text.
203: adding a second training label to a second medical text corresponding to each first medical text, and forming a second training sample by using the second training label and the second medical text corresponding to the second training label, wherein the second training label is used for indicating that the second medical text is an error medical text.
204: training a second neural network using the second training samples.
For example, the second training sample may be input to a second neural network, resulting in a prediction result for the second training sample, the prediction result being whether the second training sample is predicted to be a correct medical text or an incorrect medical text. Then, comparing the prediction result with a second training label (namely, an error medical text) of the second training sample to obtain a first loss; and adjusting the network parameters of the second neural network according to the first loss and the gradient descent method until the second neural network converges, and finishing the training of the second neural network.
Therefore, the second network for completing the training is a classification network based on sentence level, i.e. the second neural network can determine whether the medical text to be recognized is the correct medical text or the wrong medical text as a whole.
It should be understood that the second training sample is only a partial sample (i.e., a negative sample) of the training of the second neural network. The second neural network also needs to be trained using the positive sample (i.e., the correct medical sample) in the training process, which is similar to the above-mentioned training process using the second training sample and will not be described.
205: acquiring a medical text to be recognized, inputting the medical text to be recognized into a second neural network which completes training, classifying the medical text to be recognized, and determining whether the medical text to be recognized is a correct medical text.
The medical text to be recognized may be a medical text input by a user in a search box or a medical text input in a dialog box of the instant messaging tool.
It should be understood that the present application does not limit the manner in which the medical text to be recognized is obtained.
For example, a first probability that the medical text to be recognized is a correct medical text and a second probability that the medical text to be recognized is an incorrect medical text can be determined through the second neural network; based on the first probability and the second probability, it can be determined whether the medical text to be recognized is a correct medical text or an incorrect medical text. For example, if the first probability is greater than a first threshold, the medical text to be recognized is determined to be the correct medical text.
206: and under the condition that the medical text to be recognized is determined to be the correct medical text, directly outputting the medical text to be recognized.
It is understood that, in the case where the medical text to be recognized is determined to be the correct medical text, the medical text to be recognized may be directly output so that the next-level application uses the medical text to be recognized. For example, the text to be recognized may be directly used for searching a web page or completing the transmission of dialog information, and so on.
207: and under the condition that the medical text to be recognized is determined to be an erroneous medical text, correcting the medical text to be recognized, and outputting the corrected medical text.
It is understood that, in the case that the medical text to be recognized is determined to be an incorrect medical text, the text to be recognized may be corrected, and the corrected medical text (correct medical text) may be output so that the next-level application may use the correct medical text.
The method and the device for correcting the medical text errors can be used for correcting the medical text errors by using the existing text error correction model, for example, the Bert model can be used for correcting the medical text errors, and the error correction process is not described too much.
In practical application, the condition that the text to be recognized input by the user is the correct medical text is far more than the condition of the wrong medical text, and if any one medical text input by the user is subjected to word-by-word correction one by one, the correction time is longer; for a user who inputs correct medical text, the search or conversation efficiency of the user is delayed, and the user experience is affected.
It can be seen that in the embodiment of the application, whether the whole medical text to be recognized is the correct medical text is determined by the second neural network, that is, whether the medical text is the correct medical text is determined by taking a sentence as a grade, if so, error correction is not needed, and as the time for determining whether the medical text to be recognized is the correct medical text by the second neural network is far shorter than the time for performing word-by-word error correction, the user search and conversation efficiency for inputting the correct medical text cannot be influenced, and the user experience is improved; for the user who inputs the wrong medical text, the medical text is corrected, so that the next-level application can use the correct medical text to realize the expectation and intention of the user.
In one embodiment of the present application, the method for generating the error medical text may be applied to the field of smart medical technology. For example, the second neural network which completes training is used for detecting and correcting errors of medical texts input by doctors, so that the doctors are ensured to input correct medical texts, the precision of diagnosis results is further ensured, and the development of medical science and technology is promoted.
Referring to fig. 3, fig. 3 is a schematic flowchart of a process for training a first neural network according to an embodiment of the present disclosure.
301: and acquiring a first training sample and a first training label, wherein the first training sample is a correct medical text, and the first training label is a labeled wrong medical text corresponding to the second training sample.
Because the first network is a network for replacing words, training samples required by the training process are relatively few. Therefore, the first training sample and the first training label can be obtained from the historical medical data, and the first training label is the historical record in which the user originally inputs the first training sample, but the first training sample is wrongly input as the first training label. For example, the user originally intends to input the metformin tablet, but the wrong input is the metformin tablet. Therefore, the 'metformin tablet' is a first training sample, and the 'metformin tablet' is a first training label of the training sample.
302: and inputting the first training sample into the first neural network to obtain a fourth medical text.
The first training sample is converted through the first neural network, and a fourth medical text, namely an error medical text, is obtained.
303: and adjusting the network parameters of the first neural network according to the fourth medical text and the first training label so as to train the first neural network.
Illustratively, a first loss is obtained according to a difference between the fourth medical text and the first training label, wherein the first loss can be represented by formula (1):
Figure 273521DEST_PATH_IMAGE001
wherein L is1In order to be the second loss, the first loss,
Figure 187994DEST_PATH_IMAGE002
in order to be the second training label,
Figure 455027DEST_PATH_IMAGE003
for the fourth medical sample, dist is the distance finding operation.
Then, obtaining a second loss according to the difference between the intention corresponding to the target word in the fourth medical text and the intention corresponding to the target word of the first training sample; wherein the second loss can be represented by equation (2):
Figure 122769DEST_PATH_IMAGE004
wherein L is2For the third penalty, N is the number of target words,
Figure 791648DEST_PATH_IMAGE005
the weighting coefficients corresponding to the N ith target words,
Figure 785011DEST_PATH_IMAGE006
for the intention of the ith target word in the second training sample,
Figure 132816DEST_PATH_IMAGE007
the intention of the ith target word in the fourth medical text, wherein the ith target word in the second training sample corresponds to the ith target word in the fourth medical text.
Finally, weighting the first loss and the second loss to obtain a final loss; and adjusting the network parameters of the first neural network according to the final loss and the gradient descent method until the first neural network converges, and finishing the training of the neural network.
In one embodiment of the present application, the first neural network and the second neural network may be basic text recognition models, such as convolutional neural networks, cyclic neural networks, transformers, and the like. The first neural network and the second neural network may be the same type of neural network or different types of neural networks, which is not limited in this application.
In addition, the second network and the first network may be trained end to end, or may be trained separately, which is not limited in this application. In the present application, individual training is described as an example.
The following describes an example of a training process for the first neural network and the second neural network in conjunction with the network structure of the first neural network and the second neural network.
As shown in fig. 4, the first neural network includes an encoder (encoder), an intermediate layer, and a decoder (decoder) including a plurality of convolutional layers and a plurality of pooling layers, the decoder including a plurality of deconvolution layers and a plurality of pooling layers; as can be seen, the first training sample is input to the encoder for encoding to obtain a first feature vector; and a second characteristic vector is obtained after the intermediate layer treatment; inputting the second feature vector into a decoder for decoding, and outputting a fourth medical text; finally, network parameters of the first neural network are adjusted according to the fourth medical text and the first training sample.
As shown in fig. 5, the second neural network includes a plurality of convolutional layers and pooling layers. It can be seen that the second training sample is input into the second neural network, and after convolution and pooling, a prediction result is output; and adjusting the network parameters of the second neural network according to the prediction result and the second training label of the second training sample.
Referring to fig. 6, fig. 6 is a schematic structural diagram of an apparatus for generating an error medical text according to an embodiment of the present application. As shown in fig. 6, the apparatus 600 for generating an incorrect medical text includes a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the processor, the program including instructions for:
acquiring a plurality of first medical texts, wherein each first medical text in the plurality of first medical texts is a correct medical text;
and inputting each first medical text into a first neural network which finishes training to obtain a second medical text corresponding to each first medical text, wherein the second medical text is an error medical text.
In some possible embodiments, the program is specifically configured to, in inputting each first medical text into a trained first neural network, obtain a second medical text corresponding to each first medical text, execute the following steps:
performing replacement operation on a target word in each first medical text through the trained first neural network to obtain at least one third medical text, wherein the target word comprises at least one of the following words: entity words, adjectives, verbs, nouns and similar words in the first medical text;
and obtaining a second medical text corresponding to each first medical text according to at least one third medical text corresponding to each first medical text.
In some possible embodiments, the above procedure is specifically used for instructions to perform the following steps in terms of obtaining, from at least one third medical text corresponding to each first medical text, a second medical text corresponding to each first medical text:
randomly selecting one third medical text from at least one third medical text corresponding to each first medical text as the second medical text corresponding to each first medical text.
In some possible embodiments, each of the at least one third medical texts corresponds to a score, and the score corresponding to each third medical text is used for representing the similarity between each third medical text and the first medical text; in terms of obtaining, according to at least one third medical text corresponding to each first medical text, a second medical text corresponding to each first medical text, the above-mentioned program is specifically configured to execute instructions of:
summing the score corresponding to each third medical text with a random number to obtain a final score corresponding to each third medical text, wherein the random number corresponding to each third medical text is generated through a random function;
and according to the final score corresponding to each third medical text, taking the third medical text with the maximum final score as the second medical text corresponding to each first medical text.
In some possible embodiments, in the aspect of obtaining at least one third medical text by performing a replacement operation on the target word in each first medical text through the trained first neural network, the above procedure is specifically used for executing the following steps:
coding the target words of each first medical text through the trained first neural network to obtain target intentions corresponding to the target words;
obtaining at least one first intention matched with the target intention from a dictionary library;
decoding each first intention through the second neural network to obtain at least one word to be replaced corresponding to the at least one first intention;
and respectively using each word to be replaced in the at least one word to be replaced to replace the target word to obtain the at least one third medical text.
In some possible embodiments, prior to said obtaining the plurality of first medical texts, the above-mentioned program is further for executing the instructions of:
acquiring a first training sample and a first training label, wherein the first training sample is a correct medical text, and the first training label is a labeled wrong medical text corresponding to the first training sample;
inputting the first training sample into the first neural network to obtain a fourth medical text;
and adjusting the network parameters of the first neural network according to the fourth medical text and the first training label so as to train the first neural network.
In some possible embodiments, the program is further for executing the instructions of:
adding a second training label to a second medical text corresponding to each first medical text, and forming a second training sample by using the second training label and the second medical text corresponding to the second training label, wherein the second training label is used for indicating that the second medical text is an error medical text;
training a second neural network using the second training samples;
acquiring a medical text to be recognized, inputting the medical text to be recognized into a second neural network which completes training, classifying the medical text to be recognized, and determining whether the medical text to be recognized is a correct medical text;
under the condition that the medical text to be recognized is determined to be the correct medical text, directly outputting the medical text to be recognized;
and under the condition that the medical text to be recognized is determined to be an erroneous medical text, correcting the medical text to be recognized, and outputting the corrected medical text.
Referring to fig. 7, fig. 7 is a block diagram illustrating functional units of an apparatus for generating an error medical text according to an embodiment of the present application. The apparatus 700 for generating an error medical text includes: an obtaining unit 710 and a processing unit 720, wherein:
an obtaining unit 710, configured to obtain a plurality of first medical texts, where each of the plurality of first medical texts is a correct medical text;
the processing unit 720 is configured to input each first medical text into the trained first neural network, and obtain a second medical text corresponding to each first medical text, where the second medical text is an incorrect medical text.
In some possible embodiments, in inputting each first medical text into the trained first neural network to obtain a second medical text corresponding to each first medical text, the processing unit 720 is specifically configured to:
performing replacement operation on a target word in each first medical text through the trained first neural network to obtain at least one third medical text, wherein the target word comprises at least one of the following words: entity words, adjectives, verbs, nouns and similar words in the first medical text;
and obtaining a second medical text corresponding to each first medical text according to at least one third medical text corresponding to each first medical text.
In some possible embodiments, in obtaining, according to the at least one third medical text corresponding to each first medical text, a second medical text corresponding to each first medical text, the processing unit 720 is specifically configured to:
randomly selecting one third medical text from at least one third medical text corresponding to each first medical text as the second medical text corresponding to each first medical text.
In some possible embodiments, each of the at least one third medical texts corresponds to a score, and the score corresponding to each third medical text is used for representing the similarity between each third medical text and the first medical text; in terms of obtaining, according to the at least one third medical text corresponding to each first medical text, a second medical text corresponding to each first medical text, the processing unit 720 is specifically configured to:
summing the score corresponding to each third medical text with a random number to obtain a final score corresponding to each third medical text, wherein the random number corresponding to each third medical text is generated through a random function;
and according to the final score corresponding to each third medical text, taking the third medical text with the maximum final score as the second medical text corresponding to each first medical text.
In some possible embodiments, in an aspect that the trained first neural network is used to replace the target word in each first medical text to obtain at least one third medical text, the processing unit 720 is specifically configured to:
coding the target words of each first medical text through the trained first neural network to obtain target intentions corresponding to the target words;
obtaining at least one first intention matched with the target intention from a dictionary library;
decoding each first intention through the second neural network to obtain at least one word to be replaced corresponding to the at least one first intention;
and respectively using each word to be replaced in the at least one word to be replaced to replace the target word to obtain the at least one third medical text.
In some possible embodiments, before the acquiring the plurality of first medical texts, the acquiring unit 710 is further configured to:
acquiring a first training sample and a first training label, wherein the first training sample is a correct medical text, and the first training label is a labeled wrong medical text corresponding to the first training sample;
the processing unit 720 is further configured to input the first training sample into the first neural network, so as to obtain a fourth medical text; and adjusting the network parameters of the first neural network according to the fourth medical text and the first training label so as to train the first neural network.
In some possible embodiments, the processing unit 720 is further configured to:
adding a second training label to a second medical text corresponding to each first medical text, and forming a second training sample by using the second training label and the second medical text corresponding to the second training label, wherein the second training label is used for indicating that the second medical text is an error medical text;
training a second neural network using the second training samples;
acquiring a medical text to be recognized, inputting the medical text to be recognized into a second neural network which completes training, classifying the medical text to be recognized, and determining whether the medical text to be recognized is a correct medical text;
under the condition that the medical text to be recognized is determined to be the correct medical text, directly outputting the medical text to be recognized;
and under the condition that the medical text to be recognized is determined to be an erroneous medical text, correcting the medical text to be recognized, and outputting the corrected medical text.
Embodiments of the present application also provide a computer storage medium, which stores a computer program, where the computer program is executed by a processor to implement part or all of the steps of any one of the methods for generating an incorrect medical text as described in the above method embodiments.
Embodiments of the present application also provide a computer program product comprising a non-transitory computer readable storage medium storing a computer program operable to cause a computer to perform some or all of the steps of any one of the methods for generating an erroneous medical text as set forth in the above method embodiments.
It should be understood that the device for generating the incorrect medical text in the present application may include a smart phone (e.g., an Android phone, an iOS phone, etc.), a tablet computer, a palm computer, a notebook computer, a Mobile Internet device MID (MID), a wearable device, or the like. The device for generating the wrong medical text is only an example, and is not exhaustive, and includes but is not limited to the device for generating the wrong medical text. In practical applications, the apparatus for generating an electrical error medical text may further include: intelligent vehicle-mounted terminal, computer equipment and the like.
It will be appreciated that while for simplicity of explanation, the foregoing method embodiments have been described as a series of acts or combination of acts, those skilled in the art will appreciate that the present application is not limited by the order of acts described, as some steps may, in accordance with the present application, occur in other orders and concurrently. Further, those skilled in the art should also appreciate that the embodiments described in the specification are exemplary embodiments and that the acts and modules referred to are not necessarily required in this application.
In the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one type of division of logical functions, and there may be other divisions when actually implementing, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of some interfaces, devices or units, and may be an electric or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit may be implemented in the form of hardware, or may be implemented in the form of a software program module.
The integrated units, if implemented in the form of software program modules and sold or used as stand-alone products, may be stored in a computer readable memory. Based on such understanding, the technical solution of the present application may be substantially implemented or a part of or all or part of the technical solution contributing to the prior art may be embodied in the form of a software product stored in a memory, and including several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method described in the embodiments of the present application. And the aforementioned memory comprises: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic or optical disk, and other various media capable of storing program codes.
Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by associated hardware instructed by a program, which may be stored in a computer-readable memory, which may include: flash Memory disks, Read-Only memories (ROMs), Random Access Memories (RAMs), magnetic or optical disks, and the like.
The foregoing detailed description of the embodiments of the present application has been presented to illustrate the principles and implementations of the present application, and the above description of the embodiments is only provided to help understand the method and the core concept of the present application; meanwhile, for a person skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims (10)

1. A method for generating a wrong medical text, comprising:
acquiring a plurality of first medical texts, wherein each first medical text in the plurality of first medical texts is a correct medical text;
and inputting each first medical text into a first neural network which finishes training to obtain a second medical text corresponding to each first medical text, wherein the second medical text is an error medical text.
2. The method of claim 1, wherein the inputting each first medical text into the trained first neural network to obtain a second medical text corresponding to each first medical text comprises:
performing replacement operation on a target word in each first medical text through the trained first neural network to obtain at least one third medical text, wherein the target word comprises at least one of the following words: entity words, adjectives, verbs, nouns and similar words in the first medical text;
and obtaining a second medical text corresponding to each first medical text according to at least one third medical text corresponding to each first medical text.
3. The method according to claim 2, wherein the obtaining of the second medical text corresponding to each first medical text from the at least one third medical text corresponding to each first medical text comprises:
randomly selecting one third medical text from at least one third medical text corresponding to each first medical text as the second medical text corresponding to each first medical text.
4. The method according to claim 2, wherein each of the at least one third medical texts corresponds to a score, and the score corresponding to each third medical text is used for representing the similarity between each third medical text and the first medical text; obtaining a second medical text corresponding to each first medical text according to at least one third medical text corresponding to each first medical text, including:
summing the score corresponding to each third medical text with a random number to obtain a final score corresponding to each third medical text, wherein the random number corresponding to each third medical text is generated through a random function;
and according to the final score corresponding to each third medical text, taking the third medical text with the maximum final score as the second medical text corresponding to each first medical text.
5. The method according to any one of claims 2-4, wherein the performing, by the trained first neural network, a replacement operation on the target word in each of the first medical texts to obtain at least one third medical text comprises:
coding the target words of each first medical text through the trained first neural network to obtain target intentions corresponding to the target words;
obtaining at least one first intention matched with the target intention from a dictionary library;
decoding each first intention through a second neural network to obtain at least one word to be replaced corresponding to the at least one first intention;
and respectively using each word to be replaced in the at least one word to be replaced to replace the target word to obtain the at least one third medical text.
6. The method of claim 5, wherein prior to said obtaining a plurality of first medical texts, the method further comprises:
acquiring a first training sample and a first training label, wherein the first training sample is a correct medical text, and the first training label is a labeled wrong medical text corresponding to the first training sample;
inputting the first training sample into the first neural network to obtain a fourth medical text;
and adjusting the network parameters of the first neural network according to the fourth medical text and the first training label so as to train the first neural network.
7. The method of claim 6, further comprising:
adding a second training label to a second medical text corresponding to each first medical text, and forming a second training sample by using the second training label and the second medical text corresponding to the second training label, wherein the second training label is used for indicating that the second medical text is an error medical text;
training a second neural network using the second training samples;
acquiring a medical text to be recognized, inputting the medical text to be recognized into a second neural network which completes training, classifying the medical text to be recognized, and determining whether the medical text to be recognized is a correct medical text;
under the condition that the medical text to be recognized is determined to be the correct medical text, directly outputting the medical text to be recognized;
and under the condition that the medical text to be recognized is determined to be an erroneous medical text, correcting the medical text to be recognized, and outputting the corrected medical text.
8. An apparatus for generating a wrong medical text, comprising:
the medical image processing device comprises an acquisition unit, a processing unit and a display unit, wherein the acquisition unit is used for acquiring a plurality of first medical texts, and each first medical text in the plurality of first medical texts is a correct medical text;
and the processing unit is used for inputting each first medical text into the trained first neural network to obtain a second medical text corresponding to each first medical text, wherein the second medical text is an error medical text.
9. An apparatus for generating an incorrect medical text, comprising a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the processor, the programs comprising instructions for performing the steps in the method of any of claims 1-7.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which is executed by a processor to implement the method according to any one of claims 1-7.
CN202011135476.8A 2020-10-22 2020-10-22 Method and device for generating wrong medical text and storage medium Active CN112016281B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011135476.8A CN112016281B (en) 2020-10-22 2020-10-22 Method and device for generating wrong medical text and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011135476.8A CN112016281B (en) 2020-10-22 2020-10-22 Method and device for generating wrong medical text and storage medium

Publications (2)

Publication Number Publication Date
CN112016281A true CN112016281A (en) 2020-12-01
CN112016281B CN112016281B (en) 2021-02-05

Family

ID=73527984

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011135476.8A Active CN112016281B (en) 2020-10-22 2020-10-22 Method and device for generating wrong medical text and storage medium

Country Status (1)

Country Link
CN (1) CN112016281B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113342271A (en) * 2021-06-04 2021-09-03 上海蓝色帛缔智能工程有限公司 Intelligent regional acquisition medical data storage method and electronic equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080037379A1 (en) * 2006-03-31 2008-02-14 Shinichiro Arakawa Optical disk device and method for generating random number data
US20170235721A1 (en) * 2016-02-17 2017-08-17 The King Abdulaziz City For Science And Technology Method and system for detecting semantic errors in a text using artificial neural networks
CN107357775A (en) * 2017-06-05 2017-11-17 百度在线网络技术(北京)有限公司 The text error correction method and device of Recognition with Recurrent Neural Network based on artificial intelligence
CN107705784A (en) * 2017-09-28 2018-02-16 百度在线网络技术(北京)有限公司 Text regularization model training method and device, text regularization method and device
US20200192983A1 (en) * 2018-12-17 2020-06-18 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and device for correcting error in text
CN111523306A (en) * 2019-01-17 2020-08-11 阿里巴巴集团控股有限公司 Text error correction method, device and system
CN111695356A (en) * 2020-05-28 2020-09-22 平安科技(深圳)有限公司 Synonym corpus generation method, synonym corpus generation device, computer system and readable storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080037379A1 (en) * 2006-03-31 2008-02-14 Shinichiro Arakawa Optical disk device and method for generating random number data
US20170235721A1 (en) * 2016-02-17 2017-08-17 The King Abdulaziz City For Science And Technology Method and system for detecting semantic errors in a text using artificial neural networks
CN107357775A (en) * 2017-06-05 2017-11-17 百度在线网络技术(北京)有限公司 The text error correction method and device of Recognition with Recurrent Neural Network based on artificial intelligence
CN107705784A (en) * 2017-09-28 2018-02-16 百度在线网络技术(北京)有限公司 Text regularization model training method and device, text regularization method and device
US20200192983A1 (en) * 2018-12-17 2020-06-18 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and device for correcting error in text
CN111523306A (en) * 2019-01-17 2020-08-11 阿里巴巴集团控股有限公司 Text error correction method, device and system
CN111695356A (en) * 2020-05-28 2020-09-22 平安科技(深圳)有限公司 Synonym corpus generation method, synonym corpus generation device, computer system and readable storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113342271A (en) * 2021-06-04 2021-09-03 上海蓝色帛缔智能工程有限公司 Intelligent regional acquisition medical data storage method and electronic equipment
CN113342271B (en) * 2021-06-04 2023-02-21 上海蓝色帛缔智能工程有限公司 Intelligent regional acquisition medical data storage method and electronic equipment

Also Published As

Publication number Publication date
CN112016281B (en) 2021-02-05

Similar Documents

Publication Publication Date Title
US10504010B2 (en) Systems and methods for fast novel visual concept learning from sentence descriptions of images
CN111046152B (en) Automatic FAQ question-answer pair construction method and device, computer equipment and storage medium
CN108829822B (en) Media content recommendation method and device, storage medium and electronic device
CN108287858B (en) Semantic extraction method and device for natural language
CN112270196B (en) Entity relationship identification method and device and electronic equipment
CN111931490B (en) Text error correction method, device and storage medium
CN111310440B (en) Text error correction method, device and system
CN109800414A (en) Faulty wording corrects recommended method and system
CN110895553A (en) Semantic matching model training method, semantic matching method and answer obtaining method
CN112528637B (en) Text processing model training method, device, computer equipment and storage medium
CN111611791B (en) Text processing method and related device
CN110659392B (en) Retrieval method and device, and storage medium
CN114358203A (en) Training method and device for image description sentence generation module and electronic equipment
CN112686051A (en) Semantic recognition model training method, recognition method, electronic device, and storage medium
CN110852071A (en) Knowledge point detection method, device, equipment and readable storage medium
CN113569011B (en) Training method, device and equipment of text matching model and storage medium
CN112016281B (en) Method and device for generating wrong medical text and storage medium
CN112270184A (en) Natural language processing method, device and storage medium
CN110929532B (en) Data processing method, device, equipment and storage medium
CN114065741B (en) Method, device, apparatus and medium for verifying authenticity of a representation
CN115525749A (en) Voice question-answering method, device, electronic equipment and storage medium
CN115292492A (en) Method, device and equipment for training intention classification model and storage medium
CN111401070B (en) Word meaning similarity determining method and device, electronic equipment and storage medium
CN112749565A (en) Semantic recognition method and device based on artificial intelligence and semantic recognition equipment
CN110929504A (en) Statement diagnosis method, device and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant