CN111414750A - Synonymy distinguishing method, device, equipment and storage medium for entries - Google Patents

Synonymy distinguishing method, device, equipment and storage medium for entries Download PDF

Info

Publication number
CN111414750A
CN111414750A CN202010190072.2A CN202010190072A CN111414750A CN 111414750 A CN111414750 A CN 111414750A CN 202010190072 A CN202010190072 A CN 202010190072A CN 111414750 A CN111414750 A CN 111414750A
Authority
CN
China
Prior art keywords
synonymy
layer
entry
characteristic information
pair
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010190072.2A
Other languages
Chinese (zh)
Other versions
CN111414750B (en
Inventor
郭辉
徐伟建
史亚冰
罗雨
彭卫华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202010190072.2A priority Critical patent/CN111414750B/en
Publication of CN111414750A publication Critical patent/CN111414750A/en
Application granted granted Critical
Publication of CN111414750B publication Critical patent/CN111414750B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The application discloses a synonymy judging method, a synonymy judging device, synonymy judging equipment and a synonymy judging storage medium of entries, and relates to the technical field of knowledge maps. The specific implementation scheme is as follows: acquiring characteristic information of a vocabulary entry pair to be subjected to synonymy discrimination; inputting the characteristic information of the entry pair into a trained neural network model to obtain a synonymy judgment result of the entry pair; in the embodiment, the structure, the parameters and the features of the pre-training layer are directly subjected to knowledge migration, the pre-training layer is adopted to learn the feature information of the vocabulary entry pair, the mark amount in the training process is reduced, a large amount of resources and manpower are saved, the synonymy discrimination efficiency is further improved, and the accuracy of the synonymy discrimination result is improved.

Description

Synonymy distinguishing method, device, equipment and storage medium for entries
Technical Field
The application relates to computer technology, in particular to the technical field of knowledge maps.
Background
In some industries, there are many aliases of terms describing entities in the industry, and the description of spoken language is also very different. For example, in the medical field, there are many standard names and different names of entities for examination, operation, clinical, medicine, disease, etc., and for example, common cold and upper respiratory infection describe the same disease, and pregnancy-induced hypertension describe the same disease.
When the intelligent project is landed, the standard name and the alias of the same entity need to be unified so that the project can be successfully operated. At present, the synonymy relation of the entries is mainly confirmed in a manual examination and labeling mode, and the synonymy entries are unified.
The manual examination and labeling method needs a large amount of manpower, consumes a long time, and can also influence the falling of related intelligent projects.
Disclosure of Invention
The embodiment of the application provides a synonymy judging method, a synonymy judging device, synonymy judging equipment and a synonymy judging storage medium for the vocabulary entry, so that the judging efficiency of the vocabulary entry is improved, the human resources are saved, and the falling of related intelligent projects is accelerated.
In a first aspect, an embodiment of the present application provides a synonymy determination method for an entry, including:
Acquiring characteristic information of a vocabulary entry pair to be subjected to synonymy discrimination;
Inputting the characteristic information of the entry pair into a trained neural network model to obtain a synonymy judgment result of the entry pair;
Wherein the neural network model comprises a pre-training layer, a fine tuning layer and an output layer; the pre-training layer is used for training a language understanding task by adopting a natural language training sample in advance and is used for learning the characteristic information of the vocabulary entry pair to obtain the language understanding information of the vocabulary entry pair; the fine adjustment layer is used for carrying out feature extraction and fusion on the language understanding information to obtain feature representation of whether the vocabulary entry is synonymous or not; and the output layer is used for obtaining the synonymy judgment result according to the feature representation.
According to the embodiment of the application, the synonymy judgment is carried out by inputting the characteristic information of the vocabulary entry pair into the trained neural network model, so that the synonymy judgment is carried out by combining the deep learning function of the neural network model on the characteristic level, the efficiency of the synonymy judgment can be improved, the manpower resource is saved, and the landing of related intelligent projects is accelerated; in this embodiment, the pre-training layer is obtained by training a language understanding task in advance by using a natural language training sample, the natural language training sample and the vocabulary entry pair have the same feature distribution, and then the structure, parameters and features of the pre-training layer are directly subjected to knowledge migration, and the pre-training layer is used for learning the feature information of the vocabulary entry pair, so that the amount of marking in the training process is reduced, a large amount of resources and manpower are saved, and the synonymous judgment efficiency is further improved; the method comprises the steps that language understanding information is adopted to accurately reflect whether entry pairs are synonymous, feature extraction and fusion are carried out on the language understanding information through a fine adjustment layer to obtain accurate feature representation of whether the entry pairs are synonymous, the synonymy judgment result is further obtained through an output layer according to the accurate feature representation, and accuracy of the synonymy judgment result is improved.
Optionally, the pre-training layer is a neural network structure of the multiple language understanding tasks; the multiple language understanding tasks comprise lexical level tasks, grammar level tasks and semantic level tasks, and the language understanding information comprises lexical information, grammar information and semantic information.
In an optional implementation manner in the application, the pre-training layer better understands information contained in the vocabulary entry pairs from 3 layers of lexical, grammatical and semantic meanings, and greatly enhances the general semantic meaning representing capability, so that a more accurate synonymy discrimination result is obtained through a neural network model.
Optionally, the fine tuning layer includes: a convolutional layer, a pooling layer, and a full-link layer;
The convolution layer is used for carrying out feature extraction on the language understanding information, the pooling layer is used for carrying out dimensionality reduction on the extracted features, and the full-connection layer is used for fusing the dimensionality-reduced features to obtain feature representation of whether the vocabulary entry is synonymous or not.
In an optional implementation manner in the above application, the fine tuning layer is implemented by the convolution layer, the pooling layer and the full connection layer, and the implementation manner is simple and effective. When the neural network model is trained, different features can be flexibly extracted by adjusting parameters of the convolutional layer, the pooling layer and the full-connection layer, and fusion of different dimensions is performed on the features, so that whether synonym feature representation can be accurately obtained for each type of entry pair in each field.
Optionally, the output layer includes a classification layer, configured to perform synonymy and heteronymy classification on the feature representation indicating whether the entry pair is synonymous, so as to obtain confidence levels of a synonymy type and a heteronymy type.
In an alternative embodiment of the above application, the feature representations are synonymously and heterologously classified by the classification layer, and the operation of normalizing the feature representations is replaced by the classification operation, which has a higher accuracy with respect to the normalization.
Optionally, the output layer further includes an intervention layer, configured to determine whether feature information of the entry pair satisfies a set heterology rule or a synonymy rule; if the characteristic information of the entry pair meets the set synonymy rule, reducing the confidence coefficient of the synonymy type and/or improving the confidence coefficient of the synonymy type; if the characteristic information of the entry pair meets the set synonymy rule, improving the confidence coefficient of the synonymy type and/or reducing the confidence coefficient of the synonymy type;
Wherein the synonymy rule comprises: a first template of synonym entry pairs, the first template including a first fixed vocabulary and a plurality of first candidate vocabularies, the disambiguation rule including: a second template of an pair of ambiguous entries, said second template comprising a second fixed vocabulary and a plurality of second candidate vocabularies.
In an optional implementation manner in the above application, considering that there may be errors in the output of the classification layer for some special vocabulary entry pairs, the confidence level is interfered according to whether the feature information satisfies the synonymy rule or the synonymy rule, so that the embodiment is also applicable to the classification of special vocabularies; furthermore, the synonymy rule and the synonymy rule comprise corresponding templates, fixed vocabularies and candidate vocabularies, are suitable for synonymy judgment of vocabulary entry pairs which accord with the templates and contain the fixed vocabularies and any candidate vocabularies, and judge whether the characteristic information meets the corresponding rule or not through template comparison.
Optionally, the obtaining of the feature information of the entry pair to be subjected to the synonymy discrimination includes at least one of the following operations:
Acquiring plain text characteristic information of a vocabulary entry pair to be subjected to synonymy discrimination;
Acquiring natural word segmentation characteristic information of a vocabulary entry pair to be subjected to synonymy discrimination;
Acquiring part characteristic information of the entry pair to be subjected to synonymy discrimination;
Acquiring degree characteristic information of the entry pair to be subjected to synonymy discrimination;
Acquiring direction characteristic information of a vocabulary entry pair to be subjected to synonymy discrimination;
Acquiring frequency characteristic information of a vocabulary entry pair to be subjected to synonymy discrimination;
Acquiring quantity characteristic information of the entry pairs to be subjected to synonymy discrimination;
And obtaining sensory characteristic information of the entry pair to be subjected to synonymy discrimination.
In an optional implementation manner in the above application, feature information of the entry pair is constructed from multiple dimensions, so that features of the entry pair are comprehensively and deeply mined, and a more accurate discrimination result is obtained.
Optionally, before inputting the feature information of the entry pair into the trained neural network model to obtain the synonymy discrimination result of the entry pair, the method further includes:
Acquiring a neural network model to be trained, wherein the neural network model to be trained comprises the pre-training layer, a fine-tuning layer to be trained and an output layer to be trained;
And training the neural network model to be trained by using the characteristic information of the synonymy vocabulary entry pair as a positive sample and using the characteristic information of the synonymy vocabulary entry pair as a negative sample, and keeping the parameter and characteristic representation of the pre-training layer unchanged in the training process.
In an optional implementation manner of the foregoing application, when training the neural network model, parameters and feature representations of the pre-training layer are kept unchanged, so as to ensure a strong language understanding capability of the pre-training layer; and training the fine tuning layer and the output layer simultaneously so that the fine tuning layer extracts synonymous features from the language understanding information and fuses the synonymous features to obtain whether the synonymous features represent or not, and the output layer obtains a synonymous judgment result according to the feature representation.
Optionally, the entry pair is an entity name in the medical field; the entity names include entity common names and entity aliases.
An optional implementation manner in the above application defines the field and type of the entry pair, and the method provided by this embodiment can realize unification of the entity common name and the entity alias in the medical field, and accelerate landing of the medical item.
In a second aspect, an embodiment of the present application further provides a synonymy determining device for an entry, including:
The acquisition module is used for acquiring the characteristic information of the entry pair to be subjected to synonymy discrimination;
The judging module is used for inputting the characteristic information of the entry pair into a trained neural network model to obtain a synonymy judging result of the entry pair;
Wherein the neural network model comprises a pre-training layer, a fine tuning layer and an output layer; the pre-training layer is used for training a language understanding task by adopting a natural language training sample in advance and is used for learning the characteristic information of the vocabulary entry pair to obtain the language understanding information of the vocabulary entry pair; the fine adjustment layer is used for carrying out feature extraction and fusion on the language understanding information to obtain feature representation of whether the vocabulary entry is synonymous or not; and the output layer is used for obtaining the synonymy judgment result according to the feature representation.
In a third aspect, an embodiment of the present application further provides an electronic device, including:
At least one processor; and
A memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
The memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to enable the at least one processor to perform a synonymity determination method for terms as provided in embodiments of the first aspect.
In a fourth aspect, embodiments of the present application further provide a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute a synonymity determination method for terms as provided in the embodiments of the first aspect.
Other effects of the above-described alternative will be described below with reference to specific embodiments.
Drawings
The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:
FIG. 1 is a flowchart of a synonymy determination method for an entry in an embodiment of the present application;
FIG. 2 is a schematic structural diagram of a neural network model in the second embodiment of the present application;
FIG. 3 is a flowchart of a synonymy determination method for an entry in the third embodiment of the present application;
Fig. 4 is a structural diagram of a synonymy determination apparatus for lemmas in the fourth embodiment of the present application;
Fig. 5 is a block diagram of an electronic device for implementing the synonymy determination method for an entry according to the embodiment of the present application.
Detailed Description
The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Example one
Fig. 1 is a flowchart of a synonymy determination method for an entry in an embodiment of the present application, where the embodiment of the present application is applicable to a case where a pair of entries is synonymy or synonymy, the method is executed by a synonymy determination device for an entry, and the device is implemented by software and/or hardware and is specifically configured in an electronic device with a certain data operation capability.
A synonymy determination method for an entry as shown in fig. 1 includes:
S101, obtaining feature information of the entry pair to be subjected to synonymy discrimination.
The term pair to be subjected to synonymy determination includes two terms, and the length, language, field, and type of the term pair subjected to synonymy determination are not limited in this embodiment. For example, pairs of terms include review comments announcements and refusal announcements in the intellectual property area, claims and rights; or entry pairs including new coronavirus pneumonia and pneumonia, SARS and SARS in the medical field.
The feature information of the entry pair is feature information of the entry pair in terms of natural language understanding, and for example, feature information that can reflect a lexical, grammatical, and semantic meaning. Optionally, the present operations include at least one of: acquiring plain text characteristic information of a vocabulary entry pair to be subjected to synonymy discrimination; acquiring natural word segmentation characteristic information of a vocabulary entry pair to be subjected to synonymy discrimination; acquiring part characteristic information of the entry pair to be subjected to synonymy discrimination; acquiring degree characteristic information of the entry pair to be subjected to synonymy discrimination; acquiring direction characteristic information of a vocabulary entry pair to be subjected to synonymy discrimination; acquiring frequency characteristic information of a vocabulary entry pair to be subjected to synonymy discrimination; acquiring quantity characteristic information of the entry pairs to be subjected to synonymy discrimination; and obtaining sensory characteristic information of the entry pair to be subjected to synonymy discrimination.
The plain text characteristic information of the entry pair is the text of each entry. The natural word segmentation characteristic information of the entry pair is a word segmentation obtained by segmenting each word according to word components. The part characteristic information of the entry pair is text indicating parts, such as shoulders, hands and the like, contained in each entry. The degree characteristic information of the entry pair is text indicating the degree contained in each entry, such as light weight and heavy weight. The directional feature information of the entry pair is the text indicating the direction, such as top and bottom, contained in each entry. The frequency characteristic information of the entry pair is the text which each entry contains and indicates the frequency, such as recurrence, origination and secondary. The quantity characteristic information of the entry pair is the text which represents the quantity and is contained in each entry, such as double lung and five fractures. The sensory characteristic information of the entry pair is sensory text contained in each entry, such as sound, smell, color and the like.
Exemplary characteristic information of the term for "ulcerative chronic colitis" in "ulcerative chronic colitis and chronic ulcerative colitis" includes: ulcerative chronic colitis, ulcerative, chronic, colitis, colon, ulcerative and chronic. Characteristic information of "chronic ulcerative colitis" includes: chronic ulcerative colitis, chronic, ulcerative, sexual, colitis, colon, chronic and ulcerative.
And S102, inputting the characteristic information of the entry pair into the trained neural network model to obtain a synonymy judgment result of the entry pair.
The neural network model comprises a pre-training layer, a fine tuning layer and an output layer. The pretraining layer, the fine tuning layer and the output layer are connected in sequence, the pretraining layer receives the entry pairs input by the user, and the output layer outputs synonym discrimination results of the entry pairs, wherein the synonym discrimination results comprise: synonyms and synonyms.
Specifically, the pre-training layer is to adopt a natural language training sample to train a language understanding task in advance, because the natural language training sample and the vocabulary entry pair have the same characteristic distribution, the structure, parameters and characteristics of the pre-training layer are directly subjected to knowledge migration, and the pre-training layer is adopted to learn the characteristic information of the vocabulary entry pair to obtain language understanding information. Optionally, the pre-training layer is a neural network structure of the multiple language understanding tasks; the multiple language understanding tasks comprise lexical level tasks, grammar level tasks and semantic level tasks, and the language understanding information comprises lexical information, grammar information and semantic information. Based on the research result at the present stage, the structure, parameters and features of the pre-training layer in ERNIE2.0 are migrated to the pre-training layer in this embodiment, that is, the entire framework of the pre-training layer in ERNIE2.0 is directly applied. ERNIE2.0 obtains natural language information of multiple dimensions such as lexical, grammatical and semantic from training data, and greatly enhances universal semantic representation capability. Furthermore, ERNIE2.0 trains the model by using encyclopedia, news information and forum dialogue data, thereby enhancing the expression capability of semantics. Making it more powerful in chinese semantic representation. The language understanding tasks of the pretraining layer in ERNIE2.0 include: knowledge covering, markup-document relationships, capitalization prediction, sentence reordering, sentence distance, utterance relationships, and relevance.
The fine-tuning layer is used for carrying out feature extraction and fusion on the language understanding information to obtain feature representation of whether the vocabulary entry is synonymous or not. The characteristic representation may be a probability value, the size of which indicates whether the entry pair is synonymous. And the output layer is used for obtaining a synonymy judgment result according to the feature representation. For example, if the characteristic representation is greater than or equal to a set threshold, such as 50%, then the entry pair is determined to be synonymous; and if the characteristic representation is less than the set threshold value, judging that the vocabulary entry is in heterology.
According to the embodiment of the application, the synonymy judgment is carried out by inputting the characteristic information of the vocabulary entry pair into the trained neural network model, so that the synonymy judgment is carried out by combining the deep learning function of the neural network model on the characteristic level, the efficiency of the synonymy judgment can be improved, the manpower resource is saved, and the landing of related intelligent projects is accelerated; in the embodiment, the structure, parameters and characteristics of the pre-training layer are directly subjected to knowledge migration, the pre-training layer is adopted to learn the characteristic information of the vocabulary entry pair, the mark amount in the training process is reduced, a large amount of resources and manpower are saved, and the synonymy discrimination efficiency is further improved; the language understanding information can accurately reflect whether the vocabulary entry pair is synonymous or not, then the language understanding information is subjected to feature extraction and fusion through the fine-tuning layer to obtain accurate feature representation of whether the vocabulary entry pair is synonymous or not, and further the synonymy judgment result is obtained through the output layer according to the accurate feature representation, so that the accuracy of the synonymy judgment result is improved.
Furthermore, the feature information of the entry pairs is constructed from multiple dimensions, so that the features of the entry pairs are comprehensively and deeply mined, and more accurate judgment results are favorably obtained.
Furthermore, the pre-training layer better understands the information contained in the vocabulary entry pair from 3 layers of lexical, grammatical and semantic, and greatly enhances the general semantic representation capability, so that a more accurate synonymy judgment result is obtained through the neural network model.
Example two
Fig. 2 is a schematic structural diagram of a neural network model in the second embodiment of the present application, and the embodiment of the present application defines structures of a fine tuning layer and an output layer on the basis of the technical solutions of the foregoing embodiments.
As shown in fig. 2, the neural network model includes: a pre-training layer 10, a fine tuning layer 20 and an output layer 30. Wherein the fine tuning layer 20 comprises a convolutional layer 21, a pooling layer 22 and a full connection layer 23.
The convolutional layer 21 is used to perform feature extraction on the language understanding information to obtain a plurality of one-dimensional feature vectors. The pooling layer 22 may be a maximum pooling layer or an average pooling layer for reducing the dimension of the extracted features, such as taking the maximum or average value for each one-dimensional feature vector, and then stitching together as the output value of the layer. The full connection layer 23 is used for fusing the features after dimensionality reduction to obtain a feature representation of whether the entry pair is synonymous. The number of the fully-connected layers 23 is at least one, and is determined according to the feature dimension after dimension reduction and the feature dimension needing to be output, so that the learning capacity of the network is improved. For the case that the dimensionality of the reduced features is one-dimensional, only one 23 full-connected layer is needed, and the reduced features are added to obtain a probability value.
As shown in fig. 2, the output layer 30 includes a classification layer 31 for performing synonymy and heteronym classification on the feature representation of whether the entry pair is synonymous, so as to obtain the confidence of the synonymy type and the heteronymy type. The classification layer 31 is a neural network structure, such as a commonly used softmax layer. The feature representations are synonymously and heterologously classified by the classification layer 31, by which the operation of normalizing the feature representations is replaced, whereas the classification operation has a higher accuracy with respect to the normalization.
Consider the case where the output of the classification layer 31 may be erroneous for some particular entry pairs, such as hepatitis b and viral hepatitis b, which are synonyms, but are classified as synonyms by the classification layer 31. Respiratory tract infections and viral respiratory tract infections are not synonymous, but are classified as synonymous by the classification layer 31. Thus requiring intervention on special entry pairs.
Based on this, the output layer 30 further includes an intervention layer 32 connected after the classification layer 31, and configured to determine whether or not the feature information of the entry pair satisfies a set synonymy rule or a set synonymy rule. Wherein, the synonymy rule comprises: a first template of synonym entry pairs, the first template including a first fixed vocabulary and a plurality of first candidate vocabularies, the disambiguation rule including: the second template of the pair of ambiguous entries, the second template comprising a second fixed vocabulary and a plurality of second candidate vocabularies.
Exemplary, synonym rules include: the template is liver and viral hepatitis type, the fixed vocabulary is liver and viral hepatitis type, and the candidate vocabulary comprises A, B and C. If the input entry pair is hepatitis B and viral hepatitis B, the input entry pair is matched with the template, namely the set synonymy rule is met, the confidence coefficient of the synonymy type is improved and/or the confidence coefficient of the heterology type is reduced, the improvement and reduction amplitude can be the same or different, and the amplitude can be set in a self-defining mode according to the synonymy judgment precision, such as 5%.
Exemplary, the disambiguation rules include: the template "respiratory tract infection and respiratory tract infection", the fixed vocabulary is respiratory tract infection, and the candidate vocabulary comprises viral and bacterial. If the input entry pair is respiratory tract infection and viral respiratory tract infection, matching the template, namely meeting the set synonymy rule, reducing the confidence coefficient of the synonymy type and/or improving the confidence coefficient of the synonymy type, wherein the improvement and reduction amplitude can be the same or different, and the amplitude can be set by user definition according to the synonymy judgment precision, such as 5%.
In this embodiment, the fine tuning layer is implemented by the convolution layer, the pooling layer, and the full link layer, and the implementation is simple and effective. When the neural network model is trained, different features can be flexibly extracted by adjusting parameters of the convolutional layer, the pooling layer and the full-connection layer, and fusion of different dimensions is performed on the features, so that whether synonym feature representation can be accurately obtained for each type of entry pair in each field.
Furthermore, according to whether the characteristic information meets the synonymy rule or the synonymy rule, the confidence level is intervened, so that the embodiment is also suitable for the classification of special words; furthermore, the synonymy rule and the synonymy rule comprise corresponding templates, fixed vocabularies and candidate vocabularies, are suitable for synonymy judgment of vocabulary entry pairs which accord with the templates and contain the fixed vocabularies and any candidate vocabularies, and judge whether the characteristic information meets the corresponding rule or not through template comparison.
EXAMPLE III
Fig. 3 is a flowchart of a synonymy determination method for an entry according to a third embodiment of the present invention, which is further optimized based on the foregoing embodiments, and specifically adds a training operation of a neural network model. The synonymy discrimination method for the lemmas shown in fig. 3 includes:
S301, obtaining a neural network model to be trained, wherein the neural network model to be trained comprises a pre-training layer, a fine-tuning layer to be trained and an output layer to be trained.
S302, training a neural network model to be trained by using the characteristic information of the synonymous entry pair as a positive sample and the characteristic information of the heteronymous entry pair as a negative sample, and keeping the parameter and characteristic representation of a pre-training layer unchanged in the training process.
And S303, acquiring the feature information of the entry pair to be subjected to synonymy discrimination.
S304, inputting the characteristic information of the entry pair into the trained neural network model to obtain a synonymy judgment result of the entry pair.
Optionally, the entry pair is an entity name in the medical field, and the entity name includes an entity common name and an entity alias. Specifically, the entry to be processed and the common name of each entity are adopted to respectively form an entry pair. If a certain entry pair is synonymous, establishing a corresponding relation between the entry to be processed and another entity universal name in the entry pair so as to realize unification. The method provided by the embodiment defines the field and the type of the entry pair, and can realize the unification of the entity common name and the entity alias in the medical field and accelerate the landing of medical items.
In the neural network model to be trained, a pre-training layer adopts natural language training samples in advance to train a language understanding task, parameters and characteristics adopt pre-training results, and the parameters and characteristics in a fine tuning layer and an output layer are initial values and need to be trained.
Taking the medical field as an example, a plurality of synonymous entry pairs and a plurality of heteronymous entry pairs are extracted from medical records, prescriptions and prescriptions of various hospitals, for example, the synonymous entry pairs include: SARS and SARS, new coronary pneumonia and new coronavirus pneumonia. The pair of ambiguous entries includes: hypertension and hyperlipidemia, recurrent pulmonary tuberculosis and secondary pulmonary tuberculosis. And extracting the characteristic information of the synonymous entry pair and the characteristic information of the heteronymous entry pair to be used as training samples to train the whole neural network model. In the training process, parameters of the fine tuning layer and the output layer are continuously iterated, and the parameters and the characteristic representation of the pre-training layer are kept unchanged until the model reaches the preset precision.
It should be noted that S301 and S302 may be executed before S304, that is, may be executed before S303, or may be executed after S303.
In the embodiment, when the neural network model is trained, the parameter and characteristic representation of the pre-training layer are kept unchanged, so that the strong language understanding capability of the pre-training layer is ensured; and training the fine tuning layer and the output layer simultaneously so that the fine tuning layer extracts synonymous features from the language understanding information and fuses the synonymous features to obtain whether the synonymous features represent or not, and the output layer obtains a synonymous judgment result according to the feature representation.
Example four
Fig. 4 is a structural diagram of a synonymy determination apparatus for an entry in a fourth embodiment of the present application, where the fourth embodiment of the present application is applied to determine whether an entry pair is synonymous or synonym, and the apparatus is implemented by software and/or hardware and is specifically configured in an electronic device with a certain data computation capability.
The synonymy determining apparatus 400 for lemmas shown in fig. 4 includes: an acquisition module 401 and a discrimination module 402; wherein the content of the first and second substances,
The obtaining module 401 is configured to obtain feature information of a vocabulary entry pair to be subjected to synonymy discrimination.
A determining module 402, configured to input feature information of the entry pair into the trained neural network model, to obtain a synonymy determining result of the entry pair;
The neural network model comprises a pre-training layer, a fine tuning layer and an output layer; the pre-training layer is used for training a language understanding task by adopting a natural language training sample in advance and learning the characteristic information of the vocabulary entry pair to obtain the language understanding information of the vocabulary entry pair; the fine adjustment layer is used for carrying out feature extraction and fusion on the language understanding information to obtain feature representation of whether the vocabulary entry is synonymous or not; and the output layer is used for obtaining a synonymy judgment result according to the feature representation.
According to the embodiment of the application, the synonymy judgment is carried out by inputting the characteristic information of the vocabulary entry pair into the trained neural network model, so that the synonymy judgment is carried out by combining the deep learning function of the neural network model on the characteristic level, the efficiency of the synonymy judgment can be improved, the manpower resource is saved, and the landing of related intelligent projects is accelerated; in this embodiment, the pre-training layer is obtained by training a language understanding task in advance by using a natural language training sample, the natural language training sample and the vocabulary entry pair have the same feature distribution, and then the structure, parameters and features of the pre-training layer are directly subjected to knowledge migration, and the pre-training layer is used for learning feature information of the vocabulary entry pair, so that the amount of marking in the training process is reduced, a large amount of resources and manpower are saved, and the synonymous judgment efficiency is further improved; the language understanding information can accurately reflect whether the vocabulary entry pair is synonymous or not, then the language understanding information is subjected to feature extraction and fusion through the fine-tuning layer to obtain accurate feature representation of whether the vocabulary entry pair is synonymous or not, and further the synonymy judgment result is obtained through the output layer according to the accurate feature representation, so that the accuracy of the synonymy judgment result is improved.
Further, the pre-training layer is a neural network structure of the multiple language understanding tasks; the multiple language understanding tasks comprise lexical level tasks, grammar level tasks and semantic level tasks, and the language understanding information comprises lexical information, grammar information and semantic information.
Further, the fine-tuning layer includes: a convolutional layer, a pooling layer, and a full-link layer; the convolution layer is used for extracting the characteristics of the language understanding information, the pooling layer is used for reducing the dimensions of the extracted characteristics, and the full-connection layer is used for fusing the reduced-dimension characteristics to obtain the characteristic representation of whether the vocabulary entry is synonymous or not.
Further, the output layer comprises a classification layer, and the classification layer is used for carrying out synonymy and heterology classification on the characteristic representation of whether the vocabulary entry pair is synonymous or not so as to obtain the confidence degrees of the synonymy type and the heterology type.
Further, the output layer also comprises an intervention layer used for judging whether the characteristic information of the entry pair meets the set heterology rule or synonymy rule; if the characteristic information of the entry pair meets the set synonymy rule, reducing the confidence coefficient of the synonymy type and/or improving the confidence coefficient of the synonymy type; if the characteristic information of the entry pair meets the set synonymy rule, improving the confidence coefficient of the synonymy type and/or reducing the confidence coefficient of the synonymy type; wherein, the synonymy rule comprises: a first template of synonym entry pairs, the first template including a first fixed vocabulary and a plurality of first candidate vocabularies, the disambiguation rule including: the second template of the pair of ambiguous entries, the second template comprising a second fixed vocabulary and a plurality of second candidate vocabularies.
Further, the obtaining module 401 is specifically configured to perform at least one of the following operations: acquiring plain text characteristic information of a vocabulary entry pair to be subjected to synonymy discrimination; acquiring natural word segmentation characteristic information of a vocabulary entry pair to be subjected to synonymy discrimination; acquiring part characteristic information of the entry pair to be subjected to synonymy discrimination; acquiring degree characteristic information of the entry pair to be subjected to synonymy discrimination; acquiring direction characteristic information of a vocabulary entry pair to be subjected to synonymy discrimination; acquiring frequency characteristic information of a vocabulary entry pair to be subjected to synonymy discrimination; acquiring quantity characteristic information of the entry pairs to be subjected to synonymy discrimination; and obtaining sensory characteristic information of the entry pair to be subjected to synonymy discrimination.
Further, the device further comprises a training module, specifically configured to: acquiring a neural network model to be trained, wherein the neural network model to be trained comprises a pre-training layer, a fine-tuning layer to be trained and an output layer to be trained; and training a neural network model to be trained by using the characteristic information of the synonymy vocabulary entry pair as a positive sample and using the characteristic information of the synonymy vocabulary entry pair as a negative sample, and keeping the parameter and characteristic representation of a pre-training layer unchanged in the training process.
Further, the entry pair is an entity name in the medical field; the entity names include entity common names and entity aliases.
The synonymy judging device for the vocabulary entry can execute the synonymy judging method for the vocabulary entry provided by any embodiment of the application, and has the corresponding functional modules and beneficial effects of executing the synonymy judging method for the vocabulary entry.
EXAMPLE five
According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.
Fig. 5 is a block diagram of an electronic device that implements the synonymy determination method for an entry according to the embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.
As shown in fig. 5, the electronic apparatus includes: one or more processors 501, memory 502, and interfaces for connecting the various components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). In fig. 5, one processor 501 is taken as an example.
Memory 502 is a non-transitory computer readable storage medium as provided herein. The memory stores instructions executable by the at least one processor, so that the at least one processor executes the synonymy determination method for the entry provided by the application. The non-transitory computer-readable storage medium of the present application stores computer instructions for causing a computer to execute the synonymy determination method for an entry provided herein.
Memory 502, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the method for synonymy determining of terms in the embodiments of the present application (e.g., shown in fig. 4 as comprising an obtaining module 401 and a determining module 402). The processor 501 executes various functional applications of the server and data processing, i.e., a method for implementing synonymy determination of the entries in the above method embodiments, by running non-transitory software programs, instructions, and modules stored in the memory 502.
The memory 502 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created by use of an electronic device that implements the synonymy determination method for the lemma, and the like. Further, the memory 502 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, memory 502 may optionally include memory located remotely from processor 501, and such remote memory may be connected over a network to an electronic device that performs a synonym determination method for terms. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The electronic device that executes the synonymy determination method for an entry may further include: an input device 503 and an output device 504. The processor 501, the memory 502, the input device 503 and the output device 504 may be connected by a bus or other means, and fig. 5 illustrates the connection by a bus as an example.
the input device 503 may receive input numeric or character information and generate key signal inputs related to user settings and function controls of an electronic device performing a synonym determination method for a term, such as a touch screen, keypad, mouse, track pad, touch pad, pointing stick, one or more mouse buttons, track ball, joystick, etc. the output device 504 may include a display device, an auxiliary lighting device (e.g., L ED), and a tactile feedback device (e.g., vibrating motor), etc.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable logic devices (P L D)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal.
the systems and techniques described here can be implemented on a computer having a display device (e.g., a CRT (cathode ray tube) or L CD (liquid crystal display) monitor) for displaying information to the user and a keyboard and a pointing device (e.g., a mouse or a trackball) by which the user can provide input to the computer for providing interaction with the user.
the systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., AN application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with AN implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, and the present invention is not limited thereto as long as the desired results of the technical solutions disclosed in the present application can be achieved.
The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (11)

1. A synonymy discrimination method of an entry is characterized by comprising the following steps:
Acquiring characteristic information of a vocabulary entry pair to be subjected to synonymy discrimination;
Inputting the characteristic information of the entry pair into a trained neural network model to obtain a synonymy judgment result of the entry pair;
Wherein the neural network model comprises a pre-training layer, a fine tuning layer and an output layer; the pre-training layer is used for training a language understanding task by adopting a natural language training sample in advance and is used for learning the characteristic information of the vocabulary entry pair to obtain the language understanding information of the vocabulary entry pair; the fine adjustment layer is used for carrying out feature extraction and fusion on the language understanding information to obtain feature representation of whether the vocabulary entry is synonymous or not; and the output layer is used for obtaining the synonymy judgment result according to the feature representation.
2. The method of claim 1, wherein the pre-training layer is a neural network structure of multiple language understanding tasks;
The multiple language understanding tasks comprise lexical level tasks, grammar level tasks and semantic level tasks, and the language understanding information comprises lexical information, grammar information and semantic information.
3. The method of claim 1, wherein the trim layer comprises: a convolutional layer, a pooling layer, and a full-link layer;
The convolution layer is used for carrying out feature extraction on the language understanding information, the pooling layer is used for carrying out dimensionality reduction on the extracted features, and the full-connection layer is used for fusing the dimensionality-reduced features to obtain feature representation of whether the vocabulary entry is synonymous or not.
4. The method of claim 1, wherein the output layer comprises a classification layer for performing synonymy and heterology classification on the feature representation of whether the entry pair is synonymous, resulting in a confidence of a synonymy type and an heterology type.
5. The method according to claim 4, wherein the output layer further comprises an intervention layer for determining whether the feature information of the entry pair satisfies a set synonymy rule or a synonymy rule; if the characteristic information of the entry pair meets the set synonymy rule, reducing the confidence coefficient of the synonymy type and/or improving the confidence coefficient of the synonymy type; if the characteristic information of the entry pair meets the set synonymy rule, improving the confidence coefficient of the synonymy type and/or reducing the confidence coefficient of the synonymy type;
Wherein the synonymy rule comprises: a first template of synonym entry pairs, the first template including a first fixed vocabulary and a plurality of first candidate vocabularies, the disambiguation rule including: a second template of an pair of ambiguous entries, said second template comprising a second fixed vocabulary and a plurality of second candidate vocabularies.
6. The method according to claim 1, wherein the obtaining of the feature information of the entry pair to be subjected to synonymy determination includes at least one of:
Acquiring plain text characteristic information of a vocabulary entry pair to be subjected to synonymy discrimination;
Acquiring natural word segmentation characteristic information of a vocabulary entry pair to be subjected to synonymy discrimination;
Acquiring part characteristic information of the entry pair to be subjected to synonymy discrimination;
Acquiring degree characteristic information of the entry pair to be subjected to synonymy discrimination;
Acquiring direction characteristic information of a vocabulary entry pair to be subjected to synonymy discrimination;
Acquiring frequency characteristic information of a vocabulary entry pair to be subjected to synonymy discrimination;
Acquiring quantity characteristic information of the entry pairs to be subjected to synonymy discrimination;
And obtaining sensory characteristic information of the entry pair to be subjected to synonymy discrimination.
7. The method of claim 1, wherein before inputting the feature information of the entry pair into the trained neural network model to obtain the synonymy discrimination result of the entry pair, the method further comprises:
Acquiring a neural network model to be trained, wherein the neural network model to be trained comprises the pre-training layer, a fine-tuning layer to be trained and an output layer to be trained;
And training the neural network model to be trained by using the characteristic information of the synonymy vocabulary entry pair as a positive sample and using the characteristic information of the synonymy vocabulary entry pair as a negative sample, and keeping the parameter and characteristic representation of the pre-training layer unchanged in the training process.
8. The method of any one of claims 1-7, wherein the pair of terms is an entity name of a medical domain;
The entity names include entity common names and entity aliases.
9. A synonymy discrimination device for an entry, comprising:
The acquisition module is used for acquiring the characteristic information of the entry pair to be subjected to synonymy discrimination;
The judging module is used for inputting the characteristic information of the entry pair into a trained neural network model to obtain a synonymy judging result of the entry pair;
Wherein the neural network model comprises a pre-training layer, a fine tuning layer and an output layer; the pre-training layer is used for training a language understanding task by adopting a natural language training sample in advance and is used for learning the characteristic information of the vocabulary entry pair to obtain the language understanding information of the vocabulary entry pair; the fine adjustment layer is used for carrying out feature extraction and fusion on the language understanding information to obtain feature representation of whether the vocabulary entry is synonymous or not; and the output layer is used for obtaining the synonymy judgment result according to the feature representation.
10. An electronic device, comprising:
At least one processor; and
A memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
The memory stores instructions executable by the at least one processor to enable the at least one processor to perform a synonymity determination method for an entry according to any one of claims 1-8.
11. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform a synonymity determination method for an entry according to any one of claims 1 to 8.
CN202010190072.2A 2020-03-18 2020-03-18 Synonym distinguishing method, device, equipment and storage medium Active CN111414750B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010190072.2A CN111414750B (en) 2020-03-18 2020-03-18 Synonym distinguishing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010190072.2A CN111414750B (en) 2020-03-18 2020-03-18 Synonym distinguishing method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111414750A true CN111414750A (en) 2020-07-14
CN111414750B CN111414750B (en) 2023-08-18

Family

ID=71492982

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010190072.2A Active CN111414750B (en) 2020-03-18 2020-03-18 Synonym distinguishing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111414750B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111949764A (en) * 2020-08-18 2020-11-17 桂林电子科技大学 Knowledge graph completion method based on bidirectional attention mechanism
CN112183088A (en) * 2020-09-28 2021-01-05 云知声智能科技股份有限公司 Word level determination method, model construction method, device and equipment
CN116167455A (en) * 2022-12-27 2023-05-26 北京百度网讯科技有限公司 Model training and data deduplication method, device, equipment and storage medium

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4724523A (en) * 1985-07-01 1988-02-09 Houghton Mifflin Company Method and apparatus for the electronic storage and retrieval of expressions and linguistic information
CN101777042A (en) * 2010-01-21 2010-07-14 西南科技大学 Neural network and tag library-based statement similarity algorithm
US20150095017A1 (en) * 2013-09-27 2015-04-02 Google Inc. System and method for learning word embeddings using neural language models
US20160247091A1 (en) * 2012-10-22 2016-08-25 University Of Massachusetts Feature Type Spectrum Technique
US20160371254A1 (en) * 2015-06-17 2016-12-22 Panasonic Intellectual Property Management Co., Ltd. Method for assigning semantic information to word through learning using text corpus
CN106339369A (en) * 2016-08-30 2017-01-18 广东医科大学 Method and system for recognizing synonyms of data set
CN106844346A (en) * 2017-02-09 2017-06-13 北京红马传媒文化发展有限公司 Short text Semantic Similarity method of discrimination and system based on deep learning model Word2Vec
CN107145910A (en) * 2017-05-08 2017-09-08 京东方科技集团股份有限公司 Performance generation system, its training method and the performance generation method of medical image
CN107316015A (en) * 2017-06-19 2017-11-03 南京邮电大学 A kind of facial expression recognition method of high accuracy based on depth space-time characteristic
CN107797985A (en) * 2017-09-27 2018-03-13 百度在线网络技术(北京)有限公司 Establish synonymous discriminating model and differentiate the method, apparatus of synonymous text
US20180101520A1 (en) * 2016-10-11 2018-04-12 The Japan Research Institute, Limited Natural language processing apparatus, natural language processing method, and recording medium
EP3528263A1 (en) * 2018-02-15 2019-08-21 Siemens Healthcare GmbH Providing a trained virtual tissue atlas and a synthetic image
CN110442684A (en) * 2019-08-14 2019-11-12 山东大学 A kind of class case recommended method based on content of text
CN110674314A (en) * 2019-09-27 2020-01-10 北京百度网讯科技有限公司 Sentence recognition method and device

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US4724523A (en) * 1985-07-01 1988-02-09 Houghton Mifflin Company Method and apparatus for the electronic storage and retrieval of expressions and linguistic information
CN101777042A (en) * 2010-01-21 2010-07-14 西南科技大学 Neural network and tag library-based statement similarity algorithm
US20160247091A1 (en) * 2012-10-22 2016-08-25 University Of Massachusetts Feature Type Spectrum Technique
US20150095017A1 (en) * 2013-09-27 2015-04-02 Google Inc. System and method for learning word embeddings using neural language models
US20160371254A1 (en) * 2015-06-17 2016-12-22 Panasonic Intellectual Property Management Co., Ltd. Method for assigning semantic information to word through learning using text corpus
CN106339369A (en) * 2016-08-30 2017-01-18 广东医科大学 Method and system for recognizing synonyms of data set
US20180101520A1 (en) * 2016-10-11 2018-04-12 The Japan Research Institute, Limited Natural language processing apparatus, natural language processing method, and recording medium
CN106844346A (en) * 2017-02-09 2017-06-13 北京红马传媒文化发展有限公司 Short text Semantic Similarity method of discrimination and system based on deep learning model Word2Vec
CN107145910A (en) * 2017-05-08 2017-09-08 京东方科技集团股份有限公司 Performance generation system, its training method and the performance generation method of medical image
CN107316015A (en) * 2017-06-19 2017-11-03 南京邮电大学 A kind of facial expression recognition method of high accuracy based on depth space-time characteristic
CN107797985A (en) * 2017-09-27 2018-03-13 百度在线网络技术(北京)有限公司 Establish synonymous discriminating model and differentiate the method, apparatus of synonymous text
EP3528263A1 (en) * 2018-02-15 2019-08-21 Siemens Healthcare GmbH Providing a trained virtual tissue atlas and a synthetic image
CN110442684A (en) * 2019-08-14 2019-11-12 山东大学 A kind of class case recommended method based on content of text
CN110674314A (en) * 2019-09-27 2020-01-10 北京百度网讯科技有限公司 Sentence recognition method and device

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
KAI ZHOU: "Sentiment analysis of text based on CNN and bi-directional LSTM model", 《2018 24TH INTERNATIONAL CONFERENCE ON AUTOMATION AND COMPUTING(ICAC)》 *
张芳芳: "基于字面和语义相关性匹配的智能篇章配需", 《山东大学学报(理学版)》 *
王智昊: "面向知识库问答的自然语言语义特征表示研究", 《万方数据库》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111949764A (en) * 2020-08-18 2020-11-17 桂林电子科技大学 Knowledge graph completion method based on bidirectional attention mechanism
CN112183088A (en) * 2020-09-28 2021-01-05 云知声智能科技股份有限公司 Word level determination method, model construction method, device and equipment
CN112183088B (en) * 2020-09-28 2023-11-21 云知声智能科技股份有限公司 Word level determining method, model building method, device and equipment
CN116167455A (en) * 2022-12-27 2023-05-26 北京百度网讯科技有限公司 Model training and data deduplication method, device, equipment and storage medium
CN116167455B (en) * 2022-12-27 2023-12-22 北京百度网讯科技有限公司 Model training and data deduplication method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN111414750B (en) 2023-08-18

Similar Documents

Publication Publication Date Title
CN112560912B (en) Classification model training method and device, electronic equipment and storage medium
US10698932B2 (en) Method and apparatus for parsing query based on artificial intelligence, and storage medium
CN111414750A (en) Synonymy distinguishing method, device, equipment and storage medium for entries
CN111191428B (en) Comment information processing method and device, computer equipment and medium
CN111709247A (en) Data set processing method and device, electronic equipment and storage medium
CN111783451A (en) Method and apparatus for enhancing text samples
US10628426B2 (en) Text representation method and apparatus
CN110705460A (en) Image category identification method and device
US20210200963A1 (en) Machine translation model training method, apparatus, electronic device and storage medium
EP4113357A1 (en) Method and apparatus for recognizing entity, electronic device and storage medium
CN113220836A (en) Training method and device of sequence labeling model, electronic equipment and storage medium
KR102456535B1 (en) Medical fact verification method and apparatus, electronic device, and storage medium and program
CN113407698B (en) Method and device for training and recognizing intention of intention recognition model
CN111709249A (en) Multi-language model training method and device, electronic equipment and storage medium
CN110851601A (en) Cross-domain emotion classification system and method based on layered attention mechanism
CN111078878A (en) Text processing method, device and equipment and computer readable storage medium
CN111274397A (en) Method and device for establishing entity relationship detection model
CN111783861A (en) Data classification method, model training device and electronic equipment
CN113850080A (en) Rhyme word recommendation method, device, equipment and storage medium
CN111738015A (en) Method and device for analyzing emotion polarity of article, electronic equipment and storage medium
CN114495143A (en) Text object identification method and device, electronic equipment and storage medium
CN113157829A (en) Method and device for comparing interest point names, electronic equipment and storage medium
CN112560425B (en) Template generation method and device, electronic equipment and storage medium
CN112270169B (en) Method and device for predicting dialogue roles, electronic equipment and storage medium
CN112948573A (en) Text label extraction method, device, equipment and computer storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant