CN116469526A - Training method, device, equipment and storage medium for traditional Chinese medicine diagnosis model - Google Patents

Training method, device, equipment and storage medium for traditional Chinese medicine diagnosis model Download PDF

Info

Publication number
CN116469526A
CN116469526A CN202310440693.5A CN202310440693A CN116469526A CN 116469526 A CN116469526 A CN 116469526A CN 202310440693 A CN202310440693 A CN 202310440693A CN 116469526 A CN116469526 A CN 116469526A
Authority
CN
China
Prior art keywords
training
sample
sample set
target
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310440693.5A
Other languages
Chinese (zh)
Inventor
胡意仪
阮晓雯
吴振宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202310440693.5A priority Critical patent/CN116469526A/en
Publication of CN116469526A publication Critical patent/CN116469526A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H20/00ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
    • G16H20/90ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to alternative medicines, e.g. homeopathy or oriental medicines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Public Health (AREA)
  • Evolutionary Computation (AREA)
  • Primary Health Care (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Epidemiology (AREA)
  • Mathematical Physics (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Pathology (AREA)
  • Alternative & Traditional Medicine (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The application relates to the technical field of artificial intelligence and discloses a training method, device, equipment and storage medium for a traditional Chinese medicine diagnosis model, wherein the method comprises the following steps: when an identification model training instruction is received, extracting a target pattern from the identification model training instruction; searching first description information corresponding to the target pattern from a plurality of data sources, and acquiring second description information corresponding to other patterns except the target pattern from the plurality of data sources; constructing a positive example sample pair according to the first description information, and constructing a negative example sample pair according to the first description information and the second description information, wherein the data sources of the description information in the positive example sample pair and the negative example sample pair are different; data enhancement is carried out on the positive example sample pair to obtain a first training sample set, and data enhancement is carried out on the negative example sample pair to obtain a second training sample set; and training a diagnosis model according to the entity characteristics, the similarity characteristics and the text characteristics of the first training sample set and the second training sample set.

Description

Training method, device, equipment and storage medium for traditional Chinese medicine diagnosis model
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a training method, device and equipment for a traditional Chinese medicine diagnosis model and a storage medium.
Background
At present, in the diagnosis of traditional Chinese medical science diseases, besides the name of the disease, the specific syndrome type of the disease is determined, and the syndrome type of the disease indicates the cause and pathogenesis of the disease. The number of syndromes under the same disease is as small as 3-5 and as large as tens, which brings challenges to the intellectualization of the diagnosis of Chinese special diseases.
On the one hand, because of the diversity and complexity of the pathogenesis, different syndromes do not overlap, and on the other hand, because the distribution of the syndromes in the actual diagnosis is not uniform, it is difficult to collect enough high-quality data for each syndrome for model learning. However, business application scenarios require that the smart model must be able to fit each and every pattern that may occur excellently, even though it is rare for one pattern to be.
Disclosure of Invention
The main purpose of the application is to provide a method, a device, equipment and a storage medium for training a traditional Chinese medicine diagnosis model, and aims to solve the problems that training data is insufficient and model training effect is poor when the traditional Chinese medicine diagnosis model is trained in the prior art.
In a first aspect, the present application provides a method for training a diagnostic model of chinese medicine, comprising:
when an identification model training instruction is received, extracting a target pattern from the identification model training instruction;
Searching first description information corresponding to the target certificate from a plurality of data sources, and acquiring second description information corresponding to other certificates except the target certificate from the plurality of data sources;
constructing a positive example sample pair according to the first descriptive information, and constructing a negative example sample pair according to the first descriptive information and the second descriptive information, wherein the data sources of the descriptive information in the positive example sample pair and the negative example sample pair are different;
data enhancement is carried out on the positive example sample pair to obtain a first training sample set, and data enhancement is carried out on the negative example sample pair to obtain a second training sample set;
and training a diagnosis model according to the entity characteristics, the similarity characteristics and the text characteristics of the first training sample set and the second training sample set.
In a second aspect, the present application further provides a diagnostic model training apparatus for chinese medicine, the diagnostic model training apparatus for chinese medicine comprising:
the instruction receiving module is used for extracting a target pattern from the recognition model training instruction when the recognition model training instruction is received;
the data acquisition module is used for searching first description information corresponding to the target certificate from a plurality of data sources and acquiring second description information corresponding to other certificates except the target certificate from a plurality of data sources;
The data construction module is used for constructing a positive example sample pair according to the first descriptive information and constructing a negative example sample pair according to the first descriptive information and the second descriptive information, wherein the data sources of the descriptive information in the positive example sample pair and the negative example sample pair are different;
the data enhancement module is used for carrying out data enhancement on the positive example sample pair to obtain a first training sample set, and carrying out data enhancement on the negative example sample pair to obtain a second training sample set;
and the training module is used for training a diagnosis model according to the entity characteristics, the similarity characteristics and the text characteristics of the first training sample set and the second training sample set.
In a third aspect, the present application also provides a computer device comprising a processor, a memory, and a computer program stored on the memory and executable by the processor, wherein the computer program when executed by the processor implements the steps of the method for training a diagnostic model of chinese medicine as described above.
In a fourth aspect, the present application also provides a storage medium having a computer program stored thereon, wherein the computer program, when executed by a processor, implements the steps of a method for training a diagnostic model of chinese medicine as described above.
The application provides a method, a device, equipment and a storage medium for training a traditional Chinese medicine diagnosis model, wherein in the application, description information of different data sources is used for constructing a positive example sample pair and a negative example sample pair, and data enhancement is carried out on the positive example sample pair and the negative example sample pair, so that the problem of insufficient training data in the traditional Chinese medicine diagnosis model training in the prior art is solved; in addition, the diagnosis model is trained according to the plurality of dimension characteristics of the first training sample set and the second training sample set, so that the training effect of the diagnosis model is further improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic step flow diagram of a training method of a diagnostic model of traditional Chinese medicine according to an embodiment of the present application;
FIG. 2 is a schematic block diagram of a training device for a diagnostic model of traditional Chinese medicine provided in an embodiment of the present application;
Fig. 3 is a schematic block diagram of a structure of a computer device according to an embodiment of the present application.
The realization, functional characteristics and advantages of the present application will be further described with reference to the embodiments, referring to the attached drawings.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
The flow diagrams depicted in the figures are merely illustrative and not necessarily all of the elements and operations/steps are included or performed in the order described. For example, some operations/steps may be further divided, combined, or partially combined, so that the order of actual execution may be changed according to actual situations. In addition, although the division of the functional modules is performed in the apparatus schematic, in some cases, the division of the modules may be different from that in the apparatus schematic.
The embodiment of the application provides a method, a device, equipment and a storage medium for training a traditional Chinese medicine diagnosis model. The method can be applied to terminal equipment or a server, wherein the terminal equipment can be electronic equipment such as mobile phones, tablet computers, notebook computers, desktop computers, personal digital assistants, wearable equipment and the like; the server may be a single server or a server cluster composed of a plurality of servers. The following explanation will be made taking the application of the method to a server as an example.
Some embodiments of the present application are described in detail below with reference to the accompanying drawings. The following embodiments and features of the embodiments may be combined with each other without conflict.
Referring to fig. 1, fig. 1 is a schematic step flow diagram of a training method of a diagnostic model of traditional Chinese medicine according to an embodiment of the present application.
As shown in fig. 1, the method for training the diagnostic model of the traditional Chinese medicine comprises steps S10 to S15.
And step S10, when an identification model training instruction is received, extracting a target pattern from the identification model training instruction.
The identification model training instruction is used for indicating an execution main body of the application to train a diagnosis model for diagnosing the target pattern.
It can be understood that, when the execution body of the application is a server, the recognition model training instruction may be a network request sent by a user using a terminal device and received by the server, or may be an instant communication message sent by a user received by the server through the terminal device after the terminal device and the server are connected by full duplex communication, and of course, the recognition model training instruction may also be other forms of control instructions received and identifiable by the server, which is not limited herein.
Correspondingly, when the execution main body of the application is the terminal equipment, the recognition model training instruction can be control information captured and recognized by the terminal equipment when the user clicks a control such as a screen or a button of the terminal equipment, and likewise, the recognition model training instruction can also be other forms of control instructions received and recognized by the terminal equipment, so that the application is not limited.
Step S11, searching first description information corresponding to the target certificate from a plurality of data sources, and acquiring second description information corresponding to other certificates except the target certificate from a plurality of data sources.
Wherein, the plurality of data sources include, but are not limited to, the "medical sleep diagnosis and treatment guide", the "medical diagnosis and treatment term standard", the "traditional Chinese medical mental disease science", the "Baidu encyclopedia", and the like.
Different data sources gather the corpus about various types under different channels, and the corpus corresponding to the target type is searched from a plurality of data sources, so that first description information is obtained; correspondingly, the corpus of other types except the target type is obtained from a plurality of data sources, and the second description information is obtained. It can be understood that, because there are one or more corpora corresponding to a specific type under each data source, the number of the first description information and the second description information obtained in step S11 is plural.
And step S12, constructing a positive example sample pair according to the first description information, and constructing a negative example sample pair according to the first description information and the second description information, wherein the data sources of the description information in the positive example sample pair and the negative example sample pair are different.
It will be appreciated that each positive example sample pair includes two different first description information, and that the first description information of each pair comprising a positive example sample pair is from a different data source; in addition, each negative example pair comprises a first descriptive information and a second descriptive information, and the data sources of the first descriptive information and the second descriptive information in each negative example pair are different. Through step S12, a plurality of positive example sample pairs and negative example sample pairs can be constructed.
Because the data in different data sources are written by different people, even for the same pattern, the description modes of the pattern by different data sources are different, but under different description modes, the corresponding description contents correspond to a specific pattern.
For example, in the data source "the medical guide for sleep diagnosis and treatment of traditional Chinese medicine", one piece of description information of "the internal syndrome of blood stasis" is "the internal syndrome of blood stasis: for a long time, insomnia is restless, chest stuffiness, night fright dream, night sleep failure and night sleep restlessness. In the data source Baidu encyclopedia, one piece of description information of the internal blood stasis syndrome is the name of the internal blood stasis syndrome and the traditional Chinese medicine disease. The symptoms of blood stasis, blood obstruction, dark purple or spotted tongue, wiry and unsmooth pulse are the common symptoms of local bluish-purple bumps, pain and refusal, abdominal masses, stinging and not moving, refusal, or dark purple and lumpy bleeding. "
In this embodiment, the first description information of different data sources is used to construct a positive example sample pair, the first description information and the second description information of different data sources are used to construct a negative example sample pair, and then the positive example sample pair and the negative example sample pair are used to construct training data to train a diagnosis model, so that the diagnosis model can cross-compare data of different data sources, and the diagnosis model can cross-compare different models between the traditional Chinese medicine diagnosis principle and semantics, and the accuracy of identifying the target model by the diagnosis model can be improved.
In some embodiments, the constructing a negative example pair according to the first description information and the second description information includes:
acquiring a node of the target pattern in a preset clinical term standard as a target node;
acquiring a node adjacent to the target node from the clinical term standard as a neighbor node, and acquiring a pattern corresponding to the neighbor node in the clinical term standard as a neighbor pattern;
and screening target description information corresponding to the neighbor certificate from the second description information, and constructing a negative example sample pair according to the first description information and the target description information.
In some embodiments, the clinical term criteria include, but are not limited to, the "clinical term criteria for traditional Chinese medicine".
It is understood that clinical term criteria categorize various diseases, and syndromes under various diseases, and are categorized according to the similarity between diseases, syndromes under diseases, and syndromes under syndromes. Thus, the higher the similarity between diseases, the closer the nodes of the two diseases in the clinical term standard, and similarly, the higher the similarity between syndromes under the same disease, the closer the nodes of the two syndromes in the clinical term standard.
The neighbor pattern and the target pattern are of different types under a certain disease, and because the neighbor node corresponding to the neighbor pattern is adjacent to the target node corresponding to the target pattern, the symptoms of the neighbor pattern and the target pattern have great commonality and are very similar.
For example, it is assumed that the target syndrome is "internal blood stasis syndrome", and that the syndrome corresponding to one of the neighboring nodes in "clinical terms standard of traditional Chinese medicine" is "blood stasis and blood stasis syndrome". Then respectively obtaining the target description information corresponding to the blood stasis and the syndrome and the first description information of the blood stasis and the syndrome to construct a negative example sample pair.
In this embodiment, when the negative example sample pair is constructed, the target description information corresponding to the neighbor pattern and the first description information corresponding to the target pattern are obtained to construct the negative example sample pair, so that in the subsequent process of training the diagnostic model by using the negative example sample pair, the diagnostic model can learn the distinguishing characteristics of the target pattern and the neighbor pattern better, and the accuracy of identifying the target pattern by the diagnostic model can be improved.
And S13, carrying out data enhancement on the positive example sample pair to obtain a first training sample set, and carrying out data enhancement on the negative example sample pair to obtain a second training sample set.
It can be understood that the data can be expanded through data enhancement, and the data amount of training data for training the model can be increased after the data enhancement is performed on the positive example sample pair and the negative example sample pair respectively.
The method comprises the steps of obtaining a first training sample set, wherein the first training sample set is a set formed by enhancement results obtained by data enhancement of a pair of sample samples of a correction example; and similarly, the set formed by the enhancement results obtained by enhancing the data of the negative example sample pair is the second training sample set.
In some embodiments, the data enhancing the positive example sample pair to obtain a first training sample set includes:
Copying the positive sample pair to obtain a copy result, and identifying symptom entity information in the copy result to obtain an entity information list;
randomly extracting a plurality of target entity information from the entity information list, and acquiring matching synonyms corresponding to the target entity information;
and carrying out data enhancement on the positive sample pair according to the matching synonym so as to obtain a first training sample set.
Illustratively, "Xiaoming", "1992", and "university" are named entities in "Xiaoming from university graduation in 1992".
The named entity related to the symptom in the copy result is symptom entity information, wherein a plurality of symptom entity information may exist in the copy result, and a list formed by the symptom entity information is an entity information list. The target entity information is one or more symptom entity information selected from the entity information list randomly, and the synonym corresponding to the target entity information is the matching synonym.
In some trial approaches, symptom entity information in the copy result may be identified by a named-body identification technique; the symptom entity information in the correction sample pair can be labeled in advance in a manual mode, and the execution subject of the application can determine the symptom entity information by reading the labeling of the copy result; of course, the symptom entity information in the copy result may also be identified by other means, without limitation.
In some embodiments, a pre-constructed mapping dictionary of the traditional Chinese medicine field may be obtained from the database, where synonyms corresponding to various symptom entity information are recorded in the mapping dictionary, and matching synonyms corresponding to the target entity information may be obtained according to the mapping dictionary. In addition, a word vector corresponding to symptom entity information can be obtained as a first word vector, the first word vector is compared with word vectors of words in a preset word stock, and words with word vectors similar to the first word vector are selected from the preset word stock to serve as matching synonyms corresponding to target entity information. Of course, other ways may be used to obtain the matching synonyms corresponding to the target entity information, which is not limited herein.
In some embodiments, the data enhancing the positive example sample pair according to the matching synonym to obtain a first training sample set includes:
replacing the target entity information in the copying result with the matching synonym to obtain a first enhancement sample of the positive sample pair;
and constructing a first training sample set according to the positive example sample pair and the first enhanced sample.
It can be understood that the target entity information in the copy result is replaced by the matching synonym, and the obtained replacement result is the first enhanced sample of the positive sample pair.
Because there are one or more symptom entity information in the copy result and one or more matching synonyms for each symptom entity information. Therefore, whether different symptom entity information is selected as target entity information to perform synonym replacement or different matching synonyms are selected to replace certain target entity information, different replacement results can be generated. Thus, the resulting first enhanced samples may be numerous.
In the embodiment, the data enhancement of the positive example sample pair is realized by copying the positive example sample pair and carrying out synonym replacement on a plurality of symptom entity information in the copying result, so that the data volume of training data for model training is expanded.
In some embodiments, the data enhancing the positive example sample pair according to the matching synonym to obtain a first training sample set includes:
acquiring a target description statement containing the target entity information from the copy result;
Identifying matching description sentences corresponding to the matching synonyms from the first description information by using a pre-trained normalization model;
replacing the target description statement in the copy result with the matching description statement to obtain a second enhancement sample of the positive sample pair;
and constructing a first training sample set according to the positive example sample pair and the second enhanced sample.
It can be understood that the copy result is obtained by splicing a plurality of description sentences through punctuation marks, wherein the description sentences containing the target entity information in the copy result are target description sentences.
After the pre-trained normalization (Norm) model is obtained, each statement in each first description information can be aligned to a certain symptom entity information by using the normalization model, and a mapping table is obtained. And acquiring descriptive sentences corresponding to the symptom entity information as the matching synonyms from the mapping table to obtain matching descriptive sentences, replacing the target descriptive sentences in the copy result with the matching descriptive sentences, and obtaining a replacing result which is a second enhancement sample. In addition, because there may be one or more matching description sentences corresponding to the matching synonyms, after replacing the target description sentences in the copy result, one or more second enhancement samples are obtained, and the set constructed by each positive example sample pair and each second enhancement sample is the first training sample set.
Illustratively, assume that the copy result is "a, B, C, D, E", wherein A, B, C, D, E are five sentences constituting the copy result, and sentence D is a target description sentence containing target entity information.
The mapping results obtained after the analysis of each first description information by using the pre-trained normalization model are shown in table 1 below.
TABLE 1,
Assuming that the matching synonym corresponding to the target entity information is e3, matching description sentences F6, F7, F8, F9, F10, and F11 can be obtained from table 1.
The second enhancement sample obtained after replacing the target description sentence in the copy result with the matching description sentence is shown in table 2 below.
TABLE 2,
Of course, if there are a plurality of target entity information in the copy result, there are a plurality of target description sentences that can be replaced at this time. For example, assuming that in the copy result "a, B, C, D, E", the sentence D is a target description sentence containing two pieces of target entity information E00 and E01, assuming that a matching description sentence corresponding to a matching synonym matching E00 obtained from the mapping table is F15, and assuming that a matching description sentence corresponding to a matching synonym matching E01 obtained from the mapping table is F16, after the F15 and F16 are spliced, a splice result "F15, F16" is obtained, and after the target description sentence D in the copy result is replaced with the splice result "F15, F16", a second enhancement sample of the finally obtained positive sample pair is "a, B, C, F15, F16, E".
In the embodiment, the data enhancement of the positive example sample pair is realized by copying the positive example sample pair and replacing the target description statement containing the symptom entity information in the copying result with the matching description statement, so that the data volume of training data for model training is expanded.
The technical scheme for data enhancement on the negative example sample pair may refer to the technical scheme for data enhancement on the positive example sample pair, and will not be described in detail herein.
And step S14, training a diagnosis model according to the entity characteristics, the similarity characteristics and the text characteristics of the first training sample set and the second training sample set.
It can be understood that after the first training sample and the second training sample are constructed, the first training sample and the second training sample can be input into the neural network model, so that the neural network model trains according to the entity characteristics, the similarity characteristics and the text characteristics in the first training sample and the second training sample. And when the neural network model is trained, obtaining a diagnosis model for diagnosing the target pattern.
The entity characteristics refer to characteristics of symptom entity information carried by each piece of description information in the first training sample set and the second training sample set; the similarity feature refers to the similarity between two pieces of description information constituting a positive example sample pair in the first training sample set and the similarity between two pieces of description information constituting a negative example sample pair in the second training sample set; the text feature refers to the expression mode of each piece of description information in the first training sample set and the second training sample for a certain syndrome.
When the neural network model is trained, through each negative example sample pair in the second training sample set, cross comparison can be carried out between the traditional Chinese medicine diagnosis principle and the semantics on different patterns, so that the relation and the distinction between the target pattern and other patterns are learned, and the pattern discrimination capability of the neural network model is improved. In addition, the data sources of the two description information in the positive example sample pair in the first training sample and the data sources of the two description information in the negative example sample pair in the second training sample are different, and the neural network model can carry out cross comparison aiming at the data of different data sources in the training process, so that the problem of insufficient training data can be solved, and the authority of the training data is increased. In addition, in the training process of the neural network model, text features and entity features of descriptive information in the first training sample set and the second training sample set and similarity features between two descriptive information in the positive example data pair or the negative example data pair can be learned, and the recognition accuracy of a diagnosis model obtained by training the neural network model on the target evidence can be further improved.
In some embodiments, the training a diagnostic model based on the physical, similarity, and text features of the first training sample set and the second training sample set comprises:
Extracting symptom entity information from each sample of the first training sample set, and calculating mutual information of the symptom entity information and the target syndrome;
constructing a diagnosis rule according to mutual information corresponding to each symptom entity information, and labeling the first training sample set and the second training sample set according to the diagnosis rule;
and after similarity feature labeling processing is carried out on the first training sample set and the second training sample set, training a diagnosis model by using the first training sample set and the second training sample set.
It is appreciated that the interaction between symptom entity information and the target pattern can be measured using the calculated mutual information (Mutual Information).
Because the higher the mutual information of the symptom entity information and the target pattern is, the higher the association degree of the symptom entity information and the target pattern is, the diagnosis rule can be constructed according to the mutual information after the mutual information of each symptom entity information and the target pattern is calculated.
Specifically, the weight level of the symptom entity information for diagnosing the target syndrome can be determined according to the mutual information, and the symptom entity information in the description information of the training sample set is marked according to the weight level. It can be understood that, in this embodiment, labeling of the entity features of the first training sample set and the second training sample set is achieved, the higher the weight level of the symptom entity information, the more the symptom entity information will be focused by the diagnostic model in the process of training the diagnostic model, and through the technical scheme provided in this embodiment, the identification capability of the diagnostic model on the key symptom entity information for judging the target syndrome is improved.
In some embodiments, after the similarity feature labeling process is performed on the first training sample set and the second training sample set, training a diagnostic model using the first training sample set and the second training sample set includes:
extracting average vector features corresponding to the positive sample according to the first training set;
selecting a target sample pair from the first training sample set or the second training sample set, and extracting a first sample and a second sample from the target sample pair;
acquiring vector features corresponding to the first sample as first vector features, and acquiring vector features corresponding to the second sample as second vector features;
labeling the first sample according to the distance between the first vector feature and the average vector feature, and labeling the second sample according to the distance between the second vector feature and the average vector feature;
inputting the first sample and the second sample into a pre-constructed neural network model to train the neural network model;
and when the neural network model training is completed, obtaining a diagnosis model.
In some embodiments, the extracting, according to the first training set, average vector features corresponding to positive examples samples includes:
the method comprises the steps of obtaining the number of positive samples in a first training set as a first number, and extracting positive samples from the first training set one by one to serve as target samples;
coding the target sample by using a pre-trained vector acquisition model to obtain a target sample vector;
and accumulating each target sample vector to obtain a target sample total vector, and dividing the target sample total vector by the first quantity to obtain an average vector characteristic.
It may be understood that the first training set includes a positive example sample pair and a first enhancement sample obtained by enhancing the positive example sample pair, and each positive example sample pair and the first enhancement sample include two positive example samples. The total number of positive examples included in the first training set is the first number.
The target sample vector is a vector capable of representing the semantics of the target sample, after the target sample vector of each positive sample in the first training set is obtained by using the vector obtaining model, the target sample vectors are accumulated, and then the total vector of the target sample can be obtained, and the quotient of the total vector of the target sample and the first quantity is the average vector feature.
In some embodiments, the vector obtaining model may be a Bert (Bidirectional Encoder Representation from Transformers) model, a word2vec (word casting) model, or another model that may obtain a word vector corresponding to a keyword, which is not limited herein.
It will be appreciated that the first training sample set and the second training sample set include a plurality of sample pairs, where the first training sample set is a positive sample pair composed of two positive samples, and the second training sample set is a negative sample pair composed of one positive sample and one negative sample. And extracting sample pairs from the first training sample set or the second training sample set one by one to carry out similarity feature labeling, wherein the sample pairs extracted from the first training sample set or the second training sample set at present are target sample pairs. The two pieces of description information constituting the target sample pair are a first sample and a second sample, respectively.
In addition, a vector acquisition model may be used to acquire vector features of the first sample and the second sample, where the acquired vector feature corresponding to the first sample is the first vector feature, and the acquired vector feature corresponding to the second sample is the second vector feature.
It will be appreciated that the smaller the distance between the first vector feature and the average vector feature, the closer the first sample and the positive sample are described; accordingly, the greater the distance between the first vector feature and the average vector feature, the closer the first sample and the negative sample are. Similarly, if the distance between the second vector feature and the average vector feature is smaller, the second sample and the positive sample are closer; accordingly, the greater the distance between the second vector feature and the average vector feature, the closer the second sample and the negative example sample are.
In this embodiment, the first sample is labeled according to the distance between the first vector feature and the average vector feature, the second sample is labeled according to the distance between the second vector feature and the average vector feature, and then the first sample and the second sample are input into the pre-constructed neural network model to train the neural network model, so that the neural network model can evaluate the proximity degree of the sample and the positive sample according to the distance between each sample and the average vector feature in the training process, and further improve the learning efficiency and the learning effect of the neural network model.
When the neural network model is trained to a preset degree of convergence of the loss function, determining the neural network model as a diagnosis model.
In the method, the positive example sample pair and the negative example sample pair are constructed by using the description information of different data sources, and the data of the positive example sample pair and the negative example sample pair are enhanced, so that the problem of insufficient training data in the traditional Chinese medicine diagnosis model training in the prior art is solved; in addition, the diagnosis model is trained according to the plurality of dimension characteristics of the first training sample set and the second training sample set, so that the training effect of the diagnosis model is further improved.
Referring to fig. 2, fig. 2 is a schematic block diagram of a training device for a diagnostic model of traditional Chinese medicine according to an embodiment of the present application.
As shown in fig. 2, the apparatus 201 for training a diagnostic model of traditional Chinese medicine comprises:
the instruction receiving module 2011 is configured to extract a target pattern from an identification model training instruction when the identification model training instruction is received;
a data acquisition module 2012, configured to search a plurality of data sources for first description information corresponding to the target certificate, and acquire second description information corresponding to other certificates except the target certificate from the plurality of data sources;
a data construction module 2013, configured to construct a positive example sample pair according to the first description information, and construct a negative example sample pair according to the first description information and the second description information, where the data sources of the description information in the positive example sample pair and the negative example sample pair are different;
A data enhancement module 2014, configured to perform data enhancement on the positive example sample pair to obtain a first training sample set, and perform data enhancement on the negative example sample pair to obtain a second training sample set;
a training module 2015, configured to train a diagnostic model according to the physical features, the similarity features, and the text features of the first training sample set and the second training sample set.
In some embodiments, the data enhancement module 2014, when performing data enhancement on the positive example sample pair to obtain a first training sample set, includes:
copying the positive sample pair to obtain a copy result, and identifying symptom entity information in the copy result to obtain an entity information list;
randomly extracting a plurality of target entity information from the entity information list, and acquiring matching synonyms corresponding to the target entity information;
and carrying out data enhancement on the positive sample pair according to the matching synonym so as to obtain a first training sample set.
In some embodiments, the data enhancement module 2014, when performing data enhancement on the positive example sample pair according to the matching synonym to obtain a first training sample set, includes:
Replacing the target entity information in the copying result with the matching synonym to obtain a first enhancement sample of the positive sample pair;
and constructing a first training sample set according to the positive example sample pair and the first enhanced sample.
In some embodiments, the data enhancement module 2014, when performing data enhancement on the positive example sample pair according to the matching synonym to obtain a first training sample set, includes:
acquiring a target description statement containing the target entity information from the copy result;
identifying matching description sentences corresponding to the matching synonyms from the first description information by using a pre-trained normalization model;
replacing the target description statement in the copy result with the matching description statement to obtain a second enhancement sample of the positive sample pair;
and constructing a first training sample set according to the positive example sample pair and the second enhanced sample.
In some embodiments, the data construction module 2013, when constructing a negative example pair according to the first description information and the second description information, includes:
acquiring a node of the target pattern in a preset clinical term standard as a target node;
Acquiring a node adjacent to the target node from the clinical term standard as a neighbor node, and acquiring a pattern corresponding to the neighbor node in the clinical term standard as a neighbor pattern;
and screening target description information corresponding to the neighbor certificate from the second description information, and constructing a negative example sample pair according to the first description information and the target description information.
In some embodiments, the training module 2015, when training a diagnostic model based on the physical, similarity, and text features of the first training sample set and the second training sample set, comprises:
extracting symptom entity information from each sample of the first training sample set, and calculating mutual information of the symptom entity information and the target syndrome;
constructing a diagnosis rule according to mutual information corresponding to each symptom entity information, and labeling the first training sample set and the second training sample set according to the diagnosis rule;
and after similarity feature labeling processing is carried out on the first training sample set and the second training sample set, training a diagnosis model by using the first training sample set and the second training sample set.
In some embodiments, the training module 2015, after performing similarity feature labeling processing on the first training sample set and the second training sample set, when training a diagnostic model using the first training sample set and the second training sample set, includes:
extracting average vector features corresponding to the positive sample according to the first training set;
selecting a target sample pair from the first training sample set or the second training sample set, and extracting a first sample and a second sample from the target sample pair;
acquiring vector features corresponding to the first sample as first vector features, and acquiring vector features corresponding to the second sample as second vector features;
labeling the first sample according to the distance between the first vector feature and the average vector feature, and labeling the second sample according to the distance between the second vector feature and the average vector feature;
inputting the first sample and the second sample into a pre-constructed neural network model to train the neural network model;
and when the neural network model training is completed, obtaining a diagnosis model.
It should be noted that, for convenience and brevity of description, specific working processes of the above-described apparatus and each module and unit may refer to corresponding processes in the foregoing embodiment of the training method of the diagnostic model of traditional Chinese medicine, and will not be described herein.
The apparatus provided by the above embodiments may be implemented in the form of a computer program which may be run on a computer device as shown in fig. 3.
Referring to fig. 3, fig. 3 is a schematic block diagram of a computer device according to an embodiment of the present application. The computer device includes, but is not limited to, a server.
As shown in fig. 3, the computer device 301 includes a processor 3011, a memory, and a network interface connected via a system bus, wherein the memory may include a storage medium 3012 and an internal memory 3015, and the storage medium 3012 may be non-volatile or volatile.
The storage medium 3012 may store an operating system and computer programs. The computer program comprises program instructions that, when executed, cause the processor 3011 to perform any of the methods of training a diagnostic model of TCM.
The processor 3011 is used to provide computing and control capabilities to support the operation of the overall computer device.
The internal memory 3015 provides an environment for the execution of a computer program in the storage medium 3012 that, when executed by the processor 3011, causes the processor 3011 to perform any of a number of diagnostic model training methods for traditional Chinese medicine.
The network interface is used for network communication such as transmitting assigned tasks and the like. It will be appreciated by those skilled in the art that the structure shown in fig. 3 is merely a block diagram of some of the structures associated with the present application and is not limiting of the computer device to which the present application may be applied, and that a particular computer device may include more or fewer components than shown, or may combine certain components, or have a different arrangement of components.
It is to be appreciated that the processor 3011 can be a central processing unit (Central Processing Unit, CPU), and that the processor 3011 can also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, or the like. Wherein the general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
Wherein in some embodiments the processor 3011 is configured to run a computer program stored in a memory to implement the steps of:
when an identification model training instruction is received, extracting a target pattern from the identification model training instruction;
searching first description information corresponding to the target certificate from a plurality of data sources, and acquiring second description information corresponding to other certificates except the target certificate from the plurality of data sources;
constructing a positive example sample pair according to the first descriptive information, and constructing a negative example sample pair according to the first descriptive information and the second descriptive information, wherein the data sources of the descriptive information in the positive example sample pair and the negative example sample pair are different;
data enhancement is carried out on the positive example sample pair to obtain a first training sample set, and data enhancement is carried out on the negative example sample pair to obtain a second training sample set;
and training a diagnosis model according to the entity characteristics, the similarity characteristics and the text characteristics of the first training sample set and the second training sample set.
In some embodiments, the processor 3011 is configured to, when performing data enhancement on the positive sample pair to obtain a first training sample set, implement:
Copying the positive sample pair to obtain a copy result, and identifying symptom entity information in the copy result to obtain an entity information list;
randomly extracting a plurality of target entity information from the entity information list, and acquiring matching synonyms corresponding to the target entity information;
and carrying out data enhancement on the positive sample pair according to the matching synonym so as to obtain a first training sample set.
In some embodiments, the processor 3011 is configured to, when performing data enhancement on the positive example sample pair according to the matching synonyms to obtain a first training sample set, implement:
replacing the target entity information in the copying result with the matching synonym to obtain a first enhancement sample of the positive sample pair;
and constructing a first training sample set according to the positive example sample pair and the first enhanced sample.
In some embodiments, the processor 3011 is configured to, when performing data enhancement on the positive example sample pair according to the matching synonyms to obtain a first training sample set, implement:
acquiring a target description statement containing the target entity information from the copy result;
Identifying matching description sentences corresponding to the matching synonyms from the first description information by using a pre-trained normalization model;
replacing the target description statement in the copy result with the matching description statement to obtain a second enhancement sample of the positive sample pair;
and constructing a first training sample set according to the positive example sample pair and the second enhanced sample.
In some embodiments, the processor 3011 is configured to implement, when constructing a negative example sample pair from the first description information and the second description information:
acquiring a node of the target pattern in a preset clinical term standard as a target node;
acquiring a node adjacent to the target node from the clinical term standard as a neighbor node, and acquiring a pattern corresponding to the neighbor node in the clinical term standard as a neighbor pattern;
and screening target description information corresponding to the neighbor certificate from the second description information, and constructing a negative example sample pair according to the first description information and the target description information.
In some embodiments, the processor 3011 is configured to implement, when training a diagnostic model based on the physical features, the similarity features, and the text features of the first training sample set and the second training sample set:
Extracting symptom entity information from each sample of the first training sample set, and calculating mutual information of the symptom entity information and the target syndrome;
constructing a diagnosis rule according to mutual information corresponding to each symptom entity information, and labeling the first training sample set and the second training sample set according to the diagnosis rule;
and after similarity feature labeling processing is carried out on the first training sample set and the second training sample set, training a diagnosis model by using the first training sample set and the second training sample set.
In some embodiments, the processor 3011 is configured to implement, when training a diagnostic model using the first training sample set and the second training sample set after performing similarity feature labeling processing on the first training sample set and the second training sample set:
extracting average vector features corresponding to the positive sample according to the first training set;
selecting a target sample pair from the first training sample set or the second training sample set, and extracting a first sample and a second sample from the target sample pair;
acquiring vector features corresponding to the first sample as first vector features, and acquiring vector features corresponding to the second sample as second vector features;
Labeling the first sample according to the distance between the first vector feature and the average vector feature, and labeling the second sample according to the distance between the second vector feature and the average vector feature;
inputting the first sample and the second sample into a pre-constructed neural network model to train the neural network model;
and when the neural network model training is completed, obtaining a diagnosis model.
It should be noted that, for convenience and brevity of description, the specific working process of the computer device may refer to the corresponding process in the foregoing embodiment of the training method of the diagnostic model of traditional Chinese medicine, which is not described herein again.
The embodiment of the application also provides a storage medium, which is a computer readable storage medium, and a computer program is stored on the computer readable storage medium, wherein the computer program comprises program instructions, and the method implemented when the program instructions are executed can refer to various embodiments of the traditional Chinese medicine diagnosis model training method.
The computer readable storage medium may be an internal storage unit of the computer device according to the foregoing embodiment, for example, a hard disk or a memory of the computer device. The computer readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like, which are provided on the computer device.
It is to be understood that the terminology used in the description of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
It should also be understood that the term "and/or" as used in this specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations. It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.
The foregoing embodiment numbers of the present application are merely for describing, and do not represent advantages or disadvantages of the embodiments. While the invention has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made and equivalents will be apparent to those skilled in the art without departing from the scope of the invention. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A method for training a diagnostic model of traditional Chinese medicine, the method comprising:
when an identification model training instruction is received, extracting a target pattern from the identification model training instruction;
searching first description information corresponding to the target certificate from a plurality of data sources, and acquiring second description information corresponding to other certificates except the target certificate from the plurality of data sources;
constructing a positive example sample pair according to the first descriptive information, and constructing a negative example sample pair according to the first descriptive information and the second descriptive information, wherein the data sources of the descriptive information in the positive example sample pair and the negative example sample pair are different;
Data enhancement is carried out on the positive example sample pair to obtain a first training sample set, and data enhancement is carried out on the negative example sample pair to obtain a second training sample set;
and training a diagnosis model according to the entity characteristics, the similarity characteristics and the text characteristics of the first training sample set and the second training sample set.
2. The method of claim 1, wherein the data enhancing the positive example sample pair to obtain a first training sample set comprises:
copying the positive sample pair to obtain a copy result, and identifying symptom entity information in the copy result to obtain an entity information list;
randomly extracting a plurality of target entity information from the entity information list, and acquiring matching synonyms corresponding to the target entity information;
and carrying out data enhancement on the positive sample pair according to the matching synonym so as to obtain a first training sample set.
3. The method of claim 2, wherein the data enhancing the positive example sample pair according to the matching synonym to obtain a first training sample set comprises:
replacing the target entity information in the copying result with the matching synonym to obtain a first enhancement sample of the positive sample pair;
And constructing a first training sample set according to the positive example sample pair and the first enhanced sample.
4. The method of claim 2, wherein the data enhancing the positive example sample pair according to the matching synonym to obtain a first training sample set comprises:
acquiring a target description statement containing the target entity information from the copy result;
identifying matching description sentences corresponding to the matching synonyms from the first description information by using a pre-trained normalization model;
replacing the target description statement in the copy result with the matching description statement to obtain a second enhancement sample of the positive sample pair;
and constructing a first training sample set according to the positive example sample pair and the second enhanced sample.
5. The method of claim 1, wherein constructing negative example pairs from the first description information and the second description information comprises:
acquiring a node of the target pattern in a preset clinical term standard as a target node;
acquiring a node adjacent to the target node from the clinical term standard as a neighbor node, and acquiring a pattern corresponding to the neighbor node in the clinical term standard as a neighbor pattern;
And screening target description information corresponding to the neighbor certificate from the second description information, and constructing a negative example sample pair according to the first description information and the target description information.
6. The method of claim 1, wherein the training a diagnostic model based on the physical, similarity, and text features of the first training sample set and the second training sample set comprises:
extracting symptom entity information from each sample of the first training sample set, and calculating mutual information of the symptom entity information and the target syndrome;
constructing a diagnosis rule according to mutual information corresponding to each symptom entity information, and labeling the first training sample set and the second training sample set according to the diagnosis rule;
and after similarity feature labeling processing is carried out on the first training sample set and the second training sample set, training a diagnosis model by using the first training sample set and the second training sample set.
7. The method of claim 6, wherein after performing similarity feature labeling processing on the first training sample set and the second training sample set, training a diagnostic model using the first training sample set and the second training sample set comprises:
Extracting average vector features corresponding to the positive sample according to the first training set;
selecting a target sample pair from the first training sample set or the second training sample set, and extracting a first sample and a second sample from the target sample pair;
acquiring vector features corresponding to the first sample as first vector features, and acquiring vector features corresponding to the second sample as second vector features;
labeling the first sample according to the distance between the first vector feature and the average vector feature, and labeling the second sample according to the distance between the second vector feature and the average vector feature;
inputting the first sample and the second sample into a pre-constructed neural network model to train the neural network model;
and when the neural network model training is completed, obtaining a diagnosis model.
8. A diagnostic model training device for chinese medicine, comprising:
the instruction receiving module is used for extracting a target pattern from the recognition model training instruction when the recognition model training instruction is received;
the data acquisition module is used for searching first description information corresponding to the target certificate from a plurality of data sources and acquiring second description information corresponding to other certificates except the target certificate from a plurality of data sources;
The data construction module is used for constructing a positive example sample pair according to the first descriptive information and constructing a negative example sample pair according to the first descriptive information and the second descriptive information, wherein the data sources of the descriptive information in the positive example sample pair and the negative example sample pair are different;
the data enhancement module is used for carrying out data enhancement on the positive example sample pair to obtain a first training sample set, and carrying out data enhancement on the negative example sample pair to obtain a second training sample set;
and the training module is used for training a diagnosis model according to the entity characteristics, the similarity characteristics and the text characteristics of the first training sample set and the second training sample set.
9. A computer device comprising a processor, a memory, and a computer program stored on the memory and executable by the processor, wherein the computer program when executed by the processor performs the steps of the method of training a diagnostic model of chinese medicine as claimed in any one of claims 1 to 7.
10. A computer readable storage medium, wherein a computer program is stored on the computer readable storage medium, wherein the computer program, when executed by a processor, implements the steps of the method for training a diagnostic model of chinese medicine according to any one of claims 1 to 7.
CN202310440693.5A 2023-04-14 2023-04-14 Training method, device, equipment and storage medium for traditional Chinese medicine diagnosis model Pending CN116469526A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310440693.5A CN116469526A (en) 2023-04-14 2023-04-14 Training method, device, equipment and storage medium for traditional Chinese medicine diagnosis model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310440693.5A CN116469526A (en) 2023-04-14 2023-04-14 Training method, device, equipment and storage medium for traditional Chinese medicine diagnosis model

Publications (1)

Publication Number Publication Date
CN116469526A true CN116469526A (en) 2023-07-21

Family

ID=87180387

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310440693.5A Pending CN116469526A (en) 2023-04-14 2023-04-14 Training method, device, equipment and storage medium for traditional Chinese medicine diagnosis model

Country Status (1)

Country Link
CN (1) CN116469526A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117744785A (en) * 2024-02-19 2024-03-22 北京博阳世通信息技术有限公司 Space-time knowledge graph intelligent construction method and system based on network acquisition data

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117744785A (en) * 2024-02-19 2024-03-22 北京博阳世通信息技术有限公司 Space-time knowledge graph intelligent construction method and system based on network acquisition data

Similar Documents

Publication Publication Date Title
Helmers et al. Automating the search for a patent’s prior art with a full text similarity search
CN112786194A (en) Medical image diagnosis guide inspection system, method and equipment based on artificial intelligence
WO2023029506A1 (en) Illness state analysis method and apparatus, electronic device, and storage medium
Shen et al. Enhancing ontology-driven diagnostic reasoning with a symptom-dependency-aware Naïve Bayes classifier
US10936962B1 (en) Methods and systems for confirming an advisory interaction with an artificial intelligence platform
Mayya et al. Multi-channel, convolutional attention based neural model for automated diagnostic coding of unstructured patient discharge summaries
Zhao et al. Knowledge guided distance supervision for biomedical relation extraction in Chinese electronic medical records
Gao et al. Using case-level context to classify cancer pathology reports
CN116469526A (en) Training method, device, equipment and storage medium for traditional Chinese medicine diagnosis model
Hellrich Word embeddings: reliability & semantic change
Chen et al. Encoding implicit relation requirements for relation extraction: A joint inference approach
WO2021174923A1 (en) Concept word sequence generation method, apparatus, computer device, and storage medium
Memarzadeh et al. A study into patient similarity through representation learning from medical records
Satti et al. Unsupervised semantic mapping for healthcare data storage schema
Amador-Domínguez et al. A case-based reasoning model powered by deep learning for radiology report recommendation
Chen et al. A multi-strategy approach for the merging of multiple taxonomies
Juckett et al. Concept detection using text exemplars aligned with a specialized ontology
US20210133627A1 (en) Methods and systems for confirming an advisory interaction with an artificial intelligence platform
Yang et al. Threshold-learned CNN for multi-label text classification of electronic health records
Rajathi et al. Named Entity Recognition-based Hospital Recommendation
Lu et al. Towards Semi-Structured Automatic ICD Coding via Tree-based Contrastive Learning
Wu Dilated convolution for enhanced extractive summarization: A GAN-based approach with BERT word embedding
Halim et al. Extracting disease-symptom relationships from health question and answer forum
Alsaidi et al. An analogy based framework for patient-stay identification in healthcare
Shi et al. Enhancing efficiency and capacity of telehealth services with intelligent triage: a bidirectional LSTM neural network model employing character embedding

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination