CN106874643A - Build the method and system that knowledge base realizes assisting in diagnosis and treatment automatically based on term vector - Google Patents
Build the method and system that knowledge base realizes assisting in diagnosis and treatment automatically based on term vector Download PDFInfo
- Publication number
- CN106874643A CN106874643A CN201611222893.XA CN201611222893A CN106874643A CN 106874643 A CN106874643 A CN 106874643A CN 201611222893 A CN201611222893 A CN 201611222893A CN 106874643 A CN106874643 A CN 106874643A
- Authority
- CN
- China
- Prior art keywords
- disease
- correlation factor
- correlation
- dictionary
- expression
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 239000013598 vector Substances 0.000 title claims abstract description 78
- 238000000034 method Methods 0.000 title claims abstract description 72
- 238000003745 diagnosis Methods 0.000 title claims abstract description 22
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims abstract description 560
- 201000010099 disease Diseases 0.000 claims abstract description 556
- 238000000605 extraction Methods 0.000 claims abstract description 31
- 238000001514 detection method Methods 0.000 claims abstract description 28
- 238000012360 testing method Methods 0.000 claims abstract description 11
- 238000012549 training Methods 0.000 claims description 44
- 239000011159 matrix material Substances 0.000 claims description 14
- 239000000284 extract Substances 0.000 claims description 12
- 238000013507 mapping Methods 0.000 claims description 10
- 238000010606 normalization Methods 0.000 claims description 8
- 230000011218 segmentation Effects 0.000 claims description 6
- 239000008186 active pharmaceutical agent Substances 0.000 claims description 4
- 238000006243 chemical reaction Methods 0.000 claims description 4
- 238000004140 cleaning Methods 0.000 claims description 4
- 238000011426 transformation method Methods 0.000 claims description 4
- 238000003780 insertion Methods 0.000 claims description 2
- 230000037431 insertion Effects 0.000 claims description 2
- 230000008569 process Effects 0.000 description 10
- 206010020741 Hyperpyrexia Diseases 0.000 description 7
- 208000021760 high fever Diseases 0.000 description 7
- 230000015654 memory Effects 0.000 description 6
- 206010037660 Pyrexia Diseases 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 230000000391 smoking effect Effects 0.000 description 4
- 208000024891 symptom Diseases 0.000 description 4
- 208000036071 Rhinorrhea Diseases 0.000 description 3
- 206010039101 Rhinorrhoea Diseases 0.000 description 3
- 238000010438 heat treatment Methods 0.000 description 3
- 241000894007 species Species 0.000 description 3
- 206010028748 Nasal obstruction Diseases 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 241000208340 Araliaceae Species 0.000 description 1
- 206010028980 Neoplasm Diseases 0.000 description 1
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 1
- 235000003140 Panax quinquefolius Nutrition 0.000 description 1
- 201000007100 Pharyngitis Diseases 0.000 description 1
- 208000027515 Tracheal disease Diseases 0.000 description 1
- 230000001154 acute effect Effects 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 235000019504 cigarettes Nutrition 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 206010012601 diabetes mellitus Diseases 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 235000008434 ginseng Nutrition 0.000 description 1
- 208000019622 heart disease Diseases 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000007935 neutral effect Effects 0.000 description 1
- 210000002741 palatine tonsil Anatomy 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000010998 test method Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
- G06N5/043—Distributed expert systems; Blackboards
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Public Health (AREA)
- Biomedical Technology (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Medical Informatics (AREA)
- Primary Health Care (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Pathology (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Evolutionary Computation (AREA)
- Epidemiology (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Medical Treatment And Welfare Office Work (AREA)
- Measuring And Recording Apparatus For Diagnosis (AREA)
Abstract
The method and system that knowledge base realizes assisting in diagnosis and treatment is built automatically based on term vector the present invention relates to a kind of.Wherein, the method can include:Obtain patient's description;Using the disorder correlation factor dictionary of the expansion set up based on term vector, Keywords matching is carried out to patient's description, extracted during patient describes with the related word of medical science and expression;Whether Detection and Extraction word out and expression are in standard disorder correlation factor dictionary;Based on testing result, the correlation for corresponding to disease with reference to the disease correlation factor obtained according to the disorder correlation factor dictionary for expanding is given a mark, and calculates the fraction of disease;Fraction to disease is ranked up;Determine disease according to ranking results.Thus, the present invention solve how the technical problem made prediction to the description of the spoken state of an illness of patient.
Description
Technical field
The present embodiments relate to technical field of data processing, knowledge is built based on term vector automatically more particularly, to one kind
Realize the method and system of assisting in diagnosis and treatment in storehouse.
Background technology
Along with the online question and answer website of many doctors and patients of internet medical field and the fast development of mobile phone application service, sea
The colloquial style description of the conditions of patients of amount and all kinds of integrated informations, and corresponding diagnosis result constitutes question and answer pair,
Form the interrogation knowledge base of preciousness.Because these records are often unstructured data, and there are a large amount of colloquial style descriptions
, directly can there is lot of challenges using these data in caused non-standard medical terminology.At the same time, the patient of online interrogation
There are a large amount of repeated works in case, this is a kind of waste for valuable doctor's human resources.If can be calculated using artificial intelligence
Method makes preliminary diagnostic result instead of doctor, will greatly promote interrogation efficiency.This task can be summarized as:It is new to one defeated
Description of the patient for entering on integrated informations such as itself sex, age, symptom, history of disease, using Sentence analysis and related algorithm,
With reference to the advance domain knowledge collection of illustrative plates for building, the medical diagnosis on disease prediction of result of patient is returned.
Existing technical scheme mainly has following two methods:1st, by searching in question and answer storehouse and patient's description similarity most
Problem high, returns to corresponding diagnosis result.The subject matter of this kind of method is during inreal analysis patient describes
The disease information of appearance, the similarity of text can not completely reflect the similarity of conditions of patients, and matching accuracy is not good enough.2nd, lead to
Cross patient and click the information such as symptom and the disease sites related to the state of an illness, the information labels correspondence disease that superposition expert marks in advance
Marking, finally returns that a probability sorting that may be ill.The problem of this kind of method is that artificial marking exists greatly unstable
Property and subjectivity, and need mark disease quantity it is big when to expend substantial amounts of manpower and time cost, in addition, right
Information outside optional symptom, diagnostic system cannot analysis and utilization.
In view of this, it is special to propose the present invention.
The content of the invention
In order to solve above mentioned problem of the prior art, it has been and has solved how to make pre- to the description of the spoken state of an illness of patient
The technical problem of survey, embodiment of the present invention offer is a kind of to build the method that knowledge base realizes assisting in diagnosis and treatment based on term vector automatically.
Additionally, the embodiment of the present invention is also provided and a kind of is built the system that knowledge base realizes assisting in diagnosis and treatment automatically based on term vector.
To achieve these goals, according to an aspect of the present invention, there is provided following technical scheme:
A kind of to build the method that knowledge base realizes assisting in diagnosis and treatment automatically based on term vector, the method includes:
Obtain patient's description;
Using the disease-disease correlation factor dictionary of the expansion set up based on term vector, keyword is carried out to patient's description
Matching, extracts during patient describes with the related word of medical science and expression;
Whether Detection and Extraction word out and expression are in standard disease-disease correlation factor dictionary;
Based on testing result, with reference to the disease correlation factor pair obtained according to the disease-disease correlation factor dictionary for expanding
Should be given a mark in the correlation of disease, calculate the fraction of disease;
Fraction to disease is ranked up;
Determine disease according to ranking results.
Further, the disease of expansion-disease correlation factor dictionary can be set up in the following manner:
Model is represented using term vector embedding distribution formula of the medical information training on disease-disease correlation factor;
Model is represented based on term vector embedding distribution formula, using distance metric method to standard disease-disease correlation factor
Dictionary is expanded, and sets up disease, the disease correlation factor dictionary for expanding.
Further, represented using term vector embedding distribution formula of the medical information training on disease-disease correlation factor
Model, can specifically include:
Obtain medical information training corpus;
Medical information training corpus is cleaned;
Count the high frequency expression way occurred in question and answer storehouse records, power of the increase high frequency expression way in participle model
Weight, and Chinese word segmentation is carried out, obtain training text;
Training text is trained, generation term vector embedding distribution formula represents model.
Further, disease correlation factor can determine in the following manner corresponding to the correlation marking of disease:
Model is represented based on term vector embedding distribution formula, using distance metric method to standard disease-disease correlation factor
Dictionary is expanded, and is set up and is replaced vocabulary;
Use the disease-disease correlation factor dictionary and replacement vocabulary that expand, the disease-disease phase in matching medical information
The factor is closed, the correlation marking that disease correlation factor corresponds to disease is calculated.
Further, the disease-disease correlation factor dictionary and replacement vocabulary for expanding, the disease in matching medical information are used
Disease-disease correlation factor, calculates the correlation marking that disease correlation factor corresponds to disease, can specifically include:
Using the disease-disease correlation factor dictionary for expanding, the matching of keyword is carried out to doctors and patients' Question Log, extract doctor
Suffer from Question Log with the related word of medical science and expression;
With the related word of medical science and expression whether in standard disease-disease phase in doctors and patients' Question Log that Detection and Extraction go out
In the factor dictionary of pass;
If not existing, according to vocabulary is replaced, with the related word of medical science and expression in the doctors and patients' Question Log that will be extracted
Normalize to during corresponding standard scale reaches;
Reached based on standard scale, the frequency of statistics disease and its correlation factor co-occurrence obtains disease correlation factor and disease
Co-occurrence frequency records matrix;
Co-occurrence frequency record matrix based on disease correlation factor and disease, using non-linear transformation method, obtains disease
The correlation that correlation factor corresponds to disease is given a mark.
Further, the method can also include:
Model is represented based on term vector embedding distribution formula, using distance metric method to standard disease-disease correlation factor
Dictionary is expanded, and is set up and is replaced vocabulary;
Whether Detection and Extraction word out and expression specifically include in standard disease-disease correlation factor dictionary:
If being not detected by, according to vocabulary is replaced, the word and expression that will be extracted normalize to corresponding standard scale
In reaching, obtain standardizing disease correlation factor;
Based on testing result, with reference to the disease correlation factor pair obtained according to the disease-disease correlation factor dictionary for expanding
Should be given a mark in the correlation of disease, calculate the fraction of disease, specifically included:
Based on standardization disease correlation factor, with reference to the disease obtained according to the disease-disease correlation factor dictionary for expanding
The correlation that correlation factor corresponds to disease is given a mark, and calculates the fraction of disease.
Further, disease correlation factor can be determined corresponding to the correlation marking of disease by following formula:
Wherein, Score (i, j) represents that disease correlation factor corresponds to the correlation marking of disease;P(Di|Fj) represent suffer from
The conditional probability of disease;DiRepresent disease;FjRepresent disease correlation factor;NiRepresent disease frequency, Ni=∑jNij, NijRepresent note
Record frequency.
Further, the fraction of disease can be obtained by following formula:
Wherein, DS (Di) represent disease fraction;DiRepresent disease;W(Fj) represent disease category mapping weights;Score
(i, j) represents that disease correlation factor corresponds to the correlation marking of disease.
To achieve these goals, according to another aspect of the present invention, following technical scheme is additionally provided:
A kind of to build the system that knowledge base realizes assisting in diagnosis and treatment automatically based on term vector, the system can include:
Acquisition module, for obtaining patient's description;
Extraction module, for the disease-disease correlation factor dictionary using the expansion set up based on term vector, is retouched to patient
Stating carries out Keywords matching, extracts during patient describes with the related word of medical science and expression;
Detection module, for Detection and Extraction word out and expression whether in standard disease-disease correlation factor dictionary
In;
Computing module, for based on testing result, with reference to the disease obtained according to the disease-disease correlation factor dictionary for expanding
The correlation that sick correlation factor corresponds to disease is given a mark, and calculates the fraction of disease;
Order module, is ranked up for the fraction to disease;
Determining module, for determining disease according to ranking results.
Further, extraction module can also specifically include:
Term vector model sets up unit, for training the term vector on disease-disease correlation factor using medical information
Embedding distribution formula represents model;
Extended lexicon sets up unit, for representing model based on term vector embedding distribution formula, uses distance metric method pair
Standard disease-disease correlation factor dictionary is expanded, and sets up disease, the disease correlation factor dictionary for expanding.
Further, term vector model is set up unit and can specifically be included:
Acquiring unit, for obtaining medical information training corpus;
Cleaning unit, for being cleaned to medical information training corpus;
First statistic unit, for counting the high frequency expression way occurred in question and answer storehouse records, increase high frequency expression side
Weight of the formula in participle model, and Chinese word segmentation is carried out, obtain training text;
Generation unit, for being trained to training text, generation term vector embedding distribution formula represents model.
Further, computing module can also specifically include:
First replacement vocabulary sets up unit, for representing model based on term vector embedding distribution formula, using distance metric side
Method expands standard disease-disease correlation factor dictionary, sets up and replaces vocabulary;
Correlation marking computing unit, for using the disease-disease correlation factor dictionary and replacement vocabulary for expanding, matching
Disease-disease correlation factor in medical information, calculates the correlation marking that disease correlation factor corresponds to disease.
Further, correlation marking computing unit can specifically include:
Extraction unit, for using the disease-disease correlation factor dictionary for expanding, keyword being carried out to doctors and patients' Question Log
Matching, extract in doctors and patients' Question Log with the related word of medical science and expression;
Detection unit, with the related word of medical science and expression whether in mark in the doctors and patients' Question Log gone out for Detection and Extraction
In quasi- disease-disease correlation factor dictionary;
First normalization unit, for word and expression not in standard disease-disease correlation factor dictionary when, according to
Vocabulary is replaced, corresponding standard scale is normalized to and is reached with the related word of medical science and expression in the doctors and patients' Question Log that will be extracted
In;
Second statistic unit, for being reached based on standard scale, the frequency of statistics disease and its correlation factor co-occurrence obtains disease
The co-occurrence frequency record matrix of correlation factor and disease;
Non-linear conversion unit, matrix is recorded for the co-occurrence frequency based on disease correlation factor and disease, uses non-thread
Property transform method, obtain disease correlation factor corresponding to disease correlation give a mark.
Further, the system includes:
Second replacement vocabulary sets up unit, for representing model based on term vector embedding distribution formula, using distance metric side
Method expands standard disease-disease correlation factor dictionary, sets up and replaces vocabulary;
Above-mentioned detection module can specifically include:
Second normalization unit, in the word that extracts and expression not in standard disease-disease correlation factor word
When in allusion quotation, according to replacing vocabulary, the word and expression that will be extracted are normalized to during corresponding standard scale reaches, and are standardized
Disease correlation factor;
Above-mentioned computing module can specifically include:
Disease Score computing unit, for based on standardization disease correlation factor, with reference to according to the disease-disease phase for expanding
The correlation that the disease correlation factor that pass factor dictionary is obtained corresponds to disease is given a mark, and calculates the fraction of disease.
Embodiment of the present invention offer is a kind of to build the method and system that knowledge base realizes assisting in diagnosis and treatment based on term vector automatically.
Wherein, the method can include:Obtain patient's description;Using the disease-disease correlation factor of the expansion set up based on term vector
Dictionary, Keywords matching is carried out to patient's description, is extracted during patient describes with the related word of medical science and expression;Detection and Extraction go out
Whether the word for coming and expression are in standard disease-disease correlation factor dictionary;Based on testing result, with reference to according to the disease for expanding
The correlation that the disease correlation factor that disease-disease correlation factor dictionary is obtained corresponds to disease is given a mark, and calculates the fraction of disease;It is right
The fraction of disease is ranked up;Determine disease according to ranking results.Wherein, the embodiment of the present invention is utilized for medical domain training
Term vector it is distributed represent, set up the disease-disease correlation factor keyword dictionary for expanding, it is possible to use including general medical science
, in interior multi-source medical information, study builds disease knowledge collection of illustrative plates, analysis for data and colloquial internet doctors and patients Question Log
Treatment nonstandardized technique, the description of colloquial conditions of patients, thus, the present invention is solved and how the description of the spoken state of an illness of patient done
Go out the technical problem of prediction.
Brief description of the drawings
In order to illustrate more clearly of present example or technical scheme of the prior art, below will be to embodiment or existing
The accompanying drawing to be used needed for technology description does simply to be introduced, it should be apparent that, drawings in the following description are only this hair
Some bright embodiments, for those of ordinary skill in the art, on the premise of not paying creative work, can be with root
Other accompanying drawings are obtained according to these accompanying drawings.
Fig. 1 is according to embodiments of the present invention to build the stream that knowledge base realizes the method for assisting in diagnosis and treatment automatically based on term vector
Journey schematic diagram;
Fig. 2 is according to embodiments of the present invention to build the knot that knowledge base realizes the system of assisting in diagnosis and treatment automatically based on term vector
Structure schematic diagram.
Specific embodiment
The preferred embodiment of the present invention described with reference to the accompanying drawings.It will be apparent to a skilled person that this
A little implementation methods are used only for explaining know-why of the invention, it is not intended that limit the scope of the invention.
The basic thought of the embodiment of the present invention is to use term vector embedded technology, and generation is to general medical information and online doctor
Suffer from the distributed expression of disease-disease correlation factor in colloquial style Question Log, patient cases' database, build automatically disease-
The knowledge mapping of disease correlation factor, and then realize the auxiliary diagnosis to the description of patient's colloquial style state of an illness.
Need the term of explanation or be defined as follows:
Disease correlation factor:May cause, help judge or the various factors containing certain disease information, such as:Disease disease
Shape, history of disease, age, morbidity sign, sex etc..
Term vector is embedded in:The method for utilizing " Distributed Representation ", by a word (or phrase)
With the continuous real number vector representation of low dimensional (for example, less than 1000 dimensions), such that it is able to being distinguished with these vectors or being represented
The task of the natural language processings such as these words, treatment text classification, relation extraction.
Co-occurrence frequency:In a paragraph or document, certain several word or concept are while go out now referred to as once common
It is existing, count the appearance number of these words in representational whole documents, i.e. co-occurrence frequency.
Embodiment of the present invention offer is a kind of to build the method that knowledge base realizes assisting in diagnosis and treatment based on term vector automatically.Such as Fig. 1
Shown, the method can include:
S100:Obtain patient's description.
S110:Using the disease-disease correlation factor dictionary of the expansion set up based on term vector, patient's description is closed
Keyword is matched, and is extracted during patient describes with the related word of medical science and expression.
Wherein, the disease of expansion-disease correlation factor dictionary is set up by step S112 to step S114.
S112:Model is represented using term vector embedding distribution formula of the medical information training on disease-disease correlation factor.
Wherein, medical information includes but is not limited to general medical information, doctors and patients' Question Log, patient cases and medical science disease
Relevant text data of disease, disease correlative factor etc..General medical information includes but is not limited to medical literature (for example:Medical science opinion
Text, medical science patent document), textbook (especially medical text books), medical thesis.
Preferably, doctors and patients' Question Log is the online doctors and patients' colloquial style Question Log of network.
Specifically, step S112 can include:
S1121:Obtain medical information training corpus.
Wherein, medical information training corpus can include but is not limited to question and answer storehouse, medical text books, case storehouse etc..
S1122:Medical information training corpus is cleaned.
The purpose of this step is to remove meaningless character.
S1123:The high frequency expression way occurred in question and answer storehouse records is counted, increase high frequency expression way is in participle model
In weight, and carry out Chinese word segmentation, obtain training text.
S1124:Training text is trained, generation term vector embedding distribution formula represents model.
In the training process, it is possible to use training corpus include but is not limited to online doctors and patients' Question Log, patient history,
Textbook.The embodiment of the present invention uses but is not limited to the word2vec Open-Source Tools (https that Mikolov Tomasti are proposed://
Github.com/danielfrg/word2vec) the term vector insertion of training generation medical domain represents model, and is preserved
In knowledge base.In training process, it is possible to use neutral net or other training algorithms.The method of relevant term vector training can be with
Referring to Application No.:201610179115.0th, the document of 201510096570.X, the document is hereby incorporated by reference herein
This.Correlation table shows that the experiment of learning areas paper shows that bigger training corpus can obtain more preferably term vector.
For example, in actual applications, it is possible to use participle simultaneously cleans the dimension of text data training 300, hundreds of thousands kind table
The term vector stated, which part high frequency words are for example expressed as:
<Tumour:0.176907,0.470268, -0.008468 ... 300 ties up totally>
<Blood sugar:0.149234,0.278761, -0.474681 ... 300 ties up totally>
<Fever:0.184283,0.046142, -0.107758 ... 300 ties up totally>
<Have a high fever:0.204092,0.089622,0.0057266 ... totally 300 tie up>
<Hyperpyrexia:0.366153,0.314256,0.073571 ... totally 300 tie up>
Can also include in some optional implementations, the step of be trained to training text:It is trained text
The low-dimensional real number vector representation of this medium-high frequency word.
Wherein, low-dimensional can be set according to actual conditions, for example, can set less than 1000 dimensions.
S114:Model is represented based on term vector embedding distribution formula, it is related to standard disease-disease using distance metric method
Factor dictionary is expanded, and sets up disease, the disease correlation factor dictionary for expanding.
It will be apparent to one skilled in the art that can be with during the disease, the disease correlation factor dictionary that expand is set up
Set up and replace vocabulary, i.e., model is represented based on term vector embedding distribution formula, using distance metric method to standard disease-disease phase
Close factor dictionary to be expanded, set up and replace vocabulary.
Wherein, medical expert combines specific prediction task, and it is ginseng to construct the standard disease-disease correlation factor dictionary safeguarded
The textbook and required standard of authority are examined, the disease formulated by medical expert and corrected and safeguard and disease factor include collection
Close, it is the term set of standard disease-disease correlation factor, its need the disease specific that combination to be predicted and disease symptomses,
The relevant informations such as history of disease, age, morbidity sign, sex are arranged and safeguarded.For example, heart disease, depression can be as
Element in two standard diseases dictionary (set), and have a sleepless night, diabetic history can be as two standard disease correlation factor words
Element in allusion quotation (set).
Distance metric method includes but is not limited to cosine (cosine) distances, Euclidean distance or other distance metric sides
Method.
For each element in standard disease-disease correlation factor dictionary, calculated using distance metric method, and
Closest k word or phrase expression way in term vector vocabulary is found, standard disease-disease correlation factor is recorded as
The replacement of element in dictionary.Thus isomery expression way to the replacement vocabulary of standard expression way is set up, while set up one knowing
Know the disease-disease correlation factor dictionary of the expansion in storehouse.I.e.:Each interchangeable element is added to primary standard disease-disease
In correlation factor dictionary, the disease-disease correlation factor dictionary for expanding is formed.Wherein, represent can for specific tasks and data for k
With the parameter for adjusting.
Described in detail by having " heating " in classical symptom correlation factor as a example by this below by preferred embodiment
The process of the disease for being expanded-disease correlation factor dictionary and replacement vocabulary, it is specifically included:Step A1 to step A3.
Step A1:Calculated and " heating " closest word or phrase expression way using cosine distances, obtained
" having a high fever " and " hyperpyrexia ".Wherein, distance parameter k is 2.
Step A2:In the disease-disease correlation factor dictionary for expanding, add " having a high fever " and " hyperpyrexia ", while record
" having a high fever " and " hyperpyrexia " is included in the replacement vocabulary of " heating " this standard disease correlation factor.
Step A3:Model is represented using the medical field term vector embedding distribution formula for training, to standard disease-disease phase
Each element closed in factor dictionary performs same operation, so as to the disease-disease correlation factor dictionary for being expanded and replaces
Change vocabulary.
S120:Whether Detection and Extraction word out and expression are in standard disease-disease correlation factor dictionary.
In this step, if detecting the word for extracting and expressing in standard disease-disease correlation factor dictionary,
Do not processed;If being not detected by, according to vocabulary is replaced, the word and expression that will be extracted normalize to corresponding standard
In expression, obtain standardizing disease correlation factor.Wherein, replace vocabulary and represent model by based on term vector embedding distribution formula,
Standard disease-disease correlation factor dictionary is expanded using distance metric method and is set up and obtained.
The above-mentioned process step that do not carry out represents related using the standardization disease in standard disease-disease correlation factor dictionary
The factor carries out subsequent treatment.
S130:Based on testing result, with reference to the disease obtained according to the disease-disease correlation factor dictionary for expanding it is related because
Son is given a mark corresponding to the correlation of disease, calculates the fraction of disease.
In the present embodiment, when the word and expression for being not detected by extracting are in standard disease-disease correlation factor dictionary
When middle, according to replacing vocabulary, the word and expression that will be extracted are normalized to during corresponding standard scale reaches, and obtain standardizing disease
Sick correlation factor;Based on standardization disease correlation factor, with reference to the disease obtained according to the disease-disease correlation factor dictionary for expanding
The correlation that sick correlation factor corresponds to disease is given a mark, and calculates the fraction of disease.When detecting the word and expression that extract
When in standard disease-disease correlation factor dictionary, the standardization disease phase in standard disease-disease correlation factor dictionary is used
The factor is closed, the correlation of disease is corresponded to reference to the disease correlation factor obtained according to the disease-disease correlation factor dictionary for expanding
Property marking, calculate disease fraction.
Wherein, disease correlation factor is corresponded to the correlation of disease and gives a mark and determined to step S134 by step S132.
S132:Model is represented based on term vector embedding distribution formula, it is related to standard disease-disease using distance metric method
Factor dictionary is expanded, and is set up and is replaced vocabulary.
S134:Using expand disease-disease correlation factor dictionary and replace vocabulary, matching medical information in disease-
Disease correlation factor, calculates the correlation marking that disease correlation factor corresponds to disease.
Specifically, step S134 can include:
S1341:Using the disease-disease correlation factor dictionary for expanding, the matching of keyword is carried out to doctors and patients' Question Log,
With the related word of medical science and expression in extraction doctors and patients' Question Log.
In one preferred embodiment, this step can utilize the disease-disease correlation factor dictionary for expanding, to doctor
Suffering from the description of the state of an illness in question and answer storehouse and diagnostic result carries out the matching of keyword, extracts related with medical science in doctors and patients' Question Log
Word and expression.
S1342:In doctors and patients' Question Log that Detection and Extraction go out with the related word of medical science and expression whether standard disease-
In disease correlation factor dictionary.If performing step S1343;Otherwise, step S1344 is performed.
This step one by one Detection and Extraction correlation word out and expression whether standard disease-disease correlation factor word
In allusion quotation, if not processed especially if;If it was not then being normalized to during corresponding standard scale reaches according to vocabulary is replaced.
S1343:Do not processed.
This step represents to be reached using the standard scale in standard disease-disease correlation factor dictionary carries out subsequent treatment.
S1344:According to vocabulary is replaced, with the related word of medical science and expression normalizing in the doctors and patients' Question Log that will be extracted
Change to corresponding standard scale in reaching.
Above-mentioned steps S1344 can also include:When word multiple standard disease corresponding with expression or disease correlation factor,
Carry out the standardization of medical science correlation word and expression.
Specifically, when a certain expression correspondence multiple standard disease or disease correlation factor, it is determined that with expression distance most
Near standard correlation factor replaces the expression, obtains corresponding to the standardization disease correlation factor of patient description.
As an example, when certain word and expression correspond to more than one standard disease or disease correlation factor, using
But it is not limited to cosine distances or Euclidean distance to calculate and find standard concept closest therewith, for replacing current table
Up to mode, that is, carry out the standardization of medical science correlation word and expression.
For example, when certain expression correspond to more than one standard disease or disease correlation factor when, using but do not limit
Calculate and find standard correlation factor closest therewith in cosine distances or Euclidean distance, for replacing current expression
Mode.The input content that operation has obtained for this patient, comprising Q standardized disease correlation factor:{F1,
F2,...Fj...FQ}。
S1345:Reached based on standard scale, statistics disease and its correlation factor co-occurrence frequency, obtain disease correlation factor and
The co-occurrence frequency record matrix of disease.
Two kinds of elements are included in standard disease-disease correlation factor dictionary:Disease and disease correlation factor.For example,
For m kind diseases, { D is defined as1...D2...Di...Dm, for n kind disease correlation factors, it is defined as
{F1...F...Fj...Fn};By NijIt is initialized as zero.{ R is recorded in P bar question and answer storehouse1...R2...RS...RPIn, if RsIn
Occur in that D simultaneouslyiAnd Fj, by NijThe frequency for increasing by 1, i.e. certain disease and certain disease correlation factor co-occurrence is recorded once.P bars are remembered
Record is counted, and can obtain the disease correlation factor of m × n and the co-occurrence frequency record matrix of disease.
Wherein, P represents question and answer storehouse record strip number;R1,R2...Rs...RPRepresent question and answer storehouse record;NijRepresent record frequency.
S1346:Co-occurrence frequency record matrix based on disease correlation factor and disease, using non-linear transformation method, obtains
The correlation for corresponding to disease to disease correlation factor is given a mark.
This step is in specific implementation process, it is contemplated that:In being recorded at certain, it is known that disease correlation factor FjOccur, that
Suffer from disease DiConditional probability beAlthough conditional probability can be anti-to a certain extent
Disease correlation factor to the possibility of disease is reflected, but is easily influenceed by the cumulative effect of high frequency common disease, caused in record
Occurrence number amount common disease higher obtains high conditional probability.So, in final scoring functions should also include one with
Ni=∑jNijRelevant control parameter.It is similarly to the inverse document frequency thought used in document classification field.
Preferably, disease correlation factor can be determined corresponding to the correlation marking of disease by following formula:
Wherein, Score (i, j) represents that disease correlation factor corresponds to the correlation marking of disease;P(Di|Fj) represent suffer from
The conditional probability of disease;DiRepresent disease;FjRepresent disease correlation factor;NiRepresent disease frequency, Ni=∑jNij, NijRepresent note
Record frequency.
Above formula contains conditional probability and a nonlinear transformation reciprocal to disease frequency.Final each disease it is related because
Son at least one relevant disease of correspondence, corresponding fraction is represented with Score (i, j).
Above-mentioned steps are by using the disease-disease correlation factor dictionary for expanding, the disease-disease in matching medical information
Correlation factor, calculates and disease correlation factor is stored in knowledge mapping and given a mark to the correlation of disease, can automatically learn structure
Build the knowledge mapping for predictive disease.
In a preferred embodiment, can also include after step S1346:Periodically surveyed by A/B method of testings
Examination scoring functions, and update correlation marking of the disease correlation factor corresponding to disease.
This step is considered that the quality of data, the quantity in original question and answer storehouse can all give a mark to the correlation of disease correlation factor and is produced
The certain influence of life, at the same time, online medical interrogation platform can produce a large amount of new records daily.So, will be relevant
In the scoring functions of the disease associated factor are stored in offline knowledge base, by online A/B tests Selection effect more periodically
Good scoring functions version connection is reached the standard grade.
The training learning data of each version will be individually formed a disease correlation factor and be beaten to the correlation of disease
Point versions of data, due to the marking of this correlation, to be not fully equal to certain disease factor priori related to disease general
Rate, thus medical expert for give a mark evaluation only have referential, and its whether can be lifted disease determination accuracy and
Friendliness using as final evaluation index, and whether the foundation that other versions are changed.
In construction of knowledge base process, with reference to existing knowledge base, it is possible to achieve the state of an illness being input into for patient and basic letter
Breath description is analyzed, and provides the function of the disease that may be suffered from.
Described in detail with a preferred embodiment below and obtained what disease correlation factor was given a mark to the correlation of disease
Process.Wherein, " throat swells and ache ", " flu " and " nasal obstruction runny nose " is in normal dictionary.The process for obtaining correlation marking can be with
Including step B1 to step B5.
Step B1:Obtain in original question and answer storehouse one " I swells and ache at throat, has a high fever always in the past few days, have a stuffy nose runny nose, please
Ask doctor I what disease obtained " and " may suffer from catch a cold " question and answer pair.
Step B2:To the question and answer to processing, " throat swells and ache ", " having a high fever ", " nasal obstruction runny nose " and " sense are matched
Emit ".
Step B3:According to step S121 and step S122, replaced with " fever " using replacing vocabulary and " will have a high fever ".
Step B4:3 disease correlation factors and 1 disease are matched one by one, the frequency of statistics disease and correlation factor co-occurrence
Number, obtains the co-occurrence frequency record matrix of disease correlation factor and disease.
Step B5:Determine that disease correlation factor corresponds to the correlation marking of disease according to following formula:
Wherein, Score (i, j) represents that disease correlation factor corresponds to the correlation marking of disease;P(Di|Fj) represent suffer from
The conditional probability of disease;DiRepresent disease;FjRepresent disease correlation factor;NiRepresent disease frequency, Ni=∑jNij, NijRepresent note
Record frequency.
In a preferred embodiment, the fraction of disease can be obtained by following formula:
Wherein, DS (Di) represent disease fraction;DiRepresent disease;W(Fj) represent disease category mapping weights;Score
(i, j) represents that disease correlation factor corresponds to the correlation marking of disease.
For example, in { F1,F2,...Fj...FQIn, relevant factor F is described with patient for eachj, with reference to mark
The species of standardization disease correlation factor, using following formula in each associated disease DiUpper superposition marking:
Wherein, DS (Di) represent disease fraction;DiRepresent disease;W(Fj) represent disease category mapping weights;Score
(i, j) represents that disease correlation factor corresponds to the correlation marking of disease.
In above formula, because the different factors are different for the judgement confidence level of disease forecasting, the species according to the factor is not
Together, different disease category mapping weights are assigned.Wherein, mapping relations can be formulated by expert according to category attribute.For example:
" smoking habit " belongs to the disease correlation factor of habits and customs classification;" fever " belongs to the disease correlation factor of disease symptomses class,
When being calculated, the weights of classification are determined, weights are mapped using different disease categories.
S140:Fraction to disease is ranked up.
S150:Determine disease according to ranking results.
Describe the marking sequence for obtaining doubting using the embodiment of the present invention and suffering from the disease in detail with a preferred embodiment below.Its
In, the state of an illness of patient is described as " continuously occur fervescence in the past few days, there is smoking habit, be what disease ".Given a mark
The process of sequence can include step C1 to step C7.
Step C1:Using the disease-disease correlation factor dictionary for expanding, to " continuously occurring fervescence in the past few days, there is suction
Cigarette is accustomed to, and is what disease " matching of keyword is carried out, extract " hyperpyrexia " and " smoking habit ".
Step C2:Detect " hyperpyrexia " to be present in the disease of expansion-disease correlation factor dictionary, without in standard disease
In disease-disease correlation factor dictionary.
Step C3:According to vocabulary is replaced, " hyperpyrexia " is replaced with " fever ".
Step C4:Species according to belonging to " fever " and " smoking habit ", determines mapping weights respectively.
Step C5:Fraction of the patient with various disease is determined according to following formula:
Step C6:Fraction to various disease is ranked up.
Step C7:Output comes the disease of front three:<Acpuei pharyngitis:0.143531>、<Acute tonsil enlargement:
0.129281>、<Tracheal disease:0.062088>.
Although each step is described according to the mode of above-mentioned precedence in above-described embodiment, this area
Technical staff is appreciated that to realize the effect of the present embodiment, not necessarily in the execution of such order between different steps,
It (parallel) execution simultaneously or can be performed with the order for overturning, these simple changes all protection scope of the present invention it
It is interior.
Based on above method embodiment identical technology design, the embodiment of the present invention provides a kind of automatic based on term vector
Build the system that knowledge base realizes assisting in diagnosis and treatment.This is based on term vector and builds knowledge base automatically realizing that the system of assisting in diagnosis and treatment can be with
Execution is above-mentioned to build the embodiment of the method that knowledge base realizes assisting in diagnosis and treatment based on term vector automatically.As shown in Fig. 2 the system 20 can
To include:Acquisition module 21, extraction module 22, detection module 23, computing module 24, order module 25 and determining module 26.Its
In, acquisition module 21 is used to obtain patient's description.Extraction module 22 is used for the disease-disease using the expansion set up based on term vector
Sick correlation factor dictionary, Keywords matching is carried out to patient's description, is extracted during patient describes with the related word of medical science and expression.
Whether detection module 23 is used for Detection and Extraction word out and expression in standard disease-disease correlation factor dictionary.Calculate
Module 24 is used to be based on testing result, with reference to the disease correlation factor obtained according to the disease-disease correlation factor dictionary for expanding
Corresponding to the correlation marking of disease, the fraction of disease is calculated.Order module 25 is used to be ranked up the fraction of disease.It is determined that
Module 26 is used to determine disease according to ranking results.
In a preferred embodiment, extraction module can also specifically include:Term vector model sets up unit and expansion
Dictionary sets up unit.Wherein, term vector model set up unit for using medical information train on disease-disease correlation factor
Term vector embedding distribution formula represent model.Extended lexicon sets up unit for representing model based on term vector embedding distribution formula,
Standard disease-disease correlation factor dictionary is expanded using distance metric method, set up the disease that expands, disease it is related because
Sub- dictionary.
In a preferred embodiment, term vector model is set up unit and can specifically be included:Acquiring unit, cleaning are single
Unit, the first statistic unit and generation unit.Wherein, acquiring unit is used to obtain medical information training corpus.Cleaning unit is used for
Medical information training corpus is cleaned.First statistic unit is used to count the high frequency expression side occurred in question and answer storehouse records
Formula, weight of the increase high frequency expression way in participle model, and Chinese word segmentation is carried out, obtain training text.Generation unit is used
It is trained in training text, generation term vector embedding distribution formula represents model.
In a preferred embodiment, computing module can also specifically include:First replacement vocabulary sets up unit and phase
Closing property marking computing unit.Wherein, the first replacement vocabulary sets up unit for representing model based on term vector embedding distribution formula, makes
Standard disease-disease correlation factor dictionary is expanded with distance metric method, is set up and is replaced vocabulary.Correlation marking is calculated
Unit is used for using the disease-disease correlation factor dictionary for expanding and replaces vocabulary, the disease-disease phase in matching medical information
The factor is closed, the correlation marking that disease correlation factor corresponds to disease is calculated.
In a preferred embodiment, correlation marking computing unit can specifically include:Extraction unit, detection are single
Unit, the first normalization unit, the second statistic unit and non-linear conversion unit.Wherein, extraction unit is used for using the disease for expanding
Disease-disease correlation factor dictionary, the matching of keyword is carried out to doctors and patients' Question Log, with medical science phase in extraction doctors and patients' Question Log
The word of pass and expression.It is with the related word of medical science and expression in doctors and patients' Question Log that detection unit goes out for Detection and Extraction
It is no in standard disease-disease correlation factor dictionary.First normalization unit is used in word and expression not in standard disease-disease
When in sick correlation factor dictionary, according to vocabulary is replaced, with the word and table of medical science correlation in the doctors and patients' Question Log that will be extracted
Up to normalizing to during corresponding standard scale reaches.Second statistic unit is used to be reached based on standard scale, counts disease and its correlation factor
The frequency of co-occurrence, obtains the co-occurrence frequency record matrix of disease correlation factor and disease.Non-linear conversion unit is used to be based on disease
The co-occurrence frequency record matrix of sick correlation factor and disease, using non-linear transformation method, obtains disease correlation factor and corresponds to
The correlation marking of disease.
In a preferred embodiment, the system can also include:Second replacement vocabulary sets up unit;Second replacement
Vocabulary sets up unit for representing model based on term vector embedding distribution formula, using distance metric method to standard disease-disease
Correlation factor dictionary is expanded, and is set up and is replaced vocabulary.Detection module can also specifically include the second normalization unit;This second
Normalization unit is used for when the word that extracts and expression be not in standard disease-disease correlation factor dictionary, according to replacing
Vocabulary is changed, the word and expression that will be extracted are normalized to during corresponding standard scale reaches, and obtain standardizing disease correlation factor.
Computing module can also specifically include Disease Score computing unit;The Disease Score computing unit is used for based on standardization disease phase
The factor is closed, the correlation of disease is corresponded to reference to the disease correlation factor obtained according to the disease-disease correlation factor dictionary for expanding
Property marking, calculate disease fraction.
The specific work process of the system of foregoing description and relevant explanation, may be referred to the correspondence in preceding method embodiment
Process, will not be repeated here.
It will be understood by those skilled in the art that above-mentioned build the system that knowledge base realizes assisting in diagnosis and treatment based on term vector automatically
Can also include some other known features, such as processor, controller, memory and bus etc., wherein, memory includes
But it is not limited to random access memory, flash memory, read-only storage, programmable read only memory, volatile memory, non-volatile memories
Device, serial storage, parallel storage or register etc., processor include but is not limited to single core processor, polycaryon processor, base
Processor, CPLD/FPGA, DSP, arm processor, MIPS processors in X86-based etc., bus can include data/address bus,
Address bus and controlling bus.In order to unnecessarily obscure embodiment of the disclosure, these known structures are not shown in fig. 2
Go out.It may also be noted that the quantity of the modules in Fig. 2 is only schematical.According to actual needs, each module can be with
With arbitrary quantity.
It should be noted that the division of above-mentioned modules is only for example, in actual applications, there can be other division
Mode.In addition, modules can also again be decomposed into other modules, will not be repeated here.Modules can both use hardware
Mode realize, it would however also be possible to employ the mode of software is realized realizing also or by the way of software and hardware is combined.In reality
In the application of border, above-mentioned modules can be gone or field-programmable by such as central processing unit, microprocessor, Digital Signal Processing
Gate array etc. is realized.Exemplary hardware platform for implementing modules may include such as with compatible operating system
Platform based on Intel x86, Mac platforms, MAC OS, iOS, Android OS etc..
It should be noted that the statement such as " first " used herein, " second " should not be construed as coming right in a variety of manners
The limitation that the scope of the present invention is formed.
Above-described specific embodiment and experimental example are to technical scheme, implementation detail and algorithm validity
All it has been described in detail.It is to be mentioned that, specific embodiment of the invention is the foregoing is only, it is not limited to
The present invention, all within spirit of the invention and principle, any modification, equivalent substitution and improvements done etc. should be included in this hair
Within bright protection domain.
Claims (14)
1. it is a kind of to build the method that knowledge base realizes assisting in diagnosis and treatment automatically based on term vector, it is characterised in that methods described includes:
Obtain patient's description;
Using the disease-disease correlation factor dictionary of the expansion set up based on the term vector, patient description is closed
Keyword is matched, and is extracted during the patient describes with the related word of medical science and expression;
Whether the Detection and Extraction word out and the expression are in standard disease-disease correlation factor dictionary;
Based on testing result, with reference to the disease correlation factor pair that the disease according to the expansion-disease correlation factor dictionary is obtained
Should be given a mark in the correlation of disease, calculate the fraction of disease;
Fraction to the disease is ranked up;
Determine disease according to ranking results.
2. method according to claim 1, it is characterised in that the disease of the expansion-disease correlation factor dictionary passes through
In the following manner is set up:
Model is represented using term vector embedding distribution formula of the medical information training on disease-disease correlation factor;
Model is represented based on the term vector embedding distribution formula, it is related to the standard disease-disease using distance metric method
Factor dictionary is expanded, and sets up disease, the disease correlation factor dictionary of the expansion.
3. method according to claim 2, it is characterised in that the utilization medical information training is on disease-disease phase
The term vector embedding distribution formula for closing the factor represents model, specifically includes:
Obtain medical information training corpus;
The medical information training corpus is cleaned;
The high frequency expression way occurred in question and answer storehouse records is counted, increases weight of the high frequency expression way in participle model,
And Chinese word segmentation is carried out, obtain training text;
The training text is trained, generation term vector embedding distribution formula represents model.
4. method according to claim 2, it is characterised in that the correlation that the disease correlation factor corresponds to disease is beaten
Divide and determine in the following manner:
Model is represented based on the term vector embedding distribution formula, it is related to the standard disease-disease using distance metric method
Factor dictionary is expanded, and is set up and is replaced vocabulary;
Using the disease-disease correlation factor dictionary and the replacement vocabulary of the expansion, the disease in the medical information is matched
Disease-disease correlation factor, calculates the correlation marking that the disease correlation factor corresponds to disease.
5. method according to claim 4, it is characterised in that the disease-disease correlation factor using the expansion
Dictionary and the replacement vocabulary, match the disease-disease correlation factor in the medical information, calculate the disease correlation factor
Corresponding to the correlation marking of disease, specifically include:
Using the disease-disease correlation factor dictionary of the expansion, the matching of keyword is carried out to doctors and patients' Question Log, extract institute
State in doctors and patients' Question Log with the related word of medical science and expression;
With the related word of medical science and the expression whether in the mark in doctors and patients' Question Log that Detection and Extraction go out
In quasi- disease-disease correlation factor dictionary;
If not existing, according to the replacement vocabulary, with institute's predicate of medical science correlation in the doctors and patients' Question Log that will be extracted
Language and the expression are normalized to during corresponding standard scale reaches;
Reached based on the standard scale, the frequency of statistics disease and its correlation factor co-occurrence obtains disease correlation factor and disease
Co-occurrence frequency records matrix;
Co-occurrence frequency record matrix based on the disease correlation factor and disease, using non-linear transformation method, obtains described
The correlation that disease correlation factor corresponds to disease is given a mark.
6. method according to claim 2, it is characterised in that methods described includes:
Model is represented based on the term vector embedding distribution formula, it is related to the standard disease-disease using distance metric method
Factor dictionary is expanded, and is set up and is replaced vocabulary;
The Detection and Extraction word out and the expression whether in standard disease-disease correlation factor dictionary, tool
Body includes:
If being not detected by, according to the replacement vocabulary, the word and the expression that will be extracted normalize to correspondence
Standard scale reach, obtain standardizing disease correlation factor;
The disease for based on testing result, obtaining with reference to the disease according to the expansion-disease correlation factor dictionary it is related because
Son is given a mark corresponding to the correlation of disease, calculates the fraction of disease, is specifically included:
Based on the standardization disease correlation factor, obtained with reference to the disease according to the expansion-disease correlation factor dictionary
The correlation that disease correlation factor corresponds to disease is given a mark, and calculates the fraction of disease.
7. method according to claim 5, it is characterised in that the correlation that the disease correlation factor corresponds to disease is beaten
Divide and determined by following formula:
Wherein, the Score (i, j) represents that the disease correlation factor corresponds to the correlation marking of disease;P (the Di|Fj)
Represent the conditional probability with disease;The DiRepresent the disease;The FjRepresent the disease correlation factor;The NiTable
Show disease frequency, the Ni=∑jNij, the NijRepresent record frequency.
8. method according to claim 6, it is characterised in that the fraction of the disease is obtained by following formula:
Wherein, the DS (Di) represent the fraction of the disease;The DiRepresent the disease;W (the Fj) represent disease category
Mapping weights;The Score (i, j) represents that the disease correlation factor corresponds to the correlation marking of disease.
9. it is a kind of to build the system that knowledge base realizes assisting in diagnosis and treatment automatically based on term vector, it is characterised in that the system includes:
Acquisition module, for obtaining patient's description;
Extraction module, for the disease-disease correlation factor dictionary using the expansion set up based on the term vector, to the trouble
Person's description carries out Keywords matching, extracts during the patient describes with the related word of medical science and expression;
Detection module, for the Detection and Extraction word out and the expression whether in standard disease-disease correlation factor
In dictionary;
Computing module, for based on testing result, with reference to the disease that the disease according to the expansion-disease correlation factor dictionary is obtained
The correlation that sick correlation factor corresponds to disease is given a mark, and calculates the fraction of disease;
Order module, is ranked up for the fraction to the disease;
Determining module, for determining disease according to ranking results.
10. system according to claim 9, it is characterised in that the extraction module is specifically included:
Term vector model sets up unit, for the term vector insertion on disease-disease correlation factor using medical information training
Distribution represents model;
Extended lexicon sets up unit, for representing model based on the term vector embedding distribution formula, uses distance metric method pair
The standard disease-disease correlation factor dictionary is expanded, and sets up disease, the disease correlation factor dictionary of the expansion.
11. methods according to claim 10, it is characterised in that the term vector model is set up unit and specifically included:
Acquiring unit, for obtaining medical information training corpus;
Cleaning unit, for being cleaned to the medical information training corpus;
First statistic unit, for counting the high frequency expression way occurred in question and answer storehouse records, increase high frequency expression way exists
Weight in participle model, and Chinese word segmentation is carried out, obtain training text;
Generation unit, for being trained to the training text, generation term vector embedding distribution formula represents model.
12. methods according to claim 10, it is characterised in that the computing module is specifically included:
First replacement vocabulary sets up unit, for representing model based on the term vector embedding distribution formula, using distance metric side
Method expands the standard disease-disease correlation factor dictionary, sets up and replaces vocabulary;
Correlation marking computing unit, for disease-disease correlation factor dictionary and the replacement vocabulary using the expansion,
Disease-disease the correlation factor in the medical information is matched, the correlation that the disease correlation factor corresponds to disease is calculated
Marking.
13. systems according to claim 12, it is characterised in that the correlation marking computing unit is specifically included:
Extraction unit, for the disease-disease correlation factor dictionary using the expansion, keyword is carried out to doctors and patients' Question Log
Matching, extract in doctors and patients' Question Log with the related word of medical science and expression;
Detection unit, with the related word of medical science and the expression in the doctors and patients' Question Log gone out for Detection and Extraction
Whether in the standard disease-disease correlation factor dictionary;
First normalization unit, in the word and the expression not in the standard disease-disease correlation factor dictionary
When middle, according to the replacement vocabulary, with the related word of medical science and described in the doctors and patients' Question Log that will be extracted
Expression is normalized to during corresponding standard scale reaches;
Second statistic unit, for being reached based on the standard scale, the frequency of statistics disease and its correlation factor co-occurrence obtains disease
The co-occurrence frequency record matrix of correlation factor and disease;
Non-linear conversion unit, matrix is recorded for the co-occurrence frequency based on the disease correlation factor and disease, uses non-thread
Property transform method, obtain the disease correlation factor corresponding to disease correlation give a mark.
14. systems according to claim 10, it is characterised in that the system includes:
Second replacement vocabulary sets up unit, for representing model based on the term vector embedding distribution formula, using distance metric side
Method expands the standard disease-disease correlation factor dictionary, sets up and replaces vocabulary;
The detection module is specifically included:
Second normalization unit, for the word for extracting and the expression not standard disease-disease it is related because
When in sub- dictionary, according to the replacement vocabulary, the word and the expression that will be extracted normalize to corresponding standard
In expression, obtain standardizing disease correlation factor;
The computing module is specifically included:
Disease Score computing unit, for based on the standardization disease correlation factor, with reference to the disease-disease according to the expansion
The correlation that the disease correlation factor that sick correlation factor dictionary is obtained corresponds to disease is given a mark, and calculates the fraction of disease.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611222893.XA CN106874643B (en) | 2016-12-27 | 2016-12-27 | Method and system for automatically constructing knowledge base to realize auxiliary diagnosis and treatment based on word vectors |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611222893.XA CN106874643B (en) | 2016-12-27 | 2016-12-27 | Method and system for automatically constructing knowledge base to realize auxiliary diagnosis and treatment based on word vectors |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106874643A true CN106874643A (en) | 2017-06-20 |
CN106874643B CN106874643B (en) | 2020-02-28 |
Family
ID=59165041
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611222893.XA Active CN106874643B (en) | 2016-12-27 | 2016-12-27 | Method and system for automatically constructing knowledge base to realize auxiliary diagnosis and treatment based on word vectors |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106874643B (en) |
Cited By (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107358315A (en) * | 2017-06-26 | 2017-11-17 | 深圳市金立通信设备有限公司 | A kind of information forecasting method and terminal |
CN107610779A (en) * | 2017-10-25 | 2018-01-19 | 医渡云(北京)技术有限公司 | Disease Assessment Scale and risk appraisal procedure and device |
CN107633882A (en) * | 2017-09-11 | 2018-01-26 | 合肥工业大学 | Mix the minimally invasive medical service system and its aid decision-making method under cloud framework |
CN107833629A (en) * | 2017-10-25 | 2018-03-23 | 厦门大学 | Aided diagnosis method and system based on deep learning |
CN107863147A (en) * | 2017-10-24 | 2018-03-30 | 清华大学 | The method of medical diagnosis based on depth convolutional neural networks |
CN108182973A (en) * | 2017-12-29 | 2018-06-19 | 湖南大学 | A kind of Intelligent Diagnosis Technology of knowledge based collection of illustrative plates reasoning |
CN108182972A (en) * | 2017-12-15 | 2018-06-19 | 上海长江科技发展有限公司 | The intelligent coding method and system of Chinese medical diagnosis on disease based on participle network |
CN108614885A (en) * | 2018-05-03 | 2018-10-02 | 杭州认识科技有限公司 | Knowledge mapping analysis method based on medical information and device |
CN109240258A (en) * | 2018-07-09 | 2019-01-18 | 上海万行信息科技有限公司 | Vehicle failure intelligent auxiliary diagnosis method and system based on term vector |
CN109243599A (en) * | 2018-03-16 | 2019-01-18 | 申朴信息技术(上海)股份有限公司 | A kind of disease based on various dimensions information retrieval is to code method |
CN109473169A (en) * | 2018-10-18 | 2019-03-15 | 安吉康尔(深圳)科技有限公司 | A kind of methods for the diagnosis of diseases, device and terminal device |
CN109684445A (en) * | 2018-11-13 | 2019-04-26 | 中国科学院自动化研究所 | Colloquial style medical treatment answering method and system |
CN109817330A (en) * | 2019-01-25 | 2019-05-28 | 华院数据技术(上海)有限公司 | A kind of disease forecasting device |
TWI665684B (en) * | 2017-12-27 | 2019-07-11 | 瑞友資訊股份有限公司 | Care system capable of drawing up intelligent care plan and using method thereof |
CN110019826A (en) * | 2017-07-27 | 2019-07-16 | 北大医疗信息技术有限公司 | Construction method, construction device, equipment and the storage medium of medical knowledge map |
CN110164544A (en) * | 2018-02-11 | 2019-08-23 | 深圳欧德蒙科技有限公司 | A kind of method, apparatus and terminal device of illness information processing |
CN110276749A (en) * | 2019-06-14 | 2019-09-24 | 辽宁万象联合医疗科技有限公司 | Children penetrate the quality control artificial intelligence system and its quality control method of piece and diagnosis |
CN110867228A (en) * | 2019-11-15 | 2020-03-06 | 北京大学人民医院(北京大学第二临床医学院) | Intelligent information grabbing and evaluating method and system for wound severity of wound inpatient |
CN111599489A (en) * | 2020-05-19 | 2020-08-28 | 万达信息股份有限公司 | Disease information acquisition method, terminal equipment and storage medium |
CN111798941A (en) * | 2019-04-04 | 2020-10-20 | Iqvia 有限公司 | Predictive system for generating clinical queries |
CN111968740A (en) * | 2020-09-03 | 2020-11-20 | 卫宁健康科技集团股份有限公司 | Diagnostic label recommendation method and device, storage medium and electronic equipment |
CN111985246A (en) * | 2020-08-27 | 2020-11-24 | 武汉东湖大数据交易中心股份有限公司 | Disease cognitive system based on main symptoms and accompanying symptom words |
CN112017773A (en) * | 2020-08-31 | 2020-12-01 | 吾征智能技术(北京)有限公司 | Disease cognition model construction method based on nightmare and disease cognition system |
CN112331355A (en) * | 2020-11-26 | 2021-02-05 | 微医云(杭州)控股有限公司 | Generation method and device of disease category evaluation table, electronic equipment and storage medium |
CN112364055A (en) * | 2020-10-29 | 2021-02-12 | 上海德衡数据科技有限公司 | Service management software system and method |
CN112988953A (en) * | 2021-04-26 | 2021-06-18 | 成都索贝数码科技股份有限公司 | Adaptive broadcast television news keyword standardization method |
CN113505236A (en) * | 2021-06-29 | 2021-10-15 | 医智泉(杭州)医疗科技有限公司 | Construction method, device and equipment of medical knowledge graph and computer readable medium |
CN113793668A (en) * | 2021-09-17 | 2021-12-14 | 平安科技(深圳)有限公司 | Symptom standardization method and device based on artificial intelligence, electronic equipment and medium |
CN114628012A (en) * | 2022-03-21 | 2022-06-14 | 中国人民解放军西部战区总医院 | Emergency department's preliminary examination go-no-go system |
CN110459287B (en) * | 2018-05-08 | 2024-03-22 | 西门子医疗有限公司 | Structured report data from medical text reports |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070288304A1 (en) * | 2006-06-08 | 2007-12-13 | Adknowledge, Inc. | System and method for behaviorally targeted electronic communications |
CN101158969A (en) * | 2007-11-23 | 2008-04-09 | 腾讯科技(深圳)有限公司 | Whole sentence generating method and device |
JP2011180746A (en) * | 2010-02-26 | 2011-09-15 | National Institute Of Information & Communication Technology | Relational information expansion device, relational information expansion method and program |
CN104375989A (en) * | 2014-12-01 | 2015-02-25 | 国家电网公司 | Natural language text keyword association network construction system |
CN104391963A (en) * | 2014-12-01 | 2015-03-04 | 北京中科创益科技有限公司 | Method for constructing correlation networks of keywords of natural language texts |
CN104572624A (en) * | 2015-01-20 | 2015-04-29 | 浙江大学 | Method for discovering treatment relation between single medicine and disease based on term vector |
CN104965992A (en) * | 2015-07-13 | 2015-10-07 | 南开大学 | Text mining method based on online medical question and answer information |
CN105069123A (en) * | 2015-08-13 | 2015-11-18 | 易保互联医疗信息科技(北京)有限公司 | Automatic coding method and system for Chinese surgical operation information |
CN105138829A (en) * | 2015-08-13 | 2015-12-09 | 易保互联医疗信息科技(北京)有限公司 | Natural language processing method and system for Chinese diagnosis and treatment information |
CN105426358A (en) * | 2015-11-09 | 2016-03-23 | 中国农业大学 | Automatic disease noun identification method |
CN105740612A (en) * | 2016-01-27 | 2016-07-06 | 北京国医精诚科技有限公司 | Traditional Chinese medicine clinical medical record based disease diagnose and treatment method and system |
CN105786782A (en) * | 2016-03-25 | 2016-07-20 | 北京搜狗科技发展有限公司 | Word vector training method and device |
CN106096273A (en) * | 2016-06-08 | 2016-11-09 | 江苏华康信息技术有限公司 | A kind of disease symptoms derivation method based on TF IDF innovatory algorithm |
CN106156272A (en) * | 2016-06-21 | 2016-11-23 | 北京工业大学 | A kind of information retrieval method based on multi-source semantic analysis |
-
2016
- 2016-12-27 CN CN201611222893.XA patent/CN106874643B/en active Active
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070288304A1 (en) * | 2006-06-08 | 2007-12-13 | Adknowledge, Inc. | System and method for behaviorally targeted electronic communications |
CN101158969A (en) * | 2007-11-23 | 2008-04-09 | 腾讯科技(深圳)有限公司 | Whole sentence generating method and device |
JP2011180746A (en) * | 2010-02-26 | 2011-09-15 | National Institute Of Information & Communication Technology | Relational information expansion device, relational information expansion method and program |
CN104375989A (en) * | 2014-12-01 | 2015-02-25 | 国家电网公司 | Natural language text keyword association network construction system |
CN104391963A (en) * | 2014-12-01 | 2015-03-04 | 北京中科创益科技有限公司 | Method for constructing correlation networks of keywords of natural language texts |
CN104572624A (en) * | 2015-01-20 | 2015-04-29 | 浙江大学 | Method for discovering treatment relation between single medicine and disease based on term vector |
CN104965992A (en) * | 2015-07-13 | 2015-10-07 | 南开大学 | Text mining method based on online medical question and answer information |
CN105069123A (en) * | 2015-08-13 | 2015-11-18 | 易保互联医疗信息科技(北京)有限公司 | Automatic coding method and system for Chinese surgical operation information |
CN105138829A (en) * | 2015-08-13 | 2015-12-09 | 易保互联医疗信息科技(北京)有限公司 | Natural language processing method and system for Chinese diagnosis and treatment information |
CN105426358A (en) * | 2015-11-09 | 2016-03-23 | 中国农业大学 | Automatic disease noun identification method |
CN105740612A (en) * | 2016-01-27 | 2016-07-06 | 北京国医精诚科技有限公司 | Traditional Chinese medicine clinical medical record based disease diagnose and treatment method and system |
CN105786782A (en) * | 2016-03-25 | 2016-07-20 | 北京搜狗科技发展有限公司 | Word vector training method and device |
CN106096273A (en) * | 2016-06-08 | 2016-11-09 | 江苏华康信息技术有限公司 | A kind of disease symptoms derivation method based on TF IDF innovatory algorithm |
CN106156272A (en) * | 2016-06-21 | 2016-11-23 | 北京工业大学 | A kind of information retrieval method based on multi-source semantic analysis |
Non-Patent Citations (3)
Title |
---|
DAVID AFONSO等: "An Ultrasonographic Risk Score For Detecting Symptomatic Carotid Atherosclerotic Plaques", 《IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS》 * |
常鹏 等: "高效的短文本主题词抽取方法", 《计算机工程与应用》 * |
梁耀波: "智能医疗诊断系统的研究与实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (43)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107358315A (en) * | 2017-06-26 | 2017-11-17 | 深圳市金立通信设备有限公司 | A kind of information forecasting method and terminal |
CN110019826B (en) * | 2017-07-27 | 2023-02-28 | 北大医疗信息技术有限公司 | Construction method, construction device, equipment and storage medium of medical knowledge map |
CN110019826A (en) * | 2017-07-27 | 2019-07-16 | 北大医疗信息技术有限公司 | Construction method, construction device, equipment and the storage medium of medical knowledge map |
CN107633882B (en) * | 2017-09-11 | 2019-05-14 | 合肥工业大学 | Mix the minimally invasive medical service system and its aid decision-making method under cloud framework |
CN107633882A (en) * | 2017-09-11 | 2018-01-26 | 合肥工业大学 | Mix the minimally invasive medical service system and its aid decision-making method under cloud framework |
CN107863147A (en) * | 2017-10-24 | 2018-03-30 | 清华大学 | The method of medical diagnosis based on depth convolutional neural networks |
CN107863147B (en) * | 2017-10-24 | 2021-03-16 | 清华大学 | Medical diagnosis method based on deep convolutional neural network |
CN107610779A (en) * | 2017-10-25 | 2018-01-19 | 医渡云(北京)技术有限公司 | Disease Assessment Scale and risk appraisal procedure and device |
CN107833629A (en) * | 2017-10-25 | 2018-03-23 | 厦门大学 | Aided diagnosis method and system based on deep learning |
CN108182972A (en) * | 2017-12-15 | 2018-06-19 | 上海长江科技发展有限公司 | The intelligent coding method and system of Chinese medical diagnosis on disease based on participle network |
TWI665684B (en) * | 2017-12-27 | 2019-07-11 | 瑞友資訊股份有限公司 | Care system capable of drawing up intelligent care plan and using method thereof |
CN108182973A (en) * | 2017-12-29 | 2018-06-19 | 湖南大学 | A kind of Intelligent Diagnosis Technology of knowledge based collection of illustrative plates reasoning |
CN110164544A (en) * | 2018-02-11 | 2019-08-23 | 深圳欧德蒙科技有限公司 | A kind of method, apparatus and terminal device of illness information processing |
CN109243599A (en) * | 2018-03-16 | 2019-01-18 | 申朴信息技术(上海)股份有限公司 | A kind of disease based on various dimensions information retrieval is to code method |
CN108614885A (en) * | 2018-05-03 | 2018-10-02 | 杭州认识科技有限公司 | Knowledge mapping analysis method based on medical information and device |
CN110459287B (en) * | 2018-05-08 | 2024-03-22 | 西门子医疗有限公司 | Structured report data from medical text reports |
CN109240258A (en) * | 2018-07-09 | 2019-01-18 | 上海万行信息科技有限公司 | Vehicle failure intelligent auxiliary diagnosis method and system based on term vector |
CN109473169A (en) * | 2018-10-18 | 2019-03-15 | 安吉康尔(深圳)科技有限公司 | A kind of methods for the diagnosis of diseases, device and terminal device |
CN109684445A (en) * | 2018-11-13 | 2019-04-26 | 中国科学院自动化研究所 | Colloquial style medical treatment answering method and system |
CN109684445B (en) * | 2018-11-13 | 2021-05-28 | 中国科学院自动化研究所 | Spoken medical question-answering method and spoken medical question-answering system |
CN109817330A (en) * | 2019-01-25 | 2019-05-28 | 华院数据技术(上海)有限公司 | A kind of disease forecasting device |
CN111798941A (en) * | 2019-04-04 | 2020-10-20 | Iqvia 有限公司 | Predictive system for generating clinical queries |
CN111798941B (en) * | 2019-04-04 | 2023-10-13 | Iqvia 有限公司 | Predictive system for generating clinical queries |
US11615148B2 (en) | 2019-04-04 | 2023-03-28 | Iqvia Inc. | Predictive system for generating clinical queries |
CN110276749A (en) * | 2019-06-14 | 2019-09-24 | 辽宁万象联合医疗科技有限公司 | Children penetrate the quality control artificial intelligence system and its quality control method of piece and diagnosis |
CN110276749B (en) * | 2019-06-14 | 2022-04-01 | 辽宁万象联合医疗科技有限公司 | Quality control artificial intelligence system and quality control method for children radiation shooting and diagnosis |
CN110867228A (en) * | 2019-11-15 | 2020-03-06 | 北京大学人民医院(北京大学第二临床医学院) | Intelligent information grabbing and evaluating method and system for wound severity of wound inpatient |
CN111599489A (en) * | 2020-05-19 | 2020-08-28 | 万达信息股份有限公司 | Disease information acquisition method, terminal equipment and storage medium |
CN111985246A (en) * | 2020-08-27 | 2020-11-24 | 武汉东湖大数据交易中心股份有限公司 | Disease cognitive system based on main symptoms and accompanying symptom words |
CN111985246B (en) * | 2020-08-27 | 2023-08-15 | 武汉东湖大数据交易中心股份有限公司 | Disease cognitive system based on main symptoms and accompanying symptom words |
CN112017773A (en) * | 2020-08-31 | 2020-12-01 | 吾征智能技术(北京)有限公司 | Disease cognition model construction method based on nightmare and disease cognition system |
CN112017773B (en) * | 2020-08-31 | 2024-03-26 | 吾征智能技术(北京)有限公司 | Disease cognitive model construction method and disease cognitive system based on nightmare |
CN111968740A (en) * | 2020-09-03 | 2020-11-20 | 卫宁健康科技集团股份有限公司 | Diagnostic label recommendation method and device, storage medium and electronic equipment |
CN112364055A (en) * | 2020-10-29 | 2021-02-12 | 上海德衡数据科技有限公司 | Service management software system and method |
CN112364055B (en) * | 2020-10-29 | 2023-11-03 | 上海德衡数据科技有限公司 | Service management software system and method |
CN112331355A (en) * | 2020-11-26 | 2021-02-05 | 微医云(杭州)控股有限公司 | Generation method and device of disease category evaluation table, electronic equipment and storage medium |
CN112331355B (en) * | 2020-11-26 | 2024-03-19 | 微医云(杭州)控股有限公司 | Disease type evaluation table generation method and device, electronic equipment and storage medium |
CN112988953A (en) * | 2021-04-26 | 2021-06-18 | 成都索贝数码科技股份有限公司 | Adaptive broadcast television news keyword standardization method |
CN113505236A (en) * | 2021-06-29 | 2021-10-15 | 医智泉(杭州)医疗科技有限公司 | Construction method, device and equipment of medical knowledge graph and computer readable medium |
CN113505236B (en) * | 2021-06-29 | 2023-08-04 | 朱一帆 | Medical knowledge graph construction method, device, equipment and computer readable medium |
CN113793668A (en) * | 2021-09-17 | 2021-12-14 | 平安科技(深圳)有限公司 | Symptom standardization method and device based on artificial intelligence, electronic equipment and medium |
CN114628012A (en) * | 2022-03-21 | 2022-06-14 | 中国人民解放军西部战区总医院 | Emergency department's preliminary examination go-no-go system |
CN114628012B (en) * | 2022-03-21 | 2023-09-05 | 中国人民解放军西部战区总医院 | Emergency department's preliminary examination sorting system |
Also Published As
Publication number | Publication date |
---|---|
CN106874643B (en) | 2020-02-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106874643A (en) | Build the method and system that knowledge base realizes assisting in diagnosis and treatment automatically based on term vector | |
CN109460473B (en) | Electronic medical record multi-label classification method based on symptom extraction and feature representation | |
CN111274806B (en) | Method and device for recognizing word segmentation and part of speech and method and device for analyzing electronic medical record | |
CN110459282B (en) | Sequence labeling model training method, electronic medical record processing method and related device | |
CN109599185B (en) | Disease data processing method and device, electronic equipment and computer readable medium | |
CN110472229B (en) | Sequence labeling model training method, electronic medical record processing method and related device | |
CN109670179B (en) | Medical record text named entity identification method based on iterative expansion convolutional neural network | |
CN110069779B (en) | Symptom entity identification method of medical text and related device | |
CN110838368B (en) | Active inquiry robot based on traditional Chinese medicine clinical knowledge map | |
CN110705293A (en) | Electronic medical record text named entity recognition method based on pre-training language model | |
US20190057773A1 (en) | Method and system for performing triage | |
Matci et al. | Address standardization using the natural language process for improving geocoding results | |
CN112002411A (en) | Cardiovascular and cerebrovascular disease knowledge map question-answering method based on electronic medical record | |
CN109166608A (en) | Electronic health record information extracting method, device and equipment | |
CN110931128B (en) | Method, system and device for automatically identifying unsupervised symptoms of unstructured medical texts | |
CN110337645A (en) | The processing component that can be adapted to | |
CN111680089A (en) | Text structuring method, device and system and non-volatile storage medium | |
CN109378066A (en) | A kind of control method and control device for realizing disease forecasting based on feature vector | |
CN112541066B (en) | Text-structured-based medical and technical report detection method and related equipment | |
CN111477320B (en) | Treatment effect prediction model construction system, treatment effect prediction system and terminal | |
CN108231146A (en) | A kind of medical records model building method, system and device based on deep learning | |
CN104063579A (en) | Health dynamic prediction method and equipment based on multivariate medical consumption data | |
Whitney | Bootstrapping via graph propagation | |
CN109299467A (en) | Medicine text recognition method and device, sentence identification model training method and device | |
CN112349367B (en) | Method, device, electronic equipment and storage medium for generating simulated medical record |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |