CN109300550A - Medical data relation excavation method and device - Google Patents
Medical data relation excavation method and device Download PDFInfo
- Publication number
- CN109300550A CN109300550A CN201811330207.XA CN201811330207A CN109300550A CN 109300550 A CN109300550 A CN 109300550A CN 201811330207 A CN201811330207 A CN 201811330207A CN 109300550 A CN109300550 A CN 109300550A
- Authority
- CN
- China
- Prior art keywords
- medical data
- medical
- data
- relationship
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Public Health (AREA)
- Health & Medical Sciences (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Medical Informatics (AREA)
- Epidemiology (AREA)
- Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Biomedical Technology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
The present invention relates to a kind of medical data relation excavation method and device, electronic equipment and computer-readable mediums.This method comprises: obtaining the first medical data and the second medical data in target text;Feature extraction is carried out to first medical data and second medical data, obtains the feature vector of first medical data and second medical data;Described eigenvector is input to trained disaggregated model, determines the relationship by objective (RBO) between first medical data and second medical data.The present invention can efficiently identify out the relationship between the medical data in clinical case text, improve the efficiency of medical data relation excavation, in favor of further data statistic analysis.
Description
Technical field
The present invention relates to medical informations to extract field, in particular to a kind of medical data relation excavation method and doctor
Treat processing unit, electronic equipment and computer-readable medium.
Background technique
In clinical case text, many information are recorded in the form of long text, are not easy to subsequent statistical analysis task.
Clinical case structuring can solve this kind of technical problem.Wherein, relation excavation of the medical terminology in long text is clinical number
According to the very important step of structuring.
In the prior art, there are the method for artificial abstraction rule and based on text syntax dependency parsing in natural language processing
Method carry out medical data relation excavation.
But artificial rule is a kind of method of single solution for diverse problems in the method for above-mentioned artificial abstraction rule, and effect is dependent on rule
Careful degree.The above-mentioned method based on text syntax dependency parsing in natural language processing, if specific area training, mark
It is very high to form this, so few direct applications in clinical case.
It should be noted that information is only used for reinforcing the reason to background of the invention disclosed in above-mentioned background technology part
Solution, therefore may include the information not constituted to the prior art known to persons of ordinary skill in the art.
Summary of the invention
The purpose of the present invention is to provide a kind of medical data relation excavation method and medical data relation excavation device, energy
The relationship between the medical data in clinical case text is enough efficiently identified out, the effect of medical data relation excavation is improved
Rate.
According to an aspect of the present invention, a kind of medical data relation excavation method is provided, comprising: obtain in target text
The first medical data and the second medical data;Feature pumping is carried out to first medical data and second medical data
It takes, obtains the feature vector of first medical data and second medical data;Described eigenvector is input to training
Good disaggregated model determines the relationship by objective (RBO) between first medical data and second medical data.
In a kind of exemplary embodiment of the invention, the relationship by objective (RBO) include negative word and medical data relationship, when
Between with medical data relationship, numerical value and medical data relationship, the region of anatomy and medical data relationship, movement and medical data close
Any one in system, relatives and medical data relationship.
It is described to first medical data and second medical data in a kind of exemplary embodiment of the invention
Carry out feature extraction, comprising: obtain feature of first medical data itself, feature of second medical data itself,
The periphery text feature of first medical data and second medical data, syntax dependency parsing feature and sentence form are special
At least one of sign.
In a kind of exemplary embodiment of the invention, feature of first medical data itself includes in following characteristics
At least one: first medical data whether be one diagnosis;Whether first medical data is a region of anatomy;
Whether first medical data is a symptom;Whether first medical data is a lesion word;First medicine
Whether data are negative word;Whether first medical data includes verb;Whether first medical data includes number;Institute
Stating the first medical data, whether length is greater than preset byte;Whether first medical data includes time word.
In a kind of exemplary embodiment of the invention, before the periphery text feature includes first medical data
Information text feature, first medical data and described second behind face information text feature, second medical data
At least one of text feature between medical data.
In a kind of exemplary embodiment of the invention, the foregoing information text feature of first medical data include with
At least one of lower feature: preset in a word whether have fullstop before first medical data;First medical data
Whether there is comma in default word noted earlier;Whether have in first medical data default word noted earlier space or
Pause mark;Whether there is negative word in first medical data default word noted earlier;First medical data is noted earlier
Whether have in default word and only acts on negative word backward;Whether have in first medical data default word noted earlier
" companion ";Whether there is " idol " in first medical data default word noted earlier;First medical data is noted earlier pre-
If whether there is omission word in a word;Whether the verb of expression behavior is had in first medical data default word noted earlier;
Whether there is diagnosis in first medical data default word noted earlier;First medical data default word noted earlier
Inside whether there is the region of anatomy;In first medical data default word noted earlier whether symptom;First medical data
In default word noted earlier whether ill variable;Whether have in first medical data default word noted earlier continuous general
Read the mode of punctuate segmentation;In first medical data default word noted earlier whether having time;The first medicine number
Preset in a word whether have number according to noted earlier;Whether there is verb in first medical data default word noted earlier.
It is literary between first medical data and second medical data in a kind of exemplary embodiment of the invention
Eigen includes at least one of following characteristics: the distance between first medical data and second medical data;
Sequence between first medical data and second medical data;First medical data and the second medicine number
The number of fullstop between;The number of comma between first medical data and second medical data;First doctor
Learn the number of space or pause mark between data and second medical data;First medical data and second medicine
Whether have between data " companion ";Whether there is " idol " between first medical data and second medical data;Described first
Whether the verb of expression behavior is had between medical data and second medical data;First medical data and described second
Whether the negative word that only backward acts on is had between medical data;It is between first medical data and second medical data
It is no to have omission word;Whether there is negative word between first medical data and second medical data;The first medicine number
Whether there is diagnosis according between second medical data;Between first medical data and second medical data whether
There is the region of anatomy;Between first medical data and second medical data whether symptom;First medical data and
Between second medical data whether ill variable;Whether have between first medical data and second medical data
The mode of Continuous Concept punctuate segmentation;Whether there is number between first medical data and second medical data;It is described
Between first medical data and second medical data whether having time;First medical data and the second medicine number
Whether there is verb between.
In a kind of exemplary embodiment of the invention, the syntax dependency parsing feature include in following characteristics at least
It is a kind of: between first medical data and second medical data whether set membership;First medical data and institute
State dependency tree upper pathway length between the second medical data;Path between first medical data and second medical data
On whether have subject-predicate relationship side;Whether there is guest's relationship on path between first medical data and second medical data
Side;Whether surely in relationship or verbal endocentric phrase side are had between first medical data and second medical data on path;
Whether a line moves guest's relationship or subject-predicate relationship on path between first medical data and second medical data;
A line is relationship or verbal endocentric phrase in negative on path between first medical data and second medical data;
The whether dynamic guest's relationship in the last item side or subject-predicate are closed on path between first medical data and second medical data
System;Whether the last item side moves guest's relationship or subject-predicate on path between first medical data and second medical data
Relationship.
In a kind of exemplary embodiment of the invention, the sentence morphological feature includes at least one in following characteristics
Kind: whether first medical data and second medical data are in a paragraph;First medical data and described
Whether the second medical data is in a sentence;Whether first medical data and second medical data are in a clause
In;Whether first medical data and second medical data are in a paragraph, and centre is not present and the first medicine
Other similar or similar with the second medical data medical datas of data;First medical data and the second medicine number
Whether according in a sentence, and intermediate there is no other similar or similar with the second medical data with the first medical data
Medical data;Whether first medical data and second medical data are in a clause, and intermediate there is no with the
Other similar or similar with the second medical data medical datas of one medical data.
According to an aspect of the present invention, a kind of medical data relation excavation device is provided, comprising: medical data obtains mould
Block is configured to obtain the first medical data and the second medical data in target text;Feature extraction module is configured to described
First medical data and second medical data carry out feature extraction, obtain first medical data and second medicine
The feature vector of data;Relationship by objective (RBO) determination module is configured to for described eigenvector to be input to trained disaggregated model, sentence
Relationship by objective (RBO) between fixed first medical data and second medical data.
According to an aspect of the present invention, a kind of computer-readable medium is provided, computer program is stored thereon with, it is described
Medical data relation excavation method described in any of the above-described embodiment is realized when program is executed by processor.
According to an aspect of the present invention, a kind of electronic equipment is provided, comprising: one or more processors;Storage device,
It is configured to store one or more programs, when one or more of programs are executed by one or more of processors, make
It obtains one or more of processors and realizes medical data relation excavation method described in any of the above-described embodiment.
Medical data relation excavation method and medical data relation excavation dress in a kind of exemplary embodiment of the invention
It sets, by obtaining the first medical data and the second medical data in target text;And to first medical data and described
Second medical data carries out feature extraction, obtains the feature vector of first medical data and second medical data;Again
Described eigenvector is input to trained disaggregated model, determine first medical data and second medical data it
Between relationship by objective (RBO), the relationship between the medical data in clinical case text can be efficiently identified out, improve medicine number
According to the efficiency of relation excavation, in favor of further data statistic analysis.
It should be understood that above general description and following detailed description be only it is exemplary and explanatory, not
It can the limitation present invention.
Detailed description of the invention
Its example embodiment is described in detail by referring to accompanying drawing, above and other feature of the invention and advantage will become
It is more obvious.
Fig. 1 shows the flow chart of the medical data relation excavation method of an exemplary embodiment according to the present invention;
Fig. 2 shows the schematic diagrames of the disaggregated model characteristic set of an exemplary embodiment according to the present invention;
Fig. 3 shows the flow chart of the medical data relation excavation method of another exemplary embodiment according to the present invention;
Fig. 4 shows the flow chart of the medical data relation excavation method of another exemplary embodiment according to the present invention;
Fig. 5 shows the flow chart of the medical data relation excavation method of another exemplary embodiment according to the present invention;
Fig. 6 shows the block diagram of the medical data relation excavation device of an exemplary embodiment according to the present invention;
Fig. 7 shows the medical data relation excavation method or medical data relation excavation that can apply the embodiment of the present invention
The schematic diagram of the exemplary system architecture of device;
Fig. 8 shows the structural schematic diagram for being suitable for the electronic equipment for being used to realize the embodiment of the present invention.
Specific embodiment
Example embodiment is described more fully with reference to the drawings.However, example embodiment can be real in a variety of forms
It applies, and is not understood as limited to embodiment set forth herein;On the contrary, thesing embodiments are provided so that the present invention will be comprehensively and complete
It is whole, and the design of example embodiment is comprehensively communicated to those skilled in the art.Identical appended drawing reference indicates in figure
Same or similar part, thus repetition thereof will be omitted.
In addition, described feature, structure or characteristic can be incorporated in one or more implementations in any suitable manner
In example.In the following description, many details are provided to provide and fully understand to the embodiment of the present invention.However,
It will be appreciated by persons skilled in the art that technical solution of the present invention can be practiced without one in the specific detail or more
It is more, or can be using other methods, constituent element, material, device, step etc..In other cases, it is not shown in detail or describes
Known features, method, apparatus, realization, material or operation are to avoid fuzzy each aspect of the present invention.
Block diagram shown in the drawings is only functional entity, not necessarily must be corresponding with physically separate entity.
I.e., it is possible to realize these functional entitys using software form, or these are realized in the module of one or more softwares hardening
A part of functional entity or functional entity, or realized in heterogeneous networks and/or processor device and/or microcontroller device
These functional entitys.
In the prior art, it uses following three classes method and carries out medical data relation excavation:
First kind method: the method for artificial abstraction rule.Judge whether meet certain between medical data from text modality
Kind relationship, and then determine whether relationship is true between medical data.For example, judging two medical datas whether in a comma point
Every sentence in etc..
Above-mentioned first kind method at least has the disadvantage in that artificial rule is a kind of method of single solution for diverse problems, and effect depends on
The careful degree of rule;High labor cost;For new data, the regular risk that can not be covered;Between rule there may be
Conflict mutual exclusion.
Second class method: the method based on text syntax dependency parsing in natural language processing.Syntax dependency parsing is certainly
The classical task of one of right Language Processing, it can be determined that whether meet subject-predicate relationship in sentence between each word, dynamic guest's relationship, repair
The grammatical relations such as decorations relationship.Structure based on dependency analysis, judges whether medical data meets relationship by objective (RBO).
It is a kind of more satisfactory method that above-mentioned second class method, which at least has the disadvantage in that, but current industry is Chinese
Syntax dependency parsing model effect itself is general, and for specific area training, mark cost is very high, so in clinical disease
Few direct applications in example.
Third class method: specific medical data relationship, train classification models are diagnosed.According to task object, mark is clinical
The sample of case traditional Chinese medicine data relationship is classified with some general machine learning classification models, judges that relationship by objective (RBO) is
No establishment.
Above-mentioned third class method is a kind of relatively feasible method, for particular kind of relationship type, specific application area, mark
Training corpus is classified, and judges whether corresponding relationship is true.But such methods are needed for every kind of medical terminology relationship, often
A application scenarios, are targetedly marked and are trained, and as a result do not have scalability.
In the embodiment of the present invention, medical data can also be referred to as clinical data, can be medical terminology, TCM academic
Language refers to the word that clear medical concept can be characterized in medical research or medical events, and the definition of clinical data needs
In conjunction with specific clinical task define, such as " mother " in particular task, it is some be similar to the words such as " father ", " mother " can also
To be the perpetual object of the certain medical task, that is, it is also possible to medical terminology.
In the embodiment of the present invention, medical data relationship type excavates the medical treatment letter for referring to and showing in long text in clinical case
Breath, it will usually there is multiple medical terminologys or medical terminology and the collocation of other contaminations to indicate.
For example, family history: " father's body is strong, and mother is late, dies of lung cancer ", critical medical information therein are as follows:
Relatives: mother family disease: lung cancer
It is mother's trouble that lung cancer is excavated in the slave text that medical data relationship refers to, rather than the disease of father.
The invention proposes a kind of medical data relation excavation method, the relationship type that can be used between medical data is taken out
As, and solved to know method for distinguishing with machine learning.
In this example embodiment, a kind of medical data relation excavation method is provided firstly.With reference to shown in Fig. 1, the doctor
Learn data relationship method for digging the following steps are included:
In step s 110, the first medical data A and the second medical data B in target text are obtained.
In the embodiment of the present invention, the target text can be clinical case to be excavated, can be known by a set of entity
Other algorithm extracts A and B from the long text of clinical case, and specific entity identification algorithms are referred to the prior art,
This will not be detailed here.
In the step s 120, feature extraction is carried out to first medical data and second medical data, obtains institute
State the feature vector of the first medical data and second medical data.
In the exemplary embodiment, described that feature pumping is carried out to first medical data and second medical data
It takes, may include: itself feature for obtaining first medical data, feature of second medical data itself, described
In the periphery text feature of one medical data and second medical data, dependency analysis feature and sentence morphological feature etc. extremely
Few one kind.
In the exemplary embodiment, feature of first medical data itself may include at least one in following characteristics
Kind: whether first medical data is a diagnosis;Whether first medical data is a region of anatomy;Described first
Whether medical data is a symptom;Whether first medical data is a lesion word;Whether first medical data
For negative word;Whether first medical data includes verb;Whether first medical data includes number;First doctor
Learning data, whether length is greater than preset byte;Whether first medical data includes time word.
In the exemplary embodiment, the periphery text feature may include the foregoing information text of first medical data
Information text feature, first medical data and second medical data behind eigen, second medical data
Between at least one of text feature etc..
In the exemplary embodiment, the foregoing information text feature of first medical data may include in following characteristics
At least one: preset in word whether have fullstop before first medical data;First medical data is noted earlier
Whether there is comma in default word;Whether there are space or pause mark in first medical data default word noted earlier;Institute
Whether state in the first medical data default word noted earlier has negative word;First medical data default word noted earlier
Inside whether have and only acts on negative word backward;Whether there is " companion " in first medical data default word noted earlier;Described
Whether there is " idol " in one medical data default word noted earlier;In first medical data default word noted earlier whether
There is omission word;Whether the verb of expression behavior is had in first medical data default word noted earlier;First medicine
Whether there is diagnosis in data default word noted earlier;Whether there is dissection in first medical data default word noted earlier
Position;In first medical data default word noted earlier whether symptom;First medical data is noted earlier default
In a word whether ill variable;Whether there is Continuous Concept punctuate to divide in first medical data default word noted earlier
Mode;In first medical data default word noted earlier whether having time;First medical data is noted earlier pre-
If whether having number in a word;Whether there is verb in first medical data default word noted earlier.
In the exemplary embodiment, text feature can wrap between first medical data and second medical data
Include at least one of following characteristics: the distance between first medical data and second medical data;Described first
Sequence between medical data and second medical data;Sentence between first medical data and second medical data
Number number;The number of comma between first medical data and second medical data;First medical data and
The number of space or pause mark between second medical data;Between first medical data and second medical data
Whether " companion " is had;Whether there is " idol " between first medical data and second medical data;First medical data
Whether there is the verb of expression behavior between second medical data;First medical data and second medical data
Between whether have the only negative word that acts on backward;Whether there is omission between first medical data and second medical data
Word;Whether there is negative word between first medical data and second medical data;First medical data and described
Whether there is diagnosis between second medical data;Whether there is anatomy portion between first medical data and second medical data
Position;Between first medical data and second medical data whether symptom;First medical data and described second
Between medical data whether ill variable;Whether there is Continuous Concept between first medical data and second medical data
The mode of punctuate segmentation;Whether there is number between first medical data and second medical data;First medicine
Between data and second medical data whether having time;It is between first medical data and second medical data
It is no to have verb.
In the exemplary embodiment, the dependency analysis feature may include at least one of following characteristics: described
Between one medical data and second medical data whether set membership;First medical data and the second medicine number
The dependency tree upper pathway length between;Whether there is subject-predicate on path between first medical data and second medical data
Relationship side;Whether guest relationship side is had between first medical data and second medical data on path;Described first
Whether surely in relationship or verbal endocentric phrase side are had between medical data and second medical data on path;First medicine
Whether a line moves guest's relationship or subject-predicate relationship on path between data and second medical data;First medicine
A line is relationship or verbal endocentric phrase in negative on path between data and second medical data;First medicine
Whether the last item side moves guest's relationship or subject-predicate relationship on path between data and second medical data;First doctor
Learn between data and second medical data whether the last item side moves guest's relationship or subject-predicate relationship on path.
In the exemplary embodiment, the sentence morphological feature may include at least one of following characteristics: described
Whether one medical data and second medical data are in a paragraph;First medical data and the second medicine number
According to whether in a sentence;Whether first medical data and second medical data are in a clause;Described
One medical data and second medical data whether in a paragraph, and it is intermediate there is no similar with the first medical data or
Person's other medical datas similar with the second medical data;Whether first medical data and second medical data are one
In a sentence, and other medical datas similar or similar with the second medical data with the first medical data are not present in centre;
Whether first medical data and second medical data are in a clause, and centre is not present and the first medical data
Other similar or similar with the second medical data medical datas.
In step s 130, described eigenvector is input to trained disaggregated model, determines the first medicine number
According to the relationship by objective (RBO) between second medical data.
In the exemplary embodiment, the relationship by objective (RBO) may include negative word and medical data relationship, time and medicine
Data relationship, numerical value and medical data relationship, the region of anatomy and medical data relationship, movement and medical data relationship, Qin Shuyu
Any one in medical data relationship etc..
In the exemplary embodiment, it is abstracted medical data relation object complicated variant system in advance.It can be needed from clinical data and medicine
Hair is found out, for the relationship of two medical datas, following a few classes can be abstracted as, as shown in table 1:
1 medical data classification system of table
It should be noted that the relationship type between medical data is not limited to be enumerated several, class in above-mentioned table 1
Complicated variant system can also divide from other angles, and basic demand is that have specific semantic type, and can cover most of medicine number
According to relationship, such as a medical data is fixed, second medical data is further grouped.A specific example, uncertainty relation:
Negative word is fixed as A, and the type of B is arbitrary.
Wherein, specific semantic type refers to that relationship type is abstract, such as: uncertainty relation, time relationship, numerical value close
System, action relationships etc., explicit semantic meaning type.
According to the medical data relation excavation method in this example embodiment, by obtaining the first medicine in target text
Data and the second medical data;And feature extraction is carried out to first medical data and second medical data, obtain institute
State the feature vector of the first medical data and second medical data;Described eigenvector is input to trained classification again
Model determines the relationship by objective (RBO) between first medical data and second medical data, can efficiently identify out and face
The relationship between medical data in bed case text, improves the efficiency of medical data relation excavation, in favor of further
Data statistic analysis.
As shown in Fig. 2, the disaggregated model characteristic set of design may include AB feature itself, week in the embodiment of the present invention
Side text feature, syntax dependency parsing feature and sentence morphological feature.
Wherein described AB feature itself may include A feature itself and B itself feature again.
The periphery text feature may include that the left side A text feature (it is special can also to be referred to as A foregoing information text again
Sign), text feature between text feature (information text feature behind B can also be referred to as) and AB on the right of B.
For example, characteristic set may include following information (here by taking two medical datas as an example, the first medical data A
Indicate, the second medical data is indicated with B):
2 characteristic set of table
It should be noted that the calculation of specific features value may change in above-mentioned table 2, it such as can be in the first medicine
Other of the both sides data A are apart from the second medical data B of interior search, that is, 10 words being not limited in above table.Specific
In medicine task, it can according to need the data shape in adjustment above table, optimize specific number, the present invention does not limit this
It is fixed.
Dependency grammar discloses its syntactic structure by the dependence between ingredient in metalanguage unit, advocates in sentence
Core verb is the center compositions for dominating other ingredients, and itself is not by the domination of other any ingredients, all subjects
Ingredient is all subordinated to dominator with certain dependence.It is mutually dominated between sentence element and is dominated, interdependent showed with by interdependent
As being prevalent in the vocabulary (synthesis language) of Chinese, phrase, simple sentence, the compound language at different levels that can independently use until sentence group
Among unit, this feature is the generality of dependence, and interdependent syntactic analysis can reflect out the language between each ingredient of sentence
Adopted modified relationship, it can obtain the collocation information of long range, and unrelated with the physical location of sentence element.
Interdependent syntactic analysis mark relationship and meaning such as the following table 3 involved in above-mentioned table 2:
Relationship type | It identifies (Tag) | It describes (description) |
Subject-predicate relationship | SBV | subject-verb |
Dynamic guest's relationship | VOB | Direct object, verb-object |
Relationship in fixed | ATT | attribute |
Verbal endocentric phrase | ADV | adverbial |
3 syntax dependence of table
It should be noted that being the syntactic structure directly obtained using dependency analysis in the prior art, by syntactic structure
Template extracts relationship by objective (RBO), and is to pass through data using crucial syntactic structure as the feature of disaggregated model in the embodiment of the present invention
Driving, it is automatic to learn.
In the embodiment of the present invention, in clinical case structure tasks, medical data relationship is excavated from long text, is provided
A method of having both effect and versatility.If disaggregated model is using two classification disaggregated models, basic ideas are by medicine number
Two classification problems are abstracted as according to relationship.
It is illustrated so that disaggregated model is two classification disaggregated models as an example below.As shown in figure 3, embodiment of the present invention mentions
The medical data relation excavation method of confession may comprise steps of.
In step s310, according to target medicine task, relationship by objective (RBO) is determined.
In the embodiment of the present invention, goal relationship be it is known, can be according to appointing for a specific medicine task
Business itself carries out task dismantling, obtains relationship by objective (RBO).
In step s 320, the first training of medical data and the second instruction in training corpus with the relationship by objective (RBO) are obtained
Practice medical data.
In step S330, to the first training of medical data and second training of medical in the training corpus
Data are labeled.
For example, " father suffers from diabetes, and mother's body is strong ", can mark are as follows:
" father " " diabetes " 1
" mother " " diabetes " 0
In step S340, the feature of the first training of medical data and the second training of medical data is extracted, is obtained
Obtain the feature vector of the first training of medical data and the second training of medical data.
In the embodiment of the present invention, feature extraction can be carried out according to the characteristic set that above-mentioned table 2 enumerates, such as meet condition
Then the value of corresponding positions is 1, is unsatisfactory for, and corresponding positions are set as 0, and in AB feature itself, A is a diagnosis, then feature vector
First for 1, A be not one diagnosis, then the first of feature vector be 0;A is region of anatomy, then feature vector
Second is that 1, A is not a region of anatomy, then the second of feature vector is 0;And so on.Each dimensional characteristics value is put down
Paving, particular characteristic value, which is placed in vector, fixes position, then forms feature vector.
It should be noted that everybody value of feature vector can be configured with actual demand, it is above-mentioned that it's not limited to that
" 1 " and " 0 ".
In step S350, the feature vector of the first training of medical data and the second training of medical data is utilized
Two classification disaggregated model of training.
In step S360, the first medical data and the second medical data in target text are obtained.
In step S370, feature extraction is carried out to first medical data and second medical data, obtains institute
State the feature vector of the first medical data and second medical data.
In step S380, the feature vector of first medical data and second medical data is input to training
Two good classification disaggregated models, determine that the relationship by objective (RBO) between first medical data and second medical data is
No establishment.
It is to be illustrated so that disaggregated model is two classification disaggregated models as an example further below.As shown in figure 4, embodiment party of the present invention
The medical data relation excavation method that formula provides may comprise steps of.
In step S410, it is abstracted medical data relation object complicated variant system.
In the embodiment of the present invention, pre-defined classification system determines target classification according to specific medicine task.It can be with
For marking training corpus according to target later.
In the step s 420, disaggregated model characteristic set is designed.
In the embodiment of the present invention, the characteristic set may include medical data feature itself, periphery text feature, syntax
At least one of dependence feature, sentence morphological feature etc..
In step S430, two classification disaggregated model of training.
In the embodiment of the present invention, based on medical data relationship type defined in above-mentioned steps S410, label target relationship
Training corpus;Then by feature defined in step S420, feature extraction is carried out to training corpus, long text carries out vectorization table
Show;Later, the training corpus good using disaggregated model training vectorization.
In the embodiment of the present invention, decision-tree model, model-naive Bayesian, support vector machines, deep learning can be used
Deng any one.
In step S440, the classification of medical data relationship.
It, can be by feature defined in step S420 for new clinical data, that is, target text in the embodiment of the present invention
Feature extraction is carried out, the expression of vectorization is formed, is input to the trained disaggregated model of above-mentioned steps S430, two classification judge mesh
Whether mark relationship is true.
In further embodiments, problem itself can also be abstracted as to more classification, disaggregated model is directly output to fixed two
The physical relationship of a medical data.
It is illustrated so that disaggregated model is more classification disaggregated models as an example below.As shown in figure 5, embodiment of the present invention mentions
The medical data relation excavation method of confession may comprise steps of.
In step S510, the first training of medical data and the second training of medical data in training corpus are obtained.
In step S520, to the first training of medical data and second training of medical in the training corpus
Data are labeled, wherein the content marked includes between the first training of medical data and the second training of medical data
Relationship by objective (RBO).
In the embodiment of the present invention, due to using more mode classifications, the feature vector of case long text is directly inputted, more classification
Disaggregated model can directly export the relationship by objective (RBO) between A and B.Therefore, it is necessary to change the notation methods of training corpus, i.e., more points
Class mark needs to mark specific relationship type, rather than whether one of particular kind of relationship type mark.
In step S530, the feature of the first training of medical data and the second training of medical data is extracted, is obtained
Obtain the feature vector of the first training of medical data and the second training of medical data.
In step S540, the feature vector of the first training of medical data and the second training of medical data is utilized
Training disaggregated models of classifying more.
In step S550, the first medical data and the second medical data in target text are obtained.
In step S560, feature extraction is carried out to first medical data and second medical data, obtains institute
State the feature vector of the first medical data and second medical data.
In step S570, the feature vector of first medical data and second medical data is input to training
Two good classification disaggregated models, export the relationship by objective (RBO) between first medical data and second medical data.
The medical data relation excavation method that embodiment of the present invention provides, on the one hand, by designing general medicine number
According to relationship system, trained model reusable improves the efficiency of new clinical data structuring, so as to promote medical data
The effect of relation excavation promotes the efficiency of medical data relation excavation, data value is accumulated, and with labeled data
Increase, relation recognition effect can become better and better, and historical data can accumulate;On the other hand, type is abstract in the embodiment of the present invention
With versatility, modelling effect has scalability, is not relationship one model of training, and mark work greatly mitigates;Therefore,
Traditional rule method is able to solve to the covering problem and rule conflict problem of clinical case data;Also it can solve based on syntax
The low problem of the structuring accuracy rate of dependency analysis technology.
It should be noted that although describing each step of method in the present invention in the accompanying drawings with particular order,
This does not require that or implies must execute these steps in this particular order, or have to carry out step shown in whole
Just it is able to achieve desired result.Additional or alternative, it is convenient to omit multiple steps are merged into a step and held by certain steps
Row, and/or a step is decomposed into execution of multiple steps etc..
Fig. 6 shows the block diagram of the medical data relation excavation device 600 of another exemplary embodiment according to the present invention.
As described in Figure 6, medical data relation excavation device 600 includes: that medical data obtains module 610, feature extraction mould
Block 620 and relationship by objective (RBO) determination module 630.Wherein:
Medical data obtains module 610 and is configurable to obtain the first medical data and the second medicine number in target text
According to.
Feature extraction module 620 is configurable to carry out feature to first medical data and second medical data
It extracts, obtains the feature vector of first medical data and second medical data.
Relationship by objective (RBO) determination module 630 is configurable to for described eigenvector to be input to trained disaggregated model, sentences
Relationship by objective (RBO) between fixed first medical data and second medical data.
In the exemplary embodiment, the relationship by objective (RBO) may include negative word and medical data relationship, time and medicine
Data relationship, numerical value and medical data relationship, the region of anatomy and medical data relationship, movement and medical data relationship, Qin Shuyu
Any one in medical data relationship etc..
In the exemplary embodiment, feature extraction module 620 may further include feature extraction unit, and the feature is taken out
Unit is taken to be configurable to obtain feature of first medical data itself, feature of second medical data itself, institute
State periphery text feature, syntax dependency parsing feature and the sentence morphological feature of the first medical data and second medical data
At least one of Deng.
In the exemplary embodiment, feature of first medical data itself includes at least one of following characteristics:
Whether first medical data is a diagnosis;Whether first medical data is a region of anatomy;First doctor
Learn whether data are a symptom;Whether first medical data is a lesion word;First medical data whether be
Negative word;Whether first medical data includes verb;Whether first medical data includes number;First medicine
Whether whether length is greater than preset byte to data;Whether first medical data includes time word.
In the exemplary embodiment, the periphery text feature includes that the foregoing information text of first medical data is special
Behind sign, second medical data between information text feature, first medical data and second medical data
At least one of text feature.
In the exemplary embodiment, the foregoing information text feature of first medical data include in following characteristics extremely
Few one kind: preset in a word whether have fullstop before first medical data;First medical data is noted earlier default
Whether there is comma in a word;Whether there are space or pause mark in first medical data default word noted earlier;Described
Whether there is negative word in one medical data default word noted earlier;It is in first medical data default word noted earlier
It is no have only act on negative word backward;Whether there is " companion " in first medical data default word noted earlier;First doctor
Whether learn in data default word noted earlier has " idol ";Whether there is province in first medical data default word noted earlier
Slightly word;Whether the verb of expression behavior is had in first medical data default word noted earlier;First medical data
Whether there is diagnosis in default word noted earlier;Whether there is anatomy portion in first medical data default word noted earlier
Position;In first medical data default word noted earlier whether symptom;First medical data default noted earlier
In word whether ill variable;The mould for whether thering is Continuous Concept punctuate to divide in first medical data default word noted earlier
Formula;In first medical data default word noted earlier whether having time;First medical data is noted earlier default
Whether there is number in a word;Whether there is verb in first medical data default word noted earlier.
In the exemplary embodiment, between first medical data and second medical data text feature include with
At least one of lower feature: the distance between first medical data and second medical data;First medicine
Sequence between data and second medical data;Fullstop between first medical data and second medical data
Number;The number of comma between first medical data and second medical data;First medical data and described
The number of space or pause mark between second medical data;Between first medical data and second medical data whether
There is " companion ";Whether there is " idol " between first medical data and second medical data;First medical data and institute
The verb for whether having expression behavior between the second medical data stated;Between first medical data and second medical data
Whether the negative word that only backward acts on is had;Whether there is omission word between first medical data and second medical data;
Whether there is negative word between first medical data and second medical data;First medical data and described second
Whether there is diagnosis between medical data;Whether there is the region of anatomy between first medical data and second medical data;
Between first medical data and second medical data whether symptom;First medical data and second medicine
Between data whether ill variable;Whether there is Continuous Concept punctuate between first medical data and second medical data
The mode of segmentation;Whether there is number between first medical data and second medical data;First medical data
Between second medical data whether having time;Whether have between first medical data and second medical data
Verb.
In the exemplary embodiment, the sentence morphological feature includes at least one of following characteristics: first doctor
Data and second medical data are learned whether in a paragraph;First medical data and second medical data are
It is no in a sentence;Whether first medical data and second medical data are in a clause;First doctor
Data and second medical data are learned whether in a paragraph, and it is intermediate there is no it is similar with the first medical data or with
Other similar medical datas of second medical data;Whether first medical data and second medical data are in a sentence
In son, and other medical datas similar or similar with the second medical data with the first medical data are not present in centre;It is described
Whether the first medical data and second medical data are in a clause, and there is no similar with the first medical data for centre
Or other medical datas similar with the second medical data.
Each functional module and above-mentioned doctor due to the medical data relation excavation device 600 of example embodiments of the present invention
The step of learning the example embodiment of data relationship method for digging is corresponding, therefore details are not described herein.
It should be noted that although be referred in the above detailed description medical data relation excavation device several modules or
Unit, but this division is not enforceable.In fact, embodiment according to the present invention, above-described two or more
Multimode or the feature and function of unit can embody in a module or unit.Conversely, above-described one
Module or the feature and function of unit can be to be embodied by multiple modules or unit with further division.
Fig. 7 shows the medical data relation excavation method or medical data relation excavation that can apply the embodiment of the present invention
The schematic diagram of the exemplary system architecture 100 of device.
As shown in fig. 7, system architecture 100 may include one of terminal device 101,102,103 or a variety of, network
104 and server 105.Network 104 between terminal device 101,102,103 and server 105 to provide communication link
Medium.Network 104 may include various connection types, such as wired, wireless communication link or fiber optic cables etc..
It should be understood that the number of terminal device, network and server in Fig. 7 is only schematical.According to realization need
It wants, can have any number of terminal device, network and server.For example server 105 can be multiple server compositions
Server cluster etc..
User can be used terminal device 101,102,103 and be interacted by network 104 with server 105, to receive or send out
Send message etc..Terminal device 101,102,103 can be the various electronic equipments with display screen, including but not limited to intelligent hand
Machine, tablet computer, portable computer and desktop computer etc..
Server 105 can be to provide the server of various services.Such as user (is also possible to using terminal device 103
Terminal device 101 or 102) it sends and requests to server 105.Server 105 can based on the relevant information carried in the request,
Matched search result is retrieved in the database, and search result is fed back into terminal device 103, and then user can be based on
The content shown on terminal device 103 is watched.
Fig. 8 shows the structural schematic diagram for being suitable for the electronic equipment for being used to realize the embodiment of the present invention.
It should be noted that the electronic equipment 200 shown in Fig. 8 is only an example, it should not be to the function of the embodiment of the present invention
Any restrictions can be brought with use scope.
As shown in figure 8, electronic equipment 200 includes central processing unit (CPU) 201, it can be according to being stored in read-only deposit
Program in reservoir (ROM) 202 is held from the program that storage section 208 is loaded into random access storage device (RAM) 203
The various movements appropriate of row and processing.In RAM 203, it is also stored with various programs and data needed for system operatio.
CPU201, ROM 202 and RAM 203 is connected with each other by bus 204.Input/output (I/O) interface 205 is also connected to always
Line 204.
I/O interface 205 is connected to lower component: the importation 206 including keyboard, mouse etc.;It is penetrated including such as cathode
The output par, c 207 of spool (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.;Storage section 208 including hard disk etc.;
And the communications portion 209 of the network interface card including LAN card, modem etc..Communications portion 209 via such as because
The network of spy's net executes communication process.Driver 210 is also connected to I/O interface 205 as needed.Detachable media 211, such as
Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on as needed on driver 210, in order to read from thereon
Computer program be mounted into storage section 208 as needed.
Particularly, according to an embodiment of the invention, may be implemented as computer below with reference to the process of flow chart description
Software program.For example, the embodiment of the present invention includes a kind of computer program product comprising be carried on computer-readable medium
On computer program, which includes the program code for method shown in execution flow chart.In such reality
It applies in example, which can be downloaded and installed from network by communications portion 209, and/or from detachable media
211 are mounted.When the computer program is executed by central processing unit (CPU) 201, the present processes and/or dress are executed
Set the various functions of middle restriction.
It should be noted that computer-readable medium shown in the present invention can be computer-readable signal media or meter
Calculation machine readable storage medium storing program for executing either the two any combination.Computer readable storage medium for example can be --- but not
Be limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or any above combination.Meter
The more specific example of calculation machine readable storage medium storing program for executing can include but is not limited to: have the electrical connection, just of one or more conducting wires
Taking formula computer disk, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type may be programmed read-only storage
Device (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device,
Or above-mentioned any appropriate combination.In the present invention, computer readable storage medium can be it is any include or storage journey
The tangible medium of sequence, the program can be commanded execution system, device or device use or in connection.And at this
In invention, computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal,
Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including but unlimited
In electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be that computer can
Any computer-readable medium other than storage medium is read, which can send, propagates or transmit and be used for
By the use of instruction execution system, device or device or program in connection.Include on computer-readable medium
Program code can transmit with any suitable medium, including but not limited to: wireless, electric wire, optical cable, RF etc. are above-mentioned
Any appropriate combination.
Flow chart and block diagram in attached drawing illustrate method, apparatus and computer journey according to various embodiments of the invention
The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation
A part of one module, program segment or code of table, a part of above-mentioned module, program segment or code include one or more
Executable instruction for implementing the specified logical function.It should also be noted that in some implementations as replacements, institute in box
The function of mark can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are practical
On can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it wants
It is noted that the combination of each box in block diagram or flow chart and the box in block diagram or flow chart, can use and execute rule
The dedicated hardware based systems of fixed functions or operations is realized, or can use the group of specialized hardware and computer instruction
It closes to realize.
As on the other hand, present invention also provides a kind of computer-readable medium, which be can be
Included in electronic equipment described in above-described embodiment;It is also possible to individualism, and without in the supplying electronic equipment.
Above-mentioned computer-readable medium carries one or more program, when the electronics is set by one for said one or multiple programs
When standby execution, so that method described in electronic equipment realization as the following examples.For example, the electronic equipment can be real
Each step now as shown in Figure 1.
By the description of above embodiment, those skilled in the art is it can be readily appreciated that example embodiment described herein
It can also be realized in such a way that software is in conjunction with necessary hardware by software realization.Therefore, implement according to the present invention
The technical solution of example can be embodied in the form of software products, which can store in a non-volatile memories
In medium (can be CD-ROM, USB flash disk, mobile hard disk etc.) or on network, including some instructions are so that a calculating equipment (can
To be personal computer, server, touch control terminal or network equipment etc.) it executes according to the method for the embodiment of the present invention.
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to of the invention its
Its embodiment.This application is intended to cover any variations, uses, or adaptations of the invention, these modifications, purposes or
Adaptive change follow general principle of the invention and including the undocumented common knowledge in the art of the present invention or
Conventional techniques.The description and examples are only to be considered as illustrative, and true scope and spirit of the invention are by claim
It points out.
It should be understood that the present invention is not limited to the precise structure already described above and shown in the accompanying drawings, and
And various modifications and changes may be made without departing from the scope thereof.The scope of the present invention is limited only by the attached claims.
Claims (11)
1. a kind of medical data relation excavation method characterized by comprising
Obtain the first medical data and the second medical data in target text;
Feature extraction is carried out to first medical data and second medical data, obtains first medical data and institute
State the feature vector of the second medical data;
Described eigenvector is input to trained disaggregated model, determines first medical data and the second medicine number
Relationship by objective (RBO) between.
2. medical data relation excavation method according to claim 1, which is characterized in that the relationship by objective (RBO) includes negative
Word and medical data relationship, time and medical data relationship, numerical value and medical data relationship, the region of anatomy and medical data are closed
Any one in system, movement and medical data relationship, relatives and medical data relationship.
3. medical data relation excavation method according to claim 1, which is characterized in that described to the first medicine number
Feature extraction is carried out according to second medical data, comprising:
Obtain feature of first medical data itself, feature of second medical data itself, the first medicine number
According at least one in periphery text feature, syntax dependency parsing feature and the sentence morphological feature with second medical data
Kind.
4. medical data relation excavation method according to claim 3, which is characterized in that the sheet of first medical data
Body feature includes at least one of following characteristics:
Whether first medical data is a diagnosis;
Whether first medical data is a region of anatomy;
Whether first medical data is a symptom;
Whether first medical data is a lesion word;
Whether first medical data is negative word;
Whether first medical data includes verb;
Whether first medical data includes number;
Whether whether length is greater than preset byte to first medical data;
Whether first medical data includes time word.
5. medical data relation excavation method according to claim 3, which is characterized in that the periphery text feature includes
Information text feature, described behind the foregoing information text feature of first medical data, second medical data
At least one of text feature between one medical data and second medical data.
6. medical data relation excavation method according to claim 5, which is characterized in that before first medical data
Face information text feature includes at least one of following characteristics:
Preset in a word whether have fullstop before first medical data;
Whether there is comma in first medical data default word noted earlier;
Whether there are space or pause mark in first medical data default word noted earlier;
Whether there is negative word in first medical data default word noted earlier;
Whether have in first medical data default word noted earlier and only acts on negative word backward;
Whether there is " companion " in first medical data default word noted earlier;
Whether there is " idol " in first medical data default word noted earlier;
Whether there is omission word in first medical data default word noted earlier;
Whether the verb of expression behavior is had in first medical data default word noted earlier;
Whether there is diagnosis in first medical data default word noted earlier;
Whether there is the region of anatomy in first medical data default word noted earlier;
In first medical data default word noted earlier whether symptom;
In first medical data default word noted earlier whether ill variable;
The mode for whether thering is Continuous Concept punctuate to divide in first medical data default word noted earlier;
In first medical data default word noted earlier whether having time;
Whether there is number in first medical data default word noted earlier;
Whether there is verb in first medical data default word noted earlier.
7. medical data relation excavation method according to claim 5, which is characterized in that first medical data and institute
Stating text feature between the second medical data includes at least one of following characteristics:
The distance between first medical data and second medical data;
Sequence between first medical data and second medical data;
The number of fullstop between first medical data and second medical data;
The number of comma between first medical data and second medical data;
The number of space or pause mark between first medical data and second medical data;
Whether there is " companion " between first medical data and second medical data;
Whether there is " idol " between first medical data and second medical data;
Whether the verb of expression behavior is had between first medical data and second medical data;
Whether the negative word that only backward acts on is had between first medical data and second medical data;
Whether there is omission word between first medical data and second medical data;
Whether there is negative word between first medical data and second medical data;
Whether there is diagnosis between first medical data and second medical data;
Whether there is the region of anatomy between first medical data and second medical data;
Between first medical data and second medical data whether symptom;
Between first medical data and second medical data whether ill variable;
The mode for whether thering is Continuous Concept punctuate to divide between first medical data and second medical data;
Whether there is number between first medical data and second medical data;
Between first medical data and second medical data whether having time;
Whether there is verb between first medical data and second medical data.
8. medical data relation excavation method according to claim 3, which is characterized in that the sentence morphological feature includes
At least one of following characteristics:
Whether first medical data and second medical data are in a paragraph;
Whether first medical data and second medical data are in a sentence;
Whether first medical data and second medical data are in a clause;
Whether first medical data and second medical data are in a paragraph, and centre is not present and the first medicine
Other similar or similar with the second medical data medical datas of data;
Whether first medical data and second medical data are in a sentence, and centre is not present and the first medicine
Other similar or similar with the second medical data medical datas of data;
Whether first medical data and second medical data are in a clause, and centre is not present and the first medicine
Other similar or similar with the second medical data medical datas of data.
9. a kind of medical data relation excavation device characterized by comprising
Medical data obtains module, is configured to obtain the first medical data and the second medical data in target text;
Feature extraction module is configured to carry out feature extraction to first medical data and second medical data, obtain
The feature vector of first medical data and second medical data;
Relationship by objective (RBO) determination module is configured to for described eigenvector to be input to trained disaggregated model, determines described first
Relationship by objective (RBO) between medical data and second medical data.
10. a kind of computer-readable medium, is stored thereon with computer program, which is characterized in that described program is held by processor
Medical data relation excavation method as claimed in any one of claims 1 to 8 is realized when row.
11. a kind of electronic equipment characterized by comprising
One or more processors;
Storage device is configured to store one or more programs, when one or more of programs are by one or more of places
When managing device execution, so that one or more of processors realize medical data relationship as claimed in any one of claims 1 to 8
Method for digging.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111306561.0A CN113963804A (en) | 2018-11-09 | 2018-11-09 | Medical data relation mining method and device |
CN201811330207.XA CN109300550B (en) | 2018-11-09 | 2018-11-09 | Medical data relation mining method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811330207.XA CN109300550B (en) | 2018-11-09 | 2018-11-09 | Medical data relation mining method and device |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111306561.0A Division CN113963804A (en) | 2018-11-09 | 2018-11-09 | Medical data relation mining method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109300550A true CN109300550A (en) | 2019-02-01 |
CN109300550B CN109300550B (en) | 2021-11-26 |
Family
ID=65145583
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811330207.XA Active CN109300550B (en) | 2018-11-09 | 2018-11-09 | Medical data relation mining method and device |
CN202111306561.0A Pending CN113963804A (en) | 2018-11-09 | 2018-11-09 | Medical data relation mining method and device |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111306561.0A Pending CN113963804A (en) | 2018-11-09 | 2018-11-09 | Medical data relation mining method and device |
Country Status (1)
Country | Link |
---|---|
CN (2) | CN109300550B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111488400A (en) * | 2019-04-28 | 2020-08-04 | 北京京东尚科信息技术有限公司 | Data classification method, device and computer readable storage medium |
CN114334167A (en) * | 2021-12-31 | 2022-04-12 | 医渡云(北京)技术有限公司 | Medical data mining method and device, storage medium and electronic equipment |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140350965A1 (en) * | 2013-05-23 | 2014-11-27 | Stéphane Michael Meystre | Systems and methods for extracting specified data from narrative text |
CN106446526A (en) * | 2016-08-31 | 2017-02-22 | 北京千安哲信息技术有限公司 | Electronic medical record entity relation extraction method and apparatus |
CN106708959A (en) * | 2016-11-30 | 2017-05-24 | 重庆大学 | Combination drug recognition and ranking method based on medical literature database |
CN106897568A (en) * | 2017-02-28 | 2017-06-27 | 北京大数医达科技有限公司 | The treating method and apparatus of case history structuring |
US20180025121A1 (en) * | 2016-07-20 | 2018-01-25 | Baidu Usa Llc | Systems and methods for finer-grained medical entity extraction |
CN107657063A (en) * | 2017-10-30 | 2018-02-02 | 合肥工业大学 | The construction method and device of medical knowledge collection of illustrative plates |
CN108447534A (en) * | 2018-05-18 | 2018-08-24 | 灵玖中科软件(北京)有限公司 | A kind of electronic health record data quality management method based on NLP |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080228769A1 (en) * | 2007-03-15 | 2008-09-18 | Siemens Medical Solutions Usa, Inc. | Medical Entity Extraction From Patient Data |
CN105389470A (en) * | 2015-11-18 | 2016-03-09 | 福建工程学院 | Method for automatically extracting Traditional Chinese Medicine acupuncture entity relationship |
-
2018
- 2018-11-09 CN CN201811330207.XA patent/CN109300550B/en active Active
- 2018-11-09 CN CN202111306561.0A patent/CN113963804A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140350965A1 (en) * | 2013-05-23 | 2014-11-27 | Stéphane Michael Meystre | Systems and methods for extracting specified data from narrative text |
US20180025121A1 (en) * | 2016-07-20 | 2018-01-25 | Baidu Usa Llc | Systems and methods for finer-grained medical entity extraction |
CN106446526A (en) * | 2016-08-31 | 2017-02-22 | 北京千安哲信息技术有限公司 | Electronic medical record entity relation extraction method and apparatus |
CN106708959A (en) * | 2016-11-30 | 2017-05-24 | 重庆大学 | Combination drug recognition and ranking method based on medical literature database |
CN106897568A (en) * | 2017-02-28 | 2017-06-27 | 北京大数医达科技有限公司 | The treating method and apparatus of case history structuring |
CN107657063A (en) * | 2017-10-30 | 2018-02-02 | 合肥工业大学 | The construction method and device of medical knowledge collection of illustrative plates |
CN108447534A (en) * | 2018-05-18 | 2018-08-24 | 灵玖中科软件(北京)有限公司 | A kind of electronic health record data quality management method based on NLP |
Non-Patent Citations (1)
Title |
---|
韦鹏程 等: "《基于R语言数据挖掘的统计与分析》", 31 December 2017, 子科技大学出版社 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111488400A (en) * | 2019-04-28 | 2020-08-04 | 北京京东尚科信息技术有限公司 | Data classification method, device and computer readable storage medium |
CN114334167A (en) * | 2021-12-31 | 2022-04-12 | 医渡云(北京)技术有限公司 | Medical data mining method and device, storage medium and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN109300550B (en) | 2021-11-26 |
CN113963804A (en) | 2022-01-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10698932B2 (en) | Method and apparatus for parsing query based on artificial intelligence, and storage medium | |
US10055391B2 (en) | Method and apparatus for forming a structured document from unstructured information | |
US7593927B2 (en) | Unstructured data in a mining model language | |
CN110196908A (en) | Data classification method, device, computer installation and storage medium | |
KR102179890B1 (en) | Systems for data collection and analysis | |
KR102424085B1 (en) | Machine-assisted conversation system and medical condition inquiry device and method | |
CN110807566A (en) | Artificial intelligence model evaluation method, device, equipment and storage medium | |
CN109408824A (en) | Method and apparatus for generating information | |
US20220147835A1 (en) | Knowledge graph construction system and knowledge graph construction method | |
CN113627797B (en) | Method, device, computer equipment and storage medium for generating staff member portrait | |
CN110534185A (en) | Labeled data acquisition methods divide and examine method, apparatus, storage medium and equipment | |
Neidle et al. | New shared & interconnected asl resources: Signstream® 3 software; dai 2 for web access to linguistically annotated video corpora; and a sign bank | |
CN108121699A (en) | For the method and apparatus of output information | |
CN110162766A (en) | Term vector update method and device | |
CN108027809A (en) | The function of body design based on deep learning is related | |
CN110297893A (en) | Natural language question-answering method, device, computer installation and storage medium | |
CN114360711A (en) | Multi-case based reasoning by syntactic-semantic alignment and utterance analysis | |
CN109300550A (en) | Medical data relation excavation method and device | |
CN115714002A (en) | Depression risk detection model training method, depression state early warning method and related equipment | |
CN109035094A (en) | Teaching method, device and terminal device based on artificial intelligence | |
CN115620886B (en) | Data auditing method and device | |
Zhang et al. | Business chatbots with deep learning technologies: State-of-the-art, taxonomies, and future research directions | |
CN106383865B (en) | Artificial intelligence based recommended data acquisition method and device | |
CN114138928A (en) | Method, system, device, electronic equipment and medium for extracting text content | |
KR20220079336A (en) | Method and apparatus for providing a chat service including an emotional expression item |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |