CN109300550A - Medical data relation excavation method and device - Google Patents

Medical data relation excavation method and device Download PDF

Info

Publication number
CN109300550A
CN109300550A CN201811330207.XA CN201811330207A CN109300550A CN 109300550 A CN109300550 A CN 109300550A CN 201811330207 A CN201811330207 A CN 201811330207A CN 109300550 A CN109300550 A CN 109300550A
Authority
CN
China
Prior art keywords
medical data
medical
data
relationship
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811330207.XA
Other languages
Chinese (zh)
Other versions
CN109300550B (en
Inventor
焦增涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin Happy Life Technology Co Ltd
Tianjin Xinkai Life Technology Co Ltd
Original Assignee
Tianjin Happy Life Technology Co Ltd
Tianjin Xinkai Life Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Happy Life Technology Co Ltd, Tianjin Xinkai Life Technology Co Ltd filed Critical Tianjin Happy Life Technology Co Ltd
Priority to CN202111306561.0A priority Critical patent/CN113963804A/en
Priority to CN201811330207.XA priority patent/CN109300550B/en
Publication of CN109300550A publication Critical patent/CN109300550A/en
Application granted granted Critical
Publication of CN109300550B publication Critical patent/CN109300550B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Public Health (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Epidemiology (AREA)
  • Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The present invention relates to a kind of medical data relation excavation method and device, electronic equipment and computer-readable mediums.This method comprises: obtaining the first medical data and the second medical data in target text;Feature extraction is carried out to first medical data and second medical data, obtains the feature vector of first medical data and second medical data;Described eigenvector is input to trained disaggregated model, determines the relationship by objective (RBO) between first medical data and second medical data.The present invention can efficiently identify out the relationship between the medical data in clinical case text, improve the efficiency of medical data relation excavation, in favor of further data statistic analysis.

Description

Medical data relation excavation method and device
Technical field
The present invention relates to medical informations to extract field, in particular to a kind of medical data relation excavation method and doctor Treat processing unit, electronic equipment and computer-readable medium.
Background technique
In clinical case text, many information are recorded in the form of long text, are not easy to subsequent statistical analysis task. Clinical case structuring can solve this kind of technical problem.Wherein, relation excavation of the medical terminology in long text is clinical number According to the very important step of structuring.
In the prior art, there are the method for artificial abstraction rule and based on text syntax dependency parsing in natural language processing Method carry out medical data relation excavation.
But artificial rule is a kind of method of single solution for diverse problems in the method for above-mentioned artificial abstraction rule, and effect is dependent on rule Careful degree.The above-mentioned method based on text syntax dependency parsing in natural language processing, if specific area training, mark It is very high to form this, so few direct applications in clinical case.
It should be noted that information is only used for reinforcing the reason to background of the invention disclosed in above-mentioned background technology part Solution, therefore may include the information not constituted to the prior art known to persons of ordinary skill in the art.
Summary of the invention
The purpose of the present invention is to provide a kind of medical data relation excavation method and medical data relation excavation device, energy The relationship between the medical data in clinical case text is enough efficiently identified out, the effect of medical data relation excavation is improved Rate.
According to an aspect of the present invention, a kind of medical data relation excavation method is provided, comprising: obtain in target text The first medical data and the second medical data;Feature pumping is carried out to first medical data and second medical data It takes, obtains the feature vector of first medical data and second medical data;Described eigenvector is input to training Good disaggregated model determines the relationship by objective (RBO) between first medical data and second medical data.
In a kind of exemplary embodiment of the invention, the relationship by objective (RBO) include negative word and medical data relationship, when Between with medical data relationship, numerical value and medical data relationship, the region of anatomy and medical data relationship, movement and medical data close Any one in system, relatives and medical data relationship.
It is described to first medical data and second medical data in a kind of exemplary embodiment of the invention Carry out feature extraction, comprising: obtain feature of first medical data itself, feature of second medical data itself, The periphery text feature of first medical data and second medical data, syntax dependency parsing feature and sentence form are special At least one of sign.
In a kind of exemplary embodiment of the invention, feature of first medical data itself includes in following characteristics At least one: first medical data whether be one diagnosis;Whether first medical data is a region of anatomy; Whether first medical data is a symptom;Whether first medical data is a lesion word;First medicine Whether data are negative word;Whether first medical data includes verb;Whether first medical data includes number;Institute Stating the first medical data, whether length is greater than preset byte;Whether first medical data includes time word.
In a kind of exemplary embodiment of the invention, before the periphery text feature includes first medical data Information text feature, first medical data and described second behind face information text feature, second medical data At least one of text feature between medical data.
In a kind of exemplary embodiment of the invention, the foregoing information text feature of first medical data include with At least one of lower feature: preset in a word whether have fullstop before first medical data;First medical data Whether there is comma in default word noted earlier;Whether have in first medical data default word noted earlier space or Pause mark;Whether there is negative word in first medical data default word noted earlier;First medical data is noted earlier Whether have in default word and only acts on negative word backward;Whether have in first medical data default word noted earlier " companion ";Whether there is " idol " in first medical data default word noted earlier;First medical data is noted earlier pre- If whether there is omission word in a word;Whether the verb of expression behavior is had in first medical data default word noted earlier; Whether there is diagnosis in first medical data default word noted earlier;First medical data default word noted earlier Inside whether there is the region of anatomy;In first medical data default word noted earlier whether symptom;First medical data In default word noted earlier whether ill variable;Whether have in first medical data default word noted earlier continuous general Read the mode of punctuate segmentation;In first medical data default word noted earlier whether having time;The first medicine number Preset in a word whether have number according to noted earlier;Whether there is verb in first medical data default word noted earlier.
It is literary between first medical data and second medical data in a kind of exemplary embodiment of the invention Eigen includes at least one of following characteristics: the distance between first medical data and second medical data; Sequence between first medical data and second medical data;First medical data and the second medicine number The number of fullstop between;The number of comma between first medical data and second medical data;First doctor Learn the number of space or pause mark between data and second medical data;First medical data and second medicine Whether have between data " companion ";Whether there is " idol " between first medical data and second medical data;Described first Whether the verb of expression behavior is had between medical data and second medical data;First medical data and described second Whether the negative word that only backward acts on is had between medical data;It is between first medical data and second medical data It is no to have omission word;Whether there is negative word between first medical data and second medical data;The first medicine number Whether there is diagnosis according between second medical data;Between first medical data and second medical data whether There is the region of anatomy;Between first medical data and second medical data whether symptom;First medical data and Between second medical data whether ill variable;Whether have between first medical data and second medical data The mode of Continuous Concept punctuate segmentation;Whether there is number between first medical data and second medical data;It is described Between first medical data and second medical data whether having time;First medical data and the second medicine number Whether there is verb between.
In a kind of exemplary embodiment of the invention, the syntax dependency parsing feature include in following characteristics at least It is a kind of: between first medical data and second medical data whether set membership;First medical data and institute State dependency tree upper pathway length between the second medical data;Path between first medical data and second medical data On whether have subject-predicate relationship side;Whether there is guest's relationship on path between first medical data and second medical data Side;Whether surely in relationship or verbal endocentric phrase side are had between first medical data and second medical data on path; Whether a line moves guest's relationship or subject-predicate relationship on path between first medical data and second medical data; A line is relationship or verbal endocentric phrase in negative on path between first medical data and second medical data; The whether dynamic guest's relationship in the last item side or subject-predicate are closed on path between first medical data and second medical data System;Whether the last item side moves guest's relationship or subject-predicate on path between first medical data and second medical data Relationship.
In a kind of exemplary embodiment of the invention, the sentence morphological feature includes at least one in following characteristics Kind: whether first medical data and second medical data are in a paragraph;First medical data and described Whether the second medical data is in a sentence;Whether first medical data and second medical data are in a clause In;Whether first medical data and second medical data are in a paragraph, and centre is not present and the first medicine Other similar or similar with the second medical data medical datas of data;First medical data and the second medicine number Whether according in a sentence, and intermediate there is no other similar or similar with the second medical data with the first medical data Medical data;Whether first medical data and second medical data are in a clause, and intermediate there is no with the Other similar or similar with the second medical data medical datas of one medical data.
According to an aspect of the present invention, a kind of medical data relation excavation device is provided, comprising: medical data obtains mould Block is configured to obtain the first medical data and the second medical data in target text;Feature extraction module is configured to described First medical data and second medical data carry out feature extraction, obtain first medical data and second medicine The feature vector of data;Relationship by objective (RBO) determination module is configured to for described eigenvector to be input to trained disaggregated model, sentence Relationship by objective (RBO) between fixed first medical data and second medical data.
According to an aspect of the present invention, a kind of computer-readable medium is provided, computer program is stored thereon with, it is described Medical data relation excavation method described in any of the above-described embodiment is realized when program is executed by processor.
According to an aspect of the present invention, a kind of electronic equipment is provided, comprising: one or more processors;Storage device, It is configured to store one or more programs, when one or more of programs are executed by one or more of processors, make It obtains one or more of processors and realizes medical data relation excavation method described in any of the above-described embodiment.
Medical data relation excavation method and medical data relation excavation dress in a kind of exemplary embodiment of the invention It sets, by obtaining the first medical data and the second medical data in target text;And to first medical data and described Second medical data carries out feature extraction, obtains the feature vector of first medical data and second medical data;Again Described eigenvector is input to trained disaggregated model, determine first medical data and second medical data it Between relationship by objective (RBO), the relationship between the medical data in clinical case text can be efficiently identified out, improve medicine number According to the efficiency of relation excavation, in favor of further data statistic analysis.
It should be understood that above general description and following detailed description be only it is exemplary and explanatory, not It can the limitation present invention.
Detailed description of the invention
Its example embodiment is described in detail by referring to accompanying drawing, above and other feature of the invention and advantage will become It is more obvious.
Fig. 1 shows the flow chart of the medical data relation excavation method of an exemplary embodiment according to the present invention;
Fig. 2 shows the schematic diagrames of the disaggregated model characteristic set of an exemplary embodiment according to the present invention;
Fig. 3 shows the flow chart of the medical data relation excavation method of another exemplary embodiment according to the present invention;
Fig. 4 shows the flow chart of the medical data relation excavation method of another exemplary embodiment according to the present invention;
Fig. 5 shows the flow chart of the medical data relation excavation method of another exemplary embodiment according to the present invention;
Fig. 6 shows the block diagram of the medical data relation excavation device of an exemplary embodiment according to the present invention;
Fig. 7 shows the medical data relation excavation method or medical data relation excavation that can apply the embodiment of the present invention The schematic diagram of the exemplary system architecture of device;
Fig. 8 shows the structural schematic diagram for being suitable for the electronic equipment for being used to realize the embodiment of the present invention.
Specific embodiment
Example embodiment is described more fully with reference to the drawings.However, example embodiment can be real in a variety of forms It applies, and is not understood as limited to embodiment set forth herein;On the contrary, thesing embodiments are provided so that the present invention will be comprehensively and complete It is whole, and the design of example embodiment is comprehensively communicated to those skilled in the art.Identical appended drawing reference indicates in figure Same or similar part, thus repetition thereof will be omitted.
In addition, described feature, structure or characteristic can be incorporated in one or more implementations in any suitable manner In example.In the following description, many details are provided to provide and fully understand to the embodiment of the present invention.However, It will be appreciated by persons skilled in the art that technical solution of the present invention can be practiced without one in the specific detail or more It is more, or can be using other methods, constituent element, material, device, step etc..In other cases, it is not shown in detail or describes Known features, method, apparatus, realization, material or operation are to avoid fuzzy each aspect of the present invention.
Block diagram shown in the drawings is only functional entity, not necessarily must be corresponding with physically separate entity. I.e., it is possible to realize these functional entitys using software form, or these are realized in the module of one or more softwares hardening A part of functional entity or functional entity, or realized in heterogeneous networks and/or processor device and/or microcontroller device These functional entitys.
In the prior art, it uses following three classes method and carries out medical data relation excavation:
First kind method: the method for artificial abstraction rule.Judge whether meet certain between medical data from text modality Kind relationship, and then determine whether relationship is true between medical data.For example, judging two medical datas whether in a comma point Every sentence in etc..
Above-mentioned first kind method at least has the disadvantage in that artificial rule is a kind of method of single solution for diverse problems, and effect depends on The careful degree of rule;High labor cost;For new data, the regular risk that can not be covered;Between rule there may be Conflict mutual exclusion.
Second class method: the method based on text syntax dependency parsing in natural language processing.Syntax dependency parsing is certainly The classical task of one of right Language Processing, it can be determined that whether meet subject-predicate relationship in sentence between each word, dynamic guest's relationship, repair The grammatical relations such as decorations relationship.Structure based on dependency analysis, judges whether medical data meets relationship by objective (RBO).
It is a kind of more satisfactory method that above-mentioned second class method, which at least has the disadvantage in that, but current industry is Chinese Syntax dependency parsing model effect itself is general, and for specific area training, mark cost is very high, so in clinical disease Few direct applications in example.
Third class method: specific medical data relationship, train classification models are diagnosed.According to task object, mark is clinical The sample of case traditional Chinese medicine data relationship is classified with some general machine learning classification models, judges that relationship by objective (RBO) is No establishment.
Above-mentioned third class method is a kind of relatively feasible method, for particular kind of relationship type, specific application area, mark Training corpus is classified, and judges whether corresponding relationship is true.But such methods are needed for every kind of medical terminology relationship, often A application scenarios, are targetedly marked and are trained, and as a result do not have scalability.
In the embodiment of the present invention, medical data can also be referred to as clinical data, can be medical terminology, TCM academic Language refers to the word that clear medical concept can be characterized in medical research or medical events, and the definition of clinical data needs In conjunction with specific clinical task define, such as " mother " in particular task, it is some be similar to the words such as " father ", " mother " can also To be the perpetual object of the certain medical task, that is, it is also possible to medical terminology.
In the embodiment of the present invention, medical data relationship type excavates the medical treatment letter for referring to and showing in long text in clinical case Breath, it will usually there is multiple medical terminologys or medical terminology and the collocation of other contaminations to indicate.
For example, family history: " father's body is strong, and mother is late, dies of lung cancer ", critical medical information therein are as follows:
Relatives: mother family disease: lung cancer
It is mother's trouble that lung cancer is excavated in the slave text that medical data relationship refers to, rather than the disease of father.
The invention proposes a kind of medical data relation excavation method, the relationship type that can be used between medical data is taken out As, and solved to know method for distinguishing with machine learning.
In this example embodiment, a kind of medical data relation excavation method is provided firstly.With reference to shown in Fig. 1, the doctor Learn data relationship method for digging the following steps are included:
In step s 110, the first medical data A and the second medical data B in target text are obtained.
In the embodiment of the present invention, the target text can be clinical case to be excavated, can be known by a set of entity Other algorithm extracts A and B from the long text of clinical case, and specific entity identification algorithms are referred to the prior art, This will not be detailed here.
In the step s 120, feature extraction is carried out to first medical data and second medical data, obtains institute State the feature vector of the first medical data and second medical data.
In the exemplary embodiment, described that feature pumping is carried out to first medical data and second medical data It takes, may include: itself feature for obtaining first medical data, feature of second medical data itself, described In the periphery text feature of one medical data and second medical data, dependency analysis feature and sentence morphological feature etc. extremely Few one kind.
In the exemplary embodiment, feature of first medical data itself may include at least one in following characteristics Kind: whether first medical data is a diagnosis;Whether first medical data is a region of anatomy;Described first Whether medical data is a symptom;Whether first medical data is a lesion word;Whether first medical data For negative word;Whether first medical data includes verb;Whether first medical data includes number;First doctor Learning data, whether length is greater than preset byte;Whether first medical data includes time word.
In the exemplary embodiment, the periphery text feature may include the foregoing information text of first medical data Information text feature, first medical data and second medical data behind eigen, second medical data Between at least one of text feature etc..
In the exemplary embodiment, the foregoing information text feature of first medical data may include in following characteristics At least one: preset in word whether have fullstop before first medical data;First medical data is noted earlier Whether there is comma in default word;Whether there are space or pause mark in first medical data default word noted earlier;Institute Whether state in the first medical data default word noted earlier has negative word;First medical data default word noted earlier Inside whether have and only acts on negative word backward;Whether there is " companion " in first medical data default word noted earlier;Described Whether there is " idol " in one medical data default word noted earlier;In first medical data default word noted earlier whether There is omission word;Whether the verb of expression behavior is had in first medical data default word noted earlier;First medicine Whether there is diagnosis in data default word noted earlier;Whether there is dissection in first medical data default word noted earlier Position;In first medical data default word noted earlier whether symptom;First medical data is noted earlier default In a word whether ill variable;Whether there is Continuous Concept punctuate to divide in first medical data default word noted earlier Mode;In first medical data default word noted earlier whether having time;First medical data is noted earlier pre- If whether having number in a word;Whether there is verb in first medical data default word noted earlier.
In the exemplary embodiment, text feature can wrap between first medical data and second medical data Include at least one of following characteristics: the distance between first medical data and second medical data;Described first Sequence between medical data and second medical data;Sentence between first medical data and second medical data Number number;The number of comma between first medical data and second medical data;First medical data and The number of space or pause mark between second medical data;Between first medical data and second medical data Whether " companion " is had;Whether there is " idol " between first medical data and second medical data;First medical data Whether there is the verb of expression behavior between second medical data;First medical data and second medical data Between whether have the only negative word that acts on backward;Whether there is omission between first medical data and second medical data Word;Whether there is negative word between first medical data and second medical data;First medical data and described Whether there is diagnosis between second medical data;Whether there is anatomy portion between first medical data and second medical data Position;Between first medical data and second medical data whether symptom;First medical data and described second Between medical data whether ill variable;Whether there is Continuous Concept between first medical data and second medical data The mode of punctuate segmentation;Whether there is number between first medical data and second medical data;First medicine Between data and second medical data whether having time;It is between first medical data and second medical data It is no to have verb.
In the exemplary embodiment, the dependency analysis feature may include at least one of following characteristics: described Between one medical data and second medical data whether set membership;First medical data and the second medicine number The dependency tree upper pathway length between;Whether there is subject-predicate on path between first medical data and second medical data Relationship side;Whether guest relationship side is had between first medical data and second medical data on path;Described first Whether surely in relationship or verbal endocentric phrase side are had between medical data and second medical data on path;First medicine Whether a line moves guest's relationship or subject-predicate relationship on path between data and second medical data;First medicine A line is relationship or verbal endocentric phrase in negative on path between data and second medical data;First medicine Whether the last item side moves guest's relationship or subject-predicate relationship on path between data and second medical data;First doctor Learn between data and second medical data whether the last item side moves guest's relationship or subject-predicate relationship on path.
In the exemplary embodiment, the sentence morphological feature may include at least one of following characteristics: described Whether one medical data and second medical data are in a paragraph;First medical data and the second medicine number According to whether in a sentence;Whether first medical data and second medical data are in a clause;Described One medical data and second medical data whether in a paragraph, and it is intermediate there is no similar with the first medical data or Person's other medical datas similar with the second medical data;Whether first medical data and second medical data are one In a sentence, and other medical datas similar or similar with the second medical data with the first medical data are not present in centre; Whether first medical data and second medical data are in a clause, and centre is not present and the first medical data Other similar or similar with the second medical data medical datas.
In step s 130, described eigenvector is input to trained disaggregated model, determines the first medicine number According to the relationship by objective (RBO) between second medical data.
In the exemplary embodiment, the relationship by objective (RBO) may include negative word and medical data relationship, time and medicine Data relationship, numerical value and medical data relationship, the region of anatomy and medical data relationship, movement and medical data relationship, Qin Shuyu Any one in medical data relationship etc..
In the exemplary embodiment, it is abstracted medical data relation object complicated variant system in advance.It can be needed from clinical data and medicine Hair is found out, for the relationship of two medical datas, following a few classes can be abstracted as, as shown in table 1:
1 medical data classification system of table
It should be noted that the relationship type between medical data is not limited to be enumerated several, class in above-mentioned table 1 Complicated variant system can also divide from other angles, and basic demand is that have specific semantic type, and can cover most of medicine number According to relationship, such as a medical data is fixed, second medical data is further grouped.A specific example, uncertainty relation: Negative word is fixed as A, and the type of B is arbitrary.
Wherein, specific semantic type refers to that relationship type is abstract, such as: uncertainty relation, time relationship, numerical value close System, action relationships etc., explicit semantic meaning type.
According to the medical data relation excavation method in this example embodiment, by obtaining the first medicine in target text Data and the second medical data;And feature extraction is carried out to first medical data and second medical data, obtain institute State the feature vector of the first medical data and second medical data;Described eigenvector is input to trained classification again Model determines the relationship by objective (RBO) between first medical data and second medical data, can efficiently identify out and face The relationship between medical data in bed case text, improves the efficiency of medical data relation excavation, in favor of further Data statistic analysis.
As shown in Fig. 2, the disaggregated model characteristic set of design may include AB feature itself, week in the embodiment of the present invention Side text feature, syntax dependency parsing feature and sentence morphological feature.
Wherein described AB feature itself may include A feature itself and B itself feature again.
The periphery text feature may include that the left side A text feature (it is special can also to be referred to as A foregoing information text again Sign), text feature between text feature (information text feature behind B can also be referred to as) and AB on the right of B.
For example, characteristic set may include following information (here by taking two medical datas as an example, the first medical data A Indicate, the second medical data is indicated with B):
2 characteristic set of table
It should be noted that the calculation of specific features value may change in above-mentioned table 2, it such as can be in the first medicine Other of the both sides data A are apart from the second medical data B of interior search, that is, 10 words being not limited in above table.Specific In medicine task, it can according to need the data shape in adjustment above table, optimize specific number, the present invention does not limit this It is fixed.
Dependency grammar discloses its syntactic structure by the dependence between ingredient in metalanguage unit, advocates in sentence Core verb is the center compositions for dominating other ingredients, and itself is not by the domination of other any ingredients, all subjects Ingredient is all subordinated to dominator with certain dependence.It is mutually dominated between sentence element and is dominated, interdependent showed with by interdependent As being prevalent in the vocabulary (synthesis language) of Chinese, phrase, simple sentence, the compound language at different levels that can independently use until sentence group Among unit, this feature is the generality of dependence, and interdependent syntactic analysis can reflect out the language between each ingredient of sentence Adopted modified relationship, it can obtain the collocation information of long range, and unrelated with the physical location of sentence element.
Interdependent syntactic analysis mark relationship and meaning such as the following table 3 involved in above-mentioned table 2:
Relationship type It identifies (Tag) It describes (description)
Subject-predicate relationship SBV subject-verb
Dynamic guest's relationship VOB Direct object, verb-object
Relationship in fixed ATT attribute
Verbal endocentric phrase ADV adverbial
3 syntax dependence of table
It should be noted that being the syntactic structure directly obtained using dependency analysis in the prior art, by syntactic structure Template extracts relationship by objective (RBO), and is to pass through data using crucial syntactic structure as the feature of disaggregated model in the embodiment of the present invention Driving, it is automatic to learn.
In the embodiment of the present invention, in clinical case structure tasks, medical data relationship is excavated from long text, is provided A method of having both effect and versatility.If disaggregated model is using two classification disaggregated models, basic ideas are by medicine number Two classification problems are abstracted as according to relationship.
It is illustrated so that disaggregated model is two classification disaggregated models as an example below.As shown in figure 3, embodiment of the present invention mentions The medical data relation excavation method of confession may comprise steps of.
In step s310, according to target medicine task, relationship by objective (RBO) is determined.
In the embodiment of the present invention, goal relationship be it is known, can be according to appointing for a specific medicine task Business itself carries out task dismantling, obtains relationship by objective (RBO).
In step s 320, the first training of medical data and the second instruction in training corpus with the relationship by objective (RBO) are obtained Practice medical data.
In step S330, to the first training of medical data and second training of medical in the training corpus Data are labeled.
For example, " father suffers from diabetes, and mother's body is strong ", can mark are as follows:
" father " " diabetes " 1
" mother " " diabetes " 0
In step S340, the feature of the first training of medical data and the second training of medical data is extracted, is obtained Obtain the feature vector of the first training of medical data and the second training of medical data.
In the embodiment of the present invention, feature extraction can be carried out according to the characteristic set that above-mentioned table 2 enumerates, such as meet condition Then the value of corresponding positions is 1, is unsatisfactory for, and corresponding positions are set as 0, and in AB feature itself, A is a diagnosis, then feature vector First for 1, A be not one diagnosis, then the first of feature vector be 0;A is region of anatomy, then feature vector Second is that 1, A is not a region of anatomy, then the second of feature vector is 0;And so on.Each dimensional characteristics value is put down Paving, particular characteristic value, which is placed in vector, fixes position, then forms feature vector.
It should be noted that everybody value of feature vector can be configured with actual demand, it is above-mentioned that it's not limited to that " 1 " and " 0 ".
In step S350, the feature vector of the first training of medical data and the second training of medical data is utilized Two classification disaggregated model of training.
In step S360, the first medical data and the second medical data in target text are obtained.
In step S370, feature extraction is carried out to first medical data and second medical data, obtains institute State the feature vector of the first medical data and second medical data.
In step S380, the feature vector of first medical data and second medical data is input to training Two good classification disaggregated models, determine that the relationship by objective (RBO) between first medical data and second medical data is No establishment.
It is to be illustrated so that disaggregated model is two classification disaggregated models as an example further below.As shown in figure 4, embodiment party of the present invention The medical data relation excavation method that formula provides may comprise steps of.
In step S410, it is abstracted medical data relation object complicated variant system.
In the embodiment of the present invention, pre-defined classification system determines target classification according to specific medicine task.It can be with For marking training corpus according to target later.
In the step s 420, disaggregated model characteristic set is designed.
In the embodiment of the present invention, the characteristic set may include medical data feature itself, periphery text feature, syntax At least one of dependence feature, sentence morphological feature etc..
In step S430, two classification disaggregated model of training.
In the embodiment of the present invention, based on medical data relationship type defined in above-mentioned steps S410, label target relationship Training corpus;Then by feature defined in step S420, feature extraction is carried out to training corpus, long text carries out vectorization table Show;Later, the training corpus good using disaggregated model training vectorization.
In the embodiment of the present invention, decision-tree model, model-naive Bayesian, support vector machines, deep learning can be used Deng any one.
In step S440, the classification of medical data relationship.
It, can be by feature defined in step S420 for new clinical data, that is, target text in the embodiment of the present invention Feature extraction is carried out, the expression of vectorization is formed, is input to the trained disaggregated model of above-mentioned steps S430, two classification judge mesh Whether mark relationship is true.
In further embodiments, problem itself can also be abstracted as to more classification, disaggregated model is directly output to fixed two The physical relationship of a medical data.
It is illustrated so that disaggregated model is more classification disaggregated models as an example below.As shown in figure 5, embodiment of the present invention mentions The medical data relation excavation method of confession may comprise steps of.
In step S510, the first training of medical data and the second training of medical data in training corpus are obtained.
In step S520, to the first training of medical data and second training of medical in the training corpus Data are labeled, wherein the content marked includes between the first training of medical data and the second training of medical data Relationship by objective (RBO).
In the embodiment of the present invention, due to using more mode classifications, the feature vector of case long text is directly inputted, more classification Disaggregated model can directly export the relationship by objective (RBO) between A and B.Therefore, it is necessary to change the notation methods of training corpus, i.e., more points Class mark needs to mark specific relationship type, rather than whether one of particular kind of relationship type mark.
In step S530, the feature of the first training of medical data and the second training of medical data is extracted, is obtained Obtain the feature vector of the first training of medical data and the second training of medical data.
In step S540, the feature vector of the first training of medical data and the second training of medical data is utilized Training disaggregated models of classifying more.
In step S550, the first medical data and the second medical data in target text are obtained.
In step S560, feature extraction is carried out to first medical data and second medical data, obtains institute State the feature vector of the first medical data and second medical data.
In step S570, the feature vector of first medical data and second medical data is input to training Two good classification disaggregated models, export the relationship by objective (RBO) between first medical data and second medical data.
The medical data relation excavation method that embodiment of the present invention provides, on the one hand, by designing general medicine number According to relationship system, trained model reusable improves the efficiency of new clinical data structuring, so as to promote medical data The effect of relation excavation promotes the efficiency of medical data relation excavation, data value is accumulated, and with labeled data Increase, relation recognition effect can become better and better, and historical data can accumulate;On the other hand, type is abstract in the embodiment of the present invention With versatility, modelling effect has scalability, is not relationship one model of training, and mark work greatly mitigates;Therefore, Traditional rule method is able to solve to the covering problem and rule conflict problem of clinical case data;Also it can solve based on syntax The low problem of the structuring accuracy rate of dependency analysis technology.
It should be noted that although describing each step of method in the present invention in the accompanying drawings with particular order, This does not require that or implies must execute these steps in this particular order, or have to carry out step shown in whole Just it is able to achieve desired result.Additional or alternative, it is convenient to omit multiple steps are merged into a step and held by certain steps Row, and/or a step is decomposed into execution of multiple steps etc..
Fig. 6 shows the block diagram of the medical data relation excavation device 600 of another exemplary embodiment according to the present invention.
As described in Figure 6, medical data relation excavation device 600 includes: that medical data obtains module 610, feature extraction mould Block 620 and relationship by objective (RBO) determination module 630.Wherein:
Medical data obtains module 610 and is configurable to obtain the first medical data and the second medicine number in target text According to.
Feature extraction module 620 is configurable to carry out feature to first medical data and second medical data It extracts, obtains the feature vector of first medical data and second medical data.
Relationship by objective (RBO) determination module 630 is configurable to for described eigenvector to be input to trained disaggregated model, sentences Relationship by objective (RBO) between fixed first medical data and second medical data.
In the exemplary embodiment, the relationship by objective (RBO) may include negative word and medical data relationship, time and medicine Data relationship, numerical value and medical data relationship, the region of anatomy and medical data relationship, movement and medical data relationship, Qin Shuyu Any one in medical data relationship etc..
In the exemplary embodiment, feature extraction module 620 may further include feature extraction unit, and the feature is taken out Unit is taken to be configurable to obtain feature of first medical data itself, feature of second medical data itself, institute State periphery text feature, syntax dependency parsing feature and the sentence morphological feature of the first medical data and second medical data At least one of Deng.
In the exemplary embodiment, feature of first medical data itself includes at least one of following characteristics: Whether first medical data is a diagnosis;Whether first medical data is a region of anatomy;First doctor Learn whether data are a symptom;Whether first medical data is a lesion word;First medical data whether be Negative word;Whether first medical data includes verb;Whether first medical data includes number;First medicine Whether whether length is greater than preset byte to data;Whether first medical data includes time word.
In the exemplary embodiment, the periphery text feature includes that the foregoing information text of first medical data is special Behind sign, second medical data between information text feature, first medical data and second medical data At least one of text feature.
In the exemplary embodiment, the foregoing information text feature of first medical data include in following characteristics extremely Few one kind: preset in a word whether have fullstop before first medical data;First medical data is noted earlier default Whether there is comma in a word;Whether there are space or pause mark in first medical data default word noted earlier;Described Whether there is negative word in one medical data default word noted earlier;It is in first medical data default word noted earlier It is no have only act on negative word backward;Whether there is " companion " in first medical data default word noted earlier;First doctor Whether learn in data default word noted earlier has " idol ";Whether there is province in first medical data default word noted earlier Slightly word;Whether the verb of expression behavior is had in first medical data default word noted earlier;First medical data Whether there is diagnosis in default word noted earlier;Whether there is anatomy portion in first medical data default word noted earlier Position;In first medical data default word noted earlier whether symptom;First medical data default noted earlier In word whether ill variable;The mould for whether thering is Continuous Concept punctuate to divide in first medical data default word noted earlier Formula;In first medical data default word noted earlier whether having time;First medical data is noted earlier default Whether there is number in a word;Whether there is verb in first medical data default word noted earlier.
In the exemplary embodiment, between first medical data and second medical data text feature include with At least one of lower feature: the distance between first medical data and second medical data;First medicine Sequence between data and second medical data;Fullstop between first medical data and second medical data Number;The number of comma between first medical data and second medical data;First medical data and described The number of space or pause mark between second medical data;Between first medical data and second medical data whether There is " companion ";Whether there is " idol " between first medical data and second medical data;First medical data and institute The verb for whether having expression behavior between the second medical data stated;Between first medical data and second medical data Whether the negative word that only backward acts on is had;Whether there is omission word between first medical data and second medical data; Whether there is negative word between first medical data and second medical data;First medical data and described second Whether there is diagnosis between medical data;Whether there is the region of anatomy between first medical data and second medical data; Between first medical data and second medical data whether symptom;First medical data and second medicine Between data whether ill variable;Whether there is Continuous Concept punctuate between first medical data and second medical data The mode of segmentation;Whether there is number between first medical data and second medical data;First medical data Between second medical data whether having time;Whether have between first medical data and second medical data Verb.
In the exemplary embodiment, the sentence morphological feature includes at least one of following characteristics: first doctor Data and second medical data are learned whether in a paragraph;First medical data and second medical data are It is no in a sentence;Whether first medical data and second medical data are in a clause;First doctor Data and second medical data are learned whether in a paragraph, and it is intermediate there is no it is similar with the first medical data or with Other similar medical datas of second medical data;Whether first medical data and second medical data are in a sentence In son, and other medical datas similar or similar with the second medical data with the first medical data are not present in centre;It is described Whether the first medical data and second medical data are in a clause, and there is no similar with the first medical data for centre Or other medical datas similar with the second medical data.
Each functional module and above-mentioned doctor due to the medical data relation excavation device 600 of example embodiments of the present invention The step of learning the example embodiment of data relationship method for digging is corresponding, therefore details are not described herein.
It should be noted that although be referred in the above detailed description medical data relation excavation device several modules or Unit, but this division is not enforceable.In fact, embodiment according to the present invention, above-described two or more Multimode or the feature and function of unit can embody in a module or unit.Conversely, above-described one Module or the feature and function of unit can be to be embodied by multiple modules or unit with further division.
Fig. 7 shows the medical data relation excavation method or medical data relation excavation that can apply the embodiment of the present invention The schematic diagram of the exemplary system architecture 100 of device.
As shown in fig. 7, system architecture 100 may include one of terminal device 101,102,103 or a variety of, network 104 and server 105.Network 104 between terminal device 101,102,103 and server 105 to provide communication link Medium.Network 104 may include various connection types, such as wired, wireless communication link or fiber optic cables etc..
It should be understood that the number of terminal device, network and server in Fig. 7 is only schematical.According to realization need It wants, can have any number of terminal device, network and server.For example server 105 can be multiple server compositions Server cluster etc..
User can be used terminal device 101,102,103 and be interacted by network 104 with server 105, to receive or send out Send message etc..Terminal device 101,102,103 can be the various electronic equipments with display screen, including but not limited to intelligent hand Machine, tablet computer, portable computer and desktop computer etc..
Server 105 can be to provide the server of various services.Such as user (is also possible to using terminal device 103 Terminal device 101 or 102) it sends and requests to server 105.Server 105 can based on the relevant information carried in the request, Matched search result is retrieved in the database, and search result is fed back into terminal device 103, and then user can be based on The content shown on terminal device 103 is watched.
Fig. 8 shows the structural schematic diagram for being suitable for the electronic equipment for being used to realize the embodiment of the present invention.
It should be noted that the electronic equipment 200 shown in Fig. 8 is only an example, it should not be to the function of the embodiment of the present invention Any restrictions can be brought with use scope.
As shown in figure 8, electronic equipment 200 includes central processing unit (CPU) 201, it can be according to being stored in read-only deposit Program in reservoir (ROM) 202 is held from the program that storage section 208 is loaded into random access storage device (RAM) 203 The various movements appropriate of row and processing.In RAM 203, it is also stored with various programs and data needed for system operatio. CPU201, ROM 202 and RAM 203 is connected with each other by bus 204.Input/output (I/O) interface 205 is also connected to always Line 204.
I/O interface 205 is connected to lower component: the importation 206 including keyboard, mouse etc.;It is penetrated including such as cathode The output par, c 207 of spool (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.;Storage section 208 including hard disk etc.; And the communications portion 209 of the network interface card including LAN card, modem etc..Communications portion 209 via such as because The network of spy's net executes communication process.Driver 210 is also connected to I/O interface 205 as needed.Detachable media 211, such as Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on as needed on driver 210, in order to read from thereon Computer program be mounted into storage section 208 as needed.
Particularly, according to an embodiment of the invention, may be implemented as computer below with reference to the process of flow chart description Software program.For example, the embodiment of the present invention includes a kind of computer program product comprising be carried on computer-readable medium On computer program, which includes the program code for method shown in execution flow chart.In such reality It applies in example, which can be downloaded and installed from network by communications portion 209, and/or from detachable media 211 are mounted.When the computer program is executed by central processing unit (CPU) 201, the present processes and/or dress are executed Set the various functions of middle restriction.
It should be noted that computer-readable medium shown in the present invention can be computer-readable signal media or meter Calculation machine readable storage medium storing program for executing either the two any combination.Computer readable storage medium for example can be --- but not Be limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or any above combination.Meter The more specific example of calculation machine readable storage medium storing program for executing can include but is not limited to: have the electrical connection, just of one or more conducting wires Taking formula computer disk, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type may be programmed read-only storage Device (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device, Or above-mentioned any appropriate combination.In the present invention, computer readable storage medium can be it is any include or storage journey The tangible medium of sequence, the program can be commanded execution system, device or device use or in connection.And at this In invention, computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal, Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including but unlimited In electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be that computer can Any computer-readable medium other than storage medium is read, which can send, propagates or transmit and be used for By the use of instruction execution system, device or device or program in connection.Include on computer-readable medium Program code can transmit with any suitable medium, including but not limited to: wireless, electric wire, optical cable, RF etc. are above-mentioned Any appropriate combination.
Flow chart and block diagram in attached drawing illustrate method, apparatus and computer journey according to various embodiments of the invention The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part of one module, program segment or code of table, a part of above-mentioned module, program segment or code include one or more Executable instruction for implementing the specified logical function.It should also be noted that in some implementations as replacements, institute in box The function of mark can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are practical On can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it wants It is noted that the combination of each box in block diagram or flow chart and the box in block diagram or flow chart, can use and execute rule The dedicated hardware based systems of fixed functions or operations is realized, or can use the group of specialized hardware and computer instruction It closes to realize.
As on the other hand, present invention also provides a kind of computer-readable medium, which be can be Included in electronic equipment described in above-described embodiment;It is also possible to individualism, and without in the supplying electronic equipment. Above-mentioned computer-readable medium carries one or more program, when the electronics is set by one for said one or multiple programs When standby execution, so that method described in electronic equipment realization as the following examples.For example, the electronic equipment can be real Each step now as shown in Figure 1.
By the description of above embodiment, those skilled in the art is it can be readily appreciated that example embodiment described herein It can also be realized in such a way that software is in conjunction with necessary hardware by software realization.Therefore, implement according to the present invention The technical solution of example can be embodied in the form of software products, which can store in a non-volatile memories In medium (can be CD-ROM, USB flash disk, mobile hard disk etc.) or on network, including some instructions are so that a calculating equipment (can To be personal computer, server, touch control terminal or network equipment etc.) it executes according to the method for the embodiment of the present invention.
Those skilled in the art after considering the specification and implementing the invention disclosed here, will readily occur to of the invention its Its embodiment.This application is intended to cover any variations, uses, or adaptations of the invention, these modifications, purposes or Adaptive change follow general principle of the invention and including the undocumented common knowledge in the art of the present invention or Conventional techniques.The description and examples are only to be considered as illustrative, and true scope and spirit of the invention are by claim It points out.
It should be understood that the present invention is not limited to the precise structure already described above and shown in the accompanying drawings, and And various modifications and changes may be made without departing from the scope thereof.The scope of the present invention is limited only by the attached claims.

Claims (11)

1. a kind of medical data relation excavation method characterized by comprising
Obtain the first medical data and the second medical data in target text;
Feature extraction is carried out to first medical data and second medical data, obtains first medical data and institute State the feature vector of the second medical data;
Described eigenvector is input to trained disaggregated model, determines first medical data and the second medicine number Relationship by objective (RBO) between.
2. medical data relation excavation method according to claim 1, which is characterized in that the relationship by objective (RBO) includes negative Word and medical data relationship, time and medical data relationship, numerical value and medical data relationship, the region of anatomy and medical data are closed Any one in system, movement and medical data relationship, relatives and medical data relationship.
3. medical data relation excavation method according to claim 1, which is characterized in that described to the first medicine number Feature extraction is carried out according to second medical data, comprising:
Obtain feature of first medical data itself, feature of second medical data itself, the first medicine number According at least one in periphery text feature, syntax dependency parsing feature and the sentence morphological feature with second medical data Kind.
4. medical data relation excavation method according to claim 3, which is characterized in that the sheet of first medical data Body feature includes at least one of following characteristics:
Whether first medical data is a diagnosis;
Whether first medical data is a region of anatomy;
Whether first medical data is a symptom;
Whether first medical data is a lesion word;
Whether first medical data is negative word;
Whether first medical data includes verb;
Whether first medical data includes number;
Whether whether length is greater than preset byte to first medical data;
Whether first medical data includes time word.
5. medical data relation excavation method according to claim 3, which is characterized in that the periphery text feature includes Information text feature, described behind the foregoing information text feature of first medical data, second medical data At least one of text feature between one medical data and second medical data.
6. medical data relation excavation method according to claim 5, which is characterized in that before first medical data Face information text feature includes at least one of following characteristics:
Preset in a word whether have fullstop before first medical data;
Whether there is comma in first medical data default word noted earlier;
Whether there are space or pause mark in first medical data default word noted earlier;
Whether there is negative word in first medical data default word noted earlier;
Whether have in first medical data default word noted earlier and only acts on negative word backward;
Whether there is " companion " in first medical data default word noted earlier;
Whether there is " idol " in first medical data default word noted earlier;
Whether there is omission word in first medical data default word noted earlier;
Whether the verb of expression behavior is had in first medical data default word noted earlier;
Whether there is diagnosis in first medical data default word noted earlier;
Whether there is the region of anatomy in first medical data default word noted earlier;
In first medical data default word noted earlier whether symptom;
In first medical data default word noted earlier whether ill variable;
The mode for whether thering is Continuous Concept punctuate to divide in first medical data default word noted earlier;
In first medical data default word noted earlier whether having time;
Whether there is number in first medical data default word noted earlier;
Whether there is verb in first medical data default word noted earlier.
7. medical data relation excavation method according to claim 5, which is characterized in that first medical data and institute Stating text feature between the second medical data includes at least one of following characteristics:
The distance between first medical data and second medical data;
Sequence between first medical data and second medical data;
The number of fullstop between first medical data and second medical data;
The number of comma between first medical data and second medical data;
The number of space or pause mark between first medical data and second medical data;
Whether there is " companion " between first medical data and second medical data;
Whether there is " idol " between first medical data and second medical data;
Whether the verb of expression behavior is had between first medical data and second medical data;
Whether the negative word that only backward acts on is had between first medical data and second medical data;
Whether there is omission word between first medical data and second medical data;
Whether there is negative word between first medical data and second medical data;
Whether there is diagnosis between first medical data and second medical data;
Whether there is the region of anatomy between first medical data and second medical data;
Between first medical data and second medical data whether symptom;
Between first medical data and second medical data whether ill variable;
The mode for whether thering is Continuous Concept punctuate to divide between first medical data and second medical data;
Whether there is number between first medical data and second medical data;
Between first medical data and second medical data whether having time;
Whether there is verb between first medical data and second medical data.
8. medical data relation excavation method according to claim 3, which is characterized in that the sentence morphological feature includes At least one of following characteristics:
Whether first medical data and second medical data are in a paragraph;
Whether first medical data and second medical data are in a sentence;
Whether first medical data and second medical data are in a clause;
Whether first medical data and second medical data are in a paragraph, and centre is not present and the first medicine Other similar or similar with the second medical data medical datas of data;
Whether first medical data and second medical data are in a sentence, and centre is not present and the first medicine Other similar or similar with the second medical data medical datas of data;
Whether first medical data and second medical data are in a clause, and centre is not present and the first medicine Other similar or similar with the second medical data medical datas of data.
9. a kind of medical data relation excavation device characterized by comprising
Medical data obtains module, is configured to obtain the first medical data and the second medical data in target text;
Feature extraction module is configured to carry out feature extraction to first medical data and second medical data, obtain The feature vector of first medical data and second medical data;
Relationship by objective (RBO) determination module is configured to for described eigenvector to be input to trained disaggregated model, determines described first Relationship by objective (RBO) between medical data and second medical data.
10. a kind of computer-readable medium, is stored thereon with computer program, which is characterized in that described program is held by processor Medical data relation excavation method as claimed in any one of claims 1 to 8 is realized when row.
11. a kind of electronic equipment characterized by comprising
One or more processors;
Storage device is configured to store one or more programs, when one or more of programs are by one or more of places When managing device execution, so that one or more of processors realize medical data relationship as claimed in any one of claims 1 to 8 Method for digging.
CN201811330207.XA 2018-11-09 2018-11-09 Medical data relation mining method and device Active CN109300550B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202111306561.0A CN113963804A (en) 2018-11-09 2018-11-09 Medical data relation mining method and device
CN201811330207.XA CN109300550B (en) 2018-11-09 2018-11-09 Medical data relation mining method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811330207.XA CN109300550B (en) 2018-11-09 2018-11-09 Medical data relation mining method and device

Related Child Applications (1)

Application Number Title Priority Date Filing Date
CN202111306561.0A Division CN113963804A (en) 2018-11-09 2018-11-09 Medical data relation mining method and device

Publications (2)

Publication Number Publication Date
CN109300550A true CN109300550A (en) 2019-02-01
CN109300550B CN109300550B (en) 2021-11-26

Family

ID=65145583

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201811330207.XA Active CN109300550B (en) 2018-11-09 2018-11-09 Medical data relation mining method and device
CN202111306561.0A Pending CN113963804A (en) 2018-11-09 2018-11-09 Medical data relation mining method and device

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202111306561.0A Pending CN113963804A (en) 2018-11-09 2018-11-09 Medical data relation mining method and device

Country Status (1)

Country Link
CN (2) CN109300550B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111488400A (en) * 2019-04-28 2020-08-04 北京京东尚科信息技术有限公司 Data classification method, device and computer readable storage medium
CN114334167A (en) * 2021-12-31 2022-04-12 医渡云(北京)技术有限公司 Medical data mining method and device, storage medium and electronic equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140350965A1 (en) * 2013-05-23 2014-11-27 Stéphane Michael Meystre Systems and methods for extracting specified data from narrative text
CN106446526A (en) * 2016-08-31 2017-02-22 北京千安哲信息技术有限公司 Electronic medical record entity relation extraction method and apparatus
CN106708959A (en) * 2016-11-30 2017-05-24 重庆大学 Combination drug recognition and ranking method based on medical literature database
CN106897568A (en) * 2017-02-28 2017-06-27 北京大数医达科技有限公司 The treating method and apparatus of case history structuring
US20180025121A1 (en) * 2016-07-20 2018-01-25 Baidu Usa Llc Systems and methods for finer-grained medical entity extraction
CN107657063A (en) * 2017-10-30 2018-02-02 合肥工业大学 The construction method and device of medical knowledge collection of illustrative plates
CN108447534A (en) * 2018-05-18 2018-08-24 灵玖中科软件(北京)有限公司 A kind of electronic health record data quality management method based on NLP

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080228769A1 (en) * 2007-03-15 2008-09-18 Siemens Medical Solutions Usa, Inc. Medical Entity Extraction From Patient Data
CN105389470A (en) * 2015-11-18 2016-03-09 福建工程学院 Method for automatically extracting Traditional Chinese Medicine acupuncture entity relationship

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140350965A1 (en) * 2013-05-23 2014-11-27 Stéphane Michael Meystre Systems and methods for extracting specified data from narrative text
US20180025121A1 (en) * 2016-07-20 2018-01-25 Baidu Usa Llc Systems and methods for finer-grained medical entity extraction
CN106446526A (en) * 2016-08-31 2017-02-22 北京千安哲信息技术有限公司 Electronic medical record entity relation extraction method and apparatus
CN106708959A (en) * 2016-11-30 2017-05-24 重庆大学 Combination drug recognition and ranking method based on medical literature database
CN106897568A (en) * 2017-02-28 2017-06-27 北京大数医达科技有限公司 The treating method and apparatus of case history structuring
CN107657063A (en) * 2017-10-30 2018-02-02 合肥工业大学 The construction method and device of medical knowledge collection of illustrative plates
CN108447534A (en) * 2018-05-18 2018-08-24 灵玖中科软件(北京)有限公司 A kind of electronic health record data quality management method based on NLP

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
韦鹏程 等: "《基于R语言数据挖掘的统计与分析》", 31 December 2017, 子科技大学出版社 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111488400A (en) * 2019-04-28 2020-08-04 北京京东尚科信息技术有限公司 Data classification method, device and computer readable storage medium
CN114334167A (en) * 2021-12-31 2022-04-12 医渡云(北京)技术有限公司 Medical data mining method and device, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN109300550B (en) 2021-11-26
CN113963804A (en) 2022-01-21

Similar Documents

Publication Publication Date Title
US10698932B2 (en) Method and apparatus for parsing query based on artificial intelligence, and storage medium
US10055391B2 (en) Method and apparatus for forming a structured document from unstructured information
US7593927B2 (en) Unstructured data in a mining model language
CN110196908A (en) Data classification method, device, computer installation and storage medium
KR102179890B1 (en) Systems for data collection and analysis
KR102424085B1 (en) Machine-assisted conversation system and medical condition inquiry device and method
CN110807566A (en) Artificial intelligence model evaluation method, device, equipment and storage medium
CN109408824A (en) Method and apparatus for generating information
US20220147835A1 (en) Knowledge graph construction system and knowledge graph construction method
CN113627797B (en) Method, device, computer equipment and storage medium for generating staff member portrait
CN110534185A (en) Labeled data acquisition methods divide and examine method, apparatus, storage medium and equipment
Neidle et al. New shared & interconnected asl resources: Signstream® 3 software; dai 2 for web access to linguistically annotated video corpora; and a sign bank
CN108121699A (en) For the method and apparatus of output information
CN110162766A (en) Term vector update method and device
CN108027809A (en) The function of body design based on deep learning is related
CN110297893A (en) Natural language question-answering method, device, computer installation and storage medium
CN114360711A (en) Multi-case based reasoning by syntactic-semantic alignment and utterance analysis
CN109300550A (en) Medical data relation excavation method and device
CN115714002A (en) Depression risk detection model training method, depression state early warning method and related equipment
CN109035094A (en) Teaching method, device and terminal device based on artificial intelligence
CN115620886B (en) Data auditing method and device
Zhang et al. Business chatbots with deep learning technologies: State-of-the-art, taxonomies, and future research directions
CN106383865B (en) Artificial intelligence based recommended data acquisition method and device
CN114138928A (en) Method, system, device, electronic equipment and medium for extracting text content
KR20220079336A (en) Method and apparatus for providing a chat service including an emotional expression item

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant