CN105894088B - Based on deep learning and distributed semantic feature medical information extraction system and method - Google Patents

Based on deep learning and distributed semantic feature medical information extraction system and method Download PDF

Info

Publication number
CN105894088B
CN105894088B CN201610176409.8A CN201610176409A CN105894088B CN 105894088 B CN105894088 B CN 105894088B CN 201610176409 A CN201610176409 A CN 201610176409A CN 105894088 B CN105894088 B CN 105894088B
Authority
CN
China
Prior art keywords
module
training
network
term vector
medical
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610176409.8A
Other languages
Chinese (zh)
Other versions
CN105894088A (en
Inventor
吴永辉
王璟琪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Digital China Health Technologies Co ltd
Shenzhou Hebote Medical Information Technology Suzhou Co Ltd
Original Assignee
Suzhou Hebta Health Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Hebta Health Information Technology Co ltd filed Critical Suzhou Hebta Health Information Technology Co ltd
Priority to CN201610176409.8A priority Critical patent/CN105894088B/en
Publication of CN105894088A publication Critical patent/CN105894088A/en
Application granted granted Critical
Publication of CN105894088B publication Critical patent/CN105894088B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning

Abstract

The invention discloses based on deep learning and distributed semantic feature medical information extraction system and method, term vector training module, massive medical knowledge base comprising preprocessing module, based on language model strengthen study module and the medicine name Entity recognition module based on depth artificial neural network;By deep learning method using the probability for generating language model as optimization aim, the primary term vector of medicine text big data training is used;Based on massive medical knowledge base, second depth artificial neural network of training is strengthened learning, during mass knowledge library is attached to the feature learning of deep learning, so as to obtain very to the distributed semantic feature of medical domain by depth;The deep learning method based on optimization statement level maximum likelihood probability is finally used to carry out Chinese medical name Entity recognition.Term vector is generated using a large amount of un-annotated datas, so as to avoid cumbersome feature selecting and the evolutionary process in medicine natural language processing.

Description

Based on deep learning and distributed semantic feature medical information extraction system and method
Technical field
The present invention relates to a kind of medical information extraction system based on deep learning and distributed semantic feature and its realizations Method.
Background technology
Widely used health and fitness information technology results in the unprecedentedly inflated of electric health record (EHR) data.Electronic health record Data not only have been used to support clinical manipulation task (for example, Clinical Decision Support Systems), while can also support a variety of face Bed Task.Many important patient informations are dispersed in narrative medical text, but most computer application can only Understand structural data.Therefore, patient information clinic natural language processing (Clinical important in medical text can be extracted NLP technology) has been introduced in medical field, and there is many applications in which show great effectiveness.
According to the 6th comprehension of information meeting (MUC-6), it is intended to which the name entity of identification name entity boundary and type is known Not (NER) technology has become a hot topic of natural language processing research and the research direction of relative maturity.In medical text In processing, name Entity recognition (for example, disease name, medicine name, detection title etc.) be equally most basic processing step it One.Many existing NLP systems have been used identifies medical concept, such as MEDLEE based on the method for dictionary and rule. MEDLEE is the medical concept extraction system of Columbia Univ USA's exploitation and the most comprehensive medical treatment NLP of earliest and function One of system.MetaMap systems are that National Library of Medicine (National Library of Medicine, NLM) is opened The information extracting system towards biomedical text of hair.CTAKES be based on unstructured information Governance framework (UIMA) and OpenNLP natural language processing Open-Source Tools packets.In recent years, medical information research institution has successively organized multiple Entity recognitions Relevant international evaluation and test.I2b2 (the Center of Informatics for Integrating Biology in 2009 And the Bedside) tissue is absorbed in the evaluation and test of pharmaceutical therapeutic entity identification mission, and 2010, i2b2 has been organized specially again It notes in symptom, treatment and the evaluation and test of medical treatment test Entity recognition task.Share/CLEF in 2013,2014 and 2015 The international evaluation and test such as Semantic Evaluation (SemEval) is absorbed in identification disease name entity and by its regularization to UMLS On terminology bank.In i2b2 pharmaceutical therapeutic entity identification missions in 2009, most of troops that participate in are employed based on Medical Dictionary With the method for artificial rule, such as the MedEx systems of U.S. Vanderbilt University exploitations.In the i2b2 of 2010 In evaluation and test, sponsor provides a bigger mark corpus, thus multiple participates in before troop and ranking 5 system all Use the recognition methods based on machine learning.Team participating in the contest has used condition random field (Conditional Random Fields, CRFs), structuring support vector machines (Structual Support Vector Machines, SSVMs) is simultaneously explored A large amount of character representation method.
It is important there is an urgent need to be extracted from the clinical text of China at present with the rapid growth that China Electronics's case history is implemented Patient information, to accelerate domestic clinical research.Scholars have begun working on Chinese clinical treatment Entity recognition task.Tall building Wang Shikun of door university et al. identifies symptom, this kind of entity of the interpretation of the cause, onset and process of an illness in Ming and Qing Gu case using condition random field. Xu Hua in 2004 et al. proposes a kind of Chinese word segmentation and the integrated approach of name Entity recognition, has been synchronized on Chinese medical text Into the two tasks and improve respective accuracy rate.Lei Jianbo of Peking University et al. more fully compares several machines in normal service Learning algorithm identifies the performance of clinical treatment entity in modern medicine medical treatment text when using different types of feature, compares Algorithm includes support vector machines, maximum entropy, condition random field and structuring support vector machines.In conclusion in Chinese medical name In Entity recognition task, current effort, which is concentrated mainly on, studies different machine learning algorithms and the combination of different types of feature On.
In recent years, the natural language processing system based on deep learning (Deep learning) achieves significant progress. This kind of system is more effective from text learning is not marked largely using unsupervised learning (unsupervised learning) technology Character representation method.Deep learning is an active research field in machine learning, it is using deep-neural-network to obtain To high level character representation method.In fields such as image procossing, speech recognition, machine translation, deep learning all achieves phase Than in other methods more preferably performance.By deep-neural-network, NLP researcher no longer needs to take a significant amount of time for spy Determine task optimization feature, then validity feature is obtained in text automatically from not marking largely.Researcher also found, based on deep layer Term vector (word embedding) expression of neural network can not only obtain syntactic level another characteristic, can also obtain semanteme Grade another characteristic, this feature can be applied effectively in general English NLP tasks, achieve apparent effect.For example, The NLP systems based on deep-neural-network of Dr.Ronan Collobert exploitations, in part-of-speech tagging, phrase chunking, name entity In the tasks such as identification, semantic character labeling, all obtain compared to the highest accuracy rate in existed system.
Term vector is the alternative route of traditional bag of words (bag of words) character representation method popular at present, will The mapping of each word becomes the array of a floating number composition.The representation method of floating-point array can be preserved compared to classical pathway More semantic informations.Conventional method uses the term vector generation method based on sequence.This method assert it is all in language material from The sequence so occurred is positive example.For example, when it is 5 to take word window (window size), following word sequence is considered as a positive example:
X={ wL2,wL1,w0,wR1,wR2}
Wherein, W0 is current word, and WL2, WL1 are that word, WR1 are closed on the left of current word, and WR2 is to close on word on the right side of current word. When running term vector generating algorithm, algorithm randomly chooses a word and replaces W0 to form a negative example sample, i.e.,:
X*={ wL2,wL1,w*,wR1,wR2}
Then term vector generating algorithm will optimize following ranking criteria, make its minimum:
MAX { 0,1-DNN (X)+DNN (X*)}
Meanwhile traditional deep-neural-network uses stochastic gradient descent algorithm, using the following formula undated parameter set:
θ=θ-λ Δsθ
Wherein, λ is study ratio, and ΔθIt is gradient.
Term vector training method of the tradition based on neural network, usually using the optimization object function based on language model. In the training process of term vector, by constantly maximizing probability of occurrence of the reasonable word sequence in emperorship network model, into And the parameter of neural network is adjusted, by way of transmitting backward, gradually the vector in modification training, finally obtains a maximum Chemical combination manages the term vector of text sequence.Although training method can obtain a conjunction by optimizing the probability of language model in this The term vector of reason, but have ignored the effect of existing knowledge base.It is general there is presently no one due to the diversity of general field Knowledge base can cross the existing knowledge for covering every field.Therefore, it is impossible to domain knowledge is used for the training process of term vector.
Invention content
The purpose of the present invention is overcoming the shortcomings of the prior art, provide a kind of based on deep learning and distributed semantic The medical information extraction system and its implementation of feature.
The purpose of the present invention is achieved through the following technical solutions:
It is based on deep learning and distributed semantic feature medical information extraction system, feature:Include preprocessing module, base Term vector training module, massive medical knowledge base in language model strengthen study module and based on depth artificial neural networks Medicine name Entity recognition module, the preprocessing module, for carrying out forbidden character cleaning, Chinese to medicine text big data Character code is unified and generates the word table that next module term vector training uses, and word table is the word occurred in all texts List;
The term vector training module based on language model, reads pretreated medical text, according to the window of reservation Mouthful, generate positive example;Meanwhile negative example is generated using the mode of random replacement positive example center word, pass through one depth nerve net of training Network to optimize the probability of language model target as an optimization, generates primary term vector;
The massive medical knowledge base strengthens study module, using primary term vector as starting point, uses another depth By optimizing the prediction probability of medical knowledge base, reinforcement study is carried out to primary term vector for neural network, so as to generate medicine neck The distributed semantic feature in domain;
The medicine name Entity recognition module based on depth artificial neural network is strengthened learning using massive medical knowledge base Practise the distributed semantic character representation of the medical domain of training in module, the depth nerve net of one medicine name Entity recognition of training Network identifies name entity important in medicine text.
Further, it is above-mentioned based on deep learning and distributed semantic feature medical information extraction system, the pre- place It manages module and includes forbidden character filtering module, Chinese character code unified modules and word table generation module,
The forbidden character filtering module traverses text as unit of character, removes wherein invalid non-visible character;
The Chinese character code unified modules determine the Chinese character coding mode of input text according to setting;
The word table generation module as unit of unicode characters, generates word table, and word is generated in follow-up term vector in table In the process, it is mapped as the term vector of floating number form.
Further, it is above-mentioned based on deep learning and distributed semantic feature medical information extraction system, the base It is excellent that positive and negative example generation module, term vector deep neural network module and network are included in the term vector training module of language model Change and training error monitoring module, the positive and negative example generation module for reading read statement, according to preset window, generate Positive example, meanwhile, using the centre word method of random replacement positive example, generate respective negative example;
The positive example of generation is born example input network, calculates probability, and according to just by the term vector deep neural network module The probability adjustment network of negative example;
The network optimization and training error monitoring module for the overall situation, optimize the probability of language model, and controlled training Error in the process when reaching the end condition of training setting, terminates training, preservation model.
Further, it is above-mentioned based on deep learning and distributed semantic feature medical information extraction system, the sea Medical knowledge base reinforcement study module is measured to include knowledge base standardized module, strengthen study deep neural network module and network Optimization and error monitoring module, the knowledge base standardized module, the expression of entity in standardized knowledge library;
It is described to strengthen study deep neural network module, using the entity in knowledge base as input, use primary term vector As feature, predicted in learning network is strengthened, and according to the situation of predicted value and knowledge base actual value, strengthen primary word to Amount;
The network optimization and error monitoring module for the overall situation, optimize the probability of language model, and controlled training process In error, reach training setting end condition when, terminate training, preservation model.
Further, it is above-mentioned based on deep learning and distributed semantic feature medical information extraction system, the base Medicine name entity deep neural network module and sentence are included in the medicine name Entity recognition module of depth artificial neural network The optimization of grade maximum likelihood and overflow control module, the medicine name entity deep neural network module read the sentence of input, make Character representation is carried out with distributed significance characteristic, and inputs an Entity recognition network, is known according to small-scale mark language material training The identification network of not various medicine name entities;
The statement level maximum likelihood optimization and overflow control module, occur in being trained for deep neural network model Overflow error carries out approximate calculation.
Further, it is above-mentioned based on deep learning and distributed semantic feature medical information extraction system, institute's predicate The optimization of sentence grade maximum likelihood and overflow control module are avoided in model training using maximum likelihood algorithm due to computer floating number Expression range is limited and model training is caused to fail, and algorithm is:
First, to all input xiFind maximum input xmax=MAX (xi);
Then, it is converted in the following way:
To avoid the floating-point overflow problem during objective function optimization, robustness and the precision of model are improved.
Further, it is above-mentioned based on deep learning and distributed semantic feature medical information extraction system, using base In the name entity identification algorithms of deep-neural-network, deep-neural-network is based on HardTanh letters comprising a convolutional layer, one Several nonlinear transformation layers and multiple linear layers;
When calculating the class categories score of each word, take upper and lower in the range of a specific window size of target word Cliction is by as input;For the word that neighbouring sentence-initial or sentence terminate, a pseudo- filling word is used to ensure all words Input vector is regular length;Each word in input window is mapped to N-dimensional vector, and N is term vector dimension;Then, Convolutional layer generates the globalization feature corresponding to concealed nodes;Finally, local feature and global characteristics are sent into a standard together Radial networks back-propagation algorithm to be used to be trained;Wherein, loss function is defined as following statement level log-likelihood:
Wherein, S (X, T) is Sentence-level Likelihood Score when sequence label T is endowed input X;H(Tt-1,Tt) label Tt-1 To label TtGlobal transfer score;DNN(Xt,Tt) label TtIt is endowed input XtWhen deep-neural-network score.
The present invention is based on deep learning and distributed semantic feature medical information abstracting methods, include the following steps:
Using the negative example of centre word generation of random replacement input positive example;
The primary term vector of deep neural network training based on language model optimization;
Depth is carried out using medical knowledge base big data and strengthens study, obtains the distributed semantic table for medical domain Show;
The Chinese medical name Entity recognition of deep-neural-network based on optimization statement level Maximum-likelihood estimation probability;
The approximate data that prevention of deep neural network model overflows;
Strengthen learning by depth, magnanimity Chinese medical knowledge base is attached to the process of unsupervised learning.
Still further, it is above-mentioned based on deep learning and distributed semantic feature medical information abstracting method, by advance It manages module and denoising is carried out to medicine big data, coding is unified and generates word table;Mould is trained by the term vector based on language model Block reads medical text, using pre-defined length of window, read statement is divided into the positive example of multiple input window, together When, respective negative example, positive example and negative example are generated by the method for random replacement centre word and train artificial neuron in a term vector Constantly finally there is maximization language model to train primary by network probabilistic forecasting and the cycle of challenge network parameter in network Term vector;Strengthen study module by massive medical knowledge base to be initialized using primary term vector, and use primary term vector It predicts the entry in mass knowledge library, is learnt by constantly strengthening, adjust primary term vector, finally obtain towards medical domain Distributed semantic character representation;New artificial mark is read by the medicine name Entity recognition module based on depth artificial neural network A small amount of language material, read statement is converted into distributed feature description, and predict entry using distributed semantic feature description Mark, by constantly adjusting net coefficients, realize and known based on the medicine name entity of deep learning and distributed semantic feature Not.
Still further, it is above-mentioned based on deep learning and distributed semantic feature medical information abstracting method, based on language Say that the positive and negative example generation module in the term vector training module of model generates negative example using the mode of random replacement positive example centre word; Term vector deep neural network module passes through positive and negative example learning training primary term vector, the network optimization and training error monitoring module Model optimization is carried out, monitors network training error and training of judgement end condition;
Massive medical knowledge base is strengthened in study module, and knowledge base standardized module reads medical knowledge base entry, standard Change knowledge base description;Strengthen the entry that study deep neural network module reads standardization, by compare neural network forecast with it is true Knowledge base marks, and generates error signal, is learnt by strengthening, primary term vector is trained for the distributed language towards medical domain Adopted feature;
In medicine name Entity recognition module based on depth artificial neural network, medicine name entity deep neural network module Using the quotation manually marked on a small quantity, optimized by statement level maximum likelihood and the training of overflow control module can accurately identify doctor The network of scientific name entity, and carry out effective model training overflow control.
The substantive distinguishing features and significant progress that technical solution of the present invention protrudes are mainly reflected in:
1. the unsupervised feature learning based on neural network and medical text big data, greatly alleviates manual features selection Burden;Unsupervised feature learning does not need to a large amount of artificial mark, avoids time-consuming a large amount of artificial annotation process;
2. based on the unsupervised feature learning of medicine text big data, the coverage rate of feature in model is improved, compared to biography System method has a distinct increment in recall rate;
3. term vector is generated using a large amount of un-annotated datas, so as to avoid the cumbersome spy in medicine natural language processing Sign selection and evolutionary process;The existing mass knowledge library of medical domain is made full use of, existing knowledge is combined by strengthening study Into deep learning algorithm, so as to effectively improve system performance;
4. for medicine text using the medicine name entity identification algorithms based on deep-neural-network, in Chinese medical text It is assessed in mark corpus, achieves performance more higher than traditional method based on sequence labelling.
Description of the drawings
Fig. 1:The architecture principle schematic diagram of present system;
Fig. 2:The structure diagram of deep-neural-network.
Specific embodiment
The present invention, using the probability for generating language model as optimization aim, uses the big number of medicine text by deep learning method According to the primary term vector of training;Based on massive medical knowledge base, second depth artificial neural network of training is strengthened by depth Study, during mass knowledge library is attached to the feature learning of deep learning, so as to obtain very to the distribution of medical domain Semantic feature;The deep learning method based on optimization statement level maximum likelihood probability is finally used to carry out Chinese medical name entity to know Not.
As shown in Figure 1, based on deep learning and distributed semantic feature medical information extraction system, preprocessing module is included 1st, the term vector training module 2 based on language model, massive medical knowledge base strengthen study module 3 and manually refreshing based on depth Medicine name Entity recognition module 4 through network, preprocessing module 1, for medicine text big data is carried out forbidden character cleaning, Chinese character coding is unified and generates the word table that next module term vector training uses, and word table is the text occurred in all texts The list of word;
Term vector training module 2 based on language model reads pretreated medical text, according to the window of reservation, Generate positive example;Meanwhile negative example is generated using the mode of random replacement positive example center word, by training a deep neural network, To optimize the probability of language model target as an optimization, primary term vector is generated;
Massive medical knowledge base strengthens study module 3, using primary term vector as starting point, uses another depth nerve Network by optimizing the prediction probability of medical knowledge base, carries out reinforcement study, so as to generate medical domain to primary term vector Distributed semantic feature;
Medicine name Entity recognition module 4 based on depth artificial neural network is strengthened learning using massive medical knowledge base The distributed semantic character representation of the medical domain of training in module 3, the depth nerve net of one medicine name Entity recognition of training Network identifies name entity important in medicine text.
Wherein, preprocessing module 1 includes forbidden character filtering module 101, Chinese character code unified modules 102 and the life of word table Into module 103,
Forbidden character filtering module 101 traverses text as unit of character, removes wherein invalid non-visible character, including Control character 0x0-0x1F in ascii code tables;
Chinese character code unified modules 102 determine the Chinese character coding mode of input text according to setting;Such as input text It is encoded for GBK, is then converted into UTF-8 codings, follow-up system will read utf-8 form codings, and in follow-up system memory Middle unification uses unicode;
Word table generation module 103 as unit of unicode characters, generates word table, and word was generated in follow-up term vector in table Cheng Zhong is mapped as the term vector of floating number form.
Term vector training module 2 based on language model includes positive and negative example generation module 201, term vector deep neural network Module 202 and the network optimization and training error monitoring module 203, the positive and negative example generation module 201 input language for reading Sentence according to preset window, generates positive example, meanwhile, using the centre word method of random replacement positive example, generate respective negative example;
The positive example of generation is born example input network, calculates probability, and according to just by term vector deep neural network module 202 The probability adjustment network of negative example;
Network optimizes and training error monitoring module 203, for the overall situation, optimizes the probability of language model, and controlled training mistake Error in journey when reaching the end condition of training setting, terminates training, preservation model.
Massive medical knowledge base strengthens study module 3 and includes knowledge base standardized module 301, strengthens study depth nerve net Network module 302 and the network optimization and error monitoring module 303, the knowledge base standardized module 301, in standardized knowledge library The expression of entity;
Strengthen study deep neural network module 302, using the entity in knowledge base as input, made using primary term vector It is characterized, is predicted in learning network is strengthened, and according to predicted value and the situation of knowledge base actual value, strengthen primary term vector;
The network optimization and error monitoring module 303 for the overall situation, optimize the probability of language model, and controlled training process In error, reach training setting end condition when, terminate training, preservation model.
Medicine name Entity recognition module 4 based on depth artificial neural network includes medicine name entity deep neural network mould Block 401 and the optimization of statement level maximum likelihood and overflow control module 402, medicine name entity deep neural network module 401 are read The sentence of input is taken, character representation is carried out, and input an Entity recognition network using distributed significance characteristic, according to small-scale Mark the identification network that language material training identifies various medicine name entities;
Statement level maximum likelihood optimizes and overflow control module 402, occurs in being trained for deep neural network model Overflow error carries out approximate calculation.
Statement level maximum likelihood optimizes and overflow control module 402 is using maximum likelihood algorithm, avoid in model training by It is limited in computer floating number expression range and model training is caused to fail, algorithm is:
First, to all input xiFind maximum input xmax=MAX (xi);
Then, it is converted in the following way:
To avoid the floating-point overflow problem during objective function optimization, robustness and the precision of model are improved.
Using the name entity identification algorithms based on deep-neural-network, deep-neural-network include a convolutional layer, one Nonlinear transformation layer and multiple linear layers based on HardTanh functions, as shown in Fig. 2, this structure is wide when functional It is general to be used for a variety of NLP tasks.
When calculating the class categories score of each word, take upper and lower in the range of a specific window size of target word Cliction is by as input;For the word that neighbouring sentence-initial or sentence terminate, a pseudo- filling word is used to ensure all words Input vector is regular length;Each word in input window is mapped to N-dimensional vector, and N is term vector dimension;Then, Convolutional layer generates the globalization feature corresponding to concealed nodes;Finally, local feature and global characteristics are sent into a standard together Radial networks back-propagation algorithm to be used to be trained;Wherein, loss function is defined as following statement level log-likelihood:
Wherein, S (X, T) is Sentence-level Likelihood Score when sequence label T is endowed input X;H(Tt-1,Tt) label Tt-1 To label TtGlobal transfer score;DNN(Xt,Tt) label TtIt is endowed input XtWhen deep-neural-network score.
The present invention is based on deep learning and distributed semantic feature medical information abstracting methods, include the following steps:
Using the negative example of centre word generation of random replacement input positive example;
The primary term vector of deep neural network training based on language model optimization;
Depth is carried out using medical knowledge base big data and strengthens study, obtains the distributed semantic table for medical domain Show;
The Chinese medical name Entity recognition of deep-neural-network based on optimization statement level Maximum-likelihood estimation probability;
The approximate data that effective prevention of deep neural network model overflows;
Strengthen learning by depth, magnanimity Chinese medical knowledge base is attached to the process of unsupervised learning.
Wherein, denoising is carried out to medicine big data by preprocessing module 1, coding is unified and generates word table;Based on language The term vector training module 2 of model reads medical text, using pre-defined length of window, read statement is divided into multiple The positive example of input window, meanwhile, respective negative example is generated by the method for random replacement centre word, positive example and negative example are in a word Constantly by network probabilistic forecasting and the cycle of challenge network parameter in vector training artificial neural network, finally there is maximization language Speech model training goes out primary term vector;Massive medical knowledge base is strengthened study module 3 and is initialized using primary term vector, and Using the entry in primary term vector prediction mass knowledge library, learnt by constantly strengthening, adjust primary term vector, it is final to obtain To the distributed semantic character representation towards medical domain;Medicine name Entity recognition module 4 based on depth artificial neural network The a small amount of language material newly manually marked is read, read statement is converted into distributed feature using distributed semantic feature description and is retouched It states, and predicts the mark of entry, by constantly adjusting net coefficients, realize based on deep learning and distributed semantic feature Medicine name Entity recognition.
Positive and negative example generation module 201 in term vector training module 2 based on language model is used in random replacement positive example The mode of heart word generates negative example;Term vector deep neural network module 202 passes through positive and negative example learning training primary term vector, network Optimization and training error monitoring module 203 carry out model optimization, monitor network training error and training of judgement end condition;
Massive medical knowledge base is strengthened in study module 3, and knowledge base standardized module 301 reads medical knowledge base entry, Standardized knowledge library describes;Strengthen the entry that study deep neural network module 302 reads standardization, by comparing neural network forecast It is marked with true knowledge base, generates error signal, learnt by strengthening, primary term vector is trained for point towards medical domain Cloth semantic feature;
In medicine name Entity recognition module 4 based on depth artificial neural network, medicine name entity deep neural network mould Block 401 is optimized by statement level maximum likelihood using the quotation manually marked on a small quantity and 402 training of overflow control module being capable of essence The really network of identification medicine name entity, and carry out effective model training overflow control.
As a professional extremely strong field, medical domain has standardization high, covers very extensive knowledge base.It opens The two step training methods that send out a kind of innovative.In the first step, centre is obtained using the method based on optimization probabilistic language model Term vector;In second step training, from the term vector of the first step, one neural network of design is known to have medicine by optimization Library is known further to train existing term vector.Second step training using large-scale medical knowledge base as supervising and guiding, into The medicine meaning of one's words of one-step optimization term vector represents, greatly optimizes the ability that term vector matrix expresses the medicine medicine meaning of one's words, makes Obtained term vector can more accurately describe medical knowledge.Medicine term vector key technology is different from other general term vectors Technology.
Chinese medical knowledge is that the valuable source of correct guidance is carried out to term vector.Some for arranging current medical domain are logical With medical knowledge base, diagnosis term set, ICD10 and the doctor of Pharmacopoea Chinensis, Chinese such as comprising Common drugs relevant information Learn diagnosis term dictionary LOINC Chinese editions etc..By arranging existing medical terminology library, obtain one and include widely used doctor The Basic period structure of technics.
Due to starting late for Chinese medical research, Chinese medical knowledge base is relatively limited.Foreign countries are arranged to be widely used 30 common medical knowledge bases, collect more than 200 ten thousand relevant medical word entries, will and with the help of several domain experts The medical terminology of English is translated as Chinese.
A problem for having medicine art knowledge base is coverage rate deficiency.The correlative study of medical domain proves, existing Medical knowledge base can only probably cover 60% or so of medical domain essential term.Due to delaying for time, many new terms It can not be updated in terminology bank with knowledge.Therefore, medical information extraction system is developed, in large-scale Chinese medical text A large amount of clinical widely used medical terminology is extracted in this.Under the auxiliary of computerized algorithm, to the medical terminology of extraction into Row is screened, and error correction is with the merging of existing knowledge base etc.;Finally, one is built based on having Chinese medical knowledge base, International a variety of common medical terms libraries are supplement, and increase and be commonly used in clinic, but the medical terms not being included Comprehensive medical domain knowledge base.
Medical knowledge be oriented to term vector optimization method, collect and arrange one comprising more than 300 ten thousand entries it is comprehensive in Literary medical domain knowledge base.Knowledge base covers the common term of medical domain, including:Medicine name, disease name, detection knot Fruit, surgical procedure, treatment means, adverse reaction etc..A deep neural network is designed, using knowledge base, to instructing on last stage Experienced term vector is oriented optimization.
The input layer of network is the corresponding term vector of medical terminology by optimization god.Input layer is read on last stage according to optimization The term vector of language model training, as the corresponding input vector of medical terminology.To each term, neural computing belongs to The probability of each medicine classification (classification in above-mentioned 6), then by optimizing the prediction probability of medical terminology classification, to term vector into Row orientation optimization.The structure of neural network is as follows:
1) medical terminology of input using existing term vector, is converted to input vector by input layer;
2) convolutional layer converts input vector by convolution, is mapped to the middle layer (300 implicit nodes) of fixed length;
3) middle layer after convolution by linear transformation layer, is mapped to first layer hidden layer (500 by linear transformation layer Implicit node);
4) input using HardTan functions, is mapped to second layer hidden layer (500 implicit sections by nonlinear transformation layer Point);
5) linear transformation layer according to the input of second layer hidden layer, is mapped to final output node layer (6);
According to the probability of output layer and true medical terminology classification, corresponding error signal is calculated, by passing backward It broadcasts algorithm and adjusts entire neural network parameter, and the corresponding term vector of final adjustment.
Training method during model training, never marks training corpus and concentrates extraction 1/5th as verification collection It closes.In parameter selection, setting study ratio (learning rate) 0.01, term vector latitude is 50, hidden layer interstitial content Be set as 100 (we test concealed nodes number and are possible to from 50 to 150, and 100 achieve best effect, and more than 100 Without significantly improving), word window is taken to be set as 5.All deep-neural-network parameter application stochastic gradient descent algorithms and reversely Propagation algorithm (back propagation) updates.For Chinese medical text, it is not used participle technique, but by individual Chinese character Make an independent word, generate term vector.
Syntactic information is not only contained in term vector, has further included semantic information.After term vector has been obtained, to each A word is calculated and the highest vocabulary of its similarity using cosine similarity (cosine similarity).In following example In, first row is shown and other highest vocabulary of " one " similarity.It can be seen that it is mainly made of number and numeral-classifier compound. In third row, the relevant medical nomenclature of human organ is mainly included.
One It is left Limb Larynx
Three It is right Jaw Top
Two It is double Lung Office
Half Two Arm Nose
0 On Wall Sinus
Two And It states Chamber
Number Have Noon Eyelid
Have Before It is aobvious Gorge
Compared with Pillow Neck Foot
It is beautiful Under Stern Tears
In conclusion the present invention proposes and a kind of identifies medical treatment based on the method for deep learning and distributed semantic feature 6 kinds of important informations in text, including:The information such as drug, detection, disease, surgical procedure, treatment means and adverse reaction. Compared with conventional conditions random field (CRF) model, the method have the characteristics that:1) using a large amount of un-annotated datas come generate word to Amount, so as to avoid cumbersome feature selecting and the evolutionary process in medicine natural language processing;2) medical domain is made full use of to show Some mass knowledge libraries are attached to existing knowledge in deep learning algorithm by strengthening study, so as to effectively improve systematicness Energy;3) for medicine text using the medicine name entity identification algorithms based on deep-neural-network, in Chinese medical text marking It is assessed in corpus, achieves performance more higher than traditional method based on sequence labelling.
It is to be understood that:The above is only the preferred embodiment of the present invention, for the common of the art For technical staff, without departing from the principle of the present invention, several improvements and modifications can also be made, these are improved and profit Decorations also should be regarded as protection scope of the present invention.

Claims (10)

1. based on deep learning and distributed semantic feature medical information extraction system, it is characterised in that:Include preprocessing module (1), the term vector training module (2) based on language model, massive medical knowledge base strengthen study module (3) and based on depth The medicine name Entity recognition module (4) of artificial neural network, the preprocessing module (1), for medicine text big data into The cleaning of row forbidden character, Chinese character coding is unified and generates the word table that next module term vector training uses, and word table is institute There is the list of the word occurred in text;
The term vector training module (2) based on language model, reads pretreated medical text, according to the window of reservation Mouthful, generate positive example;Meanwhile negative example is generated using the mode of random replacement positive example center word, pass through one depth nerve net of training Network to optimize the probability of language model target as an optimization, generates primary term vector;
The massive medical knowledge base strengthens study module (3), using primary term vector as starting point, uses another depth god Through network, by optimizing the prediction probability of medical knowledge base, reinforcement study is carried out to primary term vector, so as to generate medical domain Distributed semantic feature;
The medicine name Entity recognition module (4) based on depth artificial neural network is strengthened learning using massive medical knowledge base Practise the distributed semantic character representation of the medical domain of training in module (3), the depth god of one medicine name Entity recognition of training Through network, name entity important in medicine text is identified.
It is 2. according to claim 1 based on deep learning and distributed semantic feature medical information extraction system, feature It is:The preprocessing module (1) includes forbidden character filtering module (101), Chinese character code unified modules (102) and word table Generation module (103),
The forbidden character filtering module (101) traverses text as unit of character, removes wherein invalid non-visible character;
The Chinese character code unified modules (102) determine the Chinese character coding mode of input text according to setting;
The word table generation module (103) as unit of unicode characters, generates word table, and word is generated in follow-up term vector in table In the process, it is mapped as the term vector of floating number form.
It is 3. according to claim 1 based on deep learning and distributed semantic feature medical information extraction system, feature It is:The term vector training module (2) based on language model includes positive and negative example generation module (201), term vector depth god Through network module (202) and the network optimization and training error monitoring module (203), the positive and negative example generation module (201) is used In reading read statement, according to preset window, positive example is generated, meanwhile, using the centre word method of random replacement positive example, generation Respective negative example;
The term vector deep neural network module (202), by the positive example of generation bear example input network, calculate probability, and according to The probability adjustment network of positive and negative example;
The network optimization and training error monitoring module (203), for the overall situation, optimize the probability of language model, and control instruction Error during white silk when reaching the end condition of training setting, terminates training, preservation model.
It is 4. according to claim 1 based on deep learning and distributed semantic feature medical information extraction system, feature It is:The massive medical knowledge base strengthens study module (3) and includes knowledge base standardized module (301), reinforcement study depth Neural network module (302) and the network optimization and error monitoring module (303), the knowledge base standardized module (301), mark The expression of entity in standardization knowledge base;
It is described to strengthen study deep neural network module (302), using the entity in knowledge base as input, use primary term vector As feature, predicted in learning network is strengthened, and according to the situation of predicted value and knowledge base actual value, strengthen primary word to Amount;
The network optimization and error monitoring module (303) for the overall situation, optimize the probability of language model, and controlled training mistake Error in journey when reaching the end condition of training setting, terminates training, preservation model.
It is 5. according to claim 1 based on deep learning and distributed semantic feature medical information extraction system, feature It is:The medicine name Entity recognition module (4) based on depth artificial neural network includes medicine name entity depth nerve net Network module (401) and the optimization of statement level maximum likelihood and overflow control module (402), the medicine name entity depth nerve net Network module (401) reads the sentence of input, and character representation is carried out, and input an Entity recognition net using distributed significance characteristic Network identifies the identification network of various medicine name entities according to small-scale mark language material training;
The statement level maximum likelihood optimization and overflow control module (402), occur in being trained for deep neural network model Overflow error, carry out approximate calculation.
It is 6. according to claim 5 based on deep learning and distributed semantic feature medical information extraction system, feature It is:The statement level maximum likelihood optimization and overflow control module (402) are avoided using maximum likelihood algorithm in model training Since computer floating number expression range is limited and model training is caused to fail, algorithm is:
First, to all input xiFind maximum input xmax=MAX (xi);
Then, it is converted in the following way:
To avoid the floating-point overflow problem during objective function optimization, robustness and the precision of model are improved.
It is 7. according to claim 1 based on deep learning and distributed semantic feature medical information extraction system, feature It is:Using the name entity identification algorithms based on deep-neural-network, deep-neural-network is based on comprising a convolutional layer, one The nonlinear transformation layer of HardTanh functions and multiple linear layers;
When calculating the class categories score of each word, the cliction up and down in the range of a specific window size of target word is taken By as input;For the word that neighbouring sentence-initial or sentence terminate, a pseudo- filling word is used to ensure the input of all words Vector is regular length;Each word in input window is mapped to N-dimensional vector, and N is term vector dimension;Then, convolution Layer generates the globalization feature corresponding to concealed nodes;Finally, local feature and global characteristics are sent into putting for standard together Network is penetrated so that back-propagation algorithm to be used to be trained;Wherein, loss function is defined as following statement level log-likelihood:
Wherein, S (X, T) is Sentence-level Likelihood Score when sequence label T is endowed input X;H(Tt-1,Tt) label Tt-1To label TtGlobal transfer score;DNN(Xt,Tt) label TtIt is endowed input XtWhen deep-neural-network score.
8. a kind of be used to implement being taken out based on deep learning and distributed semantic feature medical information for system described in claim 1 Take method, it is characterised in that include the following steps:
Using the negative example of centre word generation of random replacement input positive example;
The primary term vector of deep neural network training based on language model optimization;
Depth is carried out using medical knowledge base big data and strengthens study, the distributed semantic obtained for medical domain represents;
The Chinese medical name Entity recognition of deep-neural-network based on optimization statement level Maximum-likelihood estimation probability;
The approximate data that prevention of deep neural network model overflows;
Strengthen learning by depth, magnanimity Chinese medical knowledge base is attached to the process of unsupervised learning.
It is 9. according to claim 8 based on deep learning and distributed semantic feature medical information abstracting method, feature It is:Denoising is carried out to medicine big data by preprocessing module (1), coding is unified and generates word table;By being based on language mould The term vector training module (2) of type reads medical text, using pre-defined length of window, read statement is divided into multiple The positive example of input window, meanwhile, respective negative example is generated by the method for random replacement centre word, positive example and negative example are in a word Constantly by network probabilistic forecasting and the cycle of challenge network parameter in vector training artificial neural network, finally there is maximization language Speech model training goes out primary term vector;Strengthen study module (3) by massive medical knowledge base to carry out initially using primary term vector Change, and the entry in mass knowledge library predicted using primary term vector, learnt by constantly strengthening, adjust primary term vector, Finally obtain the distributed semantic character representation towards medical domain;Known by the medicine name entity based on depth artificial neural network Other module (4) reads a small amount of language material newly manually marked, and read statement is converted into distribution using distributed semantic feature description The feature description of formula, and predict the mark of entry, by constantly adjusting net coefficients, realize based on deep learning and distribution The medicine name Entity recognition of semantic feature.
It is 10. according to claim 9 based on deep learning and distributed semantic feature medical information abstracting method, feature It is:Positive and negative example generation module (201) in term vector training module (2) based on language model is used in random replacement positive example The mode of heart word generates negative example;Term vector deep neural network module (202) passes through positive and negative example learning training primary term vector, net Network optimizes and training error monitoring module (203) carries out model optimization, monitors network training error and training of judgement end condition;
Massive medical knowledge base is strengthened in study module (3), and knowledge base standardized module (301) reads medical knowledge base entry, Standardized knowledge library describes;Strengthen the entry that study deep neural network module (302) reads standardization, it is pre- by comparing network It surveys and is marked with true knowledge base, generate error signal, learnt by strengthening, primary term vector is trained for towards medical domain Distributed semantic feature;
In medicine name Entity recognition module (4) based on depth artificial neural network, medicine name entity deep neural network module (401) using the quotation manually marked on a small quantity, optimized by statement level maximum likelihood and overflow control module (402) training is accurate It identifies the network of medicine name entity, and carries out model training overflow control.
CN201610176409.8A 2016-03-25 2016-03-25 Based on deep learning and distributed semantic feature medical information extraction system and method Active CN105894088B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610176409.8A CN105894088B (en) 2016-03-25 2016-03-25 Based on deep learning and distributed semantic feature medical information extraction system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610176409.8A CN105894088B (en) 2016-03-25 2016-03-25 Based on deep learning and distributed semantic feature medical information extraction system and method

Publications (2)

Publication Number Publication Date
CN105894088A CN105894088A (en) 2016-08-24
CN105894088B true CN105894088B (en) 2018-06-29

Family

ID=57013869

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610176409.8A Active CN105894088B (en) 2016-03-25 2016-03-25 Based on deep learning and distributed semantic feature medical information extraction system and method

Country Status (1)

Country Link
CN (1) CN105894088B (en)

Families Citing this family (106)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106446533B (en) * 2016-09-12 2023-12-19 北京和信康科技有限公司 Human health data processing system and method thereof
CN106484674B (en) * 2016-09-20 2020-09-25 北京工业大学 Chinese electronic medical record concept extraction method based on deep learning
CN106547737B (en) * 2016-10-25 2020-05-12 复旦大学 Sequence labeling method in natural language processing based on deep learning
CN106547735B (en) * 2016-10-25 2020-07-07 复旦大学 Construction and use method of context-aware dynamic word or word vector based on deep learning
CN106682397B (en) * 2016-12-09 2020-05-19 江西中科九峰智慧医疗科技有限公司 Knowledge-based electronic medical record quality control method
CN106776501A (en) * 2016-12-13 2017-05-31 深圳爱拼信息科技有限公司 A kind of automatic method for correcting of text wrong word and server
CN108287858B (en) * 2017-03-02 2021-08-10 腾讯科技(深圳)有限公司 Semantic extraction method and device for natural language
CN107145910A (en) * 2017-05-08 2017-09-08 京东方科技集团股份有限公司 Performance generation system, its training method and the performance generation method of medical image
CN107220506A (en) * 2017-06-05 2017-09-29 东华大学 Breast cancer risk assessment analysis system based on depth convolutional neural networks
CN107391485A (en) * 2017-07-18 2017-11-24 中译语通科技(北京)有限公司 Entity recognition method is named based on the Korean of maximum entropy and neural network model
CN109284497B (en) * 2017-07-20 2021-01-12 京东方科技集团股份有限公司 Method and apparatus for identifying medical entities in medical text in natural language
CN107451295B (en) * 2017-08-17 2020-06-30 四川长虹电器股份有限公司 Method for obtaining deep learning training data based on grammar network
CN107526798B (en) * 2017-08-18 2020-09-01 武汉红茶数据技术有限公司 Entity identification and normalization combined method and model based on neural network
US20200388358A1 (en) * 2017-08-30 2020-12-10 Google Llc Machine Learning Method for Generating Labels for Fuzzy Outcomes
CN107491655B (en) * 2017-08-31 2020-08-25 上海柯棣健康管理咨询有限公司 Liver disease information intelligent consultation system based on machine learning
CN107783960B (en) * 2017-10-23 2021-07-23 百度在线网络技术(北京)有限公司 Method, device and equipment for extracting information
CN107977361B (en) * 2017-12-06 2021-05-18 哈尔滨工业大学深圳研究生院 Chinese clinical medical entity identification method based on deep semantic information representation
CN108170677B (en) * 2017-12-27 2022-01-04 北京嘉和海森健康科技有限公司 Medical term extraction method and device
CN108280061B (en) * 2018-01-17 2021-10-26 北京百度网讯科技有限公司 Text processing method and device based on ambiguous entity words
CN110162766B (en) * 2018-02-12 2023-03-24 深圳市腾讯计算机系统有限公司 Word vector updating method and device
CN108446388A (en) * 2018-03-22 2018-08-24 平安科技(深圳)有限公司 Text data quality detecting method, device, equipment and computer readable storage medium
EP3567605A1 (en) * 2018-05-08 2019-11-13 Siemens Healthcare GmbH Structured report data from a medical text report
CN109003678B (en) * 2018-06-12 2021-04-30 清华大学 Method and system for generating simulated text medical record
CN110728147B (en) * 2018-06-28 2023-04-28 阿里巴巴集团控股有限公司 Model training method and named entity recognition method
CN110737758B (en) * 2018-07-03 2022-07-05 百度在线网络技术(北京)有限公司 Method and apparatus for generating a model
CN109086268A (en) * 2018-07-13 2018-12-25 上海乐言信息科技有限公司 A kind of field syntax learning system and method based on transfer learning
CN109376250A (en) * 2018-09-27 2019-02-22 中山大学 Entity relationship based on intensified learning combines abstracting method
EP3637428A1 (en) * 2018-10-12 2020-04-15 Siemens Healthcare GmbH Natural language sentence generation for radiology reports
CN109284491B (en) * 2018-10-23 2023-08-22 北京惠每云科技有限公司 Medical text recognition method and sentence recognition model training method
CN109299467B (en) * 2018-10-23 2023-08-08 北京惠每云科技有限公司 Medical text recognition method and device and sentence recognition model training method and device
CN111180019A (en) * 2018-11-09 2020-05-19 上海云贵信息科技有限公司 Compound parameter automatic extraction method based on deep learning
CN109408626B (en) * 2018-11-09 2021-09-21 思必驰科技股份有限公司 Method and device for processing natural language
CN109471945B (en) * 2018-11-12 2021-11-23 中山大学 Deep learning-based medical text classification method and device and storage medium
CN109492233B (en) * 2018-11-14 2023-10-17 北京捷通华声科技股份有限公司 Machine translation method and device
TWI678709B (en) * 2018-11-15 2019-12-01 義守大學 Disease prediction method through a big database formed by data mining of neural network
CN109800411B (en) * 2018-12-03 2023-07-18 哈尔滨工业大学(深圳) Clinical medical entity and attribute extraction method thereof
CN109767817B (en) * 2019-01-16 2023-05-30 南通大学 Drug potential adverse reaction discovery method based on neural network language model
CN111538806B (en) * 2019-01-21 2023-04-07 阿里巴巴集团控股有限公司 Query negative case generalization method and device
CN109712680B (en) * 2019-01-24 2021-02-09 易保互联医疗信息科技(北京)有限公司 Medical data generation method and system based on HL7 standard
CN109902292B (en) * 2019-01-25 2023-05-09 网经科技(苏州)有限公司 Chinese word vector processing method and system thereof
CN109858004B (en) * 2019-02-12 2023-08-01 四川无声信息技术有限公司 Text rewriting method and device and electronic equipment
CN111563376A (en) * 2019-02-12 2020-08-21 阿里巴巴集团控股有限公司 Dish name identification method and device
US11471729B2 (en) 2019-03-11 2022-10-18 Rom Technologies, Inc. System, method and apparatus for a rehabilitation machine with a simulated flywheel
US20200289889A1 (en) 2019-03-11 2020-09-17 Rom Technologies, Inc. Bendable sensor device for monitoring joint extension and flexion
US11185735B2 (en) 2019-03-11 2021-11-30 Rom Technologies, Inc. System, method and apparatus for adjustable pedal crank
CN110134772B (en) * 2019-04-18 2023-05-12 五邑大学 Medical text relation extraction method based on pre-training model and fine tuning technology
CN110083838B (en) * 2019-04-29 2021-01-19 西安交通大学 Biomedical semantic relation extraction method based on multilayer neural network and external knowledge base
US11801423B2 (en) 2019-05-10 2023-10-31 Rehab2Fit Technologies, Inc. Method and system for using artificial intelligence to interact with a user of an exercise device during an exercise session
US11904207B2 (en) 2019-05-10 2024-02-20 Rehab2Fit Technologies, Inc. Method and system for using artificial intelligence to present a user interface representing a user's progress in various domains
US11957960B2 (en) 2019-05-10 2024-04-16 Rehab2Fit Technologies Inc. Method and system for using artificial intelligence to adjust pedal resistance
US11433276B2 (en) 2019-05-10 2022-09-06 Rehab2Fit Technologies, Inc. Method and system for using artificial intelligence to independently adjust resistance of pedals based on leg strength
CN110322959B (en) * 2019-05-24 2021-09-28 山东大学 Deep medical problem routing method and system based on knowledge
CN110276081B (en) * 2019-06-06 2023-04-25 百度在线网络技术(北京)有限公司 Text generation method, device and storage medium
CN110298040A (en) * 2019-06-20 2019-10-01 翼健(上海)信息科技有限公司 A kind of pair of Chinese corpus is labeled the control method and control device of identification
CN110458397A (en) * 2019-07-05 2019-11-15 苏州热工研究院有限公司 A kind of nuclear material military service performance information extracting method
CN110442840B (en) * 2019-07-11 2022-12-09 新华三大数据技术有限公司 Sequence labeling network updating method, electronic medical record processing method and related device
CN110399433A (en) * 2019-07-23 2019-11-01 福建奇点时空数字科技有限公司 A kind of data entity Relation extraction method based on deep learning
US11071597B2 (en) 2019-10-03 2021-07-27 Rom Technologies, Inc. Telemedicine for orthopedic treatment
US11701548B2 (en) 2019-10-07 2023-07-18 Rom Technologies, Inc. Computer-implemented questionnaire for orthopedic treatment
US20210134432A1 (en) 2019-10-03 2021-05-06 Rom Technologies, Inc. Method and system for implementing dynamic treatment environments based on patient information
US11955220B2 (en) 2019-10-03 2024-04-09 Rom Technologies, Inc. System and method for using AI/ML and telemedicine for invasive surgical treatment to determine a cardiac treatment plan that uses an electromechanical machine
US11515028B2 (en) 2019-10-03 2022-11-29 Rom Technologies, Inc. Method and system for using artificial intelligence and machine learning to create optimal treatment plans based on monetary value amount generated and/or patient outcome
US11830601B2 (en) 2019-10-03 2023-11-28 Rom Technologies, Inc. System and method for facilitating cardiac rehabilitation among eligible users
US11961603B2 (en) 2019-10-03 2024-04-16 Rom Technologies, Inc. System and method for using AI ML and telemedicine to perform bariatric rehabilitation via an electromechanical machine
US11955223B2 (en) 2019-10-03 2024-04-09 Rom Technologies, Inc. System and method for using artificial intelligence and machine learning to provide an enhanced user interface presenting data pertaining to cardiac health, bariatric health, pulmonary health, and/or cardio-oncologic health for the purpose of performing preventative actions
US20210127974A1 (en) 2019-10-03 2021-05-06 Rom Technologies, Inc. Remote examination through augmented reality
US11282608B2 (en) 2019-10-03 2022-03-22 Rom Technologies, Inc. Method and system for using artificial intelligence and machine learning to provide recommendations to a healthcare provider in or near real-time during a telemedicine session
US20210134412A1 (en) 2019-10-03 2021-05-06 Rom Technologies, Inc. System and method for processing medical claims using biometric signatures
US11515021B2 (en) 2019-10-03 2022-11-29 Rom Technologies, Inc. Method and system to analytically optimize telehealth practice-based billing processes and revenue while enabling regulatory compliance
US11270795B2 (en) 2019-10-03 2022-03-08 Rom Technologies, Inc. Method and system for enabling physician-smart virtual conference rooms for use in a telehealth context
US20210142893A1 (en) 2019-10-03 2021-05-13 Rom Technologies, Inc. System and method for processing medical claims
US11915815B2 (en) 2019-10-03 2024-02-27 Rom Technologies, Inc. System and method for using artificial intelligence and machine learning and generic risk factors to improve cardiovascular health such that the need for additional cardiac interventions is mitigated
US11955221B2 (en) 2019-10-03 2024-04-09 Rom Technologies, Inc. System and method for using AI/ML to generate treatment plans to stimulate preferred angiogenesis
US11265234B2 (en) 2019-10-03 2022-03-01 Rom Technologies, Inc. System and method for transmitting data and ordering asynchronous data
US20210134458A1 (en) 2019-10-03 2021-05-06 Rom Technologies, Inc. System and method to enable remote adjustment of a device during a telemedicine session
US11282604B2 (en) 2019-10-03 2022-03-22 Rom Technologies, Inc. Method and system for use of telemedicine-enabled rehabilitative equipment for prediction of secondary disease
US11317975B2 (en) 2019-10-03 2022-05-03 Rom Technologies, Inc. Method and system for treating patients via telemedicine using sensor data from rehabilitation or exercise equipment
US20210134463A1 (en) 2019-10-03 2021-05-06 Rom Technologies, Inc. Systems and methods for remotely-enabled identification of a user infection
US11955222B2 (en) 2019-10-03 2024-04-09 Rom Technologies, Inc. System and method for determining, based on advanced metrics of actual performance of an electromechanical machine, medical procedure eligibility in order to ascertain survivability rates and measures of quality-of-life criteria
US11101028B2 (en) 2019-10-03 2021-08-24 Rom Technologies, Inc. Method and system using artificial intelligence to monitor user characteristics during a telemedicine session
US11069436B2 (en) 2019-10-03 2021-07-20 Rom Technologies, Inc. System and method for use of telemedicine-enabled rehabilitative hardware and for encouraging rehabilitative compliance through patient-based virtual shared sessions with patient-enabled mutual encouragement across simulated social networks
US11325005B2 (en) 2019-10-03 2022-05-10 Rom Technologies, Inc. Systems and methods for using machine learning to control an electromechanical device used for prehabilitation, rehabilitation, and/or exercise
US11075000B2 (en) 2019-10-03 2021-07-27 Rom Technologies, Inc. Method and system for using virtual avatars associated with medical professionals during exercise sessions
US11915816B2 (en) 2019-10-03 2024-02-27 Rom Technologies, Inc. Systems and methods of using artificial intelligence and machine learning in a telemedical environment to predict user disease states
US11282599B2 (en) 2019-10-03 2022-03-22 Rom Technologies, Inc. System and method for use of telemedicine-enabled rehabilitative hardware and for encouragement of rehabilitative compliance through patient-based virtual shared sessions
US11923065B2 (en) 2019-10-03 2024-03-05 Rom Technologies, Inc. Systems and methods for using artificial intelligence and machine learning to detect abnormal heart rhythms of a user performing a treatment plan with an electromechanical machine
US11756666B2 (en) 2019-10-03 2023-09-12 Rom Technologies, Inc. Systems and methods to enable communication detection between devices and performance of a preventative action
US11887717B2 (en) 2019-10-03 2024-01-30 Rom Technologies, Inc. System and method for using AI, machine learning and telemedicine to perform pulmonary rehabilitation via an electromechanical machine
US20210128080A1 (en) 2019-10-03 2021-05-06 Rom Technologies, Inc. Augmented reality placement of goniometer or other sensors
US11826613B2 (en) 2019-10-21 2023-11-28 Rom Technologies, Inc. Persuasive motivation for orthopedic treatment
CN110889282B (en) * 2019-11-28 2023-03-21 哈尔滨工程大学 Text emotion analysis method based on deep learning
CN111160012B (en) * 2019-12-26 2024-02-06 上海金仕达卫宁软件科技有限公司 Medical term identification method and device and electronic equipment
CN111259112B (en) * 2020-01-14 2023-07-04 北京百度网讯科技有限公司 Medical fact verification method and device
CN111460834B (en) * 2020-04-09 2023-06-06 北京北大软件工程股份有限公司 French semantic annotation method and device based on LSTM network
US11107591B1 (en) * 2020-04-23 2021-08-31 Rom Technologies, Inc. Method and system for describing and recommending optimal treatment plans in adaptive telemedical or other contexts
US11574128B2 (en) 2020-06-09 2023-02-07 Optum Services (Ireland) Limited Method, apparatus and computer program product for generating multi-paradigm feature representations
CN111680145B (en) * 2020-06-10 2023-08-15 北京百度网讯科技有限公司 Knowledge representation learning method, apparatus, device and storage medium
CN112309519B (en) * 2020-10-26 2021-06-08 浙江大学 Electronic medical record medication structured processing system based on multiple models
CN112270186B (en) * 2020-11-04 2024-02-02 吾征智能技术(北京)有限公司 Mouth based on entropy model peppery text information matching system
CN112464667B (en) * 2020-11-18 2021-11-16 北京华彬立成科技有限公司 Text entity identification method and device, electronic equipment and storage medium
CN112434756A (en) * 2020-12-15 2021-03-02 杭州依图医疗技术有限公司 Training method, processing method, device and storage medium of medical data
CN113128233B (en) * 2021-05-11 2022-07-19 济南大学 Construction method and system of mental disease knowledge map
CN113297852B (en) * 2021-07-26 2021-11-12 北京惠每云科技有限公司 Medical entity word recognition method and device
US11698934B2 (en) 2021-09-03 2023-07-11 Optum, Inc. Graph-embedding-based paragraph vector machine learning models
CN114722208B (en) * 2022-06-08 2022-11-01 成都健康医联信息产业有限公司 Automatic classification and safety level grading method for health medical texts
CN117747124A (en) * 2024-02-20 2024-03-22 浙江大学 Medical large model logic inversion method and system based on network excitation graph decomposition

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104199972A (en) * 2013-09-22 2014-12-10 中科嘉速(北京)并行软件有限公司 Named entity relation extraction and construction method based on deep learning
CN104252570A (en) * 2013-06-28 2014-12-31 上海联影医疗科技有限公司 Mass medical image data mining system and realization method thereof
CN104298651A (en) * 2014-09-09 2015-01-21 大连理工大学 Biomedicine named entity recognition and protein interactive relationship extracting on-line system based on deep learning
CN104615589A (en) * 2015-02-15 2015-05-13 百度在线网络技术(北京)有限公司 Named-entity recognition model training method and named-entity recognition method and device
CN104915386A (en) * 2015-05-25 2015-09-16 中国科学院自动化研究所 Short text clustering method based on deep semantic feature learning
CN105404632A (en) * 2014-09-15 2016-03-16 深港产学研基地 Deep neural network based biomedical text serialization labeling system and method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040122704A1 (en) * 2002-12-18 2004-06-24 Sabol John M. Integrated medical knowledge base interface system and method
US20080281868A1 (en) * 2007-02-26 2008-11-13 Connections Center Methods, apparatus and products for transferring knowledge

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104252570A (en) * 2013-06-28 2014-12-31 上海联影医疗科技有限公司 Mass medical image data mining system and realization method thereof
CN104199972A (en) * 2013-09-22 2014-12-10 中科嘉速(北京)并行软件有限公司 Named entity relation extraction and construction method based on deep learning
CN104298651A (en) * 2014-09-09 2015-01-21 大连理工大学 Biomedicine named entity recognition and protein interactive relationship extracting on-line system based on deep learning
CN105404632A (en) * 2014-09-15 2016-03-16 深港产学研基地 Deep neural network based biomedical text serialization labeling system and method
CN104615589A (en) * 2015-02-15 2015-05-13 百度在线网络技术(北京)有限公司 Named-entity recognition model training method and named-entity recognition method and device
CN104915386A (en) * 2015-05-25 2015-09-16 中国科学院自动化研究所 Short text clustering method based on deep semantic feature learning

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Data and Knowledge in Medical Distributed Applications;Serban A等;《IOS Press》;20141231;全文 *
基于人工智能的医疗诊断系统研究与设计;滕文龙;《中国优秀硕士学位论文全文数据库 信息科技辑》;20140415(第04期);全文 *
基于词表示方法的生物医学命名实体识别;李丽双等;《小型微型计算机系统》;20160229;第37卷(第2期);全文 *

Also Published As

Publication number Publication date
CN105894088A (en) 2016-08-24

Similar Documents

Publication Publication Date Title
CN105894088B (en) Based on deep learning and distributed semantic feature medical information extraction system and method
Wang et al. Label-aware double transfer learning for cross-specialty medical named entity recognition
CN111192680B (en) Intelligent auxiliary diagnosis method based on deep learning and collective classification
Yu et al. Automatic ICD code assignment of Chinese clinical notes based on multilayer attention BiRNN
Wołk et al. Neural-based machine translation for medical text domain. Based on European Medicines Agency leaflet texts
CN109670179A (en) Case history text based on iteration expansion convolutional neural networks names entity recognition method
Liu et al. Medical-vlbert: Medical visual language bert for covid-19 ct report generation with alternate learning
CN108549639A (en) Based on the modified Chinese medicine case name recognition methods of multiple features template and system
Hu et al. HITSZ_CNER: a hybrid system for entity recognition from Chinese clinical text
CN108509419A (en) Ancient TCM books document participle and part of speech indexing method and system
CN112420191A (en) Traditional Chinese medicine auxiliary decision making system and method
Wen et al. Cross domains adversarial learning for Chinese named entity recognition for online medical consultation
Zhang et al. Identifying adverse drug reaction entities from social media with adversarial transfer learning model
Hou et al. Automatic report generation for chest X-ray images via adversarial reinforcement learning
CN112949308A (en) Method and system for identifying named entities of Chinese electronic medical record based on functional structure
He et al. Convolutional gated recurrent units for medical relation classification
Al-Sadi et al. Visual question answering in the medical domain based on deep learning approaches: A comprehensive study
Ke et al. Medical entity recognition and knowledge map relationship analysis of Chinese EMRs based on improved BiLSTM-CRF
Zhang et al. Using a pre-trained language model for medical named entity extraction in Chinese clinic text
Faris et al. Automatic symptoms identification from a massive volume of unstructured medical consultations using deep neural and BERT models
Wang et al. Research on named entity recognition of doctor-patient question answering community based on bilstm-crf model
CN115630649B (en) Medical Chinese named entity recognition method based on generation model
Liu et al. Cross-document attention-based gated fusion network for automated medical licensing exam
Weegar et al. Deep medical entity recognition for Swedish and Spanish
Lin et al. Research on named entity recognition of traditional Chinese medicine electronic medical records

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20181106

Address after: 100080 Beijing Haidian District Suzhou Street 16 (Shenzhou digital building) 18 tier 1809

Co-patentee after: Suzhou Hebta Medical Information Technology Co., Ltd.

Patentee after: Digital medical Polytron Technologies Inc

Address before: 215021 2 Creative Industrial Park, 328 Xing Hu Street, Suzhou Industrial Park, Jiangsu

Patentee before: Suzhou Hebta Medical Information Technology Co., Ltd.

TR01 Transfer of patent right
CP01 Change in the name or title of a patent holder

Address after: 100080 Beijing Haidian District Suzhou Street 16 (Shenzhou digital building) 18 tier 1809

Co-patentee after: Shenzhou hebote medical information technology (Suzhou) Co., Ltd

Patentee after: DIGITAL CHINA HEALTH TECHNOLOGIES Co.,Ltd.

Address before: 100080 Beijing Haidian District Suzhou Street 16 (Shenzhou digital building) 18 tier 1809

Co-patentee before: SUZHOU HEBTA HEALTH INFORMATION TECHNOLOGY Co.,Ltd.

Patentee before: DIGITAL CHINA HEALTH TECHNOLOGIES Co.,Ltd.

CP01 Change in the name or title of a patent holder