CN107680689A - Potential disease estimating method, system and the readable storage medium storing program for executing of medical text - Google Patents
Potential disease estimating method, system and the readable storage medium storing program for executing of medical text Download PDFInfo
- Publication number
- CN107680689A CN107680689A CN201710313520.1A CN201710313520A CN107680689A CN 107680689 A CN107680689 A CN 107680689A CN 201710313520 A CN201710313520 A CN 201710313520A CN 107680689 A CN107680689 A CN 107680689A
- Authority
- CN
- China
- Prior art keywords
- medical
- disease
- text
- vocabulary
- medical text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16Z—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS, NOT OTHERWISE PROVIDED FOR
- G16Z99/00—Subject matter not provided for in other main groups of this subclass
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Public Health (AREA)
- Biomedical Technology (AREA)
- Databases & Information Systems (AREA)
- Pathology (AREA)
- Epidemiology (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
The invention discloses a kind of potential disease estimating method, system and the readable storage medium storing program for executing of medical text, this method includes:The medical text received is segmented, and each participle corresponding to the medical text is matched with the special lexicon of predetermined medical field, extracts the medical vocabulary in each participle corresponding to the medical text;Based on the medical professionalism database built in advance, disease corresponding to the medical vocabulary in the medical text is determined;Wherein, the mapping relations comprising different type disease with medical vocabulary in the medical professionalism database;Exported the disease of determination as the potential disease for the medical text being inferred to.The present invention is accurately and efficiently inferred to the potential disease of medical text.
Description
Technical field
The present invention relates to field of computer technology, more particularly to a kind of potential disease estimating method of medical text, system
And readable storage medium storing program for executing.
Background technology
Generally, the first step for handling medical text is all to infer potential disease, and then could be carried out next
Diagnostic recommendations.Infer in the prior art for the potential disease of medical text, can only manually be pushed away according to the personal experience of doctor
Potential disease in the disconnected medical text, it is less efficient, potential disease can not be carried out using existing medical data resource
Effectively infer.
The content of the invention
It is a primary object of the present invention to provide a kind of potential disease estimating method, system and the readable storage of medical text
Medium, it is intended to be accurately and efficiently inferred to the potential disease of medical text.
To achieve the above object, the potential disease estimating method of a kind of medical text provided by the invention, methods described bag
Include following steps:
A, the medical text received is segmented, and segmented each corresponding to the medical text and predetermined doctor
Treat domain-specific lexicon to be matched, extract the medical vocabulary in each participle corresponding to the medical text;
B, based on the medical professionalism database built in advance, disease corresponding to the medical vocabulary in the medical text is determined;
Wherein, the mapping relations comprising different type disease with medical vocabulary in the medical professionalism database;
C, exported the disease of determination as the potential disease for the medical text being inferred to.
Preferably, also include before the step A:
Medical data is obtained from predetermined data source, is found out from the medical data one corresponding to each disease
Individual or multiple medical vocabulary, and establish medical professionalism database according to the mapping relations of different type disease and medical vocabulary.
Preferably, the medical vocabulary includes:
In profile information, symptom information, complications information, treatment medicine information or treatment section office information corresponding to disease
Medical vocabulary.
Preferably, the weight of each medical vocabulary corresponding to disease, the step are also included in the medical professionalism database
Rapid B includes:
Based on the medical professionalism database built in advance, disease corresponding to each medical vocabulary in the medical text is found out,
And the weight for calculating medical vocabulary corresponding to each disease add and, the weight of medical vocabulary corresponding to selection adds and highest disease
As disease corresponding to the medical text determined.
Preferably, the step of described pair of medical text received carries out word segmentation processing includes:
The medical text is matched with the special lexicon of predetermined medical field according to Forward Maximum Method method,
The first matching result is obtained, the first phrase of the first quantity and the individual character of the 3rd quantity are included in first matching result;
The medical text is matched with the special lexicon of predetermined medical field according to reverse maximum matching method,
The second matching result is obtained, the second phrase of the second quantity and the individual character of the 4th quantity are included in second matching result;
If first quantity is equal with second quantity, and the 3rd quantity is less than or equal to the described 4th number
Measure, then the word segmentation result using first matching result as the medical text;
If first quantity is equal with second quantity, and the 3rd quantity is more than the 4th quantity, then will
Word segmentation result of second matching result as the medical text;
If first quantity and second quantity are unequal, and first quantity is more than second quantity, then
Word segmentation result using second matching result as the medical text;
If first quantity and second quantity are unequal, and first quantity is less than second quantity, then
Word segmentation result using first matching result as the medical text.
In addition, to achieve the above object, the present invention also provides a kind of potential disease inference system of medical text, the doctor
Treating the potential disease inference system of text includes:
Extraction module is segmented, for being segmented to the medical text received, and by each point corresponding to the medical text
Word is matched with the special lexicon of predetermined medical field, extracts the doctor in each participle corresponding to the medical text
Treat vocabulary;
Determining module, for based on the medical professionalism database built in advance, determining the medical vocabulary in the medical text
Corresponding disease;Wherein, the mapping relations comprising different type disease with medical vocabulary in the medical professionalism database;
Output module, for being exported the disease of determination as the potential disease for the medical text being inferred to.
Preferably, in addition to:
Module is established, for obtaining medical data from predetermined data source, is found out from the medical data each
One or more medical vocabulary corresponding to kind disease, and medical treatment is established according to different type disease and the mapping relations of medical vocabulary
Specialized database.
Preferably, the medical vocabulary includes:
In profile information, symptom information, complications information, treatment medicine information or treatment section office information corresponding to disease
Medical vocabulary.
Preferably, the weight of each medical vocabulary corresponding to disease is also included in the medical professionalism database, it is described true
Cover half block is additionally operable to:
Based on the medical professionalism database built in advance, disease corresponding to each medical vocabulary in the medical text is found out,
And the weight for calculating medical vocabulary corresponding to each disease add and, the weight of medical vocabulary corresponding to selection adds and highest disease
As disease corresponding to the medical text determined.
Further, to achieve the above object, the present invention also provides a kind of computer-readable recording medium, the computer
Readable storage medium storing program for executing is stored with the potential disease inference system of medical text, and the potential disease inference system of the medical text can
By at least one computing device, so that the potential disease of at least one computing device medical text described above is inferred
The step of method.
Potential disease estimating method, system and the readable storage medium storing program for executing of medical text proposed by the present invention, by receiving
Medical text segmented, extract the medical vocabulary in each participle corresponding to the medical text;And based on structure in advance
The mapping relations comprising various disease and medical vocabulary medical professionalism database, determine the medical vocabulary in the medical text
Corresponding disease, using the potential disease as the medical text being inferred to.Due to can be built according to various medical data resources
Various disease and the mapping relations of medical vocabulary, and the medical vocabulary in medical text finds the disease mapped therewith, phase
Than manually being inferred according to doctor personal experience, more efficient and accuracy rate is higher.
Brief description of the drawings
Fig. 1 is the schematic flow sheet of the potential disease estimating method first embodiment of the medical text of the present invention;
Fig. 2 is the schematic flow sheet of the potential disease estimating method second embodiment of the medical text of the present invention;
Fig. 3 is the running environment schematic diagram of the preferred embodiment of potential disease inference system 10 of the medical text of the present invention;
Fig. 4 is the high-level schematic functional block diagram of the potential disease inference system first embodiment of the medical text of the present invention;
Fig. 5 is the high-level schematic functional block diagram of the potential disease inference system second embodiment of the medical text of the present invention.
The realization, functional characteristics and advantage of the object of the invention will be described further referring to the drawings in conjunction with the embodiments.
Embodiment
In order that technical problems, technical solutions and advantages to be solved are clearer, clear, tie below
Drawings and examples are closed, the present invention will be described in further detail.It should be appreciated that specific embodiment described herein is only
To explain the present invention, it is not intended to limit the present invention.
The present invention provides a kind of potential disease estimating method of medical text.
Reference picture 1, Fig. 1 are the schematic flow sheet of the embodiment of potential disease estimating method one of the medical text of the present invention.
In one embodiment, the potential disease estimating method of the medical text includes:
Step S10, the medical text received is segmented, and by corresponding to the medical text it is each participle with advance really
The special lexicon of fixed medical field is matched, and extracts the medical vocabulary in each participle corresponding to the medical text.
Receive medical text to be diagnosed, can such as receive user by browser, client end AP P transmissions it is to be diagnosed
Medical text.In the present embodiment, after medical text is received, word segmentation processing is carried out to the medical text received first.For example, can
According to punctuation mark by medical text dividing into the complete sentence of a rule, then word segmentation processing is carried out to the sentence of each cutting,
Word segmentation processing such as is carried out to the sentence of each cutting using the segmenting method of string matching, such as Forward Maximum Method method,
Character string in the sentence of one cutting segments from left to right;Or reverse maximum matching method, in the sentence a cutting
Character string segment from right to left;Or shortest path participle method, require to cut inside the character string in the sentence of a cutting
The word number gone out is minimum;Or two-way maximum matching method, it is forward and reverse while carry out participle matching.Also segmented using the meaning of a word
Method carries out word segmentation processing to the sentence of each cutting, and meaning of a word participle method is the segmenting method that a kind of machine talk judges, utilizes sentence
Method information and semantic information handle Ambiguity to segment.Also the sentence of each cutting is divided using statistical morphology
Word processing, from the historical search record of the historical search record of active user or public users, according to the statistics of phrase, it can unite
The frequency occurred in respect of a little two adjacent words is more, then can be segmented using the two adjacent words as phrase.
After completing word segmentation processing to medical text, each participle corresponding to the medical text is led with predetermined medical treatment
The special lexicon in domain is matched, and the medicine in general medical dictionary is may include in the predetermined special lexicon of medical field
It is dictionary, corresponding according to obtained various various diseases are extracted in a large amount of medical science texts (such as medical data of increasing income on internet)
Profile information, symptom information, complications information, treatment medicine information or treat medical vocabulary in section office information, etc..Should
The special lexicon of medical field can be changeless or regular according to medical data of increasing income newest on internet
Update the medical vocabulary in the special lexicon of medical field.Extract in each participle corresponding to the medical text with predefining
The medical vocabulary that matches of the special lexicon of medical field, you can get in the medical text with its potential disease correlation
Larger information is the medical vocabulary extracted.
Step S20, based on the medical professionalism database built in advance, determine corresponding to the medical vocabulary in the medical text
Disease;Wherein, the mapping relations comprising different type disease with medical vocabulary in the medical professionalism database.
After extracting medical vocabulary larger with its potential disease correlation in each participle corresponding to the medical text, base
In the medical professionalism database built in advance, disease corresponding to the medical vocabulary in the medical text is determined.The medical professionalism
In database comprising different type disease and medical vocabulary (such as according to extract to obtain in a large amount of medical science texts symptom, medicine, inspection
Look into, the information vocabulary such as section office) mapping relations, such as medical professionalism database, bag can be built according to online increase income data and text
Containing the specialized information such as disease and its corresponding brief introduction, symptom, complication, treatment medicine, common inspection.Different diseases based on structure
The sick and mapping relations of medical vocabulary, can find the disease mapped therewith according to the medical vocabulary in the medical text extracted
Disease.
Step S30, exported the disease of determination as the potential disease for the medical text being inferred to.
Medical vocabulary in the medical text extracted determine corresponding to after disease, you can by the disease of determination
Potential disease as the medical text being inferred to is exported, with the potential disease based on the medical text being inferred to come
Carry out follow-up diagnostic recommendations.By the medical text potential disease inferential statistics in practical application, by the present embodiment
The disease label accuracy rate (manual review does not have apparent error) that potential disease estimating method obtains can reach 85% or so, energy
Effectively improve the accuracy rate inferred to medical text potential disease.
The present embodiment is extracted in each participle corresponding to the medical text by being segmented to the medical text received
Medical vocabulary;And based on building the medical professionalism database comprising various disease with the mapping relations of medical vocabulary in advance,
Disease corresponding to the medical vocabulary in the medical text is determined, using the potential disease as the medical text being inferred to.Due to
Various disease and the mapping relations of medical vocabulary, and the medical treatment in medical text can be built according to various medical data resources
Vocabulary finds the disease mapped therewith, is manually inferred compared to according to doctor personal experience, and more efficient and accuracy rate is higher.
As shown in Fig. 2 second embodiment of the invention proposes a kind of potential disease estimating method of medical text, in above-mentioned reality
On the basis of applying example, also include before above-mentioned steps S10:
Step S40, medical data is obtained from predetermined data source, each disease is found out from the medical data
The corresponding medical vocabulary of one or more, and establish medical professionalism number according to the mapping relations of different type disease and medical vocabulary
According to storehouse.
In the present embodiment, before the potential disease for carrying out medical text is inferred, first obtained from predetermined data source
Medical data, medical professionalism number is established with the mapping relations of the different type disease in the medical data and medical vocabulary
According to storehouse.The medical data can be the authentic interpretation of the various diseases obtained from existing medical data base, including it is corresponding
Brief introduction, symptom, complication, treatment medicine, common the medical treatment letter corresponding to specialized information or various medicines such as check
Breath, the disease type information that such as medicine cures mainly, the medical data can also be in real time or fixed by instruments such as web crawlers
When from the medical data source of increasing income on internet (for example, on the question and answer of various disease, discussion etc. on each World Jam, or various
Newest medical cases, medical question and answer text etc.) the certain types of information that obtains is (for example, treatment side corresponding to various disease
Case, medicine, affiliated section office, clinical manifestation etc.).Found out from the medical data of acquisition corresponding to each disease one or
Multiple medical vocabulary, you can medical professionalism data are established according to various disease and the mapping relations of one or more medical vocabulary
Storehouse, so that medical professionalism database subsequently based on foundation carries out the deduction of potential disease.
Further, in other embodiments, each medical treatment corresponding to disease is also included in the medical professionalism database
The weight of vocabulary, above-mentioned steps S20 can include:
Based on the medical professionalism database built in advance, disease corresponding to each medical vocabulary in the medical text is found out,
And the weight for calculating medical vocabulary corresponding to each disease add and, the weight of medical vocabulary corresponding to selection adds and highest disease
As disease corresponding to the medical text determined.
In the present embodiment, it is contemplated that medical vocabulary corresponding to a kind of disease may be one or more, a medical vocabulary
Corresponding disease may also have one or more, for example, same symptom may map to obtain multiple diseases, same medicine
Also a variety of diseases can be treated.Therefore, in the medical professionalism database of structure, different medical vocabulary is also assigned to different power
Weight, when each medical vocabulary has multiple in the medical text found out in the medical professionalism database based on structure, to calculate
The weight of medical vocabulary corresponding to each disease add and, the weight of medical vocabulary corresponding to selection adds with highest disease as true
Disease corresponding to the medical text made.For example, the weight that can map some disease to obtain adds and as the deduction disease
Degree of confidence, select degree of confidence highest disease to be used as final result, so as to further improve to the potential disease of medical text
The accuracy rate that disease is inferred.
Further, in other embodiments, the step of word segmentation processing is carried out in above-mentioned steps S10 to the medical text received
Suddenly include:
It is according to Forward Maximum Method method that character string pending in medical text and predetermined medical field is special
Lexicon is (for example, the special lexicon of the medical field can be the learning-oriented of common therapy specialized dictionary or extendible capacity
Medical dictionary) matched, obtain the first matching result;
It is according to reverse maximum matching method that character string pending in medical text and predetermined medical field is special
Lexicon is (for example, the special lexicon of the medical field can be the learning-oriented of common therapy specialized dictionary or extendible capacity
Medical dictionary) matched, obtain the second matching result.Wherein, the of the first quantity is included in first matching result
One phrase, the second phrase of the second quantity is included in second matching result;Include in first matching result
The individual character of three quantity, the individual character of the 4th quantity is included in second matching result.
If first quantity is equal with second quantity, and the 3rd quantity is less than or equal to the described 4th number
Amount, then export first matching result (including phrase and individual character) corresponding to the medical text;
If first quantity is equal with second quantity, and the 3rd quantity is more than the 4th quantity, then defeated
Go out second matching result (including phrase and individual character) corresponding to the medical text;
If first quantity and second quantity are unequal, and first quantity is more than second quantity, then
Export second matching result (including phrase and individual character) corresponding to the medical text;
If first quantity and second quantity are unequal, and first quantity is less than second quantity, then
Export first matching result (including phrase and individual character) corresponding to the medical text.
Word segmentation processing is carried out to medical text using bi-directional matching method in the present embodiment, by forward and reverse while divided
The viscosity of front and rear combined arrangement in the pending character string of medical text is analyzed in word matching, due to phrase energy generation under normal circumstances
The probability of table core views information is bigger, i.e., phrase is more likely the medical vocabulary in the medical text.Therefore, by positive and negative
Find out that individual character quantity is less to carrying out segmenting matching simultaneously, the more participle matching result of phrase, to be used as medical text
Word segmentation result, so as to improve the accuracy of participle, more accurately to extract the medical vocabulary in the medical text.
The present invention further provides a kind of potential disease inference system of medical text.Referring to Fig. 3, it is medical treatment of the invention
The running environment schematic diagram of the preferred embodiment of potential disease inference system 10 of text.
In the present embodiment, the potential disease inference system 10 of described medical text is installed and runs on electronic installation 1
In.The electronic installation 1 may include, but be not limited only to, memory 11, processor 12 and display 13.Fig. 3 illustrate only with group
Part 11-13 electronic installation 1, it should be understood that being not required for implementing all components shown, the implementation that can be substituted is more
More or less components.
The memory 11 can be the internal storage unit of the electronic installation 1 in certain embodiments, such as the electricity
The hard disk or internal memory of sub-device 1.The memory 11 can also be that the outside of the electronic installation 1 is deposited in further embodiments
The plug-in type hard disk being equipped with storage equipment, such as the electronic installation 1, intelligent memory card (Smart Media Card, SMC),
Secure digital (Secure Digital, SD) blocks, flash card (Flash Card) etc..Further, the memory 11 may be used also
With both internal storage units including the electronic installation 1 or including External memory equipment.The memory 11, which is used to store, pacifies
Application software and Various types of data loaded on the electronic installation 1, such as the potential disease inference system 10 of the medical text
Program code etc..The memory 11 can be also used for temporarily storing the data that has exported or will export.
The processor 12 can be in certain embodiments a central processing unit (Central Processing Unit,
CPU), microprocessor or other data processing chips, for running the program code stored in the memory 11 or processing number
According to, such as perform the potential disease inference system 10 of the medical text etc..
The display 13 can be light-emitting diode display, liquid crystal display, touch-control liquid crystal display in certain embodiments
And OLED (Organic Light-Emitting Diode, Organic Light Emitting Diode) touches device etc..The display 13 is used
In being shown in the information that is handled in the electronic installation 1 and for showing that visual user interface, such as display extract
Medical text in medical vocabulary, the potential disease of the medical text that is inferred to etc..The part 11- of the electronic installation 1
13 are in communication with each other by system bus.
Referring to Fig. 4, it is the functional block diagram of the first embodiment of potential disease inference system 10 of the medical text of the present invention.
In the present embodiment, the potential disease inference system 10 of described medical text can be divided into one or more modules, institute
State one or more module to be stored in the memory 11, and (the present embodiment is described by one or more processors
Processor 12) it is performed, to complete the present invention.For example, in Fig. 4, the potential disease inference system 10 of described medical text
Participle extraction module 01, determining module 02 and output module 03 can be divided into.Module alleged by the present invention is to have referred to
Into the series of computation machine programmed instruction section of specific function, than program more suitable for describing the speech recognition system 10 described
Implementation procedure in electronic installation 1.Describe specifically to introduce the participle extraction module 01, determining module 02 and output mould below
The function of block 03.
Extraction module 01 is segmented, for being segmented to the medical text received, and will be each corresponding to the medical text
Participle is matched with the special lexicon of predetermined medical field, is extracted in each participle corresponding to the medical text
Medical vocabulary;
Receive medical text to be diagnosed, can such as receive user by browser, client end AP P transmissions it is to be diagnosed
Medical text.In the present embodiment, after medical text is received, word segmentation processing is carried out to the medical text received first.For example, can
According to punctuation mark by medical text dividing into the complete sentence of a rule, then word segmentation processing is carried out to the sentence of each cutting,
Word segmentation processing such as is carried out to the sentence of each cutting using the segmenting method of string matching, such as Forward Maximum Method method,
Character string in the sentence of one cutting segments from left to right;Or reverse maximum matching method, in the sentence a cutting
Character string segment from right to left;Or shortest path participle method, require to cut inside the character string in the sentence of a cutting
The word number gone out is minimum;Or two-way maximum matching method, it is forward and reverse while carry out participle matching.Also segmented using the meaning of a word
Method carries out word segmentation processing to the sentence of each cutting, and meaning of a word participle method is the segmenting method that a kind of machine talk judges, utilizes sentence
Method information and semantic information handle Ambiguity to segment.Also the sentence of each cutting is divided using statistical morphology
Word processing, from the historical search record of the historical search record of active user or public users, according to the statistics of phrase, it can unite
The frequency occurred in respect of a little two adjacent words is more, then can be segmented using the two adjacent words as phrase.
After completing word segmentation processing to medical text, each participle corresponding to the medical text is led with predetermined medical treatment
The special lexicon in domain is matched, and the medicine in general medical dictionary is may include in the predetermined special lexicon of medical field
It is dictionary, corresponding according to obtained various various diseases are extracted in a large amount of medical science texts (such as medical data of increasing income on internet)
Profile information, symptom information, complications information, treatment medicine information or treat medical vocabulary in section office information, etc..Should
The special lexicon of medical field can be changeless or regular according to medical data of increasing income newest on internet
Update the medical vocabulary in the special lexicon of medical field.Extract in each participle corresponding to the medical text with predefining
The medical vocabulary that matches of the special lexicon of medical field, you can get in the medical text with its potential disease correlation
Larger information is the medical vocabulary extracted.
Determining module 02, for based on the medical professionalism database built in advance, determining the medical word in the medical text
Disease corresponding to remittance;Wherein, the mapping relations comprising different type disease with medical vocabulary in the medical professionalism database;
After extracting medical vocabulary larger with its potential disease correlation in each participle corresponding to the medical text, base
In the medical professionalism database built in advance, disease corresponding to the medical vocabulary in the medical text is determined.The medical professionalism
In database comprising different type disease and medical vocabulary (such as according to extract to obtain in a large amount of medical science texts symptom, medicine, inspection
Look into, the information vocabulary such as section office) mapping relations, such as medical professionalism database, bag can be built according to online increase income data and text
Containing the specialized information such as disease and its corresponding brief introduction, symptom, complication, treatment medicine, common inspection.Different diseases based on structure
The sick and mapping relations of medical vocabulary, can find the disease mapped therewith according to the medical vocabulary in the medical text extracted
Disease.
Output module 03, for being exported the disease of determination as the potential disease for the medical text being inferred to.
Medical vocabulary in the medical text extracted determine corresponding to after disease, you can by the disease of determination
Potential disease as the medical text being inferred to is exported, with the potential disease based on the medical text being inferred to come
Carry out follow-up diagnostic recommendations.By the medical text potential disease inferential statistics in practical application, by the present embodiment
The disease label accuracy rate (manual review does not have apparent error) that potential disease estimating method obtains can reach 85% or so, energy
Effectively improve the accuracy rate inferred to medical text potential disease.
The present embodiment is extracted in each participle corresponding to the medical text by being segmented to the medical text received
Medical vocabulary;And based on building the medical professionalism database comprising various disease with the mapping relations of medical vocabulary in advance,
Disease corresponding to the medical vocabulary in the medical text is determined, using the potential disease as the medical text being inferred to.Due to
Various disease and the mapping relations of medical vocabulary, and the medical treatment in medical text can be built according to various medical data resources
Vocabulary finds the disease mapped therewith, is manually inferred compared to according to doctor personal experience, and more efficient and accuracy rate is higher.
As shown in figure 5, second embodiment of the invention proposes a kind of potential disease inference system of medical text, in above-mentioned reality
On the basis of applying example, in addition to:
Module 04 is established, for obtaining medical data from predetermined data source, is found out from the medical data every
One or more medical vocabulary corresponding to a kind of disease, and established and cured according to different type disease and the mapping relations of medical vocabulary
Treat specialized database.
In the present embodiment, before the potential disease for carrying out medical text is inferred, first obtained from predetermined data source
Medical data, medical professionalism number is established with the mapping relations of the different type disease in the medical data and medical vocabulary
According to storehouse.The medical data can be the authentic interpretation of the various diseases obtained from existing medical data base, including it is corresponding
Brief introduction, symptom, complication, treatment medicine, common the medical treatment letter corresponding to specialized information or various medicines such as check
Breath, the disease type information that such as medicine cures mainly, the medical data can also be in real time or fixed by instruments such as web crawlers
When from the medical data source of increasing income on internet (for example, on the question and answer of various disease, discussion etc. on each World Jam, or various
Newest medical cases, medical question and answer text etc.) the certain types of information that obtains is (for example, treatment side corresponding to various disease
Case, medicine, affiliated section office, clinical manifestation etc.).Found out from the medical data of acquisition corresponding to each disease one or
Multiple medical vocabulary, you can medical professionalism data are established according to various disease and the mapping relations of one or more medical vocabulary
Storehouse, so that medical professionalism database subsequently based on foundation carries out the deduction of potential disease.
Further, in other embodiments, each medical treatment corresponding to disease is also included in the medical professionalism database
The weight of vocabulary, above-mentioned determining module 02 can be also used for:
Based on the medical professionalism database built in advance, disease corresponding to each medical vocabulary in the medical text is found out,
And the weight for calculating medical vocabulary corresponding to each disease add and, the weight of medical vocabulary corresponding to selection adds and highest disease
As disease corresponding to the medical text determined.
In the present embodiment, it is contemplated that medical vocabulary corresponding to a kind of disease may be one or more, a medical vocabulary
Corresponding disease may also have one or more, for example, same symptom may map to obtain multiple diseases, same medicine
Also a variety of diseases can be treated.Therefore, in the medical professionalism database of structure, different medical vocabulary is also assigned to different power
Weight, when each medical vocabulary has multiple in the medical text found out in the medical professionalism database based on structure, to calculate
The weight of medical vocabulary corresponding to each disease add and, the weight of medical vocabulary corresponding to selection adds with highest disease as true
Disease corresponding to the medical text made.For example, the weight that can map some disease to obtain adds and as the deduction disease
Degree of confidence, select degree of confidence highest disease to be used as final result, so as to further improve to the potential disease of medical text
The accuracy rate that disease is inferred.
Further, in other embodiments, above-mentioned participle extraction module 01 is additionally operable to:
It is according to Forward Maximum Method method that character string pending in medical text and predetermined medical field is special
Lexicon is (for example, the special lexicon of the medical field can be the learning-oriented of common therapy specialized dictionary or extendible capacity
Medical dictionary) matched, obtain the first matching result;
It is according to reverse maximum matching method that character string pending in medical text and predetermined medical field is special
Lexicon is (for example, the special lexicon of the medical field can be the learning-oriented of common therapy specialized dictionary or extendible capacity
Medical dictionary) matched, obtain the second matching result.Wherein, the of the first quantity is included in first matching result
One phrase, the second phrase of the second quantity is included in second matching result;Include in first matching result
The individual character of three quantity, the individual character of the 4th quantity is included in second matching result.
If first quantity is equal with second quantity, and the 3rd quantity is less than or equal to the described 4th number
Amount, then export first matching result (including phrase and individual character) corresponding to the medical text;
If first quantity is equal with second quantity, and the 3rd quantity is more than the 4th quantity, then defeated
Go out second matching result (including phrase and individual character) corresponding to the medical text;
If first quantity and second quantity are unequal, and first quantity is more than second quantity, then
Export second matching result (including phrase and individual character) corresponding to the medical text;
If first quantity and second quantity are unequal, and first quantity is less than second quantity, then
Export first matching result (including phrase and individual character) corresponding to the medical text.
Word segmentation processing is carried out to medical text using bi-directional matching method in the present embodiment, by forward and reverse while divided
The viscosity of front and rear combined arrangement in the pending character string of medical text is analyzed in word matching, due to phrase energy generation under normal circumstances
The probability of table core views information is bigger, i.e., phrase is more likely the medical vocabulary in the medical text.Therefore, by positive and negative
Find out that individual character quantity is less to carrying out segmenting matching simultaneously, the more participle matching result of phrase, to be used as medical text
Word segmentation result, so as to improve the accuracy of participle, more accurately to extract the medical vocabulary in the medical text.
In addition, the present invention also provides a kind of computer-readable recording medium, the computer-readable recording medium storage has
The potential disease inference system of medical text, the potential disease inference system of the medical text can be held by least one processor
OK, so that the step of the potential disease estimating method of medical text at least one computing device such as above-mentioned embodiment
Suddenly, the specific implementation process such as step S10, S20, S30 of potential disease estimating method of the medical text as described above, herein
Repeat no more.
It should be noted that herein, term " comprising ", "comprising" or its any other variant are intended to non-row
His property includes, so that process, method, article or device including a series of elements not only include those key elements, and
And also include the other element being not expressly set out, or also include for this process, method, article or device institute inherently
Key element.In the absence of more restrictions, the key element limited by sentence "including a ...", it is not excluded that including this
Other identical element also be present in the process of key element, method, article or device.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side
Method can add the mode of required general hardware platform to realize by software, naturally it is also possible to realized by hardware, but a lot
In the case of the former be more preferably embodiment.Based on such understanding, technical scheme is substantially in other words to existing
The part that technology contributes can be embodied in the form of software product, and the computer software product is stored in a storage
In medium (such as ROM/RAM, magnetic disc, CD), including some instructions to cause a station terminal equipment (can be mobile phone, calculate
Machine, server, air conditioner, or network equipment etc.) perform method described in each embodiment of the present invention.
Above by reference to the preferred embodiments of the present invention have been illustrated, not thereby limit to the interest field of the present invention.On
State that sequence number of the embodiment of the present invention is for illustration only, do not represent the quality of embodiment.Patrolled in addition, though showing in flow charts
Order is collected, but in some cases, can be with the step shown or described by being performed different from order herein.
Those skilled in the art do not depart from the scope of the present invention and essence, can have a variety of flexible programs to realize the present invention,
It can be used for another embodiment for example as the feature of one embodiment and obtain another embodiment.All technologies with the present invention
The all any modification, equivalent and improvement made within design, all should be within the interest field of the present invention.
Claims (10)
1. a kind of potential disease estimating method of medical text, it is characterised in that the described method comprises the following steps:
A, the medical text received is segmented, and each participle corresponding to the medical text is led with predetermined medical treatment
The special lexicon in domain is matched, and extracts the medical vocabulary in each participle corresponding to the medical text;
B, based on the medical professionalism database built in advance, disease corresponding to the medical vocabulary in the medical text is determined;Wherein,
Mapping relations comprising different type disease with medical vocabulary in the medical professionalism database;
C, exported the disease of determination as the potential disease for the medical text being inferred to.
2. the potential disease estimating method of medical text as claimed in claim 1, it is characterised in that before the step A also
Including:
Obtain medical data from predetermined data source, found out from the medical data corresponding to each disease one or
Multiple medical vocabulary, and establish medical professionalism database according to the mapping relations of different type disease and medical vocabulary.
3. the potential disease estimating method of medical text as claimed in claim 1, it is characterised in that the medical vocabulary bag
Include:
Medical treatment in profile information, symptom information, complications information, treatment medicine information or treatment section office information corresponding to disease
Vocabulary.
4. the potential disease estimating method of the medical text as any one of claim 1-3, it is characterised in that the doctor
The weight that each medical vocabulary corresponding to disease is also included in specialized database is treated, the step B includes:
Based on the medical professionalism database built in advance, disease corresponding to each medical vocabulary in the medical text is found out, and count
Calculate medical vocabulary corresponding to each disease weight add and, the weight of medical vocabulary corresponding to selection adds and highest disease conduct
Disease corresponding to the medical text determined.
5. the potential disease estimating method of the medical text as any one of claim 1-3, it is characterised in that described right
The step of medical text progress word segmentation processing received, includes:
The medical text is matched with the special lexicon of predetermined medical field according to Forward Maximum Method method, obtained
First matching result, include the first phrase of the first quantity and the individual character of the 3rd quantity in first matching result;
The medical text is matched with the special lexicon of predetermined medical field according to reverse maximum matching method, obtained
Second matching result, include the second phrase of the second quantity and the individual character of the 4th quantity in second matching result;
If first quantity is equal with second quantity, and the 3rd quantity is less than or equal to the 4th quantity,
The then word segmentation result using first matching result as the medical text;
If first quantity is equal with second quantity, and the 3rd quantity is more than the 4th quantity, then by described in
Word segmentation result of second matching result as the medical text;
If first quantity and second quantity are unequal, and first quantity is more than second quantity, then by institute
State word segmentation result of second matching result as the medical text;
If first quantity and second quantity are unequal, and first quantity is less than second quantity, then by institute
State word segmentation result of first matching result as the medical text.
6. a kind of electronic installation, it is characterised in that the electronic installation includes memory, processor and is stored in the memory
The potential disease inference system for the medical text gone up and can run on the processor, the potential disease of the medical text push away
Following steps are realized when disconnected system is by the computing device:
A, the medical text received is segmented, and each participle corresponding to the medical text is led with predetermined medical treatment
The special lexicon in domain is matched, and extracts the medical vocabulary in each participle corresponding to the medical text;
B, based on the medical professionalism database built in advance, disease corresponding to the medical vocabulary in the medical text is determined;Wherein,
Mapping relations comprising different type disease with medical vocabulary in the medical professionalism database;
C, exported the disease of determination as the potential disease for the medical text being inferred to.
7. electronic installation as claimed in claim 6, it is characterised in that before the step A, the processor is additionally operable to hold
The potential disease inference system of the row medical text, to realize following steps:
Obtain medical data from predetermined data source, found out from the medical data corresponding to each disease one or
Multiple medical vocabulary, and establish medical professionalism database according to the mapping relations of different type disease and medical vocabulary.
8. electronic installation as claimed in claim 6, it is characterised in that the medical vocabulary includes:
Medical treatment in profile information, symptom information, complications information, treatment medicine information or treatment section office information corresponding to disease
Vocabulary.
9. such as the electronic installation any one of claim 6-8, it is characterised in that also wrapped in the medical professionalism database
Weight containing each medical vocabulary corresponding to disease, the step B include:
Based on the medical professionalism database built in advance, disease corresponding to each medical vocabulary in the medical text is found out, and count
Calculate medical vocabulary corresponding to each disease weight add and, the weight of medical vocabulary corresponding to selection adds and highest disease conduct
Disease corresponding to the medical text determined.
10. a kind of computer-readable recording medium, the computer-readable recording medium storage has the potential disease of medical text
Inference system, the potential disease inference system of the medical text can be by least one computing device, so that described at least one
The step of potential disease estimating method of medical text of the individual computing device as any one of claim 1-5.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710313520.1A CN107680689A (en) | 2017-05-05 | 2017-05-05 | Potential disease estimating method, system and the readable storage medium storing program for executing of medical text |
PCT/CN2018/076149 WO2018201772A1 (en) | 2017-05-05 | 2018-02-10 | Method and system for inferring potential disease from medical text, and readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710313520.1A CN107680689A (en) | 2017-05-05 | 2017-05-05 | Potential disease estimating method, system and the readable storage medium storing program for executing of medical text |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107680689A true CN107680689A (en) | 2018-02-09 |
Family
ID=61134116
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710313520.1A Pending CN107680689A (en) | 2017-05-05 | 2017-05-05 | Potential disease estimating method, system and the readable storage medium storing program for executing of medical text |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN107680689A (en) |
WO (1) | WO2018201772A1 (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018201772A1 (en) * | 2017-05-05 | 2018-11-08 | 平安科技(深圳)有限公司 | Method and system for inferring potential disease from medical text, and readable storage medium |
CN109036506A (en) * | 2018-07-25 | 2018-12-18 | 平安科技(深圳)有限公司 | Monitoring and managing method, electronic device and the readable storage medium storing program for executing of internet medical treatment interrogation |
CN109192321A (en) * | 2018-09-26 | 2019-01-11 | 北京理工大学 | The construction method and calculating storage device of drug knowledge mapping |
CN109192300A (en) * | 2018-08-17 | 2019-01-11 | 百度在线网络技术(北京)有限公司 | Intelligent way of inquisition, system, computer equipment and storage medium |
CN109215754A (en) * | 2018-09-10 | 2019-01-15 | 平安科技(深圳)有限公司 | Medical record data processing method, device, computer equipment and storage medium |
CN109616165A (en) * | 2018-11-07 | 2019-04-12 | 平安科技(深圳)有限公司 | Medical information methods of exhibiting and device |
CN109698018A (en) * | 2018-12-24 | 2019-04-30 | 广州天鹏计算机科技有限公司 | Medical text handling method, device, computer equipment and storage medium |
WO2020034810A1 (en) * | 2018-08-14 | 2020-02-20 | 平安医疗健康管理股份有限公司 | Search method and apparatus, computer device and storage medium |
WO2020103469A1 (en) * | 2018-05-29 | 2020-05-28 | 平安医疗健康管理股份有限公司 | Method and device for establishing medical mapping database, computer apparatus, and storage medium |
WO2020177230A1 (en) * | 2019-03-07 | 2020-09-10 | 平安科技(深圳)有限公司 | Medical data classification method and apparatus based on machine learning, and computer device and storage medium |
CN112002416A (en) * | 2020-08-23 | 2020-11-27 | 吾征智能技术(北京)有限公司 | Disease symptom prediction system based on urine character self-learning |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102915299A (en) * | 2012-10-23 | 2013-02-06 | 海信集团有限公司 | Word segmentation method and device |
CN104102816A (en) * | 2014-06-20 | 2014-10-15 | 周晋 | Symptom match and machine learning-based automatic diagnosis system and method |
CN104484845A (en) * | 2014-12-30 | 2015-04-01 | 天津迈沃医药技术有限公司 | Disease self-analysis method based on medical ontology database |
CN104915413A (en) * | 2015-06-05 | 2015-09-16 | 广东顺德中山大学卡内基梅隆大学国际联合研究院 | Health monitoring method and health monitoring system |
CN105139237A (en) * | 2015-09-25 | 2015-12-09 | 百度在线网络技术(北京)有限公司 | Information push method and apparatus |
CN106372439A (en) * | 2016-09-21 | 2017-02-01 | 北京大学 | Method for acquiring and processing disease symptoms and weight knowledge thereof based on case library |
CN106557653A (en) * | 2016-11-15 | 2017-04-05 | 合肥工业大学 | A kind of portable medical intelligent medical guide system and method |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8145664B2 (en) * | 2008-08-15 | 2012-03-27 | Siemens Aktiengesellschaft | Disease oriented user interfaces |
CN105138829B (en) * | 2015-08-13 | 2018-01-12 | 易保互联医疗信息科技(北京)有限公司 | A kind of natural language processing method and system of Chinese medical information |
CN105095665B (en) * | 2015-08-13 | 2018-07-06 | 易保互联医疗信息科技(北京)有限公司 | A kind of natural language processing method and system of Chinese medical diagnosis on disease information |
CN106095913A (en) * | 2016-06-08 | 2016-11-09 | 广州同构医疗科技有限公司 | A kind of electronic health record text structure method |
CN107680689A (en) * | 2017-05-05 | 2018-02-09 | 平安科技(深圳)有限公司 | Potential disease estimating method, system and the readable storage medium storing program for executing of medical text |
-
2017
- 2017-05-05 CN CN201710313520.1A patent/CN107680689A/en active Pending
-
2018
- 2018-02-10 WO PCT/CN2018/076149 patent/WO2018201772A1/en active Application Filing
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102915299A (en) * | 2012-10-23 | 2013-02-06 | 海信集团有限公司 | Word segmentation method and device |
CN104765724A (en) * | 2012-10-23 | 2015-07-08 | 海信集团有限公司 | Word segmenting method and device |
CN104102816A (en) * | 2014-06-20 | 2014-10-15 | 周晋 | Symptom match and machine learning-based automatic diagnosis system and method |
CN104484845A (en) * | 2014-12-30 | 2015-04-01 | 天津迈沃医药技术有限公司 | Disease self-analysis method based on medical ontology database |
CN104915413A (en) * | 2015-06-05 | 2015-09-16 | 广东顺德中山大学卡内基梅隆大学国际联合研究院 | Health monitoring method and health monitoring system |
CN105139237A (en) * | 2015-09-25 | 2015-12-09 | 百度在线网络技术(北京)有限公司 | Information push method and apparatus |
CN106372439A (en) * | 2016-09-21 | 2017-02-01 | 北京大学 | Method for acquiring and processing disease symptoms and weight knowledge thereof based on case library |
CN106557653A (en) * | 2016-11-15 | 2017-04-05 | 合肥工业大学 | A kind of portable medical intelligent medical guide system and method |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2018201772A1 (en) * | 2017-05-05 | 2018-11-08 | 平安科技(深圳)有限公司 | Method and system for inferring potential disease from medical text, and readable storage medium |
WO2020103469A1 (en) * | 2018-05-29 | 2020-05-28 | 平安医疗健康管理股份有限公司 | Method and device for establishing medical mapping database, computer apparatus, and storage medium |
CN109036506A (en) * | 2018-07-25 | 2018-12-18 | 平安科技(深圳)有限公司 | Monitoring and managing method, electronic device and the readable storage medium storing program for executing of internet medical treatment interrogation |
WO2020034810A1 (en) * | 2018-08-14 | 2020-02-20 | 平安医疗健康管理股份有限公司 | Search method and apparatus, computer device and storage medium |
CN109192300A (en) * | 2018-08-17 | 2019-01-11 | 百度在线网络技术(北京)有限公司 | Intelligent way of inquisition, system, computer equipment and storage medium |
CN109215754A (en) * | 2018-09-10 | 2019-01-15 | 平安科技(深圳)有限公司 | Medical record data processing method, device, computer equipment and storage medium |
CN109192321A (en) * | 2018-09-26 | 2019-01-11 | 北京理工大学 | The construction method and calculating storage device of drug knowledge mapping |
CN109616165A (en) * | 2018-11-07 | 2019-04-12 | 平安科技(深圳)有限公司 | Medical information methods of exhibiting and device |
CN109698018A (en) * | 2018-12-24 | 2019-04-30 | 广州天鹏计算机科技有限公司 | Medical text handling method, device, computer equipment and storage medium |
WO2020177230A1 (en) * | 2019-03-07 | 2020-09-10 | 平安科技(深圳)有限公司 | Medical data classification method and apparatus based on machine learning, and computer device and storage medium |
CN112002416A (en) * | 2020-08-23 | 2020-11-27 | 吾征智能技术(北京)有限公司 | Disease symptom prediction system based on urine character self-learning |
Also Published As
Publication number | Publication date |
---|---|
WO2018201772A1 (en) | 2018-11-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107680689A (en) | Potential disease estimating method, system and the readable storage medium storing program for executing of medical text | |
CN108629043B (en) | Webpage target information extraction method, device and storage medium | |
CN113051356B (en) | Open relation extraction method and device, electronic equipment and storage medium | |
CN113821622B (en) | Answer retrieval method and device based on artificial intelligence, electronic equipment and medium | |
CN111695354A (en) | Text question-answering method and device based on named entity and readable storage medium | |
CN112988963B (en) | User intention prediction method, device, equipment and medium based on multi-flow nodes | |
CN113378970B (en) | Sentence similarity detection method and device, electronic equipment and storage medium | |
CN113360654B (en) | Text classification method, apparatus, electronic device and readable storage medium | |
CN111723870A (en) | Data set acquisition method, device, equipment and medium based on artificial intelligence | |
CN115238670B (en) | Information text extraction method, device, equipment and storage medium | |
CN113657105A (en) | Medical entity extraction method, device, equipment and medium based on vocabulary enhancement | |
CN113626704A (en) | Method, device and equipment for recommending information based on word2vec model | |
CN116450829A (en) | Medical text classification method, device, equipment and medium | |
CN113344125B (en) | Long text matching recognition method and device, electronic equipment and storage medium | |
CN114840684A (en) | Map construction method, device and equipment based on medical entity and storage medium | |
CN116821373A (en) | Map-based prompt recommendation method, device, equipment and medium | |
CN113918704A (en) | Question-answering method and device based on machine learning, electronic equipment and medium | |
CN112632264A (en) | Intelligent question and answer method and device, electronic equipment and storage medium | |
CN116341646A (en) | Pretraining method and device of Bert model, electronic equipment and storage medium | |
CN116739001A (en) | Text relation extraction method, device, equipment and medium based on contrast learning | |
CN116468025A (en) | Electronic medical record structuring method and device, electronic equipment and storage medium | |
CN116628162A (en) | Semantic question-answering method, device, equipment and storage medium | |
CN115346095A (en) | Visual question answering method, device, equipment and storage medium | |
CN114595321A (en) | Question marking method and device, electronic equipment and storage medium | |
CN113204962A (en) | Word sense disambiguation method, device, equipment and medium based on graph expansion structure |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180209 |