CN106844325B - Medical information processing method and medical information processing apparatus - Google Patents

Medical information processing method and medical information processing apparatus Download PDF

Info

Publication number
CN106844325B
CN106844325B CN201510886242.XA CN201510886242A CN106844325B CN 106844325 B CN106844325 B CN 106844325B CN 201510886242 A CN201510886242 A CN 201510886242A CN 106844325 B CN106844325 B CN 106844325B
Authority
CN
China
Prior art keywords
medical
words
association
texts
medical texts
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510886242.XA
Other languages
Chinese (zh)
Other versions
CN106844325A (en
Inventor
王宏波
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University Medical Information Technology Co ltd
Original Assignee
Peking University Medical Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University Medical Information Technology Co ltd filed Critical Peking University Medical Information Technology Co ltd
Priority to CN201510886242.XA priority Critical patent/CN106844325B/en
Publication of CN106844325A publication Critical patent/CN106844325A/en
Application granted granted Critical
Publication of CN106844325B publication Critical patent/CN106844325B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F19/34

Abstract

The invention provides a medical information processing method and a medical information processing device, wherein the medical information processing method comprises the following steps: performing word segmentation on a plurality of medical texts, and clustering the plurality of medical texts; determining the association degree of every two medical texts according to the words of every two medical texts in the medical texts of the same category; judging whether words of any two medical texts in the medical texts of the same category have an association relation or not according to the association degree of every two medical texts; and when the judgment result is yes, performing association storage on the words with the association relation. Through the technical scheme of the invention, the words with the association relation in the medical text can be more accurately and comprehensively excavated, so that the medical word bank constructed according to the words with the association relation is more accurate and comprehensive.

Description

Medical information processing method and medical information processing apparatus
Technical Field
The present invention relates to the field of information processing technologies, and in particular, to a medical information processing method and a medical information processing apparatus.
Background
At present, the informatization of medical services is an international development trend, along with the rapid development of Information technology, more and more hospitals in China are accelerating to implement the overall construction based on an informatization platform and a Hospital Information System (HIS) so as to improve the service level and the core competitiveness of the hospitals, the medical informatization not only improves the working efficiency of doctors and enables the doctors to have more time to serve patients, but also improves the satisfaction and the trust of the patients, and the scientific and technological image of the hospitals is established invisibly. Therefore, the gradual integration of medical service application and basic network platform is becoming a new direction for the informatization development of domestic hospitals, especially large and medium-sized hospitals.
In the medical informatization process, the construction of the medical word stock is very important and fundamental work, and the construction of the medical word stock is beneficial to realizing the electronization of medical records, the analysis of a large number of unstructured medical texts on the Internet and the intelligent analysis of medical records of patients. Although there is a well-established medical word stock system abroad, it is not suitable for the domestic medical word stock with Chinese as the mother language. English-Chinese parallel corpus, Chinese medicine and pharmacy lexicon and the like are also constructed domestically, however, words in the domestic medical lexicon are not comprehensive and lack certain correctness.
Therefore, how to construct a more accurate and comprehensive medical word stock becomes a problem to be solved urgently.
Disclosure of Invention
Based on the problems, the invention provides a new technical scheme, which can more accurately and comprehensively dig out words with association relation in medical texts, so that a medical word bank constructed according to the words with association relation is more accurate and comprehensive.
In view of the above, an aspect of the present invention provides a medical information processing method, including: performing word segmentation on a plurality of medical texts, and clustering the plurality of medical texts; determining the association degree of every two medical texts according to the words of every two medical texts in the medical texts of the same category; judging whether words of any two medical texts in the medical texts of the same category have an association relation or not according to the association degree of every two medical texts; and when the judgment result is yes, performing association storage on the words with the association relation.
In the technical scheme, the association degree of every two medical texts is determined according to the words in every two medical texts in the medical texts of the same category, whether an association relationship exists between any two words in the medical texts of the same category is judged according to the association degree of every two medical texts, and the words with the association relationship are stored in an association manner, for example, in a medical word bank, so as to construct a more complete medical word bank. For example, the words in the a-medical text are: cold and fever, the words in the B medical text are: fever and cough, the words in the C medical text are: cough and cold, it can be seen that a and B have similar words: fever and fever, 30% correlation between a and B, with the same words in B and C: in the cough, the association degree between B and C is 50%, and A and C do not have the same or similar words, but because A and B have an association, the association between A and C can be determined, that is, the association between the words of A and C exists. Therefore, the method and the device can further dig out the words with the implicit association relationship, so that the words with the association relationship in the medical text can be more accurately and comprehensively duout. Furthermore, a search engine of medical treatment information can be constructed according to the words with the incidence relation, or automatic analysis of medical treatment text information is realized, and convenience is provided for outpatients doctors and patients to inquire diseases and symptoms.
Preferably, the plurality of medical texts may be electronic medical records in a medical system of a hospital, or may be obtained from a medical professional website by using a crawler program. Because the scale of the medical texts is larger, the distributed file system can store the medical texts.
In the above technical solution, preferably, the step of performing association storage on the words with association relationship further includes: determining the association degree of words in any two medical texts according to the association degree of any two medical texts; and storing the association degree of the words in any two medical texts.
In the technical scheme, the association degree of the words in any two medical texts is determined according to the association degree of any two medical texts, specifically, the association degree of any two medical texts can be used as the association degree of the words in any two medical texts, and the association degree of the words in any two medical texts can be calculated according to a preset algorithm, so that the association degree of the words can be reflected more accurately and intuitively according to the association degree of the words. For example, the words in the a-medical text are: cold and fever, the words in the C medical text are: cough and coolness, the degree of association between a and C is 10%, and the degree of association between cold and cough is 10%.
In any one of the above technical solutions, preferably, the step of segmenting the plurality of medical texts specifically includes: and performing word segmentation on the medical texts according to the dictionary and the parts of speech of the words in the medical texts.
In the technical scheme, the words of the medical texts can be cut according to words and parts of speech in a dictionary (preferably a medical dictionary), specifically, the words of the medical texts are cut according to the words in the dictionary, if the words in the medical texts do not exist in the dictionary, whether the words are associated with front and rear words or not is judged according to the parts of speech of the words, and whether new words need to be combined or not is judged, so that the situations of word miscut and word omission are effectively avoided, and the accuracy and the comprehensiveness of word cutting are further ensured.
In any one of the above technical solutions, preferably, the step of clustering the plurality of medical texts specifically includes: clustering the plurality of medical texts according to international disease classification and K-means algorithm.
In the technical scheme, the plurality of medical texts can be clustered according to International Classification of Disease (ICD) and a K-means algorithm, and since the medical texts of the same category obtained by clustering have the same Disease, the possibility that the words of the medical texts of the same category obtained by clustering are associated is high, and then the medical texts of the same category are further processed to ensure the processing speed.
In any one of the above technical solutions, preferably, the step of performing association storage on the words with association relations specifically includes: and storing the words with the association relation according to the attributes of the words with the association relation.
In the technical scheme, the word is stored according to the attribute of the word with the association relationship, for example, the attribute of the word is as follows: the medical information storage system comprises body parts (such as heads and limbs), predicates (such as pains and strains), diseases (such as fever and heart diseases), medicines (such as Gregorian tablets and glucose injection), treatment means (such as drip and anesthesia), and neglected words (such as home and patient) which do not contribute to information extraction), so that the storage of related words is more orderly.
Another aspect of the present invention provides a medical information processing apparatus including: the processing unit is used for segmenting a plurality of medical texts and clustering the medical texts; the first determination unit is used for determining the association degree of every two medical texts according to the words of every two medical texts in the medical texts of the same category; the judging unit is used for judging whether words of any two medical texts in the medical texts of the same category have an association relation or not according to the association degree of every two medical texts; and the storage unit is used for associating and storing the words with the association relation when the judgment result is yes.
In the technical scheme, the association degree of every two medical texts is determined according to the words in every two medical texts in the medical texts of the same category, whether an association relationship exists between any two words in the medical texts of the same category is judged according to the association degree of every two medical texts, and the words with the association relationship are stored in an association manner, for example, in a medical word bank, so as to construct a more complete medical word bank. For example, the words in the a-medical text are: cold and fever, the words in the B medical text are: fever and cough, the words in the C medical text are: cough and cold, it can be seen that a and B have similar words: fever and fever, 30% correlation between a and B, with the same words in B and C: in the cough, the association degree between B and C is 50%, and A and C do not have the same or similar words, but because A and B have an association, the association between A and C can be determined, that is, the association between the words of A and C exists. Therefore, the method and the device can further dig out the words with the implicit association relationship, so that the words with the association relationship in the medical text can be more accurately and comprehensively duout. Furthermore, a search engine of medical treatment information can be constructed according to the words with the incidence relation, or automatic analysis of medical treatment text information is realized, and convenience is provided for outpatients doctors and patients to inquire diseases and symptoms.
Preferably, the plurality of medical texts may be electronic medical records in a medical system of a hospital, or may be obtained from a medical professional website by using a crawler program. Because the scale of the medical texts is larger, the distributed file system can store the medical texts.
In the above technical solution, preferably, the storage unit includes: the second determining unit is used for determining the association degree of the words in any two medical texts according to the association degree of any two medical texts; the storage unit is specifically configured to store the association degrees of the words in any two medical texts.
In the technical scheme, the association degree of the words in any two medical texts is determined according to the association degree of any two medical texts, specifically, the association degree of any two medical texts can be used as the association degree of the words in any two medical texts, and the association degree of the words in any two medical texts can be calculated according to a preset algorithm, so that the association degree of the words can be reflected more accurately and intuitively according to the association degree of the words. For example, the words in the a-medical text are: cold and fever, the words in the C medical text are: cough and coolness, the degree of association between a and C is 10%, and the degree of association between cold and cough is 10%.
In any one of the above technical solutions, preferably, the processing unit includes: and the word cutting unit is used for cutting words of the medical texts according to the dictionary and the parts of speech of the words in the medical texts.
In the technical scheme, the words of the medical texts can be cut according to words and parts of speech in a dictionary (preferably a medical dictionary), specifically, the words of the medical texts are cut according to the words in the dictionary, if the words in the medical texts do not exist in the dictionary, whether the words are associated with front and rear words or not is judged according to the parts of speech of the words, and whether new words need to be combined or not is judged, so that the situations of word miscut and word omission are effectively avoided, and the accuracy and the comprehensiveness of word cutting are further ensured.
In any one of the above technical solutions, preferably, the processing unit includes: and the clustering unit is used for clustering the plurality of medical texts according to the international disease classification and the K-means algorithm.
In the technical scheme, the plurality of medical texts can be clustered according to International Classification of Disease (ICD) and a K-means algorithm, and since the medical texts of the same category obtained by clustering have the same Disease, the possibility that the words of the medical texts of the same category obtained by clustering are associated is high, and then the medical texts of the same category are further processed to ensure the processing speed.
In any of the foregoing technical solutions, preferably, the storage unit is specifically configured to store the words having an association relationship according to the attribute of the words having an association relationship.
In the technical scheme, the word is stored according to the attribute of the word with the association relationship, for example, the attribute of the word is as follows: the medical information storage system comprises body parts (such as heads and limbs), predicates (such as pains and strains), diseases (such as fever and heart diseases), medicines (such as Gregorian tablets and glucose injection), treatment means (such as drip and anesthesia), and neglected words (such as home and patient) which do not contribute to information extraction), so that the storage of related words is more orderly.
Through the technical scheme of the invention, the words with the association relation in the medical text can be more accurately and comprehensively excavated, so that the medical word bank constructed according to the words with the association relation is more accurate and comprehensive.
Drawings
Fig. 1 shows a flow diagram of a medical information processing method according to an embodiment of the invention;
fig. 2 shows a schematic configuration diagram of a medical information processing apparatus according to an embodiment of the present invention;
fig. 3 shows a schematic diagram of a medical information processing apparatus according to an embodiment of the invention.
Detailed Description
So that the manner in which the above recited objects, features and advantages of the present invention can be understood in detail, a more particular description of the invention, briefly summarized above, may be had by reference to the embodiments thereof which are illustrated in the appended drawings. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those specifically described herein, and therefore the scope of the present invention is not limited by the specific embodiments disclosed below.
Fig. 1 shows a flow diagram of a medical information processing method according to an embodiment of the present invention.
As shown in fig. 1, a medical information processing method according to an embodiment of the present invention includes:
102, performing word segmentation on a plurality of medical texts, and clustering the plurality of medical texts;
104, determining the association degree of every two medical texts according to the words of every two medical texts in the medical texts of the same category;
step 106, judging whether words of any two medical texts in the medical texts of the same category have an association relation according to the association degree of each two medical texts, if so, entering step 108, otherwise, ending the process;
and step 108, performing association storage on the words with the association relation.
In the technical scheme, the association degree of every two medical texts is determined according to the words in every two medical texts in the medical texts of the same category, whether an association relationship exists between any two words in the medical texts of the same category is judged according to the association degree of every two medical texts, and the words with the association relationship are stored in an association manner, for example, in a medical word bank, so as to construct a more complete medical word bank. For example, the words in the a-medical text are: cold and fever, the words in the B medical text are: fever and cough, the words in the C medical text are: cough and cold, it can be seen that a and B have similar words: fever and fever, 30% correlation between a and B, with the same words in B and C: in the cough, the association degree between B and C is 50%, and A and C do not have the same or similar words, but because A and B have an association, the association between A and C can be determined, that is, the association between the words of A and C exists. Therefore, the method and the device can further dig out the words with the implicit association relationship, so that the words with the association relationship in the medical text can be more accurately and comprehensively duout. Furthermore, a search engine of medical treatment information can be constructed according to the words with the incidence relation, or automatic analysis of medical treatment text information is realized, and convenience is provided for outpatients doctors and patients to inquire diseases and symptoms.
Preferably, the plurality of medical texts may be electronic medical records in a medical system of a hospital, or may be obtained from a medical professional website by using a crawler program. Because the scale of the medical texts is larger, the distributed file system can store the medical texts.
In the above technical solution, preferably, step 108 further includes: determining the association degree of words in any two medical texts according to the association degree of any two medical texts; and storing the association degree of the words in any two medical texts.
In the technical scheme, the association degree of the words in any two medical texts is determined according to the association degree of any two medical texts, specifically, the association degree of any two medical texts can be used as the association degree of the words in any two medical texts, and the association degree of the words in any two medical texts can be calculated according to a preset algorithm, so that the association degree of the words can be reflected more accurately and intuitively according to the association degree of the words. For example, the words in the a-medical text are: cold and fever, the words in the C medical text are: cough and coolness, the degree of association between a and C is 10%, and the degree of association between cold and cough is 10%.
In any one of the above technical solutions, preferably, the step of segmenting the plurality of medical texts specifically includes: and performing word segmentation on the medical texts according to the dictionary and the parts of speech of the words in the medical texts.
In the technical scheme, the words of the medical texts can be cut according to words and parts of speech in a dictionary (preferably a medical dictionary), specifically, the words of the medical texts are cut according to the words in the dictionary, if the words in the medical texts do not exist in the dictionary, whether the words are associated with front and rear words or not is judged according to the parts of speech of the words, and whether new words need to be combined or not is judged, so that the situations of word miscut and word omission are effectively avoided, and the accuracy and the comprehensiveness of word cutting are further ensured. Preferably, the words obtained by segmenting the medical text are medical words, so as to avoid interference of irrelevant words (such as every day, patients, home) in determining the relevance of the medical text.
In any one of the above technical solutions, preferably, the step of clustering the plurality of medical texts specifically includes: clustering the plurality of medical texts according to international disease classification and K-means algorithm.
In the technical scheme, the plurality of medical texts can be clustered according to International Classification of Disease (ICD) and a K-means algorithm, and since the medical texts of the same category obtained by clustering have the same Disease, the possibility that the words of the medical texts of the same category obtained by clustering are associated is high, and then the medical texts of the same category are further processed to ensure the processing speed.
In any of the above technical solutions, preferably, step 108 specifically includes: and storing the words with the association relation according to the attributes of the words with the association relation.
In the technical scheme, the word is stored according to the attribute of the word with the association relationship, for example, the attribute of the word is as follows: the medical information storage system comprises body parts (such as heads and limbs), predicates (such as pains and strains), diseases (such as fever and heart diseases), medicines (such as Gregorian tablets and glucose injection), treatment means (such as drip and anesthesia), and neglected words (such as home and patient) which do not contribute to information extraction), so that the storage of related words is more orderly.
Fig. 2 shows a schematic configuration diagram of a medical information processing apparatus according to an embodiment of the present invention.
As shown in fig. 2, a medical information processing apparatus 200 according to an embodiment of the present invention includes: the processing unit 202 is configured to perform word segmentation on a plurality of medical texts and perform clustering on the plurality of medical texts; the first determining unit 204 is configured to determine, according to words of every two medical texts in the medical texts of the same category, a degree of association between every two medical texts; the judging unit 206 is configured to judge whether words of any two medical texts in the medical texts of the same category have an association relationship according to the association degree of each two medical texts; and a storage unit 208, configured to, if the determination result is yes, associate and store the words having an association relationship.
In the technical scheme, the association degree of every two medical texts is determined according to the words in every two medical texts in the medical texts of the same category, whether an association relationship exists between any two words in the medical texts of the same category is judged according to the association degree of every two medical texts, and the words with the association relationship are stored in an association manner, for example, in a medical word stock, so as to construct a more perfect medical word stock. For example, the words in the a-medical text are: cold and fever, the words in the B medical text are: fever and cough, the words in the C medical text are: cough and cold, it can be seen that a and B have similar words: fever and fever, 30% correlation between a and B, with the same words in B and C: in the cough, the association degree between B and C is 50%, and A and C do not have the same or similar words, but because A and B have an association, the association between A and C can be determined, that is, the association between the words of A and C exists. Therefore, the method and the device can further dig out the words with the implicit association relationship, so that the words with the association relationship in the medical text can be more accurately and comprehensively duout. Furthermore, a search engine of medical treatment information can be constructed according to the words with the incidence relation, or automatic analysis of medical treatment text information is realized, and convenience is provided for outpatients doctors and patients to inquire diseases and symptoms.
Preferably, the plurality of medical texts may be electronic medical records in a medical system of a hospital, or may be obtained from a medical professional website by using a crawler program. Because the scale of the medical texts is larger, the distributed file system can store the medical texts.
In the above technical solution, preferably, the storage unit 208 includes: the second determining unit 2082, configured to determine association degrees of words in any two medical texts according to the association degrees of any two medical texts; the storage unit 208 is specifically configured to store the association degrees of the words in any two medical texts.
In the technical scheme, the association degree of the words in any two medical texts is determined according to the association degree of any two medical texts, specifically, the association degree of any two medical texts can be used as the association degree of the words in any two medical texts, and the association degree of the words in any two medical texts can be calculated according to a preset algorithm, so that the association degree of the words can be reflected more accurately and intuitively according to the association degree of the words. For example, the words in the a-medical text are: cold and fever, the words in the C medical text are: cough and coolness, the degree of association between a and C is 10%, and the degree of association between cold and cough is 10%.
In any of the above technical solutions, preferably, the processing unit 202 includes: the word segmentation unit 2022 is configured to segment words of the plurality of medical texts according to the dictionary and parts of speech of the words in the plurality of medical texts.
In the technical scheme, the words of the medical texts can be cut according to words and parts of speech in a dictionary (preferably a medical dictionary), specifically, the words of the medical texts are cut according to the words in the dictionary, if the words in the medical texts do not exist in the dictionary, whether the words are associated with front and rear words or not is judged according to the parts of speech of the words, and whether new words need to be combined or not is judged, so that the situations of word miscut and word omission are effectively avoided, and the accuracy and the comprehensiveness of word cutting are further ensured. Preferably, the words obtained by segmenting the medical text are medical words, so as to avoid interference of irrelevant words (such as every day, patients, home) in determining the relevance of the medical text.
In any of the above technical solutions, preferably, the processing unit 202 includes: a clustering unit 2024, configured to cluster the plurality of medical texts according to international disease classification and K-means algorithm.
In the technical scheme, the plurality of medical texts can be clustered according to International Classification of Disease (International Classification of Disease) and a K-means algorithm, and since the medical texts of the same category obtained by clustering have the same Disease, the possibility of association among words of the medical texts of the same category obtained by clustering is high, and then the medical texts of the same category are further processed to ensure the processing speed.
In any of the foregoing technical solutions, preferably, the storage unit 208 is specifically configured to store the words having an association relationship according to the attribute of the words having an association relationship.
In the technical scheme, the word is stored according to the attribute of the word with the association relationship, for example, the attribute of the word is as follows: the medical information storage system comprises body parts (such as heads and limbs), predicates (such as pains and strains), diseases (such as fever and heart diseases), medicines (such as Gregorian tablets and glucose injection), treatment means (such as drip and anesthesia), and neglected words (such as home and patient) which do not contribute to information extraction), so that the storage of related words is more orderly.
Fig. 3 shows a schematic diagram of a medical information processing apparatus according to an embodiment of the invention.
As shown in fig. 3, the medical information processing apparatus 300 first obtains a medical text from a medical professional website by using a crawler technology, and obtains an electronic medical record from a medical system in a hospital, and since the amounts of information obtained from the medical professional website and the medical system are large, the medical text and the electronic medical record obtained from the medical professional website are stored in a distributed file system as a plurality of medical texts, word segmentation and clustering are performed on the plurality of medical texts, and then the association degree of each two medical texts is calculated by using a Jacard method according to words in each two medical texts in the same category, for example, for two medical texts a and B, the word after word segmentation of a medical text is: "patient", "sore throat and itching throat", "no phlegm", "stomach distension", "lumbago", the words after the word segmentation of the B medical text are: "dry cough", "pharyngalgia and pharynx itch", "no phlegm", "stomachache", "waist soreness" and "fear of cold", exactly the same word pair can be obtained by calculation: "pharyngalgia pharynx itch" and "pharyngalgia pharynx itch", "no phlegm" and "no phlegm"; and the higher similarity terms are "gastrectasia" and "stomachache", "lumbago" and "soreness of waist". And then determining whether any two medical texts in the medical texts of the same category have an association relationship by adopting a vector cosine method, thereby obtaining the association relationship of some words, wherein the association relationship can not be obtained by calculating the similarity by adopting a Jacard method. For example, the two medical texts a and B and the other medical text C, C are the following words after word segmentation: the medical records A and C have an incidence relation through calculation, so that the words in the A and C have an incidence relation, for example, the words in the A and C have an incidence relation with the words in the tonsil inflammation, and then the words in the incidence relation are stored in a medical word stock, so that the medical word stock facing to a medical actual scene is constructed.
The technical scheme of the invention is explained in detail in the above with the help of the attached drawings, and by analyzing the real data (i.e. medical history) in the medical system of the hospital and the medical text in the medical professional website, words with association relation in the medical text can be more accurately and comprehensively excavated, so that a medical word stock facing to the medical actual scene is constructed.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (8)

1. A medical information processing method characterized by comprising:
performing word segmentation on a plurality of medical texts, and clustering the plurality of medical texts;
determining the association degree of every two medical texts according to the words of every two medical texts in the medical texts of the same category;
judging whether words of any two medical texts in the medical texts of the same category have an association relation or not according to the association degree of every two medical texts;
if so, performing association storage on the words with the association relation;
the step of performing association storage on the words with association relations specifically includes:
and storing the words with the association relation according to the attributes of the words with the association relation.
2. The medical information processing method according to claim 1, wherein the step of storing the words having the association relationship in association further includes:
determining the association degree of words in any two medical texts according to the association degree of any two medical texts;
and storing the association degree of the words in any two medical texts.
3. The medical information processing method according to claim 1, wherein the step of segmenting the plurality of medical texts specifically includes:
and performing word segmentation on the medical texts according to the dictionary and the parts of speech of the words in the medical texts.
4. The medical information processing method according to claim 1, wherein the step of clustering the plurality of medical texts specifically includes:
clustering the plurality of medical texts according to international disease classification and K-means algorithm.
5. A medical information processing apparatus characterized by comprising:
the processing unit is used for segmenting a plurality of medical texts and clustering the medical texts;
the first determination unit is used for determining the association degree of every two medical texts according to the words of every two medical texts in the medical texts of the same category;
the judging unit is used for judging whether words of any two medical texts in the medical texts of the same category have an association relation or not according to the association degree of every two medical texts;
the storage unit is used for performing association storage on the words with the association relation when the judgment result is yes;
the storage unit is specifically configured to store the words with the association relationship according to the attributes of the words with the association relationship.
6. The medical information processing apparatus according to claim 5, wherein the storage unit includes:
the second determining unit is used for determining the association degree of the words in any two medical texts according to the association degree of any two medical texts;
the storage unit is specifically configured to store the association degrees of the words in any two medical texts.
7. The medical information processing apparatus according to claim 5, wherein the processing unit includes:
and the word cutting unit is used for cutting words of the medical texts according to the dictionary and the parts of speech of the words in the medical texts.
8. The medical information processing apparatus according to claim 5, wherein the processing unit includes:
and the clustering unit is used for clustering the plurality of medical texts according to the international disease classification and the K-means algorithm.
CN201510886242.XA 2015-12-04 2015-12-04 Medical information processing method and medical information processing apparatus Active CN106844325B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510886242.XA CN106844325B (en) 2015-12-04 2015-12-04 Medical information processing method and medical information processing apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510886242.XA CN106844325B (en) 2015-12-04 2015-12-04 Medical information processing method and medical information processing apparatus

Publications (2)

Publication Number Publication Date
CN106844325A CN106844325A (en) 2017-06-13
CN106844325B true CN106844325B (en) 2022-01-25

Family

ID=59150575

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510886242.XA Active CN106844325B (en) 2015-12-04 2015-12-04 Medical information processing method and medical information processing apparatus

Country Status (1)

Country Link
CN (1) CN106844325B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110019826B (en) * 2017-07-27 2023-02-28 北大医疗信息技术有限公司 Construction method, construction device, equipment and storage medium of medical knowledge map
CN109192258B (en) * 2018-08-14 2023-06-20 深圳平安医疗健康科技服务有限公司 Medical data conversion method, medical data conversion device, computer equipment and storage medium
CN110766004B (en) * 2019-10-23 2022-05-13 泰康保险集团股份有限公司 Medical identification data processing method and device, electronic equipment and readable medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005045695A1 (en) * 2003-10-27 2005-05-19 Educational Testing Service Method and system for determining text coherence
CN101079026A (en) * 2007-07-02 2007-11-28 北京百问百答网络技术有限公司 Text similarity, acceptation similarity calculating method and system and application system
CN102982125A (en) * 2012-11-14 2013-03-20 百度在线网络技术(北京)有限公司 Method and device for identifying texts with same meaning
CN103123618A (en) * 2011-11-21 2013-05-29 北京新媒传信科技有限公司 Text similarity obtaining method and device
CN103942339A (en) * 2014-05-08 2014-07-23 深圳市宜搜科技发展有限公司 Synonym mining method and device
CN104978347A (en) * 2014-04-11 2015-10-14 中国中医科学院中医临床基础医学研究所 Data mining method and data mining system for sensitive keywords in Chinese biomedical literature database

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07319906A (en) * 1994-05-27 1995-12-08 Fujitsu Ltd Synonym retrieving processing system and character string retrieving system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2005045695A1 (en) * 2003-10-27 2005-05-19 Educational Testing Service Method and system for determining text coherence
CN101079026A (en) * 2007-07-02 2007-11-28 北京百问百答网络技术有限公司 Text similarity, acceptation similarity calculating method and system and application system
CN103123618A (en) * 2011-11-21 2013-05-29 北京新媒传信科技有限公司 Text similarity obtaining method and device
CN102982125A (en) * 2012-11-14 2013-03-20 百度在线网络技术(北京)有限公司 Method and device for identifying texts with same meaning
CN104978347A (en) * 2014-04-11 2015-10-14 中国中医科学院中医临床基础医学研究所 Data mining method and data mining system for sensitive keywords in Chinese biomedical literature database
CN103942339A (en) * 2014-05-08 2014-07-23 深圳市宜搜科技发展有限公司 Synonym mining method and device

Also Published As

Publication number Publication date
CN106844325A (en) 2017-06-13

Similar Documents

Publication Publication Date Title
CN107610779B (en) Disease evaluation and disease risk evaluation method and device
Halmin et al. Epidemiology of massive transfusion: a binational study from Sweden and Denmark
US9558264B2 (en) Identifying and displaying relationships between candidate answers
CN100449531C (en) Patient data mining
US20140344274A1 (en) Information structuring system
CN104572675B (en) A kind of system and method for similar case history retrieval
WO2018076243A1 (en) Search method and device
CN111465990B (en) Method and system for clinical trials of healthcare
US20180068076A1 (en) Systems and methods for semantic search and extraction of related concepts from clinical documents
US10423758B2 (en) Computer system and information processing method
CN104199855B (en) A kind of searching system and method for traditional Chinese medicine and pharmacy information
US20150073830A1 (en) Electrical Computing Devices for Recruiting a Patient Population for a Clinical Trial
CN114817386A (en) Method and device for generating structured medical data
US11901048B2 (en) Semantic search for a health information exchange
CN112885478B (en) Medical document retrieval method, medical document retrieval device, electronic device and storage medium
CN106844325B (en) Medical information processing method and medical information processing apparatus
CN112883157A (en) Method and device for standardizing multi-source heterogeneous medical data
WO2021151302A1 (en) Drug quality-control analysis method, apparatus, device, and medium based on machine learning
Si et al. An OMOP CDM-based relational database of clinical research eligibility criteria
CN111061835B (en) Query method and device, electronic equipment and computer readable storage medium
US20100306183A1 (en) Electronic system for a social -network web portal applied to the sector of health and health information
CN112115697A (en) Method, device, server and storage medium for determining target text
CN113064960A (en) Method for accurately searching cases similar to patient's condition
CN106354715A (en) Method and device for medical word processing
JP6375064B2 (en) System and method for uniformly correlating unstructured item features with related therapy features

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PP01 Preservation of patent right

Effective date of registration: 20240202

Granted publication date: 20220125

PP01 Preservation of patent right