WO2018120447A1 - Procédé, dispositif et équipement de traitement d'informations de dossier médical - Google Patents

Procédé, dispositif et équipement de traitement d'informations de dossier médical Download PDF

Info

Publication number
WO2018120447A1
WO2018120447A1 PCT/CN2017/077125 CN2017077125W WO2018120447A1 WO 2018120447 A1 WO2018120447 A1 WO 2018120447A1 CN 2017077125 W CN2017077125 W CN 2017077125W WO 2018120447 A1 WO2018120447 A1 WO 2018120447A1
Authority
WO
WIPO (PCT)
Prior art keywords
text
target
feature
medical
information
Prior art date
Application number
PCT/CN2017/077125
Other languages
English (en)
Chinese (zh)
Inventor
银磊
李明修
卜海亮
魏世嘉
Original Assignee
北京搜狗科技发展有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京搜狗科技发展有限公司 filed Critical 北京搜狗科技发展有限公司
Publication of WO2018120447A1 publication Critical patent/WO2018120447A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references

Definitions

  • the present invention relates to the field of information processing technologies, and in particular, to a method, device and device for processing medical record information.
  • the medical record information can reflect the patient's medical treatment
  • the medical record information can be used for doctors, patients to understand the patient's historical conditions, treatment, etc., and can also be used to analyze data on the condition and treatment of a large number of patients.
  • the medical record information that can usually be obtained directly is usually chaotic, that is, various information contents are pieced together indiscriminately. Therefore, on the one hand, when displaying such medical information to the user, the user is not only difficult to read smoothly but also cannot quickly find the required information content. On the other hand, such medical information is not conducive to the search and identification of the information content. Therefore, it is also difficult to use for data collation and analysis.
  • the technical problem to be solved by the present invention is to provide a method, a device and a device for processing medical record information, so that various information contents in the medical record information can be distinguished according to a certain structural format, and the structure of the medical record information is realized. It not only makes it easy for users to read and quickly find the information content of the demand, but also facilitates data collation and analysis.
  • an embodiment of the present invention provides a method for processing medical record information, including:
  • Target text unit in the target medical text is embodied as text information under the target category.
  • the determining the target category corresponding to the text feature of the target text unit may include:
  • the target category may be a category for describing patient information, a category for describing a disease name, a category for describing symptom statement information, a category for describing symptom identification information, and a description for medical information.
  • the method may further include:
  • the target feature word in the target medical text is embodied as text information belonging to the first feature item.
  • the extracting the target feature words for describing the first feature item from the original medical text may include:
  • the extracting the target feature words for describing the first feature item from the original medical text may include:
  • the initial feature words are matched in a standard feature vocabulary to obtain a standard feature word that matches the initial feature word as the target feature word for describing the first feature item.
  • the analyzing the original medical text to obtain an initial feature word for describing the first feature item may include:
  • the method may further include:
  • the method may further include:
  • the inferred feature word is a feature word for describing the second feature item not recorded in the original medical text ;
  • the inferred feature word is embodied in the target medical text as text information belonging to the second feature.
  • the determining the inferred feature word corresponding to the original medical text in the second feature item may include:
  • the method may further include:
  • Finding a preset medical text matching the target medical text wherein the text information of the preset medical text under the target category is the same as or similar to the target text unit, and the target category includes a category used to describe a patient's personal information and/or a category used to describe a patient's symptoms;
  • Extracting text information under the category for describing the diagnosis information in the preset medical text is embodied as reference diagnostic information in the target medical text.
  • the obtaining the original medical text may include:
  • an embodiment of the present invention provides a processing device for processing medical information, including:
  • a dividing unit configured to divide the original medical text into at least one target text unit
  • a first determining unit configured to determine a target category corresponding to the text feature of the target text unit
  • a generating unit configured to generate the target medical text, wherein the target text unit is embodied as text information under the target target category in the target medical text.
  • the first determining unit may include:
  • a target category determining subunit configured to determine, according to the first machine learning model, a target category corresponding to a text feature of the target text unit, wherein the first machine learning model passes the historical medical text included in the training sample set
  • the correspondence between the text feature and the preset category is obtained by training.
  • the target category is a category for describing patient information, a category for describing a disease name, a category for describing symptom statement information, a category for describing symptom identification information, and a category for describing medical information. Or a category used to describe prescription information.
  • the device may further include:
  • a first extracting unit configured to extract, from the original medical text, a target feature word for describing the first feature item; wherein, in the target medical text, the target feature word is embodied as belonging to the first Text information for feature items.
  • the first extracting unit may include:
  • a target feature word extracting sub-unit configured to extract the target feature word for describing the first feature item from the text information under the target category to which the first feature item belongs in the original medical text.
  • the first extracting unit may specifically include: an analyzing subunit and a matching subunit;
  • the analysis subunit is configured to analyze the original medical text to obtain an initial feature word for describing the first feature item
  • the matching subunit is configured to match the initial feature words in a standard feature vocabulary to obtain a standard feature word that matches the initial feature word, as the target feature for describing the first feature item word.
  • the matching subunit may specifically include:
  • the initial feature word extraction subunit is configured to perform lexical analysis and/or syntax analysis on the original medical text based on the medical special vocabulary to obtain the initial feature word for describing the first feature item.
  • the device may further include:
  • a establishing unit configured to describe a corresponding relationship between the initial feature word and the target feature word of the first feature item And reflected in the target medical text.
  • the device may further include:
  • a second determining unit configured to determine an inferred feature word corresponding to the original medical text under the second feature item, wherein the inferred feature word is not described in the original medical text for describing the a characteristic word of the second feature item;
  • the inferred feature word is embodied in the target medical text as text information belonging to the second feature.
  • the second determining unit may include:
  • a feature word determining subunit configured to determine, according to a second machine learning model, an inferred feature word corresponding to the original medical text under the second feature item, wherein the second machine learning model passes the training sample
  • the historical medical record text included in the set is trained by the corresponding correspondence between the preset inferred feature words for describing the second feature item.
  • the device may further include: a searching unit and a second extracting unit;
  • the searching unit is configured to search for preset medical text matching the target medical text, wherein the text information of the preset medical text under the target category is the same as the target text unit or Similarly, the target category includes categories for describing patient personal information and/or categories for describing patient symptoms;
  • the second extracting unit is configured to extract text information in a category for describing diagnostic information in the preset medical text for generating the target medical text.
  • the obtaining unit may include: a first acquiring subunit and a first identifying subunit;
  • the first obtaining subunit is configured to obtain medical record information in a voice form
  • the first identification subunit is configured to perform voice recognition on the medical record information to obtain the original medical record text.
  • the acquiring unit may include: a second acquiring subunit and a second identifying subunit;
  • the second obtaining subunit is configured to acquire medical record information in an image form
  • the second identification subunit is configured to perform image recognition on the medical record information to obtain the original medical record text.
  • an embodiment of the present invention provides an apparatus, including a memory, and one or more programs, wherein one or more programs are stored in a memory and configured to be Execution of the one or more programs by one or more processors includes instructions for performing the following operations:
  • Target text unit in the target medical text is embodied as text information under the target category.
  • the embodiment of the invention has the following advantages:
  • a method, apparatus and apparatus for unstructured original medical text, by dividing the original medical text into at least one target text unit and determining the target text unit for each target text unit
  • the target category corresponding to the text feature can generate a structured target medical text, so that each target text unit in the target medical text is embodied as text information under the target category to which it belongs. It can be seen that since different information contents are classified into corresponding categories in the structured target medical text, on the one hand, when the target medical text is displayed to the user, the user can not only read more smoothly but also can be faster.
  • the text content categorized in the target medical text text is conducive to the search and identification of the information content, which also makes the target medical text more conducive to data collation and analysis.
  • FIG. 1 is a schematic diagram of a framework of an exemplary application scenario according to an embodiment of the present invention
  • FIG. 2 is a schematic flow chart of a method for processing medical record information according to an embodiment of the present invention
  • FIG. 3 is a schematic structural diagram of a processing device for processing medical records according to an embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of a device according to an embodiment of the present invention.
  • FIG. 5 is a schematic structural diagram of a server according to an embodiment of the present invention.
  • the inventors have found through research that the medical information that can usually be directly obtained, such as the medical information input by the user, is usually disorderly.
  • the information content used to describe different features is pieced together indiscriminately.
  • the disorderly medical information is not conducive to the user's search for information content. Identification.
  • the original medical text is divided into at least one target text unit, and the target category corresponding to the text feature of the target text unit is determined for each target text unit, and the structure is generated accordingly.
  • the target medical text is such that each target text unit in the target medical text is reflected in the text information under the target category.
  • the embodiment of the present invention can be applied to the scenario shown in FIG. 1 , where the user terminal 102 and the server 101 implement interaction through the network 103 .
  • the server 101 obtains the original medical text transmitted by the user terminal 102.
  • the server 101 divides the original medical text into at least one target text unit, determines a target category corresponding to the text feature of the target text unit, and generates a target medical text, wherein in the target medical text
  • the target text unit is embodied as text information under the target category.
  • the server 101 can transmit the target medical text information to the user terminal 102 for display.
  • the user terminal 102 can be existing, under development, or developed in the future, and can be implemented by any form of wired and/or wireless connection (eg, Wi-Fi, LAN, cellular, coaxial cable, etc.).
  • Any user device that interacts with server 101 including but not limited to: existing, ongoing Smartphones, non-smart phones, tablets, laptop personal computers, desktop personal computers, small computers, medium-sized computers, large computers, etc. that are developed or developed in the future.
  • server 101 is merely an example of an existing, research-developed or future-developed device capable of providing medical information processing functions to a user.
  • Embodiments of the invention are not subject to any limitation in this regard.
  • FIG. 2 a schematic flowchart of a method for processing medical record information in an embodiment of the present invention is shown.
  • the method may include the following steps, for example:
  • the original medical text to be structured can be obtained.
  • medical information can be obtained in a variety of ways.
  • the medical record information may be information input by the user.
  • the medical record information can also be information stored in a database.
  • the originally obtained medical record information may be information in the form of text, information in the form of images, or information in the form of voice. Since the embodiment is to structure the original medical text in text form, in the case that the originally obtained medical information is in the form of text, the original medical text may be the medical information itself, and the original medical record is obtained. Where the information is in a non-text form, the original medical text may be the original medical text converted into text.
  • the step 201 may include: acquiring medical record information in a voice form; performing voice recognition on the medical record information to obtain the original medical record text.
  • the steps 201 includes: acquiring medical record information in the form of an image; performing image recognition on the medical record information to obtain the original medical record text.
  • the original medical record information may sometimes contain information about multiple diagnoses of a patient.
  • the related information of the multiple diagnosis in the medical record information can be divided into a plurality of related information of one diagnosis, and then the related information of one diagnosis is used as the original medical text for structuring. deal with. That is, the original medical text may be medical text information related to one diagnosis for one patient.
  • the original medical record information includes the relevant information of the first diagnosis and the related information of the second diagnosis
  • the medical information obtained originally can be divided into the relevant information of one diagnosis and the related information of the second diagnosis according to the time of the consultation
  • the relevant information of the first consultation and the relevant information of the second consultation are respectively used as the original medical text, and the subsequent steps are performed.
  • the original medical text information can be divided into sentences. That is, the divided target text unit is a text sentence.
  • the original medical text information may be divided into units of phrases, phrases, paragraphs, and the like.
  • each target text unit divided for the original medical text may be searched for a target category matching the text feature of the target text unit in a preset preset category that can be used to describe the medical information. , thereby determining a corresponding target category for each target text unit. It can be understood that, for the target text unit, if the text feature of the target text unit matches the target category, the target category is a category for describing the target text unit.
  • the preset plurality of categories applicable to medical record information may include, for example, a category for describing patient information, a category for describing a disease name, a category for describing symptom statement information, and a description. Any of a plurality of categories of the category of the symptom discrimination information, the category for describing the medical order information, the category for describing the prescription information, and the like. That is, for any one of the target text units, the corresponding target category may be, for example, a category for describing patient information, a category for describing a disease name, a category for describing symptom statement information, and a description for symptom recognition.
  • the patient information may include, for example, a patient name, a patient gender, a patient's age, a visit time, and the like.
  • the symptom statement information may also be referred to as chief complaint information.
  • the symptom identification information may be a dialectical information on a TCM concept, or may be a Western medical plan. Read the test results information.
  • a machine learning model can be employed to determine a corresponding target category for the target text unit.
  • step 203 may be specifically: determining, according to the first machine learning model, a target category corresponding to the text feature of the target text unit, wherein the first machine learning model passes the historical medical text included in the training sample set The correspondence between the text feature and the preset category is obtained by training, and the historical medical text is text information under the preset category.
  • the training process of the first machine learning model may be specifically: in the case of determining that the historical medical text belongs to text information under a certain preset category, using the text feature of the historical medical text as an input And training the first machine learning model with the preset category to which the historical medical text belongs.
  • the plurality of historical medical texts for training may include the plurality of text information under the preset categories applicable to the medical information, so that the trained first machine learning model can accurately cover all available Preset category of medical information.
  • the historical medical text may be a sentence text in units of sentences, that is, text information of one sentence per training is used as a historical medical text.
  • the historical medical record information may also be a paragraph text in units of paragraphs, that is, each session uses text information of one paragraph as a historical medical text.
  • the first machine learning model can represent the correspondence between the text features and the preset categories, and therefore, the target text unit
  • the text feature is input to the trained first machine learning model, and the target category output by the first machine learning model is the category to which the target text unit belongs.
  • Target text unit is embodied as text information under the target target category in the target medical text.
  • each target text unit that is divided into the original medical text can be organized according to the target category to which it belongs, and the target medical text is generated.
  • the target medical text may be used, for example, for feedback to the user, that is, after step 204, the embodiment may, for example, further comprise: presenting the target medical text.
  • each target text unit in the target medical text is saved correspondingly to its corresponding target category, so that the target medical text can reflect the text information under which the target text unit belongs to each target category. For example, suppose the target text unit is "head very painful”, belonging The target category is “command”, and the information reflected in the target medical text can be “main complaint: the head is very painful”.
  • the target medical record can be Individual features can be set in the text to reflect these important feature words.
  • the embodiment may further include, for example, extracting, from the original medical text, a target feature word for describing the first feature item.
  • the target feature word in the target medical text is embodied as text information belonging to the first feature item.
  • the target feature word is correspondingly saved with the corresponding first feature item, so the target medical record text can reflect that the target feature word belongs to the corresponding first feature item. For example, if the target feature word is “angelica” and the first feature item belongs to “medicine material”, the information reflected in the target medical text may be “medicine material: angelica”.
  • the target feature words under the first feature item are text information recorded in the original medical text.
  • the first feature item may be a feature item for describing a patient name, that is, the target feature word may be information for describing a patient name.
  • the target feature word is "Zhang San”
  • the first feature item and the target feature word in the target medical text can be embodied as "patient name: Zhang San”.
  • the first feature item may be a feature item for describing a medicine, that is, the target feature word may be information for describing a medicine.
  • the medicine may be a Chinese medicine material or a western medicine product.
  • the first feature item and the target feature word in the target medical text can be embodied as "drug: amoxicillin”.
  • the target feature word is "angelica”
  • the first feature item and the target feature word in the target medical text can be embodied as "medicinal material: angelica”.
  • the first feature item may be a feature item for describing a dose, that is, the target feature word may be information for describing a dose.
  • the first feature item and the target feature word in the target medical text can be embodied as: “dose: 10 grams.”
  • the first feature item may be a feature item for describing a symptom, that is, the target feature word may be information for describing a symptom.
  • the target feature word is “headache”
  • the first feature item and the target feature word in the target medical text can be embodied as “symptoms: headaches”.
  • the feature words of the same meaning may be normalized so that the same feature word is used in the target medical text to describe the same meaning.
  • the process of extracting the target feature word may include, for example, analyzing the original medical text to obtain an initial feature word for describing the first feature item; and the initial feature in the standard feature vocabulary The words are matched to obtain a standard feature word that matches the initial feature word as the target feature word for describing the first feature item.
  • the standard feature vocabulary specifies a standard feature word for a plurality of feature words for describing the same meaning, and the standard feature vocabulary also records the correspondence between the non-standard feature words of the same meaning and the standard feature words. . If the initial feature word is a non-standard feature word in the standard feature lexicon, the corresponding standard feature word of the non-standard feature word in the standard feature vocabulary can be used as the target feature word. If the initial feature word is a standard feature word in the standard feature lexicon, the initial feature word itself can be used as the target feature word. For example, "headache” and “headache” can be normalized into “headache”, that is, "headache” is a non-standard characteristic word, and "headache” is a standard characteristic word.
  • the method may further include: establishing the description Corresponding relationship between the initial feature word of the first feature item and the target feature word for describing the first feature item is embodied in the target medical record text. That is, the initial feature word and the target feature word corresponding to each other may also be included in the target medical text. For example, if the initial feature word is “headache” and the target feature word is “headache”, the initial feature word and the target feature word in the target medical text can be embodied as “original word: headache; standard symptom: headache”. For another example, if the initial feature word is “twenty g” and the target feature word is “20 g”, the initial feature word and the target feature word in the target medical text can be embodied as “original word: twenty g; standard dose: 20 grams”.
  • the analysis of the original medical texts can be combined with lexical analysis and syntactic analysis by means of a medical special vocabulary, so that the extraction of feature words is more accurate.
  • the original medical text in order to obtain an initial feature word, may be subjected to lexical analysis and/or syntax analysis based on a medical-specific vocabulary, The initial feature word of the first feature item. For example, suppose the original medical text records that “the head is very painful”.
  • the “head” is a noun and a subject and represents the human body part
  • the “pain” is a verb, a predicate and indicates the state of the human body part. Based on this, the initial characteristics can be determined The word is "headache.”
  • the target feature words can be identified based on corresponding specific rules.
  • the target feature word may be extracted based on an age recognition rule (eg, the feature word includes “number + year” or “number + ten”).
  • the target feature word may be extracted based on a time recognition rule (eg, the feature word includes "year", “month”, “day” or has a separator ".” "/”, etc.) .
  • the target feature words can be identified by a specific recognition technique. For example, for a first characteristic "patient name”, a target feature word can be extracted based on a natural language processing named entity recognition technique.
  • the first feature item is a feature belonging to one or several target categories, that is, the target feature words under the first feature item are present in the text information under the target category.
  • the target feature word for describing the first feature item may be specifically extracted in text information under the target category of the first feature item. That is, after 203, a target feature word for describing the first feature item is extracted from the text information under the target category to which the first feature item belongs in the original medical text.
  • the text information under the target category includes all target text units corresponding to the target category.
  • the first feature item "drug” is a feature belonging to the target category "prescription", that is, the related information corresponding to the first feature item "drug” exists in the text information belonging to the category “prescription”. Therefore, after the text information belonging to the category “prescription” is determined by classifying the original medical text, the target feature word corresponding to the first feature “drug” can be searched for and extracted in the text information belonging to the category “prescription”.
  • the target feature words of the first feature item may also be searched and extracted from all the text information of the original medical text.
  • the embodiment may further include: determining an inferred feature word corresponding to the original medical text under the second feature item, wherein the inferred feature word is in the original medical text a feature word for describing the second feature item; the inferred feature word in the target medical text is embodied as text information belonging to the second feature item.
  • the inferred feature word is text information belonging to the corresponding second feature item. For example, if the original medical text does not record the gender of the patient, assuming that the patient is a female based on the original medical text, the inferred feature word is “female” and the target category is “patient gender”. The information reflected in the text of the case can be “gender: female”.
  • the inferred feature words belonging to the second feature item are text information not directly recorded in the original medical text.
  • the second feature item may be a feature item for describing the gender of the patient, that is, the inferred feature word may be a feature word for describing the gender of the patient. Assuming that the inferred feature word is "male”, the second feature item and the corresponding inferred feature word in the target medical text can be embodied as "patient gender: male".
  • the second feature item may be a feature item for describing the age of the patient, that is, the inferred feature word may be a feature word for describing the age of the patient. Assuming that the inferred feature word is "middle age”, the second feature item and the corresponding inferred feature word in the target medical text can be embodied as "patient age: middle age”.
  • the determining the manner of determining the feature word may include: determining, according to the second machine learning model, the inferred feature word corresponding to the original medical text under the second feature item, wherein the second machine learning The model is obtained by training a correspondence between a historical medical text included in the training sample set and a preset inferred feature word for describing the second characteristic item, and can be inferred from the historical medical text.
  • the historical feature words are obtained by training a correspondence between a historical medical text included in the training sample set and a preset inferred feature word for describing the second characteristic item, and can be inferred from the historical medical text.
  • the training process of the second machine learning model may be specifically: for the historical medical text that is difficult to extract the determined feature words, in the case of determining the inferred feature words corresponding to the historical medical text,
  • the second medical learning model is trained as an input to the historical medical text as an input. It can be understood that after training a certain number of historical medical texts and their corresponding inferred feature words, the second machine learning model can represent the correspondence between the medical text and the inferred feature words, and therefore, the structure will be
  • the original medical text is input to the trained second machine learning model, and the inferred feature word output by the second machine learning model is a feature that the original medical text can reflect.
  • the same or similar symptoms, patient information, etc. may be obtained from the original medical text provided by the user. Extracting the text content of the diagnostic information in the preset medical text of the text content and as the reference diagnostic information is embodied in the target medical text information for the user to refer to, therefore, The user can obtain the diagnostic information recommended as a reference by inputting the patient information, thereby realizing the function of "self-diagnosis".
  • the embodiment may further include, for example, searching for preset medical text matching the target medical text, wherein the preset medical text is text information under the target category Same or similar to the target text unit, the target category includes a category for describing patient personal information and/or a category for describing a patient's symptoms; extracting a category for describing diagnostic information in the preset medical text
  • the text information below is embodied in the target medical text as reference diagnostic information.
  • the category for describing the diagnosis information may be, for example, a category for describing the prescription information, a category for describing the condition discrimination information, and/or a category for describing the medical order information.
  • the preset medical text may be, for example, pre-collected classic medical information or medical information provided by a medical expert.
  • the text information used to match the original medical text and the preset medical text may be text information under a target category, or may be text information under multiple target categories.
  • different matching weights can be set for different target categories to measure between the original medical information and the preset medical text.
  • the degree of matching For example, the text information used to match the original medical information and the preset medical text may be text information under four target categories of “disorder”, “patient age”, “patient gender”, and “visiting time”.
  • “diagnosis time” has relatively small impact on the diagnostic information
  • “disease”, “patient age” and “patient gender” can adopt relatively large matching weights
  • “visiting time” can adopt relatively small matching. Weights.
  • the result of the matching may be original.
  • the medical record information matches the preset medical text. If the original medical information and the preset medical text are more consistent in the text information of “disorder” and “visiting time” and the “patient gender” is more inconsistent, the matching result may be the original medical information and the preset medical record. The text does not match.
  • the medical text of the original medical text and the target medical text may be a medical text of a Chinese medicine, or may be a medical text of a Western medicine.
  • the original medical text is divided into at least one target text unit and the target category corresponding to the text feature of the target text unit is determined for each target text unit.
  • Each target text unit in the medical text is reflected in the text information under its target category. It can be seen that since different information contents are classified into corresponding categories in the structured target medical text, on the one hand, when the target medical text is displayed to the user, the user can not only read more smoothly but also can be faster.
  • the text content categorized in the target medical text text is conducive to the search and identification of the information content, which also makes the target medical text more conducive to data collation and analysis.
  • the device may specifically include:
  • the obtaining unit 301 is configured to obtain the original medical text
  • a dividing unit 302 configured to divide the original medical text into at least one target text unit
  • a first determining unit 303 configured to determine a target category corresponding to the text feature of the target text unit
  • the generating unit 304 is configured to generate target medical text, wherein the target text unit is embodied as text information under the target category in the target medical text.
  • the first determining unit 303 may include:
  • a target category determining subunit configured to determine, according to the first machine learning model, a target category corresponding to a text feature of the target text unit, wherein the first machine learning model passes the historical medical text included in the training sample set
  • the correspondence between the text feature and the preset category is obtained by training.
  • the target category is a category for describing patient information, a category for describing a disease name, a category for describing symptom statement information, a category for describing symptom identification information, and a category for describing medical information. Or a category used to describe prescription information.
  • the device may further include:
  • a first extracting unit configured to extract, from the original medical text, a target feature word for describing the first feature item; wherein, in the target medical text, the target feature word is embodied as belonging to the first Text information for feature items.
  • the first extracting unit may include:
  • a target feature word extracting sub-unit configured to extract the target feature word for describing the first feature item from the text information under the target category to which the first feature item belongs in the original medical text.
  • the first extracting unit may specifically include: an analyzing subunit and a matching subunit;
  • the analysis subunit is configured to analyze the original medical text to obtain an initial feature word for describing the first feature item
  • the matching subunit is configured to match the initial feature words in a standard feature vocabulary to obtain a standard feature word that matches the initial feature word, as the target feature for describing the first feature item word.
  • the matching subunit may include:
  • the initial feature word extraction subunit is configured to perform lexical analysis and/or syntax analysis on the original medical text based on the medical special vocabulary to obtain the initial feature word for describing the first feature item.
  • the device may further include:
  • a establishing unit configured to describe a correspondence between the initial feature word and the target feature word of the first feature item, and embodied in the target medical record text.
  • the first feature item may be a feature item for describing a patient name, a feature item for describing a medicine, a feature item for describing a dose, or a feature item for describing a symptom.
  • the device may further include:
  • a second determining unit configured to determine an inferred feature word corresponding to the original medical text under the second feature item, wherein the inferred feature word is not described in the original medical text for describing the a characteristic word of the second feature item;
  • the inferred feature word is embodied in the target medical text as text information belonging to the second feature.
  • the second determining unit may include:
  • a feature word determining subunit configured to determine, according to a second machine learning model, an inferred feature word corresponding to the original medical text under the second feature item, wherein the second machine learning model passes the training sample
  • the historical medical record text included in the set is trained by the corresponding correspondence between the preset inferred feature words for describing the second feature item.
  • the second feature item may be a feature item for describing a gender of the patient or a feature item for describing the age of the patient.
  • the device may further include: a searching unit and a second extracting unit;
  • the searching unit is configured to search for preset medical text matching the target medical text, wherein the text information of the preset medical text under the target category is the same as the target text unit or Similarly, the target category includes categories for describing patient personal information and/or categories for describing patient symptoms;
  • the second extracting unit is configured to extract text information in a category for describing diagnostic information in the preset medical text for generating the target medical text.
  • the original medical text may be medical text information related to one diagnosis for one patient.
  • the obtaining unit 301 may include: a first acquiring subunit and a first identifying subunit;
  • the first obtaining subunit is configured to obtain medical record information in a voice form
  • the first identification subunit is configured to perform voice recognition on the medical record information to obtain the original medical record text.
  • the obtaining unit 301 may include: a second acquiring subunit and a second identifying subunit;
  • the second obtaining subunit is configured to acquire medical record information in an image form
  • the second identification subunit is configured to perform image recognition on the medical record information to obtain the original medical record text.
  • the device may further include:
  • a presentation unit for presenting the target medical text.
  • the original medical text is divided into at least one target text unit and the target category corresponding to the text feature of the target text unit is determined for each target text unit.
  • a structured target medical text can be generated such that each target text unit in the target medical text is embodied as textual information under its target category. It can be seen that since different information contents are classified into corresponding categories in the structured target medical text, on the one hand, when the target medical text is displayed to the user, the user can not only read more smoothly but also can be faster.
  • the text content categorized in the target medical text text is conducive to the search and identification of the information content, which also makes the target medical text more conducive to data collation and analysis.
  • apparatus 1800 can include one or more of the following components: processing component 1802, memory 1804, power component 1806, multimedia component 1806, audio component 1810, input/output (I/O) interface 1812, sensor component 1814, And a communication component 1816.
  • Processing component 1802 typically controls the overall operation of device 1800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations.
  • Processing component 1802 can include one or more processors 1820 to execute instructions to perform all or part of the steps described above.
  • processing component 1802 can include one or more modules to facilitate interaction between component 1802 and other components.
  • processing component 1802 can include a multimedia module to facilitate interaction between multimedia component 1806 and processing component 1802.
  • Memory 1804 is configured to store various types of data to support operation at device 1800. Examples of such data include instructions for any application or method operating on device 1800, contact data, phone book data, messages, pictures, videos, and the like. Memory 1804 can be implemented by any type of volatile or non-volatile storage device, or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Disk or Optical Disk.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read only memory
  • EPROM erasable Programmable Read Only Memory
  • PROM Programmable Read Only Memory
  • ROM Read Only Memory
  • Magnetic Memory Flash Memory
  • Disk Disk or Optical Disk.
  • Power component 1806 provides power to various components of device 1800.
  • Power component 1806 can include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for device 1800.
  • Multimedia component 1806 includes a screen between the device 1800 and the user that provides an output interface.
  • the screen can include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen can be implemented as a touch screen to receive input signals from the user.
  • the touch panel includes one or more touch sensors to sense touches, slides, and gestures on the touch panel. The touch sensor may sense not only the boundary of the touch or sliding action, but also the duration and pressure associated with the touch or slide operation.
  • the multimedia component 1806 includes a front camera and/or a rear camera. When the device 1800 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each front and rear camera can be a fixed optical lens system or have focal length and optical zoom capabilities.
  • the audio component 1810 is configured to output and/or input an audio signal.
  • audio component 1810 includes a microphone (MIC) that is configured to receive an external audio signal when device 1800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode.
  • the received audio signal may be further stored in memory 1804 or transmitted via communication component 1816.
  • the audio component 1810 also includes a speaker for outputting an audio signal.
  • the I/O interface 1812 provides an interface between the processing component 1802 and a peripheral interface module, which may be a keyboard, a click wheel, a button, or the like. These buttons may include, but are not limited to, a home button, a volume button, a start button, and a lock button.
  • Sensor assembly 1814 includes one or more sensors for providing device 1800 with a status assessment of various aspects.
  • sensor assembly 1814 can detect an open/closed state of device 1800, relative positioning of components, such as the display and keypad of device 1800, and sensor component 1814 can also detect a change in position of one component of device 1800 or device 1800, The presence or absence of contact by the user with the device 1800, the orientation or acceleration/deceleration of the device 1800 and the temperature change of the device 1800.
  • Sensor assembly 1814 can include a proximity sensor configured to detect the presence of nearby objects without any physical contact.
  • Sensor assembly 1814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications.
  • the sensor assembly 1814 can also include an acceleration sensor, a gyro sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
  • Communication component 1816 is configured to facilitate wired or wireless communication between device 1800 and other devices.
  • the device 1800 can access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof.
  • communication component 1816 receives broadcast signals or broadcast associated information from an external broadcast management system via a broadcast channel.
  • the communication component 1816 also includes a near field communication (NFC) module to facilitate short range communication.
  • NFC near field communication
  • the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
  • RFID radio frequency identification
  • IrDA infrared data association
  • UWB ultra-wideband
  • Bluetooth Bluetooth
  • device 1800 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable A gate array (FPGA), controller, microcontroller, microprocessor, or other electronic component implementation for performing the above methods.
  • ASICs application specific integrated circuits
  • DSPs digital signal processors
  • DSPDs digital signal processing devices
  • PLDs programmable logic devices
  • FPGA field programmable A gate array
  • controller microcontroller, microprocessor, or other electronic component implementation for performing the above methods.
  • FIG. 5 is a schematic structural diagram of a server in an embodiment of the present invention.
  • the server 1900 can vary considerably depending on configuration or performance, and can include one or more central processing units (CPUs) 1922 (eg, one or more processors) and memory 1932, one or one The above storage medium 1942 or storage medium 1930 of data 1944 (eg, one or one storage device in Shanghai).
  • the memory 1932 and the storage medium 1930 may be short-term storage or persistent storage.
  • the program stored on storage medium 1930 may include one or more modules (not shown), each of which may include a series of instruction operations in the server.
  • central processor 1922 can be configured to communicate with storage medium 1930, which performs a series of instruction operations in storage medium 1930.
  • Server 1900 may also include one or more power sources 1926, one or more wired or wireless network interfaces 1950, one or more input and output interfaces 1958, one or more keyboards 1956, and/or one or more operating systems 1941.
  • power sources 1926 For example, Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.
  • An embodiment of the present invention provides an apparatus.
  • the device includes a memory, and one or more programs, wherein one or more programs are stored in the memory, and configured to be executed by one or more processors to include the one or more programs for performing the following operations Instructions:
  • Target text unit in the target medical text is embodied as text information under the target category.
  • the device may be specifically the foregoing device 1800
  • the memory may be specifically the memory 1804 in the foregoing device 1800
  • the processor may be specifically the processor 1820 in the foregoing device 1800.
  • the device may be specifically the foregoing server 1900
  • the processor may be specifically the central processor 1922 in the foregoing server 1900
  • the memory may be specifically in the foregoing server 1900.
  • Storage medium 1930
  • the processor can specifically execute the following operations:
  • the target category is a category for describing patient information, a category for describing a disease name, a category for describing symptom statement information, a category for describing symptom identification information, and a category for describing medical information. Or a category used to describe prescription information.
  • the processor may further execute an instruction of:
  • the target feature word in the target medical text is embodied as text information belonging to the first feature item.
  • the processor may specifically execute an instruction of:
  • the processor may specifically execute an instruction of:
  • the initial feature words are matched in a standard feature vocabulary to obtain a standard feature word that matches the initial feature word as the target feature word for describing the first feature item.
  • the processor may specifically execute an instruction of:
  • the processor may further execute an instruction of:
  • the first feature item may be a feature item for describing a patient name, for describing a medicine
  • a feature item of a product for describing a medicine
  • a feature item for describing a dose for describing a symptom.
  • the processor may further execute an instruction of:
  • the inferred feature word is a feature word for describing the second feature item not recorded in the original medical text ;
  • the inferred feature word is embodied in the target medical text as text information belonging to the second feature.
  • the processor may specifically execute an instruction of:
  • the second feature item may be a feature item for describing a gender of the patient or a feature item for describing the age of the patient.
  • the processor may further execute an instruction of:
  • the target category includes categories for describing patient personal information and/or categories for describing patient symptoms;
  • Extracting text information under the category for describing the diagnosis information in the preset medical text is embodied as reference diagnostic information in the target medical text.
  • the original medical text may be medical text information related to one diagnosis for one patient.
  • the processor may specifically execute the following operations:
  • the processor may specifically perform the following operations. make:
  • the processor may further execute an instruction of:
  • Embodiments of the present invention also provide a non-transitory computer readable storage medium including instructions, such as a memory 1804 including instructions executable by the processor 1820 of the apparatus 1800 to perform the above methods, such as a storage medium including instructions. 1930, the above instructions may be executed by the central processor 1922 of the server 1900 to perform the above method.
  • the non-transitory computer readable storage medium may be a ROM, a random access memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, and an optical data storage device.
  • a non-transitory computer readable storage medium when instructions in the storage medium are executed by a processor of an electronic device, enabling the electronic device to perform a method of communication, the method comprising:
  • Target text unit in the target medical text is embodied as text information under the target category.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Epidemiology (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Medical Treatment And Welfare Office Work (AREA)
  • Machine Translation (AREA)

Abstract

La présente invention concerne un procédé de traitement d'informations de dossier médical. Le procédé comprend les étapes suivantes : acquérir un texte de dossier médical d'origine et diviser le texte de dossier médical d'origine en au moins une unité de texte cible ; déterminer une catégorie cible correspondant à une caractéristique de texte de l'unité de texte cible ; et produire un texte de dossier médical cible, l'unité de texte cible dans le texte de dossier médical cible étant représentée sous la forme d'informations textuelles appartenant à la catégorie cible. Grâce au procédé fourni par les modes de réalisation de la présente invention, différents contenus d'informations dans un texte de dossier médical cible structuré sont divisés respectivement en catégories correspondantes, qui peuvent non seulement permettre à un utilisateur de lire plus facilement, mais peuvent aussi permettre à l'utilisateur de trouver du contenu d'informations souhaité plus rapidement de sorte que le texte de dossier médical cible soit plus adapté à l'organisation et l'analyse de données. De plus, la présente invention concerne également un dispositif et un équipement de traitement d'informations de dossier médical.
PCT/CN2017/077125 2016-12-28 2017-03-17 Procédé, dispositif et équipement de traitement d'informations de dossier médical WO2018120447A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201611236257.2A CN108257676B (zh) 2016-12-28 2016-12-28 一种医案信息的处理方法、装置和设备
CN201611236257.2 2016-12-28

Publications (1)

Publication Number Publication Date
WO2018120447A1 true WO2018120447A1 (fr) 2018-07-05

Family

ID=62707727

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/077125 WO2018120447A1 (fr) 2016-12-28 2017-03-17 Procédé, dispositif et équipement de traitement d'informations de dossier médical

Country Status (2)

Country Link
CN (1) CN108257676B (fr)
WO (1) WO2018120447A1 (fr)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109284353A (zh) * 2018-09-10 2019-01-29 平安科技(深圳)有限公司 医案检索方法、装置、计算机设备和存储介质
CN111177117A (zh) * 2019-12-17 2020-05-19 山东中医药大学第二附属医院 一种中医医案数据处理方法
CN111209924A (zh) * 2018-11-19 2020-05-29 零氪科技(北京)有限公司 一种用于对医嘱进行自动提取的系统及应用
CN116646046A (zh) * 2023-07-27 2023-08-25 中日友好医院(中日友好临床医学研究所) 一种基于互联网诊疗的电子病历处理方法和系统

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111125100A (zh) * 2019-12-12 2020-05-08 东软集团股份有限公司 数据存储方法、装置、存储介质及电子设备
CN112131862B (zh) * 2020-07-20 2021-12-03 中国中医科学院中医药信息研究所 一种中医医案数据处理方法、装置及电子设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103020453A (zh) * 2012-12-15 2013-04-03 中国科学院深圳先进技术研究院 基于本体技术的结构化电子病历生成方法
CN103678281A (zh) * 2013-12-31 2014-03-26 北京百度网讯科技有限公司 对文本进行自动标注的方法和装置
CN103886034A (zh) * 2014-03-05 2014-06-25 北京百度网讯科技有限公司 一种建立索引及匹配用户的查询输入信息的方法和设备
CN104899260A (zh) * 2015-05-20 2015-09-09 东华大学 一种中文病理文本结构化处理方法
CN105808712A (zh) * 2016-03-07 2016-07-27 陈宽 将文本类医疗报告转换为结构化数据的智能系统及方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103020453A (zh) * 2012-12-15 2013-04-03 中国科学院深圳先进技术研究院 基于本体技术的结构化电子病历生成方法
CN103678281A (zh) * 2013-12-31 2014-03-26 北京百度网讯科技有限公司 对文本进行自动标注的方法和装置
CN103886034A (zh) * 2014-03-05 2014-06-25 北京百度网讯科技有限公司 一种建立索引及匹配用户的查询输入信息的方法和设备
CN104899260A (zh) * 2015-05-20 2015-09-09 东华大学 一种中文病理文本结构化处理方法
CN105808712A (zh) * 2016-03-07 2016-07-27 陈宽 将文本类医疗报告转换为结构化数据的智能系统及方法

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109284353A (zh) * 2018-09-10 2019-01-29 平安科技(深圳)有限公司 医案检索方法、装置、计算机设备和存储介质
CN109284353B (zh) * 2018-09-10 2023-10-03 平安科技(深圳)有限公司 医案检索方法、装置、计算机设备和存储介质
CN111209924A (zh) * 2018-11-19 2020-05-29 零氪科技(北京)有限公司 一种用于对医嘱进行自动提取的系统及应用
CN111209924B (zh) * 2018-11-19 2023-04-18 零氪科技(北京)有限公司 一种用于对医嘱进行自动提取的系统及应用
CN111177117A (zh) * 2019-12-17 2020-05-19 山东中医药大学第二附属医院 一种中医医案数据处理方法
CN111177117B (zh) * 2019-12-17 2023-06-16 山东中医药大学第二附属医院 一种中医医案数据处理方法
CN116646046A (zh) * 2023-07-27 2023-08-25 中日友好医院(中日友好临床医学研究所) 一种基于互联网诊疗的电子病历处理方法和系统
CN116646046B (zh) * 2023-07-27 2023-11-17 中日友好医院(中日友好临床医学研究所) 一种基于互联网诊疗的电子病历处理方法和系统

Also Published As

Publication number Publication date
CN108257676A (zh) 2018-07-06
CN108257676B (zh) 2020-03-03

Similar Documents

Publication Publication Date Title
WO2018120447A1 (fr) Procédé, dispositif et équipement de traitement d'informations de dossier médical
WO2017084541A1 (fr) Procédé et appareil pour envoyer une image d'expression pendant une session d'appel
KR102544453B1 (ko) 정보 처리 방법, 장치 및 저장 매체
JP6167245B2 (ja) 通信メッセージ識別方法、通信メッセージ識別装置、プログラム及び記録媒体
JP2018504727A (ja) 参考文書の推薦方法及び装置
US20160307563A1 (en) Methods and systems for detecting plagiarism in a conversation
WO2019109663A1 (fr) Procédé et appareil de recherche interlingue, et appareil de recherche interlingue
CN109471919B (zh) 零代词消解方法及装置
CN105550643A (zh) 医学术语识别方法及装置
RU2733816C1 (ru) Способ обработки речевой информации, устройство и запоминающий носитель информации
CN106202150A (zh) 信息显示方法及装置
CN111898382A (zh) 一种命名实体识别方法、装置和用于命名实体识别的装置
CN108255939A (zh) 一种跨语言搜索方法和装置、一种用于跨语言搜索的装置
CN111708943A (zh) 一种搜索结果展示方法、装置和用于搜索结果展示的装置
JP2022510660A (ja) データ処理方法及びその装置、電子機器、並びに記憶媒体
WO2022116527A1 (fr) Procédé et dispositif de traitement de données
WO2018214663A1 (fr) Procédé et appareil de traitement de données vocales, et dispositif électronique
US11798675B2 (en) Generating and searching data structures that facilitate measurement-informed treatment recommendation
CN110634570A (zh) 一种诊断仿真方法及相关装置
CN112948665A (zh) 一种搜索方法、装置和介质
US20170039874A1 (en) Assisting a user in term identification
CN114822753A (zh) 一种处方审核方法、装置、电子设备及存储介质
US11238863B2 (en) Query disambiguation using environmental audio
WO2017035985A1 (fr) Procédé et dispositif de stockage de chaînes
CN112836026B (zh) 基于对话的问诊方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17887578

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17887578

Country of ref document: EP

Kind code of ref document: A1