WO2018120447A1

WO2018120447A1 - Method, device and equipment for processing medical record information

Info

Publication number: WO2018120447A1
Application number: PCT/CN2017/077125
Authority: WO
Inventors: 银磊; 李明修; 卜海亮; 魏世嘉
Original assignee: 北京搜狗科技发展有限公司
Priority date: 2016-12-28
Filing date: 2017-03-17
Publication date: 2018-07-05
Also published as: CN108257676A; CN108257676B

Abstract

Disclosed by the present invention is a method for processing medical record information. The method comprises: acquiring an original medical record text and dividing the original medical record text into at least one target text unit; determining a target category corresponding to a text feature of the target text unit; and generating a target medical record text, the target text unit in the target medical record text being embodied as text information under the target category. By means of the method provided by the embodiments of the present invention, different information content in a structured target medical record text is respectively divided into corresponding categories, which not only may enable a user to read more smoothly, but may also enable the user to find desired information content more quickly so that the target medical record text is more favorable for data organization and analysis. In addition, also disclosed by the present invention are a device and equipment for processing medical record information.

Description

Method, device and device for processing medical record information

The present application claims priority to Chinese Patent Application No. 201611236257.2, entitled "Processing, Apparatus and Apparatus for Processing Medical Information" on December 28, 2016, the entire contents of which are hereby incorporated by reference. Combined in this application.

Technical field

The present invention relates to the field of information processing technologies, and in particular, to a method, device and device for processing medical record information.

Background technique

At present, medical information has become a very common object of information processing in information processing technology. Since the medical record information can reflect the patient's medical treatment, the medical record information can be used for doctors, patients to understand the patient's historical conditions, treatment, etc., and can also be used to analyze data on the condition and treatment of a large number of patients.

However, the medical record information that can usually be obtained directly is usually chaotic, that is, various information contents are pieced together indiscriminately. Therefore, on the one hand, when displaying such medical information to the user, the user is not only difficult to read smoothly but also cannot quickly find the required information content. On the other hand, such medical information is not conducive to the search and identification of the information content. Therefore, it is also difficult to use for data collation and analysis.

Summary of the invention

The technical problem to be solved by the present invention is to provide a method, a device and a device for processing medical record information, so that various information contents in the medical record information can be distinguished according to a certain structural format, and the structure of the medical record information is realized. It not only makes it easy for users to read and quickly find the information content of the demand, but also facilitates data collation and analysis.

In a first aspect, an embodiment of the present invention provides a method for processing medical record information, including:

Obtaining the original medical text and dividing the original medical text into at least one target text unit;

Determining a target category corresponding to a text feature of the target text unit;

Generating a target medical text, wherein the target text unit in the target medical text is embodied as text information under the target category.

Optionally, the determining the target category corresponding to the text feature of the target text unit may include:

Determining, according to the first machine learning model, a target category corresponding to the text feature of the target text unit, wherein the first machine learning model passes between a text feature of the historical medical text included in the training sample set and the preset category The correspondence is obtained by training.

Optionally, the target category may be a category for describing patient information, a category for describing a disease name, a category for describing symptom statement information, a category for describing symptom identification information, and a description for medical information. Category, or category used to describe prescription information.

Optionally, the method may further include:

Extracting a target feature word for describing the first feature item from the original medical text;

The target feature word in the target medical text is embodied as text information belonging to the first feature item.

Optionally, the extracting the target feature words for describing the first feature item from the original medical text may include:

Extracting the target feature words for describing the first feature item from the text information under the target category to which the first feature item belongs in the original medical text.

Performing analysis on the original medical text to obtain an initial feature word for describing the first feature item;

The initial feature words are matched in a standard feature vocabulary to obtain a standard feature word that matches the initial feature word as the target feature word for describing the first feature item.

Optionally, the analyzing the original medical text to obtain an initial feature word for describing the first feature item may include:

Performing lexical analysis and/or syntax analysis on the original medical text based on a medical-specific vocabulary to obtain the initial feature words for describing the first feature item.

Optionally, the method may further include:

Corresponding relationship between the initial feature word and the target feature word for describing the first feature item is established and embodied in the target medical record text.

Optionally, the method may further include:

Determining the inferred feature word corresponding to the original medical text under the second feature item, wherein the inferred feature word is a feature word for describing the second feature item not recorded in the original medical text ;

The inferred feature word is embodied in the target medical text as text information belonging to the second feature.

Optionally, the determining the inferred feature word corresponding to the original medical text in the second feature item may include:

Determining, according to the second machine learning model, the inferred feature words corresponding to the original medical text under the second feature item, wherein the second machine learning model passes the historical medical text and the pre-set included in the training sample set The corresponding relationship between the inferred feature words for describing the second feature item is obtained by training.

Optionally, after the generating the target medical text, the method may further include:

Finding a preset medical text matching the target medical text, wherein the text information of the preset medical text under the target category is the same as or similar to the target text unit, and the target category includes a category used to describe a patient's personal information and/or a category used to describe a patient's symptoms;

Extracting text information under the category for describing the diagnosis information in the preset medical text is embodied as reference diagnostic information in the target medical text.

Optionally, the obtaining the original medical text may include:

Obtaining medical record information in a voice form; performing voice recognition on the medical record information to obtain the original medical record text;

or,

Obtaining medical record information in the form of an image; performing image recognition on the medical record information to obtain the original medical record text.

In a second aspect, an embodiment of the present invention provides a processing device for processing medical information, including:

An acquisition unit for obtaining the original medical text;

a dividing unit, configured to divide the original medical text into at least one target text unit;

a first determining unit, configured to determine a target category corresponding to the text feature of the target text unit;

And a generating unit, configured to generate the target medical text, wherein the target text unit is embodied as text information under the target target category in the target medical text.

Optionally, the first determining unit may include:

a target category determining subunit, configured to determine, according to the first machine learning model, a target category corresponding to a text feature of the target text unit, wherein the first machine learning model passes the historical medical text included in the training sample set The correspondence between the text feature and the preset category is obtained by training.

Optionally, the target category is a category for describing patient information, a category for describing a disease name, a category for describing symptom statement information, a category for describing symptom identification information, and a category for describing medical information. Or a category used to describe prescription information.

Optionally, the device may further include:

a first extracting unit, configured to extract, from the original medical text, a target feature word for describing the first feature item; wherein, in the target medical text, the target feature word is embodied as belonging to the first Text information for feature items.

Optionally, the first extracting unit may include:

And a target feature word extracting sub-unit, configured to extract the target feature word for describing the first feature item from the text information under the target category to which the first feature item belongs in the original medical text.

Optionally, the first extracting unit may specifically include: an analyzing subunit and a matching subunit;

The analysis subunit is configured to analyze the original medical text to obtain an initial feature word for describing the first feature item;

The matching subunit is configured to match the initial feature words in a standard feature vocabulary to obtain a standard feature word that matches the initial feature word, as the target feature for describing the first feature item word.

Optionally, the matching subunit may specifically include:

The initial feature word extraction subunit is configured to perform lexical analysis and/or syntax analysis on the original medical text based on the medical special vocabulary to obtain the initial feature word for describing the first feature item.

Optionally, the device may further include:

a establishing unit, configured to describe a corresponding relationship between the initial feature word and the target feature word of the first feature item And reflected in the target medical text.

Optionally, the device may further include:

a second determining unit, configured to determine an inferred feature word corresponding to the original medical text under the second feature item, wherein the inferred feature word is not described in the original medical text for describing the a characteristic word of the second feature item;

Optionally, the second determining unit may include:

Deducing a feature word determining subunit, configured to determine, according to a second machine learning model, an inferred feature word corresponding to the original medical text under the second feature item, wherein the second machine learning model passes the training sample The historical medical record text included in the set is trained by the corresponding correspondence between the preset inferred feature words for describing the second feature item.

Optionally, the device may further include: a searching unit and a second extracting unit;

The searching unit is configured to search for preset medical text matching the target medical text, wherein the text information of the preset medical text under the target category is the same as the target text unit or Similarly, the target category includes categories for describing patient personal information and/or categories for describing patient symptoms;

The second extracting unit is configured to extract text information in a category for describing diagnostic information in the preset medical text for generating the target medical text.

Optionally, the obtaining unit may include: a first acquiring subunit and a first identifying subunit;

The first obtaining subunit is configured to obtain medical record information in a voice form;

The first identification subunit is configured to perform voice recognition on the medical record information to obtain the original medical record text.

Optionally, the acquiring unit may include: a second acquiring subunit and a second identifying subunit;

The second obtaining subunit is configured to acquire medical record information in an image form;

The second identification subunit is configured to perform image recognition on the medical record information to obtain the original medical record text.

In a third aspect, an embodiment of the present invention provides an apparatus, including a memory, and one or more programs, wherein one or more programs are stored in a memory and configured to be Execution of the one or more programs by one or more processors includes instructions for performing the following operations:

Compared with the prior art, the embodiment of the invention has the following advantages:

A method, apparatus and apparatus according to an embodiment of the present invention, for unstructured original medical text, by dividing the original medical text into at least one target text unit and determining the target text unit for each target text unit The target category corresponding to the text feature can generate a structured target medical text, so that each target text unit in the target medical text is embodied as text information under the target category to which it belongs. It can be seen that since different information contents are classified into corresponding categories in the structured target medical text, on the one hand, when the target medical text is displayed to the user, the user can not only read more smoothly but also can be faster. On the other hand, the text content categorized in the target medical text text is conducive to the search and identification of the information content, which also makes the target medical text more conducive to data collation and analysis.

DRAWINGS

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below. Obviously, the drawings in the following description are only It is a few embodiments described in the present invention, and other drawings can be obtained from those skilled in the art without any inventive effort.

FIG. 1 is a schematic diagram of a framework of an exemplary application scenario according to an embodiment of the present invention; FIG.

2 is a schematic flow chart of a method for processing medical record information according to an embodiment of the present invention;

3 is a schematic structural diagram of a processing device for processing medical records according to an embodiment of the present invention;

4 is a schematic structural diagram of a device according to an embodiment of the present invention;

FIG. 5 is a schematic structural diagram of a server according to an embodiment of the present invention.

detailed description

The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is a partial embodiment of the invention, and not all of the embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.

The inventors have found through research that the medical information that can usually be directly obtained, such as the medical information input by the user, is usually disorderly. Among them, the information content used to describe different features is pieced together indiscriminately. On the one hand, it is difficult for the user to smoothly read the disorganized medical information. On the other hand, the disorderly medical information is not conducive to the user's search for information content. Identification.

In order to solve the above problem, in the embodiment of the present invention, the original medical text is divided into at least one target text unit, and the target category corresponding to the text feature of the target text unit is determined for each target text unit, and the structure is generated accordingly. The target medical text is such that each target text unit in the target medical text is reflected in the text information under the target category. It can be seen that since different information contents are classified into corresponding categories in the structured target medical text, on the one hand, when the target medical text is displayed to the user, the user can not only read more smoothly but also can be easier. To find the required information content more quickly, on the other hand, the text content categorized in the target medical text text is conducive to the search and identification of the information content, which also makes the target medical text more conducive to data collation and analysis.

For example, the embodiment of the present invention can be applied to the scenario shown in FIG. 1 , where the user terminal 102 and the server 101 implement interaction through the network 103 . In this scenario, the server 101 obtains the original medical text transmitted by the user terminal 102. Then, the server 101 divides the original medical text into at least one target text unit, determines a target category corresponding to the text feature of the target text unit, and generates a target medical text, wherein in the target medical text The target text unit is embodied as text information under the target category. Then, the server 101 can transmit the target medical text information to the user terminal 102 for display.

It will be appreciated that the user terminal 102 can be existing, under development, or developed in the future, and can be implemented by any form of wired and/or wireless connection (eg, Wi-Fi, LAN, cellular, coaxial cable, etc.). Any user device that interacts with server 101, including but not limited to: existing, ongoing Smartphones, non-smart phones, tablets, laptop personal computers, desktop personal computers, small computers, medium-sized computers, large computers, etc. that are developed or developed in the future.

Further, the server 101 is merely an example of an existing, research-developed or future-developed device capable of providing medical information processing functions to a user. Embodiments of the invention are not subject to any limitation in this regard.

It will be understood that in the above scenario, although the actions of the embodiments of the present invention are described as being performed by the server 101, these actions may also be partially performed by the user terminal 102, partially by the server 101, or performed entirely by the user terminal 102. . The present invention is not limited in terms of the execution subject, and only the actions disclosed in the embodiments of the present invention may be performed.

It should be noted that the above application scenarios are only for the purpose of facilitating understanding of the present invention, and embodiments of the present invention are not limited in this respect. Rather, embodiments of the invention may be applied to any scenario that is applicable.

Various non-limiting embodiments of the present invention are described in detail below with reference to the drawings.

Exemplary method

Referring to FIG. 2, a schematic flowchart of a method for processing medical record information in an embodiment of the present invention is shown. In this embodiment, the method may include the following steps, for example:

201. Obtain the original medical text.

In the specific implementation, based on the obtained medical record information, the original medical text to be structured can be obtained. Among them, medical information can be obtained in a variety of ways. For example, the medical record information may be information input by the user. As another example, the medical record information can also be information stored in a database.

It can be understood that there are many possible forms of medical information obtained from the original. For example, the originally obtained medical record information may be information in the form of text, information in the form of images, or information in the form of voice. Since the embodiment is to structure the original medical text in text form, in the case that the originally obtained medical information is in the form of text, the original medical text may be the medical information itself, and the original medical record is obtained. Where the information is in a non-text form, the original medical text may be the original medical text converted into text. For example, in a case where the medical record information is in a voice form, the step 201 may include: acquiring medical record information in a voice form; performing voice recognition on the medical record information to obtain the original medical record text. For another example, in the case where the medical record information is in the form of an image, the steps 201 includes: acquiring medical record information in the form of an image; performing image recognition on the medical record information to obtain the original medical record text.

It should be noted that the original medical record information may sometimes contain information about multiple diagnoses of a patient. In order to make the medical information obtained by the structured processing unified, the related information of the multiple diagnosis in the medical record information can be divided into a plurality of related information of one diagnosis, and then the related information of one diagnosis is used as the original medical text for structuring. deal with. That is, the original medical text may be medical text information related to one diagnosis for one patient. For example, suppose that the original medical record information includes the relevant information of the first diagnosis and the related information of the second diagnosis, and the medical information obtained originally can be divided into the relevant information of one diagnosis and the related information of the second diagnosis according to the time of the consultation, The relevant information of the first consultation and the relevant information of the second consultation are respectively used as the original medical text, and the subsequent steps are performed.

202. Divide the original medical text into at least one target text unit.

In the specific implementation, the original medical text information can be divided into sentences. That is, the divided target text unit is a text sentence. Of course, in this embodiment, the original medical text information may be divided into units of phrases, phrases, paragraphs, and the like.

203. Determine a target category corresponding to the text feature of the target text unit.

In a specific implementation, each target text unit divided for the original medical text may be searched for a target category matching the text feature of the target text unit in a preset preset category that can be used to describe the medical information. , thereby determining a corresponding target category for each target text unit. It can be understood that, for the target text unit, if the text feature of the target text unit matches the target category, the target category is a category for describing the target text unit.

It can be understood that the preset plurality of categories applicable to medical record information may include, for example, a category for describing patient information, a category for describing a disease name, a category for describing symptom statement information, and a description. Any of a plurality of categories of the category of the symptom discrimination information, the category for describing the medical order information, the category for describing the prescription information, and the like. That is, for any one of the target text units, the corresponding target category may be, for example, a category for describing patient information, a category for describing a disease name, a category for describing symptom statement information, and a description for symptom recognition. The category of information, the category used to describe the medical order information, or the category used to describe the prescription information. The patient information may include, for example, a patient name, a patient gender, a patient's age, a visit time, and the like. The symptom statement information may also be referred to as chief complaint information. The symptom identification information may be a dialectical information on a TCM concept, or may be a Western medical plan. Read the test results information.

In this embodiment, for example, a machine learning model can be employed to determine a corresponding target category for the target text unit. Specifically, step 203 may be specifically: determining, according to the first machine learning model, a target category corresponding to the text feature of the target text unit, wherein the first machine learning model passes the historical medical text included in the training sample set The correspondence between the text feature and the preset category is obtained by training, and the historical medical text is text information under the preset category. Wherein, the training process of the first machine learning model may be specifically: in the case of determining that the historical medical text belongs to text information under a certain preset category, using the text feature of the historical medical text as an input And training the first machine learning model with the preset category to which the historical medical text belongs. Wherein, the plurality of historical medical texts for training may include the plurality of text information under the preset categories applicable to the medical information, so that the trained first machine learning model can accurately cover all available Preset category of medical information. In addition, the historical medical text may be a sentence text in units of sentences, that is, text information of one sentence per training is used as a historical medical text. Alternatively, the historical medical record information may also be a paragraph text in units of paragraphs, that is, each session uses text information of one paragraph as a historical medical text. It can be understood that after training for a certain number of historical medical texts and their corresponding preset categories, the first machine learning model can represent the correspondence between the text features and the preset categories, and therefore, the target text unit The text feature is input to the trained first machine learning model, and the target category output by the first machine learning model is the category to which the target text unit belongs.

204. Generate a target medical text, wherein the target text unit is embodied as text information under the target target category in the target medical text.

In the specific implementation, each target text unit that is divided into the original medical text can be organized according to the target category to which it belongs, and the target medical text is generated. The target medical text may be used, for example, for feedback to the user, that is, after step 204, the embodiment may, for example, further comprise: presenting the target medical text.

It can be understood that all target text units divided in the original medical text are included in the target medical text. In addition, each target text unit in the target medical text is saved correspondingly to its corresponding target category, so that the target medical text can reflect the text information under which the target text unit belongs to each target category. For example, suppose the target text unit is "head very painful", belonging The target category is “command”, and the information reflected in the target medical text can be “main complaint: the head is very painful”.

It should be noted that some characteristic words for describing important features can be recorded in the original medical text. Considering that these important feature words are mixed with other text content under the target category, in order to enable the user to more clearly identify these important feature words, in some embodiments of the present embodiment, the target medical record can be Individual features can be set in the text to reflect these important feature words. Specifically, before 204, the embodiment may further include, for example, extracting, from the original medical text, a target feature word for describing the first feature item. The target feature word in the target medical text is embodied as text information belonging to the first feature item. In the target medical text, the target feature word is correspondingly saved with the corresponding first feature item, so the target medical record text can reflect that the target feature word belongs to the corresponding first feature item. For example, if the target feature word is “angelica” and the first feature item belongs to “medicine material”, the information reflected in the target medical text may be “medicine material: angelica”.

It can be understood that the target feature words under the first feature item are text information recorded in the original medical text. For example, the first feature item may be a feature item for describing a patient name, that is, the target feature word may be information for describing a patient name. Assuming that the target feature word is "Zhang San", the first feature item and the target feature word in the target medical text can be embodied as "patient name: Zhang San". For another example, the first feature item may be a feature item for describing a medicine, that is, the target feature word may be information for describing a medicine. The medicine may be a Chinese medicine material or a western medicine product. Assuming that the target feature word is "amoxicillin", the first feature item and the target feature word in the target medical text can be embodied as "drug: amoxicillin". Assuming that the target feature word is "angelica", the first feature item and the target feature word in the target medical text can be embodied as "medicinal material: angelica". For another example, the first feature item may be a feature item for describing a dose, that is, the target feature word may be information for describing a dose. Assuming that the target feature word is "10 grams", the first feature item and the target feature word in the target medical text can be embodied as: "dose: 10 grams." Still again, the first feature item may be a feature item for describing a symptom, that is, the target feature word may be information for describing a symptom. Assuming that the target feature word is “headache”, the first feature item and the target feature word in the target medical text can be embodied as “symptoms: headaches”.

Understandably, different medical texts sometimes use different feature words to describe the same Righteousness, this is not conducive to the statistical analysis of medical information. To this end, in some embodiments of the present embodiment, the feature words of the same meaning may be normalized so that the same feature word is used in the target medical text to describe the same meaning. Specifically, the process of extracting the target feature word may include, for example, analyzing the original medical text to obtain an initial feature word for describing the first feature item; and the initial feature in the standard feature vocabulary The words are matched to obtain a standard feature word that matches the initial feature word as the target feature word for describing the first feature item. The standard feature vocabulary specifies a standard feature word for a plurality of feature words for describing the same meaning, and the standard feature vocabulary also records the correspondence between the non-standard feature words of the same meaning and the standard feature words. . If the initial feature word is a non-standard feature word in the standard feature lexicon, the corresponding standard feature word of the non-standard feature word in the standard feature vocabulary can be used as the target feature word. If the initial feature word is a standard feature word in the standard feature lexicon, the initial feature word itself can be used as the target feature word. For example, "headache" and "headache" can be normalized into "headache", that is, "headache" is a non-standard characteristic word, and "headache" is a standard characteristic word.

In order to enable the user to understand the normalization process of the feature words, in order to prevent the user from being able to understand the standard feature words that appear in the target medical text, in some embodiments of the present embodiment, for example, the method may further include: establishing the description Corresponding relationship between the initial feature word of the first feature item and the target feature word for describing the first feature item is embodied in the target medical record text. That is, the initial feature word and the target feature word corresponding to each other may also be included in the target medical text. For example, if the initial feature word is “headache” and the target feature word is “headache”, the initial feature word and the target feature word in the target medical text can be embodied as “original word: headache; standard symptom: headache”. For another example, if the initial feature word is “twenty g” and the target feature word is “20 g”, the initial feature word and the target feature word in the target medical text can be embodied as “original word: twenty g; standard dose: 20 grams".

Since the feature words in the medical text may have medical professional attributes, the analysis of the original medical texts can be combined with lexical analysis and syntactic analysis by means of a medical special vocabulary, so that the extraction of feature words is more accurate. Specifically, in some implementations of this embodiment, in order to obtain an initial feature word, the original medical text may be subjected to lexical analysis and/or syntax analysis based on a medical-specific vocabulary, The initial feature word of the first feature item. For example, suppose the original medical text records that “the head is very painful”. Through lexical analysis and syntactic analysis, it can be recognized that the “head” is a noun and a subject and represents the human body part, and the “pain” is a verb, a predicate and indicates the state of the human body part. Based on this, the initial characteristics can be determined The word is "headache."

In addition, for some first feature items with specific rules, the target feature words can be identified based on corresponding specific rules. For example, for the first characteristic "patient age", the target feature word may be extracted based on an age recognition rule (eg, the feature word includes "number + year" or "number + ten"). For another example, for the first feature item "visiting time", the target feature word may be extracted based on a time recognition rule (eg, the feature word includes "year", "month", "day" or has a separator "." "/", etc.) . Furthermore, for certain specific first feature items, the target feature words can be identified by a specific recognition technique. For example, for a first characteristic "patient name", a target feature word can be extracted based on a natural language processing named entity recognition technique.

Sometimes, the first feature item is a feature belonging to one or several target categories, that is, the target feature words under the first feature item are present in the text information under the target category. Based on this, in some implementations of this embodiment, the target feature word for describing the first feature item may be specifically extracted in text information under the target category of the first feature item. That is, after 203, a target feature word for describing the first feature item is extracted from the text information under the target category to which the first feature item belongs in the original medical text. Wherein, the text information under the target category includes all target text units corresponding to the target category. For example, the first feature item "drug" is a feature belonging to the target category "prescription", that is, the related information corresponding to the first feature item "drug" exists in the text information belonging to the category "prescription". Therefore, after the text information belonging to the category “prescription” is determined by classifying the original medical text, the target feature word corresponding to the first feature “drug” can be searched for and extracted in the text information belonging to the category “prescription”. Of course, the target feature words of the first feature item may also be searched and extracted from all the text information of the original medical text.

It should be noted that some characteristic words that are not directly recorded in the original medical text can sometimes be inferred from the text information recorded in the original medical text. In some embodiments of the present embodiment, separate feature items may be provided in the target medical text for embodying the inferred feature words. Specifically, before 204, the embodiment may further include: determining an inferred feature word corresponding to the original medical text under the second feature item, wherein the inferred feature word is in the original medical text a feature word for describing the second feature item; the inferred feature word in the target medical text is embodied as text information belonging to the second feature item. Inferring the feature word and its corresponding second feature item in the target medical text are saved, so the target medical text can be reflected The inferred feature word is text information belonging to the corresponding second feature item. For example, if the original medical text does not record the gender of the patient, assuming that the patient is a female based on the original medical text, the inferred feature word is “female” and the target category is “patient gender”. The information reflected in the text of the case can be “gender: female”.

It can be understood that the inferred feature words belonging to the second feature item are text information not directly recorded in the original medical text. For example, the second feature item may be a feature item for describing the gender of the patient, that is, the inferred feature word may be a feature word for describing the gender of the patient. Assuming that the inferred feature word is "male", the second feature item and the corresponding inferred feature word in the target medical text can be embodied as "patient gender: male". For another example, the second feature item may be a feature item for describing the age of the patient, that is, the inferred feature word may be a feature word for describing the age of the patient. Assuming that the inferred feature word is "middle age", the second feature item and the corresponding inferred feature word in the target medical text can be embodied as "patient age: middle age".

It should be noted that, for inferring a method of inferring a feature word, for example, a machine learning model can be employed. Specifically, the determining the manner of determining the feature word may include: determining, according to the second machine learning model, the inferred feature word corresponding to the original medical text under the second feature item, wherein the second machine learning The model is obtained by training a correspondence between a historical medical text included in the training sample set and a preset inferred feature word for describing the second characteristic item, and can be inferred from the historical medical text. The historical feature words. The training process of the second machine learning model may be specifically: for the historical medical text that is difficult to extract the determined feature words, in the case of determining the inferred feature words corresponding to the historical medical text, The second medical learning model is trained as an input to the historical medical text as an input. It can be understood that after training a certain number of historical medical texts and their corresponding inferred feature words, the second machine learning model can represent the correspondence between the medical text and the inferred feature words, and therefore, the structure will be The original medical text is input to the trained second machine learning model, and the inferred feature word output by the second machine learning model is a feature that the original medical text can reflect.

In some embodiments of the present embodiment, in the case where the user provides the original medical text containing the textual content of the symptom, patient information, etc., the same or similar symptoms, patient information, etc. may be obtained from the original medical text provided by the user. Extracting the text content of the diagnostic information in the preset medical text of the text content and as the reference diagnostic information is embodied in the target medical text information for the user to refer to, therefore, The user can obtain the diagnostic information recommended as a reference by inputting the patient information, thereby realizing the function of "self-diagnosis". Specifically, after 203, the embodiment may further include, for example, searching for preset medical text matching the target medical text, wherein the preset medical text is text information under the target category Same or similar to the target text unit, the target category includes a category for describing patient personal information and/or a category for describing a patient's symptoms; extracting a category for describing diagnostic information in the preset medical text The text information below is embodied in the target medical text as reference diagnostic information. Among them, the category for describing the diagnosis information may be, for example, a category for describing the prescription information, a category for describing the condition discrimination information, and/or a category for describing the medical order information. In addition, the preset medical text may be, for example, pre-collected classic medical information or medical information provided by a medical expert.

It can be understood that the text information used to match the original medical text and the preset medical text may be text information under a target category, or may be text information under multiple target categories. When using the text information under multiple target categories to match the original medical information with the preset medical text, different matching weights can be set for different target categories to measure between the original medical information and the preset medical text. The degree of matching. For example, the text information used to match the original medical information and the preset medical text may be text information under four target categories of “disorder”, “patient age”, “patient gender”, and “visiting time”. Among them, considering that the “diagnosis time” has relatively small impact on the diagnostic information, “disease”, “patient age” and “patient gender” can adopt relatively large matching weights, and “visiting time” can adopt relatively small matching. Weights. At this time, if the original medical information and the preset medical text are more consistent in the text information of "illness", "patient age" and "patient gender" and the "visual time" is inconsistent, the result of the matching may be original. The medical record information matches the preset medical text. If the original medical information and the preset medical text are more consistent in the text information of “disorder” and “visiting time” and the “patient gender” is more inconsistent, the matching result may be the original medical information and the preset medical record. The text does not match.

In this embodiment, the medical text of the original medical text and the target medical text may be a medical text of a Chinese medicine, or may be a medical text of a Western medicine.

In this embodiment, for the original medical text without the structure, the original medical text is divided into at least one target text unit and the target category corresponding to the text feature of the target text unit is determined for each target text unit. Can generate structured target medical texts that make the target Each target text unit in the medical text is reflected in the text information under its target category. It can be seen that since different information contents are classified into corresponding categories in the structured target medical text, on the one hand, when the target medical text is displayed to the user, the user can not only read more smoothly but also can be faster. On the other hand, the text content categorized in the target medical text text is conducive to the search and identification of the information content, which also makes the target medical text more conducive to data collation and analysis.

Exemplary device

Referring to FIG. 3, a schematic structural diagram of a processing apparatus for medical record information in an embodiment of the present invention is shown. In this embodiment, the device may specifically include:

The obtaining unit 301 is configured to obtain the original medical text;

a dividing unit 302, configured to divide the original medical text into at least one target text unit;

a first determining unit 303, configured to determine a target category corresponding to the text feature of the target text unit;

The generating unit 304 is configured to generate target medical text, wherein the target text unit is embodied as text information under the target category in the target medical text.

Optionally, the first determining unit 303 may include:

Optionally, the device may further include:

Optionally, the first extracting unit may include:

Optionally, the matching subunit may include:

Optionally, the device may further include:

And a establishing unit, configured to describe a correspondence between the initial feature word and the target feature word of the first feature item, and embodied in the target medical record text.

Optionally, the first feature item may be a feature item for describing a patient name, a feature item for describing a medicine, a feature item for describing a dose, or a feature item for describing a symptom.

Optionally, the device may further include:

Optionally, the second determining unit may include:

Optionally, the second feature item may be a feature item for describing a gender of the patient or a feature item for describing the age of the patient.

Optionally, the original medical text may be medical text information related to one diagnosis for one patient.

Optionally, the obtaining unit 301 may include: a first acquiring subunit and a first identifying subunit;

Optionally, the obtaining unit 301 may include: a second acquiring subunit and a second identifying subunit;

Optionally, the device may further include:

a presentation unit for presenting the target medical text.

In this embodiment, for the original medical text without the structure, the original medical text is divided into at least one target text unit and the target category corresponding to the text feature of the target text unit is determined for each target text unit. A structured target medical text can be generated such that each target text unit in the target medical text is embodied as textual information under its target category. It can be seen that since different information contents are classified into corresponding categories in the structured target medical text, on the one hand, when the target medical text is displayed to the user, the user can not only read more smoothly but also can be faster. On the other hand, the text content categorized in the target medical text text is conducive to the search and identification of the information content, which also makes the target medical text more conducive to data collation and analysis.

Referring to FIG. 4, apparatus 1800 can include one or more of the following components: processing component 1802, memory 1804, power component 1806, multimedia component 1806, audio component 1810, input/output (I/O) interface 1812, sensor component 1814, And a communication component 1816.

Processing component 1802 typically controls the overall operation of device 1800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. Processing component 1802 can include one or more processors 1820 to execute instructions to perform all or part of the steps described above. Moreover, processing component 1802 can include one or more modules to facilitate interaction between component 1802 and other components. For example, processing component 1802 can include a multimedia module to facilitate interaction between multimedia component 1806 and processing component 1802.

Memory 1804 is configured to store various types of data to support operation at device 1800. Examples of such data include instructions for any application or method operating on device 1800, contact data, phone book data, messages, pictures, videos, and the like. Memory 1804 can be implemented by any type of volatile or non-volatile storage device, or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Disk or Optical Disk.

Power component 1806 provides power to various components of device 1800. Power component 1806 can include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for device 1800.

Multimedia component 1806 includes a screen between the device 1800 and the user that provides an output interface. In some embodiments, the screen can include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen can be implemented as a touch screen to receive input signals from the user. The touch panel includes one or more touch sensors to sense touches, slides, and gestures on the touch panel. The touch sensor may sense not only the boundary of the touch or sliding action, but also the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 1806 includes a front camera and/or a rear camera. When the device 1800 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera can receive external multimedia data. Each front and rear camera can be a fixed optical lens system or have focal length and optical zoom capabilities.

The audio component 1810 is configured to output and/or input an audio signal. For example, audio component 1810 includes a microphone (MIC) that is configured to receive an external audio signal when device 1800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signal may be further stored in memory 1804 or transmitted via communication component 1816. In some embodiments, the audio component 1810 also includes a speaker for outputting an audio signal.

The I/O interface 1812 provides an interface between the processing component 1802 and a peripheral interface module, which may be a keyboard, a click wheel, a button, or the like. These buttons may include, but are not limited to, a home button, a volume button, a start button, and a lock button.

Sensor assembly 1814 includes one or more sensors for providing device 1800 with a status assessment of various aspects. For example, sensor assembly 1814 can detect an open/closed state of device 1800, relative positioning of components, such as the display and keypad of device 1800, and sensor component 1814 can also detect a change in position of one component of device 1800 or device 1800, The presence or absence of contact by the user with the device 1800, the orientation or acceleration/deceleration of the device 1800 and the temperature change of the device 1800. Sensor assembly 1814 can include a proximity sensor configured to detect the presence of nearby objects without any physical contact. Sensor assembly 1814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 1814 can also include an acceleration sensor, a gyro sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

Communication component 1816 is configured to facilitate wired or wireless communication between device 1800 and other devices. The device 1800 can access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, communication component 1816 receives broadcast signals or broadcast associated information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 1816 also includes a near field communication (NFC) module to facilitate short range communication. For example, the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, device 1800 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable A gate array (FPGA), controller, microcontroller, microprocessor, or other electronic component implementation for performing the above methods.

FIG. 5 is a schematic structural diagram of a server in an embodiment of the present invention. The server 1900 can vary considerably depending on configuration or performance, and can include one or more central processing units (CPUs) 1922 (eg, one or more processors) and memory 1932, one or one The above storage medium 1942 or storage medium 1930 of data 1944 (eg, one or one storage device in Shanghai). Among them, the memory 1932 and the storage medium 1930 may be short-term storage or persistent storage. The program stored on storage medium 1930 may include one or more modules (not shown), each of which may include a series of instruction operations in the server. Still further, central processor 1922 can be configured to communicate with storage medium 1930, which performs a series of instruction operations in storage medium 1930.

Server 1900 may also include one or more power sources 1926, one or more wired or wireless network interfaces 1950, one or more input and output interfaces 1958, one or more keyboards 1956, and/or one or more operating systems 1941. For example, Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.

An embodiment of the present invention provides an apparatus. The device includes a memory, and one or more programs, wherein one or more programs are stored in the memory, and configured to be executed by one or more processors to include the one or more programs for performing the following operations Instructions:

In some embodiments of the present embodiment, the device may be specifically the foregoing device 1800, and the memory may be specifically the memory 1804 in the foregoing device 1800, and the processor may be specifically the processor 1820 in the foregoing device 1800. .

In other embodiments of the present embodiment, the device may be specifically the foregoing server 1900, and the processor may be specifically the central processor 1922 in the foregoing server 1900, and the memory may be specifically in the foregoing server 1900. Storage medium 1930.

Optionally, in order to determine a target category corresponding to the text feature of the target text unit, where The processor can specifically execute the following operations:

Optionally, the processor may further execute an instruction of:

Optionally, in order to extract a target feature word for describing the first feature item from the original medical text, the processor may specifically execute an instruction of:

Optionally, in order to analyze the original medical text to obtain an initial feature word for describing the first feature item, the processor may specifically execute an instruction of:

Optionally, the processor may further execute an instruction of:

Optionally, the first feature item may be a feature item for describing a patient name, for describing a medicine A feature item of a product, a feature item for describing a dose, or a feature item for describing a symptom.

Optionally, the processor may further execute an instruction of:

Optionally, in order to determine the inferred feature word corresponding to the original medical text under the second feature item, the processor may specifically execute an instruction of:

Optionally, the processor may further execute an instruction of:

After generating the target medical text, searching for the preset medical text matching the target medical text, wherein the text information of the preset medical text under the target category is the same as the target text unit Or similarly, the target category includes categories for describing patient personal information and/or categories for describing patient symptoms;

Optionally, in order to obtain the original medical text, the processor may specifically execute the following operations:

Obtain medical information in the form of speech;

Performing voice recognition on the medical record information to obtain the original medical text.

Optionally, in order to obtain the original medical text, the processor may specifically perform the following operations. make:

Obtain medical record information in the form of images;

Performing image recognition on the medical record information to obtain the original medical text.

Optionally, the processor may further execute an instruction of:

Presenting the target medical text.

Embodiments of the present invention also provide a non-transitory computer readable storage medium including instructions, such as a memory 1804 including instructions executable by the processor 1820 of the apparatus 1800 to perform the above methods, such as a storage medium including instructions. 1930, the above instructions may be executed by the central processor 1922 of the server 1900 to perform the above method. For example, the non-transitory computer readable storage medium may be a ROM, a random access memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, and an optical data storage device.

A non-transitory computer readable storage medium, when instructions in the storage medium are executed by a processor of an electronic device, enabling the electronic device to perform a method of communication, the method comprising:

Other embodiments of the invention will be apparent to those skilled in the <RTIgt; The present invention is intended to cover any variations, uses, or adaptations of the present invention, which are in accordance with the general principles of the invention and include common general knowledge or common technical means in the art that are not disclosed in the present disclosure. . The specification and examples are to be considered as illustrative only,

It is to be understood that the invention is not limited to the details of the details of The scope of the invention is limited only by the appended claims

The above are only the preferred embodiments of the present invention, and are not intended to limit the present invention. Any modifications, equivalents, improvements, etc., which are within the spirit and scope of the present invention, should be included in the protection of the present invention. Within the scope.

Claims

A method for processing medical record information, comprising:

Obtaining the original medical text and dividing the original medical text into at least one target text unit;

Determining a target category corresponding to a text feature of the target text unit;

Generating a target medical text, wherein the target text unit in the target medical text is embodied as text information under the target category.
The method according to claim 1, wherein the determining a target category corresponding to the text feature of the target text unit comprises:

Determining, according to the first machine learning model, a target category corresponding to the text feature of the target text unit, wherein the first machine learning model passes between a text feature of the historical medical text included in the training sample set and the preset category The correspondence is obtained by training.
The method according to claim 1, wherein the target category is a category for describing patient information, a category for describing a disease name, a category for describing symptom statement information, and a description for symptom discrimination information. A category, a category used to describe medical order information, or a category used to describe prescription information.
The method of claim 1 further comprising:

Extracting a target feature word for describing the first feature item from the original medical text;

The target feature word in the target medical text is embodied as text information belonging to the first feature item.
The method according to claim 4, wherein the extracting the target feature words for describing the first feature item from the original medical text includes:

Extracting the target feature words for describing the first feature item from the text information under the target category to which the first feature item belongs in the original medical text.
The method according to claim 4, wherein the extracting the target feature words for describing the first feature item from the original medical text includes:

Performing analysis on the original medical text to obtain an initial feature word for describing the first feature item;

The initial feature words are matched in a standard feature vocabulary to obtain a standard feature word that matches the initial feature word as the target feature word for describing the first feature item.
The method according to claim 6, wherein the analyzing the original medical text to obtain an initial feature word for describing the first feature item comprises:

Performing lexical analysis and/or syntax analysis on the original medical text based on a medical-specific vocabulary to obtain the initial feature words for describing the first feature item.
The method of claim 6 further comprising:

Corresponding relationship between the initial feature word and the target feature word for describing the first feature item is established and embodied in the target medical record text.
The method of claim 1 further comprising:

Determining the inferred feature word corresponding to the original medical text under the second feature item, wherein the inferred feature word is a feature word for describing the second feature item not recorded in the original medical text ;

The inferred feature word is embodied in the target medical text as text information belonging to the second feature.
The method according to claim 9, wherein the determining the inferred feature word corresponding to the original medical text under the second feature item comprises:

Determining, according to the second machine learning model, the inferred feature words corresponding to the original medical text under the second feature item, wherein the second machine learning model passes the historical medical text and the pre-set included in the training sample set The corresponding relationship between the inferred feature words for describing the second feature item is obtained by training.
The method according to claim 1, wherein after the generating the target medical text, the method further comprises:

Finding a preset medical text matching the target medical text, wherein the text information of the preset medical text under the target category is the same as or similar to the target text unit, and the target category includes a category used to describe a patient's personal information and/or a category used to describe a patient's symptoms;

Extracting text information under the category for describing the diagnosis information in the preset medical text is embodied as reference diagnostic information in the target medical text.
The method of claim 1 wherein said obtaining the original medical text comprises:

Obtaining medical record information in a voice form; performing voice recognition on the medical record information to obtain the original medical record text;

or,

Obtaining medical record information in the form of an image; performing image recognition on the medical record information to obtain the original medical record text.
A processing device for medical record information, comprising:

An acquisition unit for obtaining the original medical text;

a dividing unit, configured to divide the original medical text into at least one target text unit;

a first determining unit, configured to determine a target category corresponding to the text feature of the target text unit;

And a generating unit, configured to generate the target medical text, wherein the target text unit is embodied as text information under the target target category in the target medical text.
An apparatus, comprising: a memory, and one or more programs, wherein one or more programs are stored in the memory and configured to execute the one or more programs by one or more processors Contains instructions for doing the following:

Obtaining the original medical text and dividing the original medical text into at least one target text unit;

Determining a target category corresponding to a text feature of the target text unit;

Generating a target medical text, wherein the target text unit in the target medical text is embodied as text information under the target category.