WO2018201772A1

WO2018201772A1 - Method and system for inferring potential disease from medical text, and readable storage medium

Info

Publication number: WO2018201772A1
Application number: PCT/CN2018/076149
Authority: WO
Inventors: 赵清源; 韦邕; 吕梓燊; 徐亮; 肖京
Original assignee: 平安科技（深圳）有限公司
Priority date: 2017-05-05
Filing date: 2018-02-10
Publication date: 2018-11-08
Also published as: CN107680689A

Abstract

Provided are a method and system for inferring a potential disease from a medical text, and a readable storage medium. The method comprises: performing segmentation on a received medical text, performing matching between respective words corresponding to the medical text and a pre-determined medical-specific terminology base, and extracting a medical terminology from the respective words corresponding to the medical text (S10); determining, on the basis of a pre-established medical specialty database, a disease corresponding to the medical terminology in the medical text (S20); and outputting the determined disease as a potential disease inferred from the medical text (S30). The method enables accurate and highly efficient inference of a potential disease from a medical text.

Description

Potential disease inference method, system and readable storage medium for medical text

Priority claim

The present application is based on the priority of the Chinese Patent Application for the application of the Chinese Patent Application No. CN2017103135201, entitled "Potential Disease Inferring Methods, Systems and Readable Storage Media for Medical Texts", filed on May 5, 2017. The entire content is incorporated herein by reference.

Technical field

The present application relates to the field of computer technology, and in particular, to a potential disease inference method, system, and readable storage medium for medical text.

Background technique

In general, the first step in dealing with medical texts is to infer potential diseases in order to make the next diagnostic recommendations. In the prior art, the underlying disease inference for the medical text can only artificially infer the underlying disease in the medical text according to the doctor's personal experience, and the efficiency is low, and the existing medical data resources cannot be utilized to effectively infer the underlying disease.

Summary of the invention

The main purpose of the present application is to provide a potential disease inference method, system and readable storage medium for medical texts, which aim to accurately and efficiently infer potential diseases of medical texts.

In order to achieve the above object, a first aspect of the present application provides a method for inferring a potential disease of a medical text, the method comprising the following steps:

A. segmentation of the received medical text, and matching each word segment corresponding to the medical text with a predetermined medical field-specific vocabulary to extract medical vocabulary in each participle corresponding to the medical text;

B. determining, according to a pre-built medical professional database, a disease corresponding to the medical vocabulary in the medical text; wherein the medical professional database includes a mapping relationship between different types of diseases and medical vocabulary;

C. Outputting the determined disease as a presumed potential disease of the medical text.

In addition, in order to achieve the above object, the second aspect of the present application further provides a potential disease inference system for a medical text, where the potential disease inference system of the medical text includes:

a word segmentation module, configured to segment the received medical text, and match each word segment corresponding to the medical text with a predetermined medical field-specific vocabulary to extract a medical vocabulary in each word segment corresponding to the medical text;

a determining module, configured to determine a disease corresponding to the medical vocabulary in the medical text based on a pre-built medical professional database; wherein the medical professional database includes a mapping relationship between different types of diseases and medical vocabulary;

An output module for outputting the determined disease as an inferred potential disease of the medical text.

Further, in order to achieve the above object, a third aspect of the present application further provides a computer readable storage medium storing a potential disease inference system of medical text, the potential disease inference system of the medical text may be Executing by at least one processor to cause the at least one processor to perform the steps of the potential disease inference method of the medical text as described above.

The potential disease inferring method, system and readable storage medium of the medical text proposed by the present application extracts the medical vocabulary in each participle corresponding to the medical text by segmenting the received medical text; and based on the pre-built inclusion A medical professional database that maps the relationship between disease and medical vocabulary to determine the disease corresponding to the medical vocabulary in the medical text as a delineated underlying disease of the medical text. Because it can construct the mapping relationship between different diseases and medical vocabulary according to various medical data resources, and find the disease mapped according to the medical vocabulary in the medical text, it is more efficient and accurate than manual estimation based on the doctor's personal experience. high.

DRAWINGS

1 is a schematic flow chart of a first embodiment of a method for inferring a potential disease of a medical text according to the present application;

2 is a schematic flow chart of a second embodiment of a method for inferring a potential disease of a medical text according to the present application;

3 is a schematic diagram of an operating environment of a preferred embodiment of the underlying disease inference system 10 of the medical text of the present application;

4 is a schematic diagram of functional modules of a first embodiment of a potential disease inference system for medical text of the present application;

FIG. 5 is a schematic diagram of functional modules of a second embodiment of a potential disease inference system for medical text of the present application.

The implementation, functional features and advantages of the present application will be further described with reference to the accompanying drawings.

detailed description

In order to make the technical problems, technical solutions and beneficial effects to be solved by the present application clearer and clearer, the present application will be further described in detail below with reference to the accompanying drawings and embodiments. It is understood that the specific embodiments described herein are merely illustrative of the application and are not intended to be limiting.

The present application provides a method for inferring a potential disease of a medical text.

Referring to FIG. 1, FIG. 1 is a schematic flow chart of an embodiment of a method for estimating a potential disease of a medical text according to the present application.

In an embodiment, the potential disease inference method of the medical text includes:

Step S10, segmenting the received medical text, and matching each word segment corresponding to the medical text with a predetermined medical field-specific vocabulary, and extracting the medical vocabulary in each participle corresponding to the medical text.

Receiving medical text to be diagnosed, such as receiving medical text to be diagnosed sent by the user through a browser, a client APP, or the like. In this embodiment, after receiving the medical text, the received medical text is first subjected to word segmentation processing. For example, the medical text can be divided into a complete statement according to the punctuation marks, and then the word segmentation processing is performed on each segmented sentence, for example, the word segmentation method can be used to perform segmentation processing on each segmented sentence, such as positive direction. The maximum matching method, which divides the string in a segmented statement from left to right; or, the inverse maximum matching method, divides the string in a segmented statement from right to left; or, the shortest path Word segmentation, the string in a segmented statement requires the number of words to be cut out to be the least; or, the two-way maximum matching method, and the word segmentation is performed in both forward and reverse directions. Word segmentation can also be used to classify each segmented sentence. Word segmentation is a segmentation method for machine speech judgment. It uses syntactic information and semantic information to deal with ambiguity phenomena to segment words. Statistical segmentation can also be used to process word segmentation of each segmented sentence. From the historical search record of the current user or the historical search record of the public user, according to the statistics of the phrase, the frequency of occurrence of some two adjacent words will be compared. If you have more, you can use these two adjacent words as a phrase to perform word segmentation.

After the medical text completes the word segmentation process, the respective word segments corresponding to the medical text are matched with the predetermined medical field-specific vocabulary, and the predetermined medical field-specific vocabulary may include the medical lexicon in the general medical dictionary, according to a large number. Brief information, symptom information, complication information, therapeutic drug information, or medical vocabulary in treatment department information extracted from various medical diseases extracted from medical texts (such as open source medical data on the Internet). The medical field-specific vocabulary can be fixed, or it can be based on the latest open source medical data on the Internet to regularly update the medical vocabulary in the medical field-specific vocabulary. The medical vocabulary matching the predetermined medical field-specific vocabulary among the respective word segments corresponding to the medical text is extracted, and the medical vocabulary that is related to the potential disease in the medical text, that is, the extracted medical vocabulary can be obtained.

Step S20: Determine a disease corresponding to the medical vocabulary in the medical text based on the pre-built medical professional database; wherein the medical professional database includes a mapping relationship between different types of diseases and medical vocabulary.

After extracting the medical vocabulary related to the underlying disease in each participle corresponding to the medical text, the medical vocabulary corresponding to the medical vocabulary in the medical text is determined based on the pre-built medical professional database. The medical professional database contains mapping relationships between different types of diseases and medical vocabulary (such as symptoms, drugs, examinations, departments and other information vocabulary extracted from a large number of medical texts), such as building medical professional materials based on open source data and texts. The database contains professional information such as diseases and their corresponding profiles, symptoms, complications, treatments, and common tests. Based on the constructed mapping relationship between different diseases and medical vocabulary, the disease mapped with the medical vocabulary in the medical text can be found.

In step S30, the determined disease is output as the inferred potential disease of the medical text.

After determining the corresponding disease according to the extracted medical vocabulary in the medical text, the determined disease can be output as the inferred potential disease of the medical text, based on the inferred potential disease of the medical text. Subsequent diagnostic recommendations. After the medical text potential disease inference statistics in practical application, the disease label accuracy rate obtained by the potential disease inference method in this embodiment (the human examination has no obvious error) can reach about 85%, which can effectively improve the potential disease inference for the medical text. The accuracy rate.

In this embodiment, the medical vocabulary in each participle corresponding to the medical text is extracted by segmenting the received medical text; and the medical text is determined based on a pre-built medical professional database containing mapping relationships between different diseases and medical vocabulary. The medical vocabulary corresponds to the disease as a potential disease inferred from the medical text. Because it can construct the mapping relationship between different diseases and medical vocabulary according to various medical data resources, and find the disease mapped according to the medical vocabulary in the medical text, it is more efficient and accurate than manual estimation based on the doctor's personal experience. high.

As shown in FIG. 2, the second embodiment of the present application provides a method for inferring a potential disease of a medical text. On the basis of the foregoing embodiment, before the step S10, the method further includes:

Step S40: Obtain medical data from a predetermined data source, find one or more medical vocabularies corresponding to each disease from the medical data, and establish a medical professional database according to a mapping relationship between different types of diseases and medical vocabulary.

In this embodiment, before performing the underlying disease estimation of the medical text, the medical data is first acquired from the predetermined data source to establish a medical professional database according to the mapping relationship between the different types of diseases and the medical vocabulary in the medical data. The medical data may be an authoritative interpretation of various diseases obtained from an existing medical database, including corresponding information such as profiles, symptoms, complications, therapeutic drugs, common examinations, etc., or medical treatments corresponding to various drugs. Information, such as the type of disease in which the drug is administered, the medical data can also be an open source medical data source on the Internet in real time or regularly through tools such as web crawlers (for example, questions and answers about different diseases in various forums, etc., or Specific types of information obtained by various latest medical cases, medical question and answer texts, etc. (for example, treatment plans for different diseases, therapeutic drugs, departments, clinical manifestations, etc.). Finding one or more medical vocabularies corresponding to each disease from the obtained medical data, and establishing a medical professional database according to the mapping relationship between different diseases and one or more medical vocabularies for subsequent establishment of a medical professional database To infer the underlying disease.

Further, in other embodiments, the medical professional database further includes the weight of each medical vocabulary corresponding to the disease, and the step S20 may include:

Based on the pre-built medical professional database, find out the diseases corresponding to the medical vocabulary in the medical text, calculate the weight of the medical vocabulary corresponding to each disease, and select the corresponding medical vocabulary to add the highest disease as the determined disease. The medical text corresponds to the disease.

In this embodiment, the medical vocabulary corresponding to one disease may be one or more, and one medical vocabulary may have one or more diseases. For example, the same symptom may map multiple diseases, the same type. Medicines can also treat a variety of diseases. Therefore, in the medical professional database constructed, different medical vocabularies are given different weights, so that when there are multiple medical vocabularies in the medical texts found based on the constructed medical professional database, the medical vocabulary corresponding to each disease can be calculated. The sum of the weights is selected, and the weight of the corresponding medical vocabulary is added to add the highest disease as the disease corresponding to the medical text determined. For example, the weight of a disease map can be summed as the degree of self-confidence of the disease, and the disease with the highest degree of confidence is selected as the final result, thereby further improving the accuracy of the underestimation of the medical text.

Further, in other embodiments, the step of performing word segmentation processing on the received medical text in the above step S10 includes:

According to the forward maximum matching method, the string to be processed in the medical text and the predetermined medical field-specific vocabulary (for example, the medical field-specific vocabulary may be a general medical professional vocabulary, or may be a scalable learning medical The lexicon is matched to obtain the first matching result;

According to the inverse maximum matching method, the character string to be processed in the medical text and the predetermined medical field-specific vocabulary (for example, the medical field-specific vocabulary can be a general medical professional vocabulary, or can be a scalable learning medical word. The library is matched to obtain the second matching result. The first matching result includes a first number of first phrases, and the second matching result includes a second number of second phrases; the first matching result includes a third number of words The second matching result includes a fourth number of words.

If the first quantity is equal to the second quantity, and the third quantity is less than or equal to the fourth quantity, outputting the first matching result (including a phrase and a single word) corresponding to the medical text;

If the first quantity is equal to the second quantity, and the third quantity is greater than the fourth quantity, outputting the second matching result (including a phrase and a single word) corresponding to the medical text;

If the first quantity is not equal to the second quantity, and the first quantity is greater than the second quantity, outputting the second matching result (including a phrase and a single word) corresponding to the medical text;

If the first quantity is not equal to the second quantity, and the first quantity is less than the second quantity, outputting the first matching result (including a phrase and a single word) corresponding to the medical text.

In this embodiment, the two-way matching method is used to perform word segmentation processing on medical texts, and the word segmentation matching is performed by forward and reverse simultaneous segmentation to analyze the stickiness of the combined content in the character string to be processed of the medical text, since the phrase can represent the core viewpoint information under normal circumstances. The probability is greater, that is, the phrase is more likely to be the medical vocabulary in the medical text. Therefore, through the simultaneous and reverse word segmentation matching, the word segment matching result with fewer words and more phrases is found to be used as the word segmentation result of the medical text, thereby improving the accuracy of the word segmentation, so as to extract the medical text more accurately. Medical vocabulary.

The application further provides a potential disease inference system for medical text. Please refer to FIG. 3, which is a schematic diagram of an operating environment of a preferred embodiment of the underlying disease inference system 10 of the medical text of the present application.

In the present embodiment, the medical text potential disease inference system 10 is installed and operated in the electronic device 1. The electronic device 1 may include, but is not limited to, a memory 11, a processor 12, and a display 13. Figure 3 shows only the electronic device 1 with components 11-13, but it should be understood that not all illustrated components may be implemented, and more or fewer components may be implemented instead.

The memory 11 may in some embodiments be an internal storage unit of the electronic device 1, such as a hard disk or memory of the electronic device 1. The memory 11 may also be an external storage device of the electronic device 1 in other embodiments, such as a plug-in hard disk equipped on the electronic device 1, a smart memory card (SMC), and a secure digital device. (Secure Digital, SD) card, flash card, etc. Further, the memory 11 may also include both an internal storage unit of the electronic device 1 and an external storage device. The memory 11 is used to store application software installed on the electronic device 1 and various types of data, such as program codes of the underlying disease inference system 10 of the medical text. The memory 11 can also be used to temporarily store data that has been output or is about to be output.

The processor 12, in some embodiments, may be a central processing unit (CPU), a microprocessor or other data processing chip for running program code or processing data stored in the memory 11, for example A potential disease inference system 10 or the like that executes the medical text.

The display 13 may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch sensor or the like in some embodiments. The display 13 is used to display information processed in the electronic device 1 and a user interface for displaying visualizations, such as displaying medical vocabulary in the extracted medical text, inferred potential disease of the medical text, and the like. The components 11-13 of the electronic device 1 communicate with one another via a system bus.

Please refer to FIG. 4, which is a functional block diagram of a first embodiment of the underlying disease inference system 10 of the medical text of the present application. In this embodiment, the potential disease inference system 10 of the medical text may be segmented into one or more modules, the one or more modules being stored in the memory 11 and processed by one or more The present invention (this embodiment is the processor 12) is executed to complete the application. For example, in FIG. 4, the potential disease inference system 10 of the medical text may be divided into a component word extraction module 01, a determination module 02, and an output module 03. A module referred to in this application refers to a series of computer program instructions that are capable of performing a particular function, and are more suitable than the program to describe the execution of the speech recognition system 10 in the electronic device 1. The following description will specifically describe the functions of the word segmentation module 01, the determination module 02, and the output module 03.

The word segmentation module 01 is configured to segment the received medical text, and match each word segment corresponding to the medical text with a predetermined medical field-specific vocabulary to extract the medical vocabulary in each participle corresponding to the medical text. ;

After the medical text completes the word segmentation process, the respective word segments corresponding to the medical text are matched with the predetermined medical field-specific vocabulary, and the predetermined medical field-specific vocabulary may include the medical lexicon in the general medical dictionary, according to a large number. Brief information, symptom information, complication information, therapeutic drug information, or medical vocabulary in treatment department information extracted from various medical diseases extracted from medical texts (such as open source medical data on the Internet). The medical field-specific vocabulary can be fixed, or it can be based on the latest open source medical data on the Internet to regularly update medical vocabulary in the medical field-specific vocabulary. The medical vocabulary matching the predetermined medical field-specific vocabulary among the respective word segments corresponding to the medical text is extracted, and the medical vocabulary that is related to the potential disease in the medical text, that is, the extracted medical vocabulary can be obtained.

a determining module 02, configured to determine, according to a pre-built medical professional database, a disease corresponding to the medical vocabulary in the medical text; wherein the medical professional database includes a mapping relationship between different types of diseases and medical vocabulary;

The output module 03 is configured to output the determined disease as the inferred potential disease of the medical text.

As shown in FIG. 5, the second embodiment of the present application provides a potential disease inference system for a medical text. Based on the foregoing embodiments, the method further includes:

The establishing module 04 is configured to obtain medical data from a predetermined data source, find one or more medical vocabularies corresponding to each disease from the medical data, and establish a medical relationship according to a mapping relationship between different types of diseases and medical vocabulary Professional database.

In this embodiment, before the underlying disease inference of the medical text is performed, the medical data is first acquired from a predetermined data source to establish a medical professional database according to the mapping relationship between different types of diseases and medical vocabulary in the medical data. The medical data may be an authoritative interpretation of various diseases obtained from an existing medical database, including corresponding information such as profiles, symptoms, complications, therapeutic drugs, common examinations, etc., or medical treatments corresponding to various drugs. Information, such as the type of disease in which the drug is administered, the medical data can also be an open source medical data source on the Internet in real time or regularly through tools such as web crawlers (for example, questions and answers about different diseases in various forums, etc., or Specific types of information obtained by various latest medical cases, medical question and answer texts, etc. (for example, treatment plans for different diseases, therapeutic drugs, departments, clinical manifestations, etc.). Finding one or more medical vocabularies corresponding to each disease from the obtained medical data, and establishing a medical professional database according to the mapping relationship between different diseases and one or more medical vocabularies for subsequent establishment of a medical professional database To infer the underlying disease.

Further, in other embodiments, the medical professional database further includes the weight of each medical vocabulary corresponding to the disease, and the determining module 02 may further be used to:

Further, in other embodiments, the word segmentation module 01 is further configured to:

Moreover, the present application also provides a computer readable storage medium storing a potential disease inference system of medical text, the potential disease inference system of the medical text being executable by at least one processor such that The at least one processor performs the steps of the potential disease inference method of the medical text in the above embodiment, and the specific implementation processes of the steps S10, S20, S30, etc. of the potential disease inference method of the medical text are as described above, and are not Let me repeat.

It is to be understood that the term "comprises", "comprising", or any other variants thereof, is intended to encompass a non-exclusive inclusion, such that a process, method, article, or device comprising a series of elements includes those elements. It also includes other elements that are not explicitly listed, or elements that are inherent to such a process, method, article, or device. An element that is defined by the phrase "comprising a ..." does not exclude the presence of additional equivalent elements in the process, method, item, or device that comprises the element.

Through the description of the above embodiments, those skilled in the art can clearly understand that the foregoing embodiment method can be implemented by means of software plus a necessary general hardware platform, and can also be implemented by hardware, but in many cases, the former is A better implementation. Based on such understanding, the technical solution of the present application, which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk, The optical disc includes a number of instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the methods described in various embodiments of the present application.

The preferred embodiments of the present application have been described above with reference to the drawings, and are not intended to limit the scope of the application. The serial numbers of the embodiments of the present application are merely for the description, and do not represent the advantages and disadvantages of the embodiments. Additionally, although logical sequences are shown in the flowcharts, in some cases the steps shown or described may be performed in a different order than the ones described herein.

A person skilled in the art can implement the present application in various variants without departing from the scope and spirit of the present application. For example, the features as one embodiment can be used in another embodiment to obtain another embodiment. Any modifications, equivalent substitutions and improvements made within the technical concept of the application should be within the scope of the application.

Claims

A potential disease inference method for medical text, characterized in that the method comprises the following steps:

A. segmentation of the received medical text, and matching each word segment corresponding to the medical text with a predetermined medical field-specific vocabulary to extract medical vocabulary in each participle corresponding to the medical text;

B. determining, according to a pre-built medical professional database, a disease corresponding to the medical vocabulary in the medical text; wherein the medical professional database includes a mapping relationship between different types of diseases and medical vocabulary;

C. Outputting the determined disease as a presumed potential disease of the medical text.
The method for inferring a potential disease of a medical text according to claim 1, wherein the step A further comprises:

Obtaining medical data from a predetermined data source, finding one or more medical vocabularies corresponding to each disease from the medical data, and establishing a medical professional database according to a mapping relationship between different types of diseases and medical vocabulary.
The potential disease inference method of medical text according to claim 1, wherein the medical vocabulary comprises:

Information about the disease, symptom information, complication information, treatment drug information, or medical vocabulary in the treatment department information.
The potential disease inference method of the medical text according to claim 1, wherein the medical professional database further includes weights of respective medical vocabularies corresponding to the disease, and the step B includes:

Based on the pre-built medical professional database, find out the diseases corresponding to the medical vocabulary in the medical text, calculate the weight of the medical vocabulary corresponding to each disease, and select the corresponding medical vocabulary to add the highest disease as the determined disease. The medical text corresponds to the disease.
The potential disease inference method of the medical text according to claim 2, wherein the medical professional database further includes weights of respective medical vocabularies corresponding to the disease, and the step B includes:

Based on the pre-built medical professional database, find out the diseases corresponding to the medical vocabulary in the medical text, calculate the weight of the medical vocabulary corresponding to each disease, and select the corresponding medical vocabulary to add the highest disease as the determined disease. The medical text corresponds to the disease.
The potential disease inference method of the medical text according to claim 3, wherein the medical professional database further includes weights of respective medical vocabularies corresponding to the disease, and the step B includes:

Based on the pre-built medical professional database, find out the diseases corresponding to the medical vocabulary in the medical text, calculate the weight of the medical vocabulary corresponding to each disease, and select the corresponding medical vocabulary to add the highest disease as the determined disease. The medical text corresponds to the disease.
The method for inferring a potential disease of a medical text according to claim 1, wherein the step of performing word segmentation on the received medical text comprises:

Matching the medical text with a predetermined medical domain-specific vocabulary according to a forward maximum matching method to obtain a first matching result, where the first matching result includes a first quantity of the first phrase and a third quantity of words ;

Matching the medical text with a predetermined medical domain-specific vocabulary according to the inverse maximum matching method to obtain a second matching result, where the second matching result includes a second number of second phrases and a fourth number of words;

If the first quantity is equal to the second quantity, and the third quantity is less than or equal to the fourth quantity, the first matching result is used as a word segmentation result of the medical text;

If the first quantity is equal to the second quantity, and the third quantity is greater than the fourth quantity, the second matching result is used as a word segmentation result of the medical text;

If the first quantity is not equal to the second quantity, and the first quantity is greater than the second quantity, the second matching result is used as a word segmentation result of the medical text;

If the first quantity is not equal to the second quantity, and the first quantity is less than the second quantity, the first matching result is used as a word segmentation result of the medical text.
The method for inferring a potential disease of a medical text according to claim 2, wherein the step of performing word segmentation on the received medical text comprises:

Matching the medical text with a predetermined medical domain-specific vocabulary according to a forward maximum matching method to obtain a first matching result, where the first matching result includes a first quantity of the first phrase and a third quantity of words ;

Matching the medical text with a predetermined medical domain-specific vocabulary according to the inverse maximum matching method to obtain a second matching result, where the second matching result includes a second number of second phrases and a fourth number of words;

If the first quantity is equal to the second quantity, and the third quantity is less than or equal to the fourth quantity, the first matching result is used as a word segmentation result of the medical text;

If the first quantity is equal to the second quantity, and the third quantity is greater than the fourth quantity, the second matching result is used as a word segmentation result of the medical text;

If the first quantity is not equal to the second quantity, and the first quantity is greater than the second quantity, the second matching result is used as a word segmentation result of the medical text;

If the first quantity is not equal to the second quantity, and the first quantity is less than the second quantity, the first matching result is used as a word segmentation result of the medical text.
The method for inferring a potential disease of a medical text according to claim 3, wherein the step of performing word segmentation processing on the received medical text comprises:

Matching the medical text with a predetermined medical domain-specific vocabulary according to a forward maximum matching method to obtain a first matching result, where the first matching result includes a first quantity of the first phrase and a third quantity of words ;

Matching the medical text with a predetermined medical domain-specific vocabulary according to the inverse maximum matching method to obtain a second matching result, where the second matching result includes a second number of second phrases and a fourth number of words;

If the first quantity is equal to the second quantity, and the third quantity is less than or equal to the fourth quantity, the first matching result is used as a word segmentation result of the medical text;

If the first quantity is equal to the second quantity, and the third quantity is greater than the fourth quantity, the second matching result is used as a word segmentation result of the medical text;

If the first quantity is not equal to the second quantity, and the first quantity is greater than the second quantity, the second matching result is used as a word segmentation result of the medical text;

If the first quantity is not equal to the second quantity, and the first quantity is less than the second quantity, the first matching result is used as a word segmentation result of the medical text.
An electronic device, comprising: a memory, a processor, and a potential disease inference system stored on the memory and operable on the processor, the potential disease of the medical text The following steps are implemented when the inference system is executed by the processor:

Step 1: classifying the received medical text, and matching each word segment corresponding to the medical text with a predetermined medical field-specific vocabulary to extract medical vocabulary in each participle corresponding to the medical text;

Step 2: determining, according to a pre-built medical professional database, a disease corresponding to the medical vocabulary in the medical text; wherein the medical professional database includes a mapping relationship between different types of diseases and medical vocabulary;

Step 3: Output the determined disease as the inferred potential disease of the medical text.
The electronic device according to claim 10, wherein prior to said step 1, said processor is further configured to execute said potential disease inference system of said medical text to implement the following steps:

Obtaining medical data from a predetermined data source, finding one or more medical vocabularies corresponding to each disease from the medical data, and establishing a medical professional database according to a mapping relationship between different types of diseases and medical vocabulary.
The electronic device of claim 10, wherein the medical vocabulary comprises:

Information about the disease, symptom information, complication information, treatment drug information, or medical vocabulary in the treatment department information.
The electronic device according to claim 10, wherein the medical professional database further comprises weights of respective medical vocabularies corresponding to the disease, and the second step comprises:

Based on the pre-built medical professional database, find out the diseases corresponding to the medical vocabulary in the medical text, calculate the weight of the medical vocabulary corresponding to each disease, and select the corresponding medical vocabulary to add the highest disease as the determined disease. The medical text corresponds to the disease.
The electronic device according to claim 11, wherein the medical professional database further comprises weights of respective medical vocabularies corresponding to diseases, and the second step comprises:

Based on the pre-built medical professional database, find out the diseases corresponding to the medical vocabulary in the medical text, calculate the weight of the medical vocabulary corresponding to each disease, and select the corresponding medical vocabulary to add the highest disease as the determined disease. The medical text corresponds to the disease.
The electronic device according to claim 12, wherein the medical professional database further comprises weights of respective medical vocabularies corresponding to the disease, and the second step comprises:

Based on the pre-built medical professional database, find out the diseases corresponding to the medical vocabulary in the medical text, calculate the weight of the medical vocabulary corresponding to each disease, and select the corresponding medical vocabulary to add the highest disease as the determined disease. The medical text corresponds to the disease.
The electronic device according to claim 10, wherein the step of performing word segmentation on the received medical text comprises:

Matching the medical text with a predetermined medical domain-specific vocabulary according to a forward maximum matching method to obtain a first matching result, where the first matching result includes a first quantity of the first phrase and a third quantity of words ;

Matching the medical text with a predetermined medical domain-specific vocabulary according to the inverse maximum matching method to obtain a second matching result, where the second matching result includes a second number of second phrases and a fourth number of words;

If the first quantity is equal to the second quantity, and the third quantity is less than or equal to the fourth quantity, the first matching result is used as a word segmentation result of the medical text;

If the first quantity is equal to the second quantity, and the third quantity is greater than the fourth quantity, the second matching result is used as a word segmentation result of the medical text;

If the first quantity is not equal to the second quantity, and the first quantity is greater than the second quantity, the second matching result is used as a word segmentation result of the medical text;

If the first quantity is not equal to the second quantity, and the first quantity is less than the second quantity, the first matching result is used as a word segmentation result of the medical text.
The electronic device according to claim 11, wherein the step of performing word segmentation on the received medical text comprises:

Matching the medical text with a predetermined medical domain-specific vocabulary according to a forward maximum matching method to obtain a first matching result, where the first matching result includes a first quantity of the first phrase and a third quantity of words ;

Matching the medical text with a predetermined medical domain-specific vocabulary according to the inverse maximum matching method to obtain a second matching result, where the second matching result includes a second number of second phrases and a fourth number of words;

If the first quantity is equal to the second quantity, and the third quantity is less than or equal to the fourth quantity, the first matching result is used as a word segmentation result of the medical text;

If the first quantity is equal to the second quantity, and the third quantity is greater than the fourth quantity, the second matching result is used as a word segmentation result of the medical text;

If the first quantity is not equal to the second quantity, and the first quantity is greater than the second quantity, the second matching result is used as a word segmentation result of the medical text;

If the first quantity is not equal to the second quantity, and the first quantity is less than the second quantity, the first matching result is used as a word segmentation result of the medical text.
The electronic device according to claim 12, wherein the step of performing word segmentation on the received medical text comprises:

Matching the medical text with a predetermined medical domain-specific vocabulary according to a forward maximum matching method to obtain a first matching result, where the first matching result includes a first quantity of the first phrase and a third quantity of words ;

Matching the medical text with a predetermined medical domain-specific vocabulary according to the inverse maximum matching method to obtain a second matching result, where the second matching result includes a second number of second phrases and a fourth number of words;

If the first quantity is equal to the second quantity, and the third quantity is less than or equal to the fourth quantity, the first matching result is used as a word segmentation result of the medical text;

If the first quantity is equal to the second quantity, and the third quantity is greater than the fourth quantity, the second matching result is used as a word segmentation result of the medical text;

If the first quantity is not equal to the second quantity, and the first quantity is greater than the second quantity, the second matching result is used as a word segmentation result of the medical text;

If the first quantity is not equal to the second quantity, and the first quantity is less than the second quantity, the first matching result is used as a word segmentation result of the medical text.
A computer readable storage medium storing a potential disease inference system of medical text executable by at least one processor to cause the at least one processor Perform the following steps:

Performing word segmentation on the received medical text, matching each word segment corresponding to the medical text with a predetermined medical field-specific vocabulary, and extracting medical vocabulary in each participle corresponding to the medical text;

Determining a disease corresponding to the medical vocabulary in the medical text based on the pre-built medical professional database; wherein the medical professional database includes a mapping relationship between different types of diseases and medical vocabulary;

The determined disease is output as an inferred potential disease of the medical text.
The computer readable storage medium according to claim 19, wherein said segmentation of the received medical text is performed, and each word segment corresponding to the medical text is matched with a predetermined medical domain-specific vocabulary. Before extracting the medical vocabulary in each participle corresponding to the medical text, the processor is further configured to execute the potential disease inference system of the medical text to implement the following steps:

Obtaining medical data from a predetermined data source, finding one or more medical vocabularies corresponding to each disease from the medical data, and establishing a medical professional database according to a mapping relationship between different types of diseases and medical vocabulary.