CN114093506B

CN114093506B - System for assisting disease reasoning and storage medium

Info

Publication number: CN114093506B
Application number: CN202111400154.6A
Authority: CN
Inventors: 李景阳
Original assignee: Beijing Ouying Information Technology Co ltd
Current assignee: Beijing Allin Technology Co ltd
Priority date: 2021-11-19
Filing date: 2021-11-19
Publication date: 2022-12-23
Anticipated expiration: 2041-11-19
Also published as: CN114093506A

Abstract

The invention relates to a system for assisting in disease reasoning, comprising: a storage unit configured to store a multi-map; an acquisition unit configured to acquire an initial evidence set; and a processing unit, comprising: the system comprises an evidence expansion module, a diagnosis prior probability module, a maximum population probability module, a maximum clinical performance probability module, an objective function calculation module, an optimization result set dividing module, a diagnosis set determination module and a final diagnosis set determination module. The invention also relates to a storage medium implementing the functionality of the system for assisting disease reasoning of the present application. According to the system for assisting the disease reasoning of the application, the medical resource shortage degree can be reduced, high accuracy is provided, and the information such as the symptoms of the user can be arranged in advance.

Description

System and storage medium for assisting disease reasoning

Technical Field

The invention relates to a system and a storage medium for assisting disease reasoning.

Background

On-line medical treatment is a new state generated by the combination of the current internet and the medical industry, and the research thereof is gradually paid extensive attention by the management academia. The system predicts the name and probability of the disease suffered by the user through strict and complex calculation according to the known data, and gives a set of evidences which are expected to be further clarified and treatment suggestions for the user, wherein the evidences comprise but are not limited to examination names (CT, X-ray and the like), treatment departments, critical information of the disease and the like.

At present, most of diseases that users want to know are registered in hospitals, and then the users communicate with doctors for many times and check the diseases to know the diseases that the users may suffer from.

It is expected that the direct hospital registration communication has at least the following disadvantages:

1. due to the shortage of the existing medical resources, the problems of difficult registration and difficult medical visit exist;

2. the problem of hanging wrong numbers exists due to the fact that the symptoms of the user cannot be clearly understood and cannot be found, and precious time of the user and a doctor is wasted;

3. some systems also make disease reasoning, but most of them use decision tree algorithm as core algorithm, but the disadvantages of decision tree are obvious, such as: ignoring the correlation between attributes, when the number of samples in each category is inconsistent, the information gain is biased to the characteristics with more numerical values, when the number of categories is too large, the error is increased more quickly, and the like. These drawbacks lead to a difficult and not very effective handling of the sample.

Therefore, there is a need for a system for assisting inference of diseases that can reduce the degree of medical resource shortage while providing high accuracy so that information of possible diseases can be accurately acquired in advance without going to a hospital and communicating with a doctor directly.

Disclosure of Invention

According to one aspect of the invention, the application relates to a system for assisting in disease reasoning, the system comprising: a storage unit configured to store a multi-graph; an acquisition unit configured to acquire an initial evidence set including at least a crowd information set of a user and a symptom set of the user; and a processing unit, comprising: an evidence expansion module configured to expand the symptom set acquired by the acquisition unit based on the multigraph stored in the storage unit, thereby obtaining an expanded symptom set and obtaining an associated diagnosis set based on the expanded symptom set; a diagnosis prior probability module configured to compute a logarithm of a sum of diagnosis prior probabilities of diagnoses in a set of diagnoses; a maximum group probability module configured to calculate a logarithm of a sum of the population information in the user's set of population information to a maximum group probability for a diagnosis in the set of diagnoses; a maximum clinical performance probability module configured to calculate a logarithm of a sum of maximum clinical performance probabilities for a symptom in the set of symptoms and a diagnosis in the set of diagnoses; an objective function calculation module configured to calculate an optimization result value of an objective function; an optimized result set partitioning module configured to obtain all subsets of the associated diagnosis set and to feed back all subsets to the diagnosis prior probability module, the maximum population probability module, the maximum clinical performance probability module and the objective function calculation module to obtain corresponding optimized results for all subsets, thereby obtaining an optimized result set; a diagnostic set determination module configured to rank the optimization results in the set of optimization results by result value; and a final diagnosis set determination module configured to output a final diagnosis set corresponding to the maximum result value or output one or more final diagnosis sets corresponding to the top-ranked result values.

Preferably, the system for assisting disease reasoning further comprises a human-machine interaction interface for receiving initial information input by the user, including a set of crowd information and a set of symptoms of the user, and a diagnosis interaction interface, wherein the human-machine interaction interface is configured to receive the initial information in any form that the user can implement; and wherein the diagnostic interactive interface is to display the one or more final diagnostic sets to the user.

Preferably, the processing unit further comprises: a cold start clarification module configured to ask the user to add a new user symptom until a valid symptom is obtained if the set of symptoms obtained by the obtaining unit does not include any valid symptom.

Preferably, the processing unit further comprises: a threshold comparison module configured to compare a maximum result value of the optimized result set with a predetermined threshold and to output a final diagnosis set corresponding to the maximum result value or output one or more final diagnosis sets corresponding to the top-ranked result values by the final diagnosis set determination module when it is determined that the maximum result value is greater than the predetermined threshold.

Preferably, the processing unit further comprises: a problem domain module configured to obtain a problem domain based on the multiple map to obtain all symptoms associated with each diagnosis in the set of diagnoses when the threshold comparison module has determined that there are no result values greater than a predetermined threshold.

Preferably, the processing unit further comprises: a query clarification module configured to query for the most sensitive valid symptom of the diagnoses in the set or subset of the most recently obtained optimized results set with the largest result value based on the obtained problem domain and add the newly obtained valid symptom to the recently obtained symptom set to obtain a new symptom set and feed it back to the evidence expansion module.

Preferably, the processing unit further comprises a counting module configured to count the number of times of query clarification by the query clarification module and, when the number is greater than 5 times, to deactivate the threshold comparison module and output one or more final diagnostic sets with the result values ranked top.

Preferably, the system for assisting in disease reasoning comprises: a medical history interface configured to link with a hospital medical history system such that medical history information related to the user can be obtained from the hospital medical history system and inserted into the user's initial set of evidence as needed.

Preferably, the system for assisting in disease reasoning further comprises a physician-side interactive interface configured to present the evidence obtained from the user and the outputted set of one or more final diagnoses to a physician, wherein the physician-side interactive interface is capable of retrieving the stored set of symptoms and the set of demographic information from a storage unit of the system.

According to another aspect of the invention, the application also relates to a storage medium storing instructions which, when executed, implement the process steps to be performed by the above-described system for assisting disease reasoning.

The system for assisting disease reasoning according to the present invention can reduce the medical resource shortage while providing high accuracy, so that information of possible diseases can be accurately acquired in advance without going to a hospital and directly communicating with a doctor. In addition, when the user goes to a hospital, the information collected by the system and the inferred information can assist a doctor in acquiring individual conditions of the user, such as symptom sets, crowd information sets and the like, so that the doctor can use more accurate information for actual diagnosis. This pre-arrangement of information on the user's physical condition etc. contributes significantly to saving the physician the time to ask the user.

Drawings

Other significant features and advantages of the invention result from the following non-limiting description provided for illustrative purposes with reference to the following drawings, in which:

FIG. 1 shows a flow diagram of a flow performed by a system for assisting disease inference according to an embodiment of the invention;

FIG. 2 shows a flow diagram of a flow performed by a system for assisting disease inference according to another embodiment of the invention;

FIG. 3 shows a block diagram of a system for assisting disease inference, according to an embodiment of the invention; and

fig. 4 shows a general block diagram of a system for assisting disease inference according to an embodiment of the present invention.

Detailed Description

The embodiment of the application relates to a technical scheme under the computer application condition.

The process performed by the system for assisting in disease reasoning of the present application deduces possible "diagnoses" (or may also be referred to herein as "diseases") from known "evidence" (evidence includes "symptoms" evidence and) and further optimizes the relevant "diagnoses" (it should be understood that in the context of this document "diagnosis" should be understood as "the possibility of having a certain disease" without being able to replace a full doctor's medical diagnosis) based on further "clarification" of the user, and thus "diagnosis" in the present application helps the user to understand his own physical condition and to facilitate information collection when a subsequent doctor diagnoses).

In the system for assisting disease reasoning according to the present application, the following known information needs to be utilized and may be contained in a multiple graph which will be described and defined in detail below:

"disease-symptom" association information, i.e. symptoms (the set of which is defined as M) ₀ ＝{m _i In which m represents a symptom element, the lower subscript i represents the i-th symptom, and i is a positive integer), and a disease (or diagnosis) (the set of which is defined as D) ₀ ＝{d _j In which d denotes the disease element, the subscript j denotes the j-th disease, j belongs to a positive integer), and "sensitivity" and "specificity", which respectively denote the conditional probability P (m) of the symptom relative to the disease _i |d _j ) And the conditional probability P (d) of the disease for the symptoms _j |m _i ) (ii) a And

"disease-demographics" association information, i.e. demographic characteristics of the disease, which indicates the disease in relation to demographics (the set of which is defined as G) ₀ ＝{g _k Where k denotes a crowd-characteristic element, the subscript k denotes a kth individual group characteristic, k being a positive integer), which relations include:

(a) Incidence of disease in the general population, i.e. the prior probability of disease P (d) _j )，

(b) Incidence of disease P (d) in a given population _j |g _k ) Or the population distribution P (g) of patients _k |d _j )，

(c) The prior probability P (g) of a population characteristic, i.e. a population characteristic _k ) It may be obtained from information sources such as the national statistics bureau.

"disease-disease" association information, i.e., the relationship between diseases. As is known, there is a certain probabilistic relationship between diseases, such as a complication relationship (no causal relationship, but caused by common factors) or a complication relationship (causal relationship), etc. For D ₀ Of (2) a disease d having a probabilistic relationship ₁ 、d ₂ More than two diseases d can be set ₁ 、d ₂ Probability relation P (d) therebetween ₂ |d ₁ ) This represents d ₂ With a probability P (d) ₂ |d ₁ ) Secondary to d ₁ I.e. in the presence of disease d ₁ Has a disease of d ₂ Has a probability of P (d) ₂ |d ₁ ). It is to be understood that, as used herein, the term "secondary" means only d ₁ 、d ₂ The existence of a certain probability relationship between the two does not mean that a direct causal relationship exists, and when a multiple graph is constructed, the causal relationship is not required to be considered, but only the related probability relationship is required to be considered.

"symptom-symptom" association information, between the various symptoms in the set of symptoms, there may be some probabilistic relationship, which means a non-independent relationship between the symptoms and some secondary manifestation relationship. For M ₀ Symptom m having a probabilistic relationship in (1) ₁ 、m ₂ The above two symptoms m can be set ₁ 、m ₂ Such that P (m) is ₂ |m ₁ )>0, which means that the patient suffers from the symptom m ₁ Shows the symptom m on the basis of ₂ Has a probability of P (m) ₂ |m ₁ )。

A bayesian network is a suitable expression for expressing the relationship between symptoms or population characteristics and the corresponding disease, wherein the nodes in the bayesian network represent random variables (events), which in the scope of the invention are evidence (symptoms m) _i Or population characteristics g _k ) Or diagnosis (associated diseases)d _j ) And edges between nodes in the network represent associations between the nodes. It should be noted that in the context of inferential diagnosis of disease, it is difficult to establish a complete bayesian network because the correlation between certain symptoms is unknown, such as headache and back pain.

Although bayesian networks describe events and associations between events, in the context of disease reasoning, human-computer interactions are often passed on by concepts, while events are often compounded by multiple concepts, e.g. the symptom "waist pain" is compounded by two concepts "waist" and "pain". Therefore, a knowledge graph depicting concepts and relationships between concepts is also needed to enable efficient interaction. Concepts may have various relationships between them, for example, in the context of disease reasoning, such as defining a compound relationship that expresses between concepts/events, such as a contained-of, for example:

composite relationship (1):

and

composite relationship (2):

in the composite relationship (1), there are both the term "lumbar pain" which is an event in the Bayesian network and the term "lumbar" and "pain" which are non-event concepts, and thus, it is called "lumbar pain _{Location of a body part} "and" pain _{Kind of symptom} 'is' waist pain _{Symptoms and signs} "is a constituent element of the above formula.

Whereas in compound relationship (2), it is actually a logical combination of two events, which may be referred to as a "combined event".

It should be appreciated that in addition to composite relationships, there are also probabilistic relationships in a bayesian network between these events.

P (lumbar pain-pain and fever) =1, P (fever-lumbar pain and fever) =1

As previously defined, the intermediate probability relationships of the bayesian network also include at least a "disease-symptom" probability relationship, a "disease-population characteristic" probability relationship, a "disease-disease" probability relationship, a "symptom-symptom" probability relationship, and the like.

Further, in a bayesian network, when the probability of a conceptual probability relationship is 1, it is equivalent to a deductive relationship, that is,

to more intuitively represent the use of bayesian networks and knowledge-graphs in the present application, a multiple graph G = (V, R, I) is employed to represent bayesian networks and knowledge-graphs simultaneously, where:

-V＝{v _k the vertex set is expressed, and comprises evidence events such as symptoms and crowd characteristics, diagnosis events such as diseases, disease groups or syndromes, and non-event concepts such as parts, symptom types and symptom limits;

-R represents a set of relationships comprising:

(i) The probabilistic relationship between events may be, for example,

when P (numbness of lower limbs-lumbar disc herniation) =0.7, it is abbreviated as

Further, when the relationship type is not specifically labeled between events, then the probability relationship between events is defaulted, i.e., < v > ₁ →v ₂ 〉；

(ii) Part-whole relationships (part-of) between concepts of the same kind, e.g.

(iii) Composite-of the compound relationship between concepts, as described previously;

(iv) The superior-inferior relation (kinof) between concepts of the same kind, e.g.

When the concepts are all event types, then the context also means the derivation of the relationship, i.e.

Or

From the above description, the relationship set R can be represented as a conceptual relationship set R _K Set of probabilistic relationships R _p I.e. R = R _K ∪R _p 。

-I represents a correlation function between concepts or events. For example, for two events v ₁ ，v ₂ If the relation between R ∈ R _P Then the association of these two events with the corresponding relationship r can be represented as I (r) = (v) ₁ ，v ₂ ) And the weight of the relation r is defined as the conditional probability P (v) ₂ |v ₁ )。

In order to reduce the introduced redundant information as much as possible, the conceptual relationship set R can be utilized in combination _K Set of probabilistic relationships R _p To equivalently represent combined evidence without recording logical operations in node attributes.

As has been described in the foregoing, the present invention,and relation e between evidences ₃ ＝e ₁ ∧e ₂ Can be expressed as

Correspondingly, the OR relation e between the evidences ₃ ＝e ₁ ∨e ₂ Can be expressed as

For the AND-relation, when all child nodes are true, the derivation can be reversed, i.e., if

Then the

However, it is anticipated that the derivation between evidence may be ambiguous, for example, if e ₁ ＝e ₃ ∨e ₄ ，e ₂ ＝e ₃ ∨e ₅ Can be easily determined

However, it is not limited to

Therefore, in order to avoid ambiguity, the conceptual relationship set R needs to be introduced _K Making a distinction to prevent ambiguities, i.e. when

Then e can be determined ₃ ＝e ₁ ∧e ₂ Can further determine

To further illustrate the relationships between nodes in the multiple graph, some examples of node relationships within the scope of the present application are shown herein:

example one: symptom + Limit

Among them, the "lumbar pain" is composed of the following elements:

example two: symptom combination (and relationship)

Example three: symptom combination (or relationship)

Example four: disease group

Alternatively or additionally, in the multiple graph of the present application, in the set of relationships R, in addition to the disclosed relationships, in a preferred embodiment of the present application, some ambiguity probability relationships may be set in the multiple graph in view of ambiguity due to user description of symptoms and sites.

By way of example, these ambiguity probability relationships may include:

(1) Proximity of sites relationship: for example, when the user describes the neck, since the neck back, neck top, neck shoulder and neck shoulder are all adjacent to the neck, a proximity probability relationship is set between these adjacent parts to more fully cover the user description error, for example

(where P is merely a probability relationship and is not specified as any particular value or values, it is contemplated that the magnitude of the P-value correlates with anatomical proximity between the sites, e.g., the closer the P-value, the greater the P-value), e.g., P (neck-shoulder) =0.2;

(2) Similarity relationship of symptoms: for example, when the user describes the symptom of "soreness", because there is a certain similarity between "soreness" and "distending pain", a probability relationship of similarity is set between these similar symptoms to more fully cover possible diagnoses, such as pain

For example, P (distending pain-soreness) =0.3;

(3) The inclusion relationship of the parts: although the partial-global relationship part-of between the same kind of concepts has been described above, the inclusion relationship between the parts may additionally include a probabilistic relationship between the parts (i.e., a probabilistic relationship between more detailed parts relative to broader parts), such as a broader part "upper limbs" including more detailed partsFour sites of the shoulder, the upper arm, the elbow, the front arm, the wrist, the hand, except for the middle part

In addition to the relationships, there may be

Etc., e.g., p (shoulder | upper limb) = =0.2;

(4) Inclusion relationship of symptoms: although the above has described a kindred-of relationship between similar concepts, the inclusion relationship between symptoms may additionally include a probabilistic relationship between symptoms (i.e., limiting the probabilistic relationship of more symptoms relative to broader symptoms), for example, limiting more symptoms between "severe low back pain" and broader symptoms "low back pain" in addition to

In addition to the upper and lower relation, also comprises

Probability relationship of (c), for example, P (lumbar pain-severe lumbar pain) =0.5; also or for example, limiting the onset of more symptoms "post-exertion exacerbations of pain" and broader symptoms "pain" in addition to

In addition to the upper and lower relation, also comprises

For example, P (aggravation-pain after pain exertion) =0.1, etc.

It should be understood by those skilled in the art that although some specific values of the probability relations are illustrated above, the illustrated values of the probability relations only represent an example, except the probability values 1 and 0 that can definitely determine the relations, and do not represent that these illustrated values are directly applied to the multiple graphs when constructing the related multiple graphs, but those skilled in the art can calculate and optimize actual effective values according to various medical data and statistical data which are currently known and apply the actual effective values to the probability relation settings of the multiple graphs to facilitate more accurate subsequent inference calculation.

It should be understood by those of ordinary skill in the art that the above description of the relationship between nodes in the multi-graph is merely exemplary and should not be construed as limiting the relationship of all nodes in the multi-graph of the present application.

Based on the above, it can be determined that, given a user's set of demographic information G and a set of symptoms M, it can be inferred that a set of diagnoses (diseases) D that the user may have can be expressed as:

where P (D-MG) represents the probability of having D under the condition of the union of M and G in the bayesian network, it is apparent that the diagnosis set D having the highest probability can be understood as the disease set having the highest probability of being likely to have D under the current information of the user.

According to the Bayesian formula, P (D-MG) can be modified as:

considering that for a given user M and G are determined and the direction of disease inference is G- > D- > M, the determination of the set of diagnoses D can be modified as:

based on this, the above reasoning process can be understood as solving the maximum of the following objective function:

f(D)＝log P(D)+log P(G|D)+log P(M|D) (2)

if the diagnoses D in the diagnosis set D, the crowd information G in the crowd information set G, and the symptoms M in the symptom set M are independent of each other in a probabilistic sense, the optimization result of the equation (2) can be expressed as:

wherein each item of the formula (3) is: the term a represents the logarithm of the sum of the disease prior probabilities for each disease D in any diagnosis set D, the term b represents the logarithm of the sum of the maximum population probabilities for a certain diagnosis D in the diagnosis set D for each item of population information G in the user's population information set G, and the term c represents the logarithm of the sum of the maximum clinical performance probabilities for each symptom M in the user's symptom M for a certain diagnosis D in the diagnosis set D.

However, as described above with respect to the multiplets, there is no complete independence between diseases and diseases, symptoms and symptoms, disease and population information, etc., and therefore, there is a need to correct the above optimization results.

Fig. 1 shows a flow chart of a procedure performed by a system for assisting disease inference according to an embodiment of the present invention. As shown in fig. 1, the flow executed by the system for assisting disease inference described in the present invention includes at least the steps described in detail below.

At step S100, an initial set of evidence is acquired.

The initial set of evidence includes at least user basic information and user symptoms. The user basic information includes at least a set G of basic group information such as gender and age of the user. The user symptoms include at least an initial complaint symptom M.

At step S102, the user' S initial chief complaint symptoms M are expanded based on the multigraph to obtain an expanded symptom set M1 and an associated diagnosis set D based on the expanded symptom set M1.

It is envisioned that in this step, the user's symptoms, including the initial chief complaint symptoms, are expanded based on the multiple map. Since there may be a complex-of relationship between symptoms, such as:

and is provided with

Therefore, when the combined symptom of the waist pain and the fever is obtained, according to the multiple graph theory, the fact that the two symptoms of the waist pain and the fever are obtained at the same time can be deduced, and therefore evidence information loss and inaccurate reasoning results can be prevented through expansion of the user symptoms. After the user symptoms are expanded based on the multi-graph, the initial chief complaint symptoms are updated to an expanded set of symptoms. Further, from the expanded symptom set, a set of relevant diagnoses associated with the expanded symptom set can be obtained from a multigraph. For example, when the symptom of lumbar pain and fever is expanded to lumbar pain and fever, a set of diagnoses (diseases) associated with lumbar pain and accordingly a set of diagnoses (diagnoses) associated with fever may be acquired based on the multiplet, and therefore, based on the expansion of the symptom of lumbar pain and fever of the user, a union of the set of diagnoses (diseases) associated with lumbar pain and the set of diagnoses (diagnoses) associated with fever may be actually obtained, so that all diseases associated with the symptom of lumbar pain and fever of the user have a thorough consideration without omission.

Furthermore, in this step, due to the ambiguity of the user's description of symptoms and sites, it is possible to further expand with an ambiguity probability relationship for sites and symptoms in the user's initial complaint symptoms so that the range of all diagnoses that may be involved is obtained based on the user's symptoms and sites.

Note that if there are two such evidences in the expanded set of evidence

Then e is removed from the expanded evidence set ₂ . The reason for this is that e is described as an event ₁ A certain ratio e ₂ Including more information (complex relationships) and, thus, for any diagnosis d _j If both exist

Then p is ₁ A certain ratio p ₂ More accurately express the probability relationship of the relevant evidence diagnosis (please note, p) ₁ Not necessarily greater than p ₂ ) E.g. of

The diagnosis using the former must be more accurate than the latter.

At step S104, the logarithm of the sum of the diagnosis prior probabilities of the diagnosis D in the diagnosis set D is calculated.

For the associated diagnosis set D obtained from step S102, the disease D is as described above ₁ And disease d ₂ There may be a relationship between them, i.e. there is a certain probability relationship between diseases, such as a combination relationship (no causal relationship, but caused by a common factor) or a concurrent relationship (causal relationship), etc. As is well known, for D = { D = { ₁ ，d ₂ Disease in (d) ₁ And d ₂ In other words, if d ₁ And d ₂ Independent of each other (i.e. d) ₁ And d ₂ There is no additional probabilistic relationship therebetween), it can be determined that the user has a diagnosisThe probability of D is P (D) = P (D) ₁ )×P(d ₂ ) (i.e., the probability that the user has both diseases at the same time). However, if d ₂ With a probability P (d) ₂ |d ₁ ) Secondary to d ₁ The probability that the user has the diagnosis set D is corrected to be P (D) = P (D) ₁ )×P(d ₂ |d ₁ ) In other words, disease d ₁ →d ₂ The log-likelihood gain that leads to diagnosis D can be expressed as:

this is equivalent to using the relation d ₁ →d ₂ Probability P (d) of ₂ |d ₁ ) Replacement node d ₂ Probability P (d) of ₂ |d ₁ ). Thus, it can be understood that for each diagnostic dt, only di → d is retained in the calculation _t (i and t represent the subscripts of the disease elements in the diagnosis set D, which are understood as conventional in the art), the maximum probability of finding the most likely diseased set can be determined. However, in making the prior probability P (D), making such a reservation (or replacement) first requires that there cannot be a loop between the two diseases (i.e., there is a partial order relationship).

In order to retain the disease association that most affects the outcome of this calculation, a valid relationship needs to be selected for each disease in the diagnostic set, which proceeds as follows:

starting from the start of any one of the diagnostics dt,

i. ) Selecting valid relationship d _i →d _t Let arg max _di→dt log P(D)；

ii.) delete all other relations to dt, only keeping the largest d _i → dt, i.e. only the largest P (d) remains _t |d _i ) And set it to P' (d) _t )；

iii.) delete dt to all selected node relationships, only retain relationships to unselected nodes, thus avoiding loops;

iv.) repeating steps i) - (ii) until all relationships in D have been deleted.

Therefore, the term a in equation (3) can be modified to:

it is understood that for certain diseases in the diagnostic set D, P '(D) may not be present, in which case P (D) = P' (D).

At step S106, a logarithm of the sum of the maximum population probabilities of the population information G in the user' S population information set G for the diagnosis D in the diagnosis set D is calculated.

As mentioned above, P (g | d) represents the proportion of users of a disease among various types of people. The most important population information is age and sex. However, it is expected that the correlation between the two kinds of population information of age and gender is not strong, and thus the calculation of the maximum population probability for the population information of age and gender is not changed.

However, the information on the population relating to occupation and medical history has a significant influence on the diagnosis of the disease, however, the prior probability (or population probability) of such information on the population is not generally known in advance, however, the knowledge that those skilled in the art can easily obtain exists in a form similar to "the probability of lung cancer in a long-term large number of smokers is 10-20 times that of non-smokers". As can be seen from equation (1), when comparing different D e D, the meaningful yield value (risk), risk, can be expressed in the form of log r (g, D), where the meaning of r (g, D) should be understood that the prevalence of D is r times higher for the population with population information g than for the population without population information g (the population without population information g can be considered as the whole population because the population with population information g has a smaller proportion of disease D in the whole population).

For the sake of distinction, such population information g of the disease d sufferer with population information g is referred to as personal history feature H e H, (i.e. there is no definite prior probability, but it can be known that the population information (personal history feature H) significantly increases the probability of suffering from the disease d). Of course, if the risk increase relationship between the personal history feature h and the disease d is not clear, r (h, d) may be set to 1 (i.e., representing the probability of the disease d not being obvious to the personal history feature h).

Based on this, the term b in equation (3) can be modified to:

at step S108, the logarithm of the sum of the maximum clinical manifestation probability of a symptom in the symptom set M to a diagnosis in the diagnosis set D is calculated.

Symptoms M in the symptom set M for the diagnosis D in the diagnosis set D, there are negative symptom expressions and positive symptom expressions, wherein a positive symptom expression means that a symptom expression of the symptom is present as long as any diagnosis in the diagnosis set D supports the occurrence of the symptom; by contrast, negative symptom performance means that all diagnoses in the diagnosis set D all support a symptom performance that the symptom appears. Thus, the symptoms in symptom set M may be represented as a union of positive and negative symptom expression sets M = M ⁺ ∪M ^- Wherein

Thus, considering the non-independence of the diagnoses in the diagnosis set D, the term c in the above equation (3) can be modified as:

however, in the symptom set M, there is a symptom M ∈ M, P (M | d) =0, however, there is another symptom M' other than the chief complaint symptom M, so that

In other words, although the symptom m and the diagnosis D in the diagnosis set D show a conditional probability of 0, the symptom m passes through the symptom setThe probability relationship between the other symptoms M' except M can indirectly establish the probability relationship between the diagnosis D in the diagnosis set D and the symptom M, so that the probability relationship between the diagnosis D and the symptom M can be expressed as:

P″(m|d)＝P(m|m′)×P(m′|d) (4)

based on the above equation (4), the term c in the above equation (3) can be further modified to:

at step S110, an optimization result value of the objective function is calculated, i.e., a sum of the logarithm of the sum of the diagnosis prior probabilities, the logarithm of the sum of the maximum population probabilities, and the logarithm of the sum of the maximum clinical performance probabilities is calculated.

In the case of the objective function of equation (3), based on the correction calculation in steps S106-S110, the result of the objective function of equation (3) can be optimized as:

at step S112, all subsets Di of the associated set of diagnoses D (here, proper subsets) are obtained and steps S104-S110 are repeated for each subset Di, resulting in an optimized result value f (Di) for all subsets.

At step S114, the optimization results in the optimization result set F corresponding to the relevant diagnosis set and the optimization result values of all subsets thereof are sorted by result value.

At step S116, the final diagnostic set corresponding to the largest result value is output or one or more final diagnostic sets corresponding to the top ranked (e.g., top three, but this is not limiting) result values are output.

Although the steps of the implementation of the processes performed by the system of the present invention are described above and in the drawings in a certain order, it is contemplated that these orders are not presented in a limiting manner, but are merely exemplary. Alternatively, within the scope of the present application, one or more of the above-described steps may be changed in their order in the process or may be performed simultaneously, as long as the functions of the system of the present invention can be achieved.

By way of example and not limitation, all subsets Di of the associated set of diagnostics D may be directly obtained after step S102 (step S112), and then steps S104-S110 are respectively performed for each subset Di and each of the associated set of diagnostics D, thereby obtaining an optimized result set F corresponding to each subset Di and the associated set of diagnostics D.

As shown in fig. 2, according to a preferred embodiment of the present application, cold start clarification is required considering that the symptom set acquired at step S100 may not include effective symptoms directly associated with the disease. Specifically, at optional step S118, it is determined whether the symptom set acquired at step S100 includes directly valid symptoms. If the directly valid symptom is included, the flow executed by the system for assisting the disease inference proceeds to step S102. If it is determined that no valid symptom is included, the step proceeds to step S120, at which step S120 the user is asked to add a new user symptom until a valid symptom is obtained.

According to a preferred embodiment of the present application, a predetermined threshold may be set for the maximum result value in the optimized result set F to improve the accuracy of the inference. Specifically, after step S114, a step S122 of comparing the result values in the optimized result set F with a predetermined threshold value may be provided, and if the maximum result value is greater than the predetermined threshold value, the diagnostic set is considered to be sufficiently accurate, and the process proceeds to step S116. However, if there is no result value greater than a predetermined threshold, it may be assumed that the current diagnosis combination is not completely accurate and the most likely set of diagnoses cannot be inferred, in which case further extensive clarification of the currently acquired set of symptoms is required to enhance the accuracy of the diagnosis. Specifically, in the absence of any outcome value being greater than the predetermined threshold, the flow performed by the system for assisting in disease inference of the present application proceeds to steps S124-S126, which may also be referred to generally as inference clarification process steps.

At step S124, all symptoms associated with each diagnosis in the associated diagnosis set D are acquired based on the multiple map to obtain a problem domain, wherein all diagnoses in the associated diagnosis set and all symptoms associated with each diagnosis are referred to as the problem domain. It is envisioned that the question field also includes additional queries that need to be unambiguous due to location and symptom ambiguity.

From the above definitions of problem domain and valid symptoms, it can be learned that all the lower symptoms of one valid symptom are also in the problem domain, and thus the problem domain is exactly the markov blanket of valid symptoms. When there are multiple active symptoms, the problem domain is the union of the Markov blankets for each active symptom.

At step S126, based on the obtained problem domain, the effective symptom with the highest sensitivity in the diagnoses in the diagnosis set with the largest result value in the most recently obtained optimized result set is queried and added to the most recently obtained symptom set M to obtain a new symptom set and fed back to step S102 for circulation.

In a preferred embodiment of the present application, an optional counting step may also be provided at a suitable location, which determines the number of times the above-mentioned cycle has been passed, and outputs one or more diagnostic sets with the result values sorted in the top after the number of cycles has exceeded a certain number, for example 5 times. It is also envisaged that the number of output diagnostic sets may also be predetermined to be, for example, five.

Alternatively, only evidence-granular interrogation efficiency is actually considered in determining the need to interrogate other valid symptoms of sensitivity associated with a diagnosis in the set of diagnoses, but the natural logic of human dialog may also actually be considered in translating into natural language employed by the process performed by the system for assisting in disease reasoning.

For example: (1) merging elements of the same type; for example, "lumbar pain" and "lumbar numbness" should be asked "ask you whether your lumbar region has the following symptoms: A. pain, B, numbness; and (2) sequentially inquiring according to the chief complaint symptoms of the user; for example, if the user inputs "my low back pain and leg numbness", then the user should ask the question about "low back pain", including the definition, degree, etc.; the question about "numbness of the lower limbs" is asked again.

For example, a problem history abstraction can be modeled as

Wherein,

the b-th query object indicating the a-th cycle includes, for example, "lumbar (symptom)", "lumbar pain (degree)". When the flow executed by the system for assisting disease inference according to the present application gives a set of evidence to be clarified for the n +1 th cycle, it should be compared with the historical questions, and preferably the same query target should be selected for querying. The comparison sequence is:

that is, the query objects in the same cycle clarify sequentially, and the query objects in different cycles preferentially ask for clarification of the most recently queried to approximate human conversation habits.

Further, it is preferable that the supplementary effective symptom asked for the user at step S120 is directly generated by matching the user-input part into the knowledge graph in the multi-graph, for example, if the user inputs "waist", the relevant part symptoms such as "waist pain", "waist swelling" and the like are found in the matching, and then the user is asked directly whether the waist has the above symptoms, so that the user can select the supplementary effective symptom directly. In addition, it is also possible to directly generate a relevant question for the user to answer, for example, if the user inputs "waist" and determines that there is no directly effective symptom, the user is directly asked "what is your waist inappropriate? ". Of course, the aforementioned alternative or direct interrogation approaches may be used in combination.

As shown in fig. 3, fig. 3 illustrates a block diagram of a system for assisting disease inference according to an embodiment of the present invention. The system 3 for assisting disease reasoning according to the present application as shown in the figure may comprise a storage unit 302, an acquisition unit 304 and a processing unit 306.

The storage unit 302 is configured to store a multi-graph. The multi-graph represents both the Bayesian network and the knowledge graph and includes at least known information such as effective symptoms, diseases associated with the effective symptoms first, "disease-symptom" associated information, and "disease-population" associated information.

The obtaining unit 304 is arranged to obtain an initial set of evidence comprising at least a set of demographic information of the user and a set of symptoms of the user as described before. Further, in an advantageous embodiment, the obtaining unit 304 may also be configured to ask the user for other information based on feedback from other units in the system.

The processing unit 306 is configured to process the information acquired by the acquisition unit 304 as described above. Preferably, the processing unit 306 may include a plurality of sub-modules, such as an evidence expansion module 306-1, a diagnosis prior probability module 306-2, a maximum population probability module 306-3, a maximum clinical performance probability module 306-4, an objective function calculation module 306-5, an optimized result set partitioning module 306-6, an optimized result ranking module 306-7, and a final diagnosis set determination module 306-8.

The evidence expansion module 306-1 is configured to expand the set of symptoms acquired by the acquisition unit 304 based on the multi-graph stored in the storage unit 302 as described in step S102, thereby obtaining an expanded set of symptoms and obtaining an associated set of diagnoses based on the expanded set of symptoms.

The diagnosis prior probability module 306-2 is configured to calculate a logarithm of the sum of the diagnosis prior probabilities of the diagnoses in the diagnosis set as described in step S104.

The maximum group probability module 306-3 is configured to calculate a logarithm of the sum of the maximum group probabilities of the demographic information in the user' S set of demographic information for the diagnoses in the set of diagnoses, as described in step S106.

The maximum clinical performance probability module 306-4 is configured to calculate the logarithm of the sum of the maximum clinical performance probability of a symptom of the symptom set M to a diagnosis in the diagnosis set as described in step S108.

The objective function calculation module 306-5 is configured to calculate the optimized result value of the objective function as described in step S110, i.e., calculate the sum of the logarithm of the sum of the diagnosis prior probabilities, the logarithm of the sum of the maximum population probabilities, and the logarithm of the sum of the maximum clinical performance probabilities.

The optimized result set partitioning module 306-6 is configured to obtain all subsets of the associated diagnosis set as described in step S112 and feed these subsets back to the diagnosis prior probability module 306-2, the maximum population probability module 306-3, the maximum clinical performance probability module 306-4, respectively, to obtain corresponding optimized result values for the subset of the relevant diagnosis set via the objective function calculation module 306-5.

The diagnostic set determination module 306-7 is configured to sort the optimized results in the sets corresponding to the optimized result values of the associated diagnostic set and the subset thereof by the optimized result values as set forth in step S114.

The final diagnosis set determination module 306-8 is configured to output the final diagnosis set corresponding to the maximum result value or output one or more final diagnosis sets corresponding to the top-ranked result values as described in step S116.

Optionally, the system for assisting disease reasoning of the present invention may further comprise a threshold comparison module 306-9, a question domain module 306-10, and a query clarification module 306-11. The threshold comparison module 306-9 is configured to compare the maximum result value in the optimized result set F with a predetermined threshold as described in step S122, and if the maximum result value is greater than the predetermined threshold, the final diagnosis set corresponding to the maximum result value or one or more final diagnosis sets corresponding to the top-ranked result values are output by the final diagnosis set determination module 306-8. If the maximum result value is less than the predetermined threshold, the threshold comparison module 306-9 activates the problem domain module 306-10, and the problem domain module 306-10 is configured to obtain all symptoms associated with each diagnosis in the associated set of diagnoses based on the multiple map as described in step S124, obtaining a problem domain. The query clarification module 306-11 is then configured to query for the effective symptom with the highest sensitivity among the diagnoses in the set of diagnoses with the largest result value in the most recently obtained optimized result set based on the obtained question domain as described in step S126 and add the newly obtained effective symptom to the recently obtained symptom set to obtain a new symptom set and feed it back to the evidence expansion module 306-1 for looping.

Optionally, processing unit 306 may also include a cold start clarification module 306-12. The cold start clarification module 306-12 determines whether the symptom set acquired by the acquisition unit 304 includes valid symptoms and asks the user via the acquisition unit 304 to add new user symptoms until valid symptoms are acquired if the acquired symptom set does not include any valid symptoms.

Optionally, the processing unit 306 may include a counting module 306-13 configured to count the number of cycles and, when the number is greater than a predetermined number (e.g., five times, other numbers are possible), deactivate the problem domain module 306-10 and output by the final diagnosis set determination module 306-8 the final diagnosis set corresponding to the maximum result value or output one or more (2 or 3) final diagnosis sets corresponding to the top-ranked result values.

It is foreseen that the information such as the related evidence set and the inference result obtained by the obtaining unit 304 and the processing unit 306 may also be optionally stored in the storage unit 302 in real time or may also be optionally stored on a remote storage device (e.g., cloud) for subsequent invocation.

Although various individual functional modules of the processor are described above, those of ordinary skill in the art will appreciate that these modules are merely exemplary. Indeed, there may be modules and one or more modules performing one or more of the above functions, and these arrangements may vary depending on the implementation needs of the invention.

Advantageously, the system for assisting in disease reasoning within the scope of the present application further comprises a human-machine interface, wherein the human-machine interface is configured to receive initial information input by the user, comprising the user basic information and a symptom set of the user symptom information. The human-machine interface is configured to receive user-initiated information in any form that a user may implement (e.g., voice input, text input, image recognition). By way of example and not limitation, the human-computer interaction interface may be embodied as a keyboard, mouse, touch screen, joystick, microphone, or any other hardware or combination thereof that can receive initial information input by a user.

Advantageously, the system for assisting in disease reasoning within the scope of the present application further comprises a diagnosis interactive interface, wherein the diagnosis interactive interface is configured to display a set of diagnoses output by the system or to feed back to the user other indications of diagnosis determined by the system to be asked to obtain the user's response to the other indications of the system to be clarified. Advantageously, the diagnostic interface is preferably a screen, for example in the form of a liquid crystal display, an organic light emitting diode or the like. It is contemplated that the diagnostic interactive interface may also be output device hardware such as a voice announcement device, a projection device, or a combination thereof, by way of example and not limitation.

More advantageously, the human-machine interface and the diagnosis interface in the system for assisting disease reasoning within the scope of the present application may be integrated. By way of example and not limitation, a touch screen may be an example of an integrated human-machine interface and diagnostic interface, for example. It is contemplated that other human-machine interface including a screen may be integrated with the diagnostic interface to perform both functions, such as a combination of a display and a keyboard (or other physical input device), etc.

Advantageously, the system for assisting in disease reasoning within the scope of the present application further comprises a medical history interface, wherein the medical history interface is structured to be linked to the hospital medical history system such that medical history information relating to the user can be obtained from the hospital medical history system and inserted into the user's initial evidence set as needed (e.g., the user's current symptoms are related to a previous medical history, or relapse, etc.) for assisting in disease reasoning.

Further, a system for assisting in disease reasoning within the scope of the present application also includes a physician-side interactive interface configured to present the set of evidence obtained from the user and the set of final diagnoses output to a physician. It is envisioned that the physician-side interactive interface is capable of obtaining a set of evidence about the user and a set of final diagnoses from the storage unit 304 or an associated remote storage of the system for assisting in disease reasoning, which enables the physician to obtain the user's physical state more quickly and accurately, thereby facilitating the physician's diagnosis process.

Advantageously, the storage unit 302 of the system for assisting disease reasoning within the scope of the present application may for example comprise a Memory, such as a usb-disk, a removable hard-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic or optical disk, etc. or other hardware storage that can store data. Further, the storage unit 302 according to the present invention may include a database, a cloud storage, and the like. Further, the storage unit 302 may include any software program that may also store the procedures executed for implementing the system for assisting disease inference of the present application.

As shown in fig. 4, fig. 4 is a general block diagram of a system for assisting disease inference according to an embodiment of the present invention, wherein the system for assisting disease inference generally includes at least the following components based on the same inventive concept: a processor 401, a memory 402, a communication interface 403, and a bus 404; the processor 401, the memory 402 and the communication interface 403 complete mutual communication through the bus 404; the communication interface 403 is used for implementing information interaction communication of the system for assisting disease inference and information transmission with other software or hardware; the processor 401 is adapted to invoke a computer program in the memory 402, which when executed implements the procedures performed by the system for assisting disease inference as described earlier in this application.

Based on the same inventive concept, yet another embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the procedures performed by the system for assisting disease inference as described previously in this application, and will not be described herein again.

In addition, the logic instructions in the memory may be implemented in the form of software functional units and may be stored in a computer readable storage medium when sold or used as a stand-alone product. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the processes executed by the system for assisting disease inference according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

For the computer-readable storage medium provided by the embodiment of the present invention, the operation principle and the beneficial effect of the computer program stored thereon are similar to those of the disease inference system provided by the above embodiment, and the detailed description is given with reference to the above embodiment, which is not described in detail herein.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiments of the present invention. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute various embodiments or portions of embodiments.

It will also be appreciated that various modifications may be made according to particular requirements. For example, customized hardware might also be used and/or particular elements might be implemented in hardware, software, firmware, middleware, microcode, hardware description languages, or any combination thereof. For example, some or all of the disclosed systems for assisting in disease reasoning and the procedures performed thereby may be implemented by programming hardware (e.g., programmable logic circuitry including Field Programmable Gate Arrays (FPGAs) and/or Programmable Logic Arrays (PLAs)) in assembly languages or hardware programming languages such as VERILOG, VHDL, C + +, using logic and algorithms in accordance with the present disclosure.

It should also be understood that the processes performed by the aforementioned system for assisting in disease reasoning may be implemented in a server-client mode. For example, a client may receive data input by a user and send the data to a server. The client may also receive data input by the user, perform a part of the processing in the flow executed by the system for assisting disease inference, and transmit the data obtained by the processing to the server. The server may receive data from the client and execute another part of the flow performed by the aforementioned system for assisting in disease inference or the flow performed by the aforementioned system for assisting in disease inference and return the execution result to the client. The client may receive the execution result of the flow executed by the system for assisting disease inference from the server, and may present it to the user through an output device, for example.

It should also be understood that the components of the system for assisting in disease reasoning can be distributed across a network. For example, some processes may be performed using one processor while other processes may be performed by another processor that is remote from the one processor. Other components of the system for assisting in disease reasoning may also be similarly distributed. In this way, the system for assisting in disease reasoning can be interpreted as a distributed computing system that performs processing at multiple locations.

Although embodiments or examples of the present disclosure have been described with reference to the accompanying drawings, it is to be understood that the above-described methods, systems and apparatus are merely exemplary embodiments or examples and that the scope of the present invention is not limited by these embodiments or examples, but only by the claims as issued and their equivalents. Various elements in the embodiments or examples may be omitted or may be replaced with equivalents thereof. Further, the steps may be performed in an order different from that described in the present disclosure. Further, the various elements in the embodiments or examples may be combined in various ways. It is important that as technology evolves, many of the elements described herein may be replaced with equivalent elements that appear after the present disclosure.

Claims

1. A system for assisting in disease reasoning, the system comprising:

a storage unit configured to store a multiplicity of maps and to store association information between diseases;

an acquisition unit configured to acquire an initial evidence set including at least a crowd information set of a user and a symptom set of the user; and

a processing unit, comprising:

an evidence expansion module configured to expand the symptom set acquired by the acquisition unit based on the multigraph stored in the storage unit, thereby obtaining an expanded symptom set and obtaining an associated diagnosis set based on the expanded symptom set;

a diagnosis prior probability module configured to compute a logarithm of a sum of diagnosis prior probabilities for the diagnoses in the set of diagnoses, wherein the diagnosis prior probabilities represent an incidence of the diagnoses in the set of diagnoses across the population;

a maximum population probability module configured to compute a logarithm of a sum of maximum population probabilities of demographic information in the set of demographic information for the user for diagnoses in the set of diagnoses, wherein the demographic information includes occupational and medical history information that affects the probability of having a diagnosis in the set of diagnoses;

a maximum clinical performance probability module configured to calculate a logarithm of a sum of maximum clinical performance probabilities for a symptom in the set of symptoms to a diagnosis in the set of diagnoses;

an objective function calculation module configured to calculate an optimization result value of an objective function, an optimization result set partitioning module configured to obtain all subsets of the associated diagnosis set and feed back the all subsets to the diagnosis prior probability module, the maximum population probability module, the maximum clinical performance probability module, respectively, to obtain corresponding optimization result values of the all subsets via the objective function calculation module;

a diagnostic set determination module configured to order the optimized results in the optimized result set corresponding to the associated diagnostic set and all subsets thereof by result value; and

a final diagnostic set determination module configured to output a final diagnostic set corresponding to the largest result value or to output one or more final diagnostic sets corresponding to the top-ranked result values.

2. The system of claim 1, further comprising a human-machine interface for receiving initial information input by a user, including the set of demographic information and the set of symptoms for the user, and a diagnostic interface, wherein the human-machine interface is configured to receive the initial information in whatever form the user can implement; and wherein the diagnostic interactive interface is to display one or more final diagnostic sets to the user.

3. The system of claim 1 or 2, wherein the processing unit further comprises:

a cold start clarification module configured to ask the user to add a new user symptom until a valid symptom is obtained if the symptom set obtained by the obtaining unit does not include any valid symptom.

4. The system of claim 1 or 2, wherein the processing unit further comprises:

a threshold comparison module configured to compare a maximum result value of the optimized result set with a predetermined threshold and to output a final diagnosis set corresponding to the maximum result value or output one or more final diagnosis sets corresponding to top-ranked result values by the final diagnosis set determination module when it is determined that the maximum result value is greater than the predetermined threshold.

5. The system of claim 4, wherein the processing unit further comprises:

a problem domain module configured to obtain a problem domain based on the multiple map for all symptoms associated with each diagnosis in the set of diagnoses when the threshold comparison module has determined that there are no result values greater than the predetermined threshold.

6. The system of claim 5, wherein the processing unit further comprises:

a query clarification module configured to query for the effective symptom with the highest sensitivity among the diagnoses in the set or subset of the most recently obtained optimized result set having the largest result value based on the obtained question domain and to add the newly obtained effective symptom to the recently obtained symptom set to obtain a new symptom set and to feed it back to the evidence expansion module.

7. The system of claim 6, wherein the processing unit further comprises:

a counting module configured to count a number of challenge clarifications by the challenge clarifications module and, when the number is greater than 5, to deactivate the threshold comparison module and output one or more final diagnosis sets with result values ranked top.

8. The system according to claim 1 or 2, wherein the system comprises:

a medical history interface configured to link with a hospital medical history system such that medical history information related to the user can be obtained from the hospital medical history system and inserted into the user's initial set of evidence when needed.

9. The system of claim 1 or 2, wherein the system further comprises:

a physician-side interactive interface configured to present evidence obtained from the user and the outputted one or more final diagnosis sets to a physician, wherein the physician-side interactive interface is capable of retrieving the stored symptom sets and the stored demographic information sets from the storage unit of the system.

10. A storage medium storing instructions that, when executed, implement at least the following:

(1) Acquiring an initial evidence set at least comprising a crowd information set of a user and a symptom set of the user;

(2) Expanding the acquired symptom set based on a multigraph, thereby obtaining an expanded symptom set and obtaining an associated diagnosis set based on the expanded symptom set;

(3) Calculating a logarithm of a sum of the diagnosis prior probabilities for the diagnoses in the set of diagnoses;

(4) Calculating a logarithm of a sum of maximum population probabilities of the demographic information in the set of demographic information for the user for the diagnoses in the set of diagnoses;

(5) Calculating a logarithm of a sum of a symptom in the set of symptoms to a maximum clinical manifestation probability of a diagnosis in the set of diagnoses;

(6) An optimization result value of the objective function is calculated,

(7) Obtaining all subsets of the associated diagnostic set and performing steps (3) - (6) on said all subsets, respectively, to obtain corresponding optimization results for said all subsets, thereby obtaining an optimized result set;

(8) Sorting the optimization results in the optimization result set according to result values; and

(9) Outputting a final diagnostic set corresponding to the largest result value or outputting one or more final diagnostic sets corresponding to the top ranked result values.