CN117370565A - Information retrieval method and system - Google Patents

Information retrieval method and system Download PDF

Info

Publication number
CN117370565A
CN117370565A CN202311512085.7A CN202311512085A CN117370565A CN 117370565 A CN117370565 A CN 117370565A CN 202311512085 A CN202311512085 A CN 202311512085A CN 117370565 A CN117370565 A CN 117370565A
Authority
CN
China
Prior art keywords
information
department
similarity
doctor
illness state
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311512085.7A
Other languages
Chinese (zh)
Inventor
徐浩然
任然
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing Institute of Engineering
Original Assignee
Chongqing Institute of Engineering
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing Institute of Engineering filed Critical Chongqing Institute of Engineering
Priority to CN202311512085.7A priority Critical patent/CN117370565A/en
Publication of CN117370565A publication Critical patent/CN117370565A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records

Abstract

The invention discloses an information retrieval method and system, which relate to the technical field of information retrieval, and the technical key points of the scheme are as follows: the method comprises the steps of constructing a topological space by taking the similarity between each piece of illness state information as the distance between different nodes, carrying out fuzzy clustering on a topological structure model by utilizing a subspace fuzzy clustering algorithm, extracting illness state information of a clustering center of each cluster, acquiring department information, calculating the similarity between the illness state information and the department information, taking a department with the largest similarity as a matching object, providing more comprehensive medical services for patients with complex illness states, acquiring all doctor information of the matched departments, constructing a fuzzy evaluation matrix, calculating comprehensive evaluation values of doctors by each index and weight corresponding to the index, taking the doctor with the largest comprehensive evaluation value as the matching object of the patient, and providing more accurate doctor matching for users.

Description

Information retrieval method and system
Technical Field
The invention relates to the technical field of information retrieval, in particular to an information retrieval method and system.
Background
With the increase in medical level and the development of informatization, more and more hospitals begin to disclose their medical information. However, how to quickly and accurately find departments and doctors suitable for own illness from massive hospital information becomes an important problem for patients, so it is of great importance to develop a hospital information retrieval system capable of automatically matching suitable departments and doctors according to illness states described by patients.
The prior art has the following defects:
1. current systems typically provide only department search functionality based on the condition description, where the user is required to enter information such as symptoms, disease names, etc., and the system returns a matching department. However, the matching method can only find a single department, and cannot cope with the situation that the complex illness state involves multiple departments, and in reality, the illness state of the patient is often complex and changeable, and may involve multiple different departments. For example, a patient may have both heart disease and diabetes, and need to consult both cardiology and endocrinology, and existing systems do not meet this need.
2. Current hospital information retrieval systems are generally only able to be matched to the corresponding department according to the condition description and not to further match to a specific doctor, however, in reality, different doctors in the same department may have different professional backgrounds, clinical experiences and adequacy areas. Thus, providing a more accurate doctor matching service to the user would improve the efficiency and quality of the medical service. In addition, for some complex conditions requiring expert consultation, it is also important for doctors to be able to match to multiple departments simultaneously.
Disclosure of Invention
(one) solving the technical problems
Aiming at the defects of the prior art, the invention provides an information retrieval method and system, which are used for solving the problems that the similarity between illness state information and department information is calculated through text mining on the illness state information, the similarity between illness state information is used as the distance between different nodes, a topological space is constructed, a subspace fuzzy clustering algorithm is utilized for fuzzy clustering the processed topological structure model, each illness state information is divided into corresponding clusters according to clustering results, illness state information of a clustering center of each cluster is extracted, department information is acquired, the similarity between illness state information and department information is calculated through illness state information vectors and department information vectors, the department with the largest similarity is used as a matching object, a plurality of departments are matched for a patient with complex illness state, all doctor information of the matched departments is acquired, characteristic information related to the patient is extracted, the extracted characteristic information is used as an evaluation index, a fuzzy evaluation matrix is constructed, the weight of each index is calculated through the fuzzy evaluation matrix, the comprehensive evaluation of the doctor is calculated through each index and the corresponding weight of the index, and the doctor is used as the matching object of the patient.
(II) technical scheme
In order to achieve the above purpose, the invention is realized by the following technical scheme: an information retrieval method comprising the steps of:
collecting hospital information including department information and doctor information through official channels of each hospital;
receiving illness information input by a user, taking each illness information as a node of a topological space according to the illness information of a patient, carrying out text mining on the illness information, calculating similarity between the illness information, and constructing a topological space by taking the similarity between each illness information as a distance between different nodes;
presetting a similarity threshold, combining adjacent nodes with similarity higher than the similarity threshold into a node, selecting an initial state of each node as a clustering center, carrying out fuzzy clustering on the processed topological structure model by using a subspace fuzzy clustering algorithm, and dividing each illness state information into corresponding clusters according to a clustering result;
extracting disease information of a clustering center of each cluster, acquiring department information, calculating similarity between the disease information and the department information through a disease information vector and a department information vector, and taking a department with the maximum similarity as a matching object;
acquiring all doctor information of a matched department, extracting characteristic information related to a patient, constructing a fuzzy evaluation matrix D by taking the extracted characteristic information as an evaluation index, and calculating the weight omega of each index through the fuzzy evaluation matrix D p Calculating the comprehensive evaluation of doctors through the indexes and the weights corresponding to the indexesAnd (c) a value Cv, wherein a doctor with the largest comprehensive evaluation value is taken as a matching object of the patient.
Further, the construction process of the topological space comprises the following steps:
receiving illness state information input by a user, wherein the illness state information comprises symptoms, medical history and examination results;
constructing a topological space according to disease information of patients by using a topological theory and method, and taking each disease information as a node of the topological space;
text mining is carried out on the illness state information, the similarity between the illness state information is calculated, and the similarity between each illness state information is used as the distance between different nodes.
Further, calculating the similarity between the condition information includes:
preprocessing disease information, including word segmentation, sentence segmentation and stop word removal, and extracting keywords or phrases from each piece of disease information;
each piece of illness state information is expressed as a vector composed of a plurality of keywords, each keyword is taken as a dimension of the vector, and the frequency of occurrence of the keyword in the illness state information is taken as the value of the dimension;
for any two illness state information vectors A and B, the similarity SI of the illness state information vectors A and B is calculated by using a cosine similarity formula, and the calculation formula is as follows:
wherein a is i Represented as one dimension, b, of vector a i Expressed as one dimension of the vector B, i is expressed as a label of the dimension, and n is expressed as the number of dimensions.
Further, classifying each condition information includes:
presetting a similarity threshold, merging adjacent nodes with similarity higher than the similarity threshold into a node, and updating the edge and node information in the topological structure model;
according to the node information in the processed topological structure model, selecting an initial state of each node as a clustering center, or randomly selecting a part of nodes as the clustering center;
fuzzy clustering is carried out on the processed topological structure model by utilizing a subspace fuzzy clustering algorithm, and in each iteration step, the membership of the clustering center and the cluster is continuously updated according to the similarity among the nodes and the updating rule of the clustering center;
and dividing each illness state information into corresponding clusters according to the clustering result.
Further, the department matching process includes:
extracting disease information of a clustering center of each cluster, and extracting keywords or phrases from the disease information;
the method comprises the steps of representing illness state information as a vector composed of a plurality of keywords, taking each keyword as a dimension of the vector, and taking the occurrence frequency of the keyword in the illness state information as a value of the dimension;
acquiring all department information, representing the department information as vectors composed of the same keywords in the illness state information, taking each keyword as one dimension of the vectors, and taking the occurrence frequency of the keyword in the department information as the value of the dimension;
the similarity between the illness state information and the department information is calculated through the illness state information vector and the department information vector, and the maximum similarity M between the illness state information and the department information is obtained, wherein the calculation formula is as follows:
wherein I is represented as a mark of departments, N is represented as the number of departments, and Sd is represented as the similarity between disease information and department information;
taking the department with the maximum similarity as a matching object.
Further, the doctor matching process includes:
acquiring all doctor information of a matched department, and extracting characteristic information related to patients, wherein the characteristic information comprises clinical experience, professional field and patient public praise;
the extracted characteristic information is used as an evaluation index to construct a fuzzy evaluation matrix D:
wherein d pq The importance of the p-th index relative to the q-th index is expressed, p is the number of rows of the matrix, and q is the number of columns of the matrix.
Further, the doctor matching process further includes:
calculating the weight omega of each index through the fuzzy evaluation matrix D p The calculation formula is as follows:
wherein m is the total number of evaluation indexes, k is a positive integer, k is more than or equal to 1 and less than or equal to m, p is more than or equal to 1 and less than or equal to m, and q is more than or equal to 1 and less than or equal to m.
Further, the doctor matching process further includes:
through each index and the weight corresponding to the index, the doctor is evaluated, the comprehensive evaluation value Cv of the doctor is calculated, and the calculation formula is as follows:
wherein m is the total number of evaluation indexes, x p A quantized value of the p-th index;
and taking the doctor with the largest comprehensive evaluation value as a matching object of the patient.
An information retrieval system, comprising: the system comprises a data acquisition module, a data storage module, a disease information retrieval module and a result display module; wherein,
the data acquisition module is used for collecting hospital information, including department information and doctor information, through official channels of various hospitals, cleaning and processing the collected information, and storing the processed data;
the data storage module is used for storing the collected department information and doctor information, and the information is stored in a relational database or a non-relational database so as to facilitate subsequent retrieval and processing;
the disease information retrieval module comprises a disease information analysis unit, a department matching unit and a doctor matching unit, wherein the disease information analysis unit performs text mining on disease information, calculates similarity between disease information, takes the similarity between each disease information as the distance between different nodes, constructs a topological space, performs fuzzy clustering on the processed topological structure model by using a subspace fuzzy clustering algorithm, and divides each disease information into corresponding clusters according to clustering results;
the department matching unit is used for extracting the illness state information of the clustering center of each cluster and acquiring department information, calculating the similarity between the illness state information and the department information through the illness state information vector and the department information vector, and taking the department with the largest similarity as a matching object;
the doctor matching unit is used for acquiring all doctor information of a matched department, extracting characteristic information related to a patient, constructing a fuzzy evaluation matrix D by taking the extracted characteristic information as an evaluation index, and calculating the weight omega of each index through the fuzzy evaluation matrix D p Calculating a comprehensive evaluation value Cv of a doctor through each index and the weight corresponding to the index, and taking the doctor with the largest comprehensive evaluation value as a matching object of the patient;
and the result display module is used for displaying the matched department information and doctor information to the user.
(III) beneficial effects
The invention provides an information retrieval method and system, which have the following beneficial effects:
(1) The constructed topological space can be used for data visualization, complex illness state information and medical resource information are presented to a user in a more visual mode, the user is helped to better know own illness state and medical resource distribution conditions, doctor and department can be selected better, personalized recommendation service can be developed based on the constructed topological space, corresponding doctor and department and treatment scheme of related diseases can be recommended according to illness state information and preference of patients, and humanization and accuracy of medical service are improved.
(2) The processed topological structure model is subjected to fuzzy clustering by utilizing a subspace fuzzy clustering algorithm, disease information can be more accurately divided into corresponding clusters according to the similarity between nodes and the side information in the topological structure model, the clustering quality is improved, more accurate results are provided for subsequent recommendation and matching, larger clusters can be formed by combining adjacent nodes, the clustering results are easier to explain, and doctors or patients can more clearly understand the similarity and difference between different disease information, so that the disease and treatment scheme can be better understood.
(3) The department most matched with the illness state of the patient can be found more accurately by calculating the similarity between the illness state information and the department information, the pertinence and the efficiency of medical service are improved, the illness state information is classified, different departments are matched for patients with complicated illness states according to different illness states, cooperation and communication among different departments can be promoted, a multi-department combined treatment team is formed, and more comprehensive and more specialized medical service is provided for the patients.
(4) By extracting the characteristic information related to the patient and calculating the weight of each index by using the fuzzy evaluation matrix D, the comprehensive strength of doctors can be evaluated more accurately, doctors which are best matched with the needs of the patient can be found, the pertinence and the efficiency of medical service are improved, the distribution of medical resources is optimized, and the patient is ensured to obtain the most suitable medical service.
Drawings
FIG. 1 is a flow chart of an information retrieval method according to the present invention;
fig. 2 is a schematic diagram of an information retrieval system according to the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, the present invention provides an information retrieval method, taking hospital information retrieval as an example, comprising the following steps:
step one: collecting hospital information, including department information and doctor information, through official channels of each hospital, cleaning and processing the collected information, and storing the processed data;
the first step comprises the following steps:
step 101: determining data sources, and taking official websites, weChat public numbers and the like of hospitals as sources for data collection;
step 102: analyzing and extracting webpage data by using a crawler program, wherein the webpage data comprises department information and doctor information;
step 103: after the related data are extracted, the data are cleaned and processed to remove repeated, erroneous or incomplete data, and meanwhile, the data are formatted and standardized for subsequent storage and use;
step 104: and storing the processed information.
It should be noted that, since the hospital information is dynamically updated, the data collection needs to have the characteristics of real-time and update, so that the latest hospital information can be collected periodically or in real time and stored, so as to ensure the data freshness and accuracy of the retrieval system.
Combining the contents of steps 101 to 104:
by collecting data from information disclosure channels of a plurality of hospitals, more comprehensive department and doctor information can be obtained, information omission and prejudice can be avoided, a search result is more representative and objective, a crawler program can automatically extract department and doctor information on a hospital website or a social media platform, the possibility of human intervention and errors is reduced, accuracy and credibility of the data are improved, and a reliable basis is provided for subsequent search and processing.
Step two: receiving illness information input by a user, taking each illness information as a node of a topological space according to the illness information of a patient, carrying out text mining on the illness information, calculating similarity between the illness information, and constructing a topological space by taking the similarity between each illness information as a distance between different nodes;
the second step comprises the following steps:
step 201: receiving illness state information input by a user, including symptoms, medical history, examination results and the like;
step 202: constructing a topological space according to disease information of patients by using a topological theory and method, and taking each disease information as a node of the topological space;
step 203: text mining is carried out on the illness state information, the similarity between the illness state information is calculated, and the similarity between each illness state information is used as the distance between different nodes.
It should be noted that, the similarity between each disease information is taken as the distance between different nodes, so as to construct a topological space, in this space, the shorter the distance between the nodes is, the higher the similarity between the nodes is, and various calculation modes can be adopted for calculating the similarity, including cosine similarity, pearson correlation coefficient and euclidean distance.
The calculating of the similarity and the association degree between different nodes specifically comprises the following steps:
step 2031: preprocessing disease information, including word segmentation, sentence segmentation and stop word removal, and extracting key words or phrases from each piece of disease information, such as symptoms, diagnosis results, treatment methods and the like;
step 2032: each piece of illness state information is expressed as a vector composed of a plurality of keywords, each keyword is taken as a dimension of the vector, and the frequency of occurrence of the keyword in the illness state information is taken as the value of the dimension;
in the vector, the dimension refers to each component of the vector, namely each feature of the vector, and each keyword is taken as a dimension of one vector, which means that the keyword is taken as one component of the vector, and each keyword represents one feature;
step 2033: for any two illness state information vectors A and B, the similarity SI of the illness state information vectors A and B is calculated by using a cosine similarity formula, and the calculation formula is as follows:
wherein a is i Represented as one dimension, b, of vector a i Expressed as one dimension of the vector B, i is expressed as a label of the dimension, and n is expressed as the number of dimensions.
It should be noted that, in practical application, other factors and data features, such as personal preference of a patient, distribution of medical resources, etc. need to be considered, so as to further improve accuracy and efficiency of matching, and meanwhile, different algorithms and parameter choices may also affect the calculation result, so that appropriate algorithms and parameters need to be selected according to specific situations.
Combining the contents of steps 201 to 203:
the constructed topological space can be used for data visualization, complex illness state information and medical resource information are presented to a user in a more visual mode, the user is helped to better know own illness state and medical resource distribution conditions, doctor and department can be selected better, personalized recommendation service can be developed based on the constructed topological space, corresponding doctor and department and treatment scheme of related diseases can be recommended according to illness state information and preference of patients, and humanization and accuracy of medical service are improved.
Step three: presetting a similarity threshold, combining adjacent nodes with similarity higher than the similarity threshold into a node, selecting an initial state of each node as a clustering center, carrying out fuzzy clustering on the processed topological structure model by using a subspace fuzzy clustering algorithm, and dividing each illness state information into corresponding clusters according to a clustering result;
the third step comprises the following steps:
step 301: presetting a similarity threshold, merging adjacent nodes with similarity higher than the similarity threshold into a node, and updating the edge and node information in the topological structure model;
it should be noted that, when node merging is performed, an appropriate merging policy needs to be selected, a merging policy based on similarity may be used, for example, calculating similarity between nodes, and merging neighboring nodes with higher similarity into one node, or a merging policy based on density may be used, for example, using connection relationship and density information between nodes, and merging neighboring nodes with higher density into one node.
If there is a connection relationship between the nodes to be merged, the information of the edges between them may be merged, for example, the weights of the edges may be added, or the information of the edges may be recalculated according to a certain rule, after merging the neighboring nodes, the attribute information of the remaining nodes needs to be updated, for example, the attributes of the nodes to be merged may be averaged or weighted averaged, and then the updated attributes may be assigned to the new nodes.
Step 302: according to the node information in the processed topological structure model, selecting an initial state of each node as a clustering center, or randomly selecting a part of nodes as the clustering center;
step 303: fuzzy clustering is carried out on the processed topological structure model by utilizing a subspace fuzzy clustering algorithm, and in each iteration step, the membership of the clustering center and the cluster is continuously updated according to the similarity among the nodes and the updating rule of the clustering center;
step 304: and dividing each illness state information into corresponding clusters according to the clustering result.
It should be noted that, the processed topological structure model is subjected to fuzzy clustering by utilizing a subspace fuzzy clustering algorithm. Common Fuzzy clustering algorithms such as FCM (Fuzzy C-Means) or SFCM (Soft Fuzzy C-Means) may be employed. In the implementation process, proper parameters such as a distance measurement method, a fuzzy index, an iteration stop condition and the like are required to be selected according to actual conditions.
Combining the contents of steps 301 to 304:
the processed topological structure model is subjected to fuzzy clustering by utilizing a subspace fuzzy clustering algorithm, disease information can be more accurately divided into corresponding clusters according to the similarity between nodes and the side information in the topological structure model, the clustering quality is improved, more accurate results are provided for subsequent recommendation and matching, larger clusters can be formed by combining adjacent nodes, the clustering results are easier to explain, and doctors or patients can more clearly understand the similarity and difference between different disease information, so that the disease and treatment scheme can be better understood.
Step four: extracting disease information of a clustering center of each cluster, acquiring department information, calculating similarity between the disease information and the department information through a disease information vector and a department information vector, and taking a department with the maximum similarity as a matching object;
the fourth step comprises the following steps:
step 401: extracting disease information of a clustering center of each cluster, and extracting keywords or phrases from the disease information;
step 402: the method comprises the steps of representing illness state information as a vector composed of a plurality of keywords, taking each keyword as a dimension of the vector, and taking the occurrence frequency of the keyword in the illness state information as a value of the dimension;
step 403: acquiring all department information, representing the department information as vectors composed of the same keywords in the illness state information, taking each keyword as one dimension of the vectors, and taking the occurrence frequency of the keyword in the department information as the value of the dimension;
step 404: the similarity between the illness state information and the department information is calculated through the illness state information vector and the department information vector, and the maximum similarity M between the illness state information and the department information is obtained, wherein the calculation formula is as follows:
wherein I is represented as a mark of departments, N is represented as the number of departments, and Sd is represented as the similarity between disease information and department information;
step 405: taking the department with the maximum similarity as a matching object.
It should be noted that, the disease information of some patients may be relatively complex, and a single department may not completely meet their requirements, in which case, a mode of multi-department combined treatment may be considered to enable the patients to obtain more comprehensive and more specialized medical services, so that the disease information of the patients is classified to match with the multiple departments for common treatment.
Combining the contents of steps 401 to 405:
the department most matched with the illness state of the patient can be found more accurately by calculating the similarity between the illness state information and the department information, the pertinence and the efficiency of medical service are improved, the illness state information is classified, different departments are matched for patients with complicated illness states according to different illness states, cooperation and communication among different departments can be promoted, a multi-department combined treatment team is formed, and more comprehensive and more specialized medical service is provided for the patients.
Step five: acquiring all doctor information of a matched department, extracting characteristic information related to a patient, constructing a fuzzy evaluation matrix D by taking the extracted characteristic information as an evaluation index, and calculating the weight omega of each index through the fuzzy evaluation matrix D p And calculating a comprehensive evaluation value Cv of the doctor through the indexes and the weights corresponding to the indexes, and taking the doctor with the largest comprehensive evaluation value as a matching object of the patient.
The fifth step comprises the following steps:
step 501: acquiring all doctor information of a matched department, and extracting characteristic information related to patients, wherein the characteristic information comprises clinical experience, professional field, patient public praise and the like;
step 502: the extracted characteristic information is used as an evaluation index to construct a fuzzy evaluation matrix D:
wherein d pq The importance degree of the p-th index relative to the q-th index is expressed, p is the number of rows of the matrix, and q is the number of columns of the matrix;
step 503: calculating the weight omega of each index through the fuzzy evaluation matrix D p The calculation formula is as follows:
wherein m is the total number of evaluation indexes, k is a positive integer, k is more than or equal to 1 and less than or equal to m, p is more than or equal to 1 and less than or equal to m, and q is more than or equal to 1 and less than or equal to m;
step 504: through each index and the weight corresponding to the index, the doctor is evaluated, the comprehensive evaluation value Cv of the doctor is calculated, and the calculation formula is as follows:
wherein m is the total number of evaluation indexes, x p A quantized value of the p-th index;
step 505: and taking the doctor with the largest comprehensive evaluation value as a matching object of the patient.
It should be noted that, the normalization or standardization of doctor information including characteristics such as experience, professional field, and patient public praise, so that they have the same scale, is a very important step, so that it can ensure that all the characteristics use the same measurement unit in comparison and decision, avoid deviation caused by scale differences of different characteristics, and use a normalization method or a normalization method.
Combining the contents of steps 501 to 505:
by extracting the characteristic information related to the patient and calculating the weight of each index by using the fuzzy evaluation matrix D, the comprehensive strength of doctors can be evaluated more accurately, doctors which are best matched with the needs of the patient can be found, the pertinence and the efficiency of medical service are improved, the distribution of medical resources is optimized, and the patient is ensured to obtain the most suitable medical service.
Referring to fig. 2, the present invention further provides an information retrieval system, including: the system comprises a data acquisition module, a data storage module, a disease information retrieval module and a result display module; wherein,
the data acquisition module is used for collecting hospital information, including department information and doctor information, through official channels of various hospitals, cleaning and processing the collected information, and storing the processed data;
the data storage module is used for storing the collected department information and doctor information, and the information is stored in a relational database or a non-relational database so as to facilitate subsequent retrieval and processing;
the disease information retrieval module comprises a disease information analysis unit, a department matching unit and a doctor matching unit, wherein the disease information analysis unit performs text mining on disease information, calculates similarity between disease information, takes the similarity between each disease information as the distance between different nodes, constructs a topological space, performs fuzzy clustering on the processed topological structure model by using a subspace fuzzy clustering algorithm, and divides each disease information into corresponding clusters according to clustering results;
the department matching unit is used for extracting the illness state information of the clustering center of each cluster and acquiring department information, calculating the similarity between the illness state information and the department information through the illness state information vector and the department information vector, and taking the department with the largest similarity as a matching object;
the doctor matching unit is used for acquiring all doctor information of a matched department, extracting characteristic information related to a patient, constructing a fuzzy evaluation matrix D by taking the extracted characteristic information as an evaluation index, and calculating the weight omega of each index through the fuzzy evaluation matrix D p Calculating the medical science through each index and the weight corresponding to the indexThe generated comprehensive evaluation value Cv takes a doctor with the largest comprehensive evaluation value as a matching object of the patient;
and the result display module is used for displaying the matched department information and doctor information to the user.
In the application, the related formulas are all the numerical calculation after dimensionality removal, and the formulas are one formulas for obtaining the latest real situation by software simulation through collecting a large amount of data, and the formulas are set by a person skilled in the art according to the actual situation.
The above embodiments may be implemented in whole or in part by software, hardware, firmware, or any other combination. When implemented in software, the above-described embodiments may be implemented in whole or in part in the form of a computer program product. Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
The foregoing is merely specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes and substitutions are intended to be covered by the scope of the present application.

Claims (9)

1. An information retrieval method, comprising the steps of:
collecting hospital information including department information and doctor information through official channels of each hospital;
receiving illness information input by a user, taking each illness information as a node of a topological space according to the illness information of a patient, carrying out text mining on the illness information, calculating similarity between the illness information, and constructing a topological space by taking the similarity between each illness information as a distance between different nodes;
presetting a similarity threshold, combining adjacent nodes with similarity higher than the similarity threshold into a node, selecting an initial state of each node as a clustering center, carrying out fuzzy clustering on the processed topological structure model by using a subspace fuzzy clustering algorithm, and dividing each illness state information into corresponding clusters according to a clustering result;
extracting disease information of a clustering center of each cluster, acquiring department information, calculating similarity between the disease information and the department information through a disease information vector and a department information vector, and taking a department with the maximum similarity as a matching object;
acquiring all doctor information of a matched department, extracting characteristic information related to a patient, constructing a fuzzy evaluation matrix D by taking the extracted characteristic information as an evaluation index, and calculating the weight omega of each index through the fuzzy evaluation matrix D p And calculating a comprehensive evaluation value Cv of the doctor through the indexes and the weights corresponding to the indexes, and taking the doctor with the largest comprehensive evaluation value as a matching object of the patient.
2. An information retrieval method as claimed in claim 1, wherein,
the construction process of the topological space comprises the following steps: receiving illness state information input by a user, wherein the illness state information comprises symptoms, medical history and examination results;
constructing a topological space according to disease information of patients by using a topological theory and method, and taking each disease information as a node of the topological space; text mining is carried out on the illness state information, the similarity between the illness state information is calculated, and the similarity between each illness state information is used as the distance between different nodes.
3. An information retrieval method as claimed in claim 2, wherein,
calculating the similarity between the condition information includes: preprocessing disease information, including word segmentation, sentence segmentation and stop word removal, and extracting keywords or phrases from each piece of disease information; each piece of illness state information is expressed as a vector composed of a plurality of keywords, each keyword is taken as a dimension of the vector, and the frequency of occurrence of the keyword in the illness state information is taken as the value of the dimension;
for any two illness state information vectors A and B, the similarity SI of the illness state information vectors A and B is calculated by using a cosine similarity formula, and the calculation formula is as follows:
wherein a is i Represented as one dimension, b, of vector a i Expressed as one dimension of the vector B, i is expressed as a label of the dimension, and n is expressed as the number of dimensions.
4. An information retrieval method as claimed in claim 3, wherein,
classifying each condition information includes: presetting a similarity threshold, combining adjacent nodes with similarity higher than the similarity threshold into a node, updating the edge and node information in the topological structure model, and selecting the initial state of each node as a clustering center or randomly selecting a part of nodes as the clustering center according to the node information in the processed topological structure model;
fuzzy clustering is carried out on the processed topological structure model by utilizing a subspace fuzzy clustering algorithm, and in each iteration step, the membership of the clustering center and the cluster is continuously updated according to the similarity among the nodes and the updating rule of the clustering center; and dividing each illness state information into corresponding clusters according to the clustering result.
5. An information retrieval method as claimed in claim 4, wherein,
the department matching process comprises the following steps: extracting disease information of a clustering center of each cluster, and extracting keywords or phrases from the disease information; the method comprises the steps of representing illness state information as a vector composed of a plurality of keywords, taking each keyword as a dimension of the vector, and taking the occurrence frequency of the keyword in the illness state information as a value of the dimension; acquiring all department information, representing the department information as vectors composed of the same keywords in the illness state information, taking each keyword as one dimension of the vectors, and taking the occurrence frequency of the keyword in the department information as the value of the dimension;
the similarity between the illness state information and the department information is calculated through the illness state information vector and the department information vector, and the maximum similarity M between the illness state information and the department information is obtained, wherein the calculation formula is as follows:
i is a mark of departments, N is the number of departments, sd is the similarity between disease information and department information, and the department with the largest similarity is taken as a matching object.
6. An information retrieval method as claimed in claim 1, wherein,
the doctor matching process includes: acquiring information of all doctors in a matched department, and extracting characteristic information related to patients, wherein the characteristic information comprises clinical experience and professional fields; the extracted characteristic information is used as an evaluation index to construct a fuzzy evaluation matrix D:
wherein d pq The importance of the p-th index relative to the q-th index is expressed, p is the number of rows of the matrix, and q is the number of columns of the matrix.
7. An information retrieval method as claimed in claim 6, wherein,
calculating the weight omega of each index through the fuzzy evaluation matrix D p The calculation formula is as follows:
wherein m is the total number of evaluation indexes, k is a positive integer, k is more than or equal to 1 and less than or equal to m, p is more than or equal to 1 and less than or equal to m, and q is more than or equal to 1 and less than or equal to m.
8. An information retrieval method as claimed in claim 7, wherein,
through each index and the weight corresponding to the index, the doctor is evaluated, the comprehensive evaluation value Cv of the doctor is calculated, and the calculation formula is as follows:
wherein m is the total number of evaluation indexes, x p A quantized value of the p-th index; and taking the doctor with the largest comprehensive evaluation value as a matching object of the patient.
9. An information retrieval system, to which the method according to any one of claims 1 to 8 is applied,
the system comprises a data acquisition module, a data storage module, a disease information retrieval module and a result display module; wherein,
the data acquisition module is used for collecting hospital information, including department information and doctor information, through official channels of various hospitals, cleaning and processing the collected information, and storing the processed data;
the data storage module is used for storing the collected department information and doctor information, and the information is stored in a relational database or a non-relational database so as to facilitate subsequent retrieval and processing;
the disease information retrieval module comprises a disease information analysis unit, a department matching unit and a doctor matching unit, wherein the disease information analysis unit performs text mining on disease information, calculates similarity between disease information, takes the similarity between each disease information as the distance between different nodes, constructs a topological space, performs fuzzy clustering on the processed topological structure model by using a subspace fuzzy clustering algorithm, and divides each disease information into corresponding clusters according to clustering results;
the department matching unit is used for extracting the illness state information of the clustering center of each cluster and acquiring department information, calculating the similarity between the illness state information and the department information through the illness state information vector and the department information vector, and taking the department with the largest similarity as a matching object;
the doctor matching unit is used for acquiring all doctor information of a matched department, extracting characteristic information related to a patient, constructing a fuzzy evaluation matrix D by taking the extracted characteristic information as an evaluation index, and calculating the weight omega of each index through the fuzzy evaluation matrix D p Calculating a comprehensive evaluation value Cv of a doctor through each index and the weight corresponding to the index, and taking the doctor with the largest comprehensive evaluation value as a matching object of the patient;
and the result display module is used for displaying the matched department information and doctor information to the user.
CN202311512085.7A 2023-11-14 2023-11-14 Information retrieval method and system Pending CN117370565A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311512085.7A CN117370565A (en) 2023-11-14 2023-11-14 Information retrieval method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311512085.7A CN117370565A (en) 2023-11-14 2023-11-14 Information retrieval method and system

Publications (1)

Publication Number Publication Date
CN117370565A true CN117370565A (en) 2024-01-09

Family

ID=89407744

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311512085.7A Pending CN117370565A (en) 2023-11-14 2023-11-14 Information retrieval method and system

Country Status (1)

Country Link
CN (1) CN117370565A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117789965A (en) * 2024-02-23 2024-03-29 北京惠每云科技有限公司 Venous thrombosis prevention simulation evaluation method and system based on digital twin

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117789965A (en) * 2024-02-23 2024-03-29 北京惠每云科技有限公司 Venous thrombosis prevention simulation evaluation method and system based on digital twin

Similar Documents

Publication Publication Date Title
Bashir et al. BagMOOV: A novel ensemble for heart disease prediction bootstrap aggregation with multi-objective optimized voting
Abu et al. Ensemble learning for multidimensional poverty classification
KR101903522B1 (en) The method of search for similar case of multi-dimensional health data and the apparatus of thereof
CN112204671A (en) Personalized device recommendation for active health monitoring and management
Pathan et al. Identifying stroke indicators using rough sets
CN110299209B (en) Similar medical record searching method, device and equipment and readable storage medium
Zhou et al. Modeling methodology for early warning of chronic heart failure based on real medical big data
CN116364299B (en) Disease diagnosis and treatment path clustering method and system based on heterogeneous information network
CN117253614B (en) Diabetes risk early warning method based on big data analysis
Tabrizi et al. Towards a patient satisfaction based hospital recommendation system
Khalid et al. Machine learning hybrid model for the prediction of chronic kidney disease
Wang et al. Automatic diagnosis with efficient medical case searching based on evolving graphs
CN117370565A (en) Information retrieval method and system
Kwon et al. Deep learning algorithm to predict need for critical care in pediatric emergency departments
Ullah et al. Explainable artificial intelligence approach in combating real-time surveillance of COVID19 pandemic from CT scan and X-ray images using ensemble model
CN110910991A (en) Medical automatic image processing system
CN105718726A (en) Medical auxiliary examination system knowledge acquisition and inference method based on rough set
Spanier et al. A new method for the automatic retrieval of medical cases based on the RadLex ontology
Saleem Durai et al. An intelligent knowledge mining model for kidney cancer using rough set theory
Folino et al. A recommendation engine for disease prediction
Zeng Length of stay prediction model of indoor patients based on light gradient boosting machine
Yu et al. Temporal case matching with information value maximization for predicting physiological states
Christopher et al. Knowledge-based systems and interestingness measures: Analysis with clinical datasets
Perwej et al. An Intelligent Cardiac Ailment Prediction Using Efficient ROCK Algorithm and K-Means & C4. 5 Algorithm
AU2021102593A4 (en) A Method for Detection of a Disease

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination