WO2021189983A1 - Medical case search method and apparatus based on interactive feedback, and readable storage medium - Google Patents

Medical case search method and apparatus based on interactive feedback, and readable storage medium Download PDF

Info

Publication number
WO2021189983A1
WO2021189983A1 PCT/CN2020/136406 CN2020136406W WO2021189983A1 WO 2021189983 A1 WO2021189983 A1 WO 2021189983A1 CN 2020136406 W CN2020136406 W CN 2020136406W WO 2021189983 A1 WO2021189983 A1 WO 2021189983A1
Authority
WO
WIPO (PCT)
Prior art keywords
case
similar
cases
similarity
similar cases
Prior art date
Application number
PCT/CN2020/136406
Other languages
French (fr)
Chinese (zh)
Inventor
孔令炜
王健宗
黄章成
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021189983A1 publication Critical patent/WO2021189983A1/en

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • G06F18/23213Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • G16H70/60ICT specially adapted for the handling or processing of medical references relating to pathologies

Definitions

  • This application relates to the field of artificial intelligence technology, and in particular to a case search method, device, electronic device, and computer-readable storage medium based on interactive feedback.
  • the inventor is aware of the above-mentioned problems existing in the traditional case retrieval method, and the present application urgently needs to provide a case search method based on interactive feedback.
  • This application provides a case search method, device, electronic device, and computer-readable storage medium based on interactive feedback, the main purpose of which is to quickly and effectively search for a case that best matches the condition from a large number of cases.
  • the interactive feedback-based case search method provided by this application includes:
  • the supervised learning result of the learning model determine and feed back to the user a second-level similar case that is more similar to the initial case, until a final similar case that is similar to the initial case reaches a preset requirement is determined.
  • the present application also provides a case search device based on interactive feedback, the device includes:
  • the case search module is used to search for first-level similar cases similar to the initial case from the existing case database according to preset matching rules
  • the case ranking module is used for ranking the similar cases according to their similarity to the initial case, and feeding back a preset number of similar cases to the user according to the ranking;
  • the case learning module is used to perform supervised learning on the similar cases through a learning model according to the user's feedback information on the similar cases;
  • the case feedback module is used to determine and feed back to the user a second-level similar case that is more similar to the initial case according to the supervised learning result of the learning model, until it is determined that the similarity to the initial case reaches the expected level. Set the required final similar cases.
  • an electronic device which includes:
  • At least one processor and,
  • a memory communicatively connected with the at least one processor; wherein,
  • the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can perform the following steps:
  • the supervised learning result of the learning model determine and feed back to the user a second-level similar case that is more similar to the initial case, until a final similar case that is similar to the initial case reaches a preset requirement is determined.
  • the present application also provides a computer-readable storage medium having at least one instruction stored in the computer-readable storage medium, and the at least one instruction is executed by a processor in an electronic device to implement the above-mentioned Case search method based on interactive feedback.
  • the embodiment of this application searches the existing case database for first-level similar cases that are similar to the initial case; sorts the similar cases according to their similarity to the initial case, and sorts them according to the degree of similarity to the initial case. Sorting feeds back a preset number of similar cases to the user; according to the user’s feedback information on the similar cases, supervised learning of the similar cases through the learning model; determining and feeding back to the user according to the supervised learning results of the learning model A second-level similar case with a higher degree of similarity to the initial case is determined until a final similar case whose degree of similarity to the initial case reaches the preset requirement is determined.
  • the search results are continuously updated through feedback on the search results.
  • the classification of cases is divided by clustering, and the search results are supervised learning through the learning model, so as to quickly and effectively learn from a large number of cases.
  • the case that best matches the condition of the disease is searched; the search method of this application makes different responses through direct or indirect feedback from the user, continuously updates the search content, realizes the interaction between the engine and the user, and improves the search efficiency of the user.
  • FIG. 1 is a schematic flowchart of a case search method based on interactive feedback provided by an embodiment of this application;
  • FIG. 2 is a schematic diagram of modules of a case search device based on interactive feedback provided by an embodiment of this application;
  • FIG. 3 is a schematic diagram of the internal structure of an electronic device for implementing a case search method based on interactive feedback provided by an embodiment of the application;
  • This application provides a case search method based on interactive feedback.
  • FIG. 1 it is a schematic flowchart of a case search and identification method based on interactive feedback provided by an embodiment of this application.
  • the method can be executed by a device, and the device can be implemented by software and/or hardware.
  • the case search method based on interactive feedback includes:
  • S1 According to preset matching rules, search for first-level similar cases similar to the initial case from the existing case database;
  • S2 Sort the similar cases according to their similarity to the initial case, and feed back a preset number of similar cases to the user according to the sort;
  • S3 Perform supervised learning on the similar case through a learning model according to the user's feedback information on the similar case;
  • the above-mentioned case search method based on interactive feedback based on the artificial intelligence of this application can facilitate doctors or clinical researchers (hereinafter referred to as users) to obtain cases that need to be known and referred to.
  • the search method continuously updates the search results from the user's feedback on the search results, and searches for a case that best matches the condition from a large number of cases.
  • the cases are stored in text form.
  • the cases include but are not limited to pathological data such as patient information, clinical manifestations, and final diagnosis results.
  • step S1 searching for a first-level similar case similar to the initial case from the existing case database according to the preset matching rules, including the following steps:
  • the cases in the case database are matched with the initial case one by one, so as to determine a first-level similar case that matches the initial case.
  • the system parses the initial case to generate a search command, and searches among the massive cases in the case database.
  • step S2 the similar cases are sorted according to their similarity to the initial case, and a preset number of similar cases are fed back to the user according to the sort.
  • the searched similar cases are fed back to the user according to the degree of similarity to the initial case.
  • the search engine executes the search command and returns a batch of case texts in descending order of similarity.
  • the system uses the k-means algorithm to cluster the case files in advance on the case text in the case database.
  • the operation is as follows:
  • S26 Calculate the center vector of the initial case and each category of cases to calculate the L 2 norm distance, and feed back all the cases in the category with the smallest distance to the user according to the degree of similarity.
  • the distance between the search text x and the i-th category center ⁇ i is:
  • d ij represents the L 2 norm distance between the searched case and the center of each category of cases
  • x j represents the search case
  • ⁇ i represents the i-th classification center vector
  • step S3 according to the user's feedback information on the similar cases, the supervised learning of the similar cases through the learning model includes the following steps:
  • the user’s feedback information on the similar cases includes: similar cases with a high degree of similarity and similar cases with a low degree of similarity, among which,
  • the learning model is used to supervise and learn similar cases with a high degree of similarity to the initial case, where the logistic regression formula is used to calculate the probability:
  • x is the feature of the participant's data
  • y is the label of the data
  • the loss value of the function can be calculated, and the convergence of the model is achieved on the basis of the lowest loss value area. That is, according to the supervised learning result of the learning model, the case that is more similar to the initial case is fed back to the user, until the case that most matches the initial case is fed back to the user.
  • step S4 the user's feedback information on the similar cases includes: similar cases with a high degree of similarity and similar cases with a low degree of similarity,
  • the modified preset matching rules and the supervised learning results of the learning model further search the case database for secondary cases similar to the initial case, and give feedback to the user
  • similar cases with a high degree of similarity and similar cases with a low degree of similarity are respectively strengthened and weakened.
  • the user consults the similar cases fed back by the system, and returns the search after consulting the text of a certain case. engine.
  • the search engine gets feedback, and if the user thinks that the case text description they have just looked up is close, then strengthen the operation. On the contrary, if the user thinks that the case text just checked is not useful, then the weakening operation is performed.
  • Strengthen and weaken operations that is, the system records the case texts approved and disapproved by the user each time, and updates the two sets with labels 1 and 0 respectively.
  • strengthening and weakening operations also include:
  • the intensive operation extracts all words in the document in the tag 1 set and calculates the number of occurrences, and divides the number of occurrences by the number of occurrences of the word in all case texts in the database to obtain the word frequency. Add the first three words in word frequency to the "text contains" item in the search command.
  • the weakening operation is to extract the word frequency of the documents in the tag 0 set, and add the first three words of the word frequency to the "text does not contain" item in the background search command.
  • the distance to the center L_2 norm of the text of each category will be recalculated. If the distance is found to be less than the category of the current search, it will switch to the new category or merge the category to search. This correction is avoided to a certain extent The search results converge to categories that are not required by the user.
  • search engines can also obtain the following ways:
  • the embodiment of this application searches the existing case database for first-level similar cases that are similar to the initial case; sorts the similar cases according to their similarity to the initial case, and sorts them according to the degree of similarity to the initial case. Sorting feeds back a preset number of similar cases to the user; according to the user’s feedback information on the similar cases, supervised learning of the similar cases through the learning model; determining and feeding back to the user according to the supervised learning results of the learning model A second-level similar case with a higher degree of similarity to the initial case is determined until a final similar case whose degree of similarity to the initial case reaches the preset requirement is determined.
  • the search results are continuously updated through feedback on the search results.
  • the classification of cases is divided by clustering, and the search results are supervised learning through the learning model, so as to quickly and effectively learn from a large number of cases.
  • the case that best matches the condition of the disease is searched; the search method of this application makes different responses through direct or indirect feedback from the user, continuously updates the search content, realizes the interaction between the engine and the user, and improves the search efficiency of the user.
  • FIG. 2 it is a functional block diagram of the case search device based on interactive feedback in this application.
  • the apparatus 100 for case search based on interactive feedback described in this application can be installed in an electronic device.
  • the device for searching a case based on interactive feedback may include: a case searching module 101, a case ranking module 102, a case learning module 103, and a case feedback module 104.
  • the module described in this application can also be called a unit, which refers to a series of computer program segments that can be executed by the processor of an electronic device and can complete fixed functions, and are stored in the memory of the electronic device.
  • each module/unit is as follows:
  • the case search module 101 is used to search for first-level similar cases similar to the initial case from the existing case database according to preset matching rules;
  • the case ranking module 102 is used for ranking the similar cases according to their similarity to the initial case, and feeding back a preset number of similar cases to the user according to the ranking;
  • the case learning module 103 is configured to perform supervised learning on the similar cases through a learning model according to user feedback information on the similar cases;
  • the case feedback module 104 is used to determine and feed back to the user a second-level similar case that is more similar to the initial case according to the supervised learning result of the learning model, until it is determined that the similarity to the initial case is reached.
  • searching for a first-level similar case similar to the initial case from an existing case database according to a preset matching rule includes the following steps:
  • the cases in the case database are matched with the initial case one by one, so as to determine a first-level similar case that matches the initial case.
  • the similar cases are sorted according to their similarity to the initial case, and a preset number of similar cases are fed back to the user according to the sorting, and the k-means algorithm is used to compare the cases in the case database.
  • the cluster division includes the following steps:
  • the first step serialize the cases in the case database and encode them to obtain ⁇ x i ⁇ ;
  • Step 2 Initialize the category center vector ⁇ 1 , ⁇ 2 ,..., ⁇ k ⁇ , where k is selected at the maximum number of iterations according to the number of cases;
  • the third step the code of each case is divided into categories by calculating the L 2 norm according to the central vector;
  • Step 4 According to the division result, calculate the category center vector ⁇ ′ 1 , ⁇ ′ 2 ,..., ⁇ ′ k ⁇ ;
  • Step 5 Iterate the third and fourth steps until the difference between the corresponding center values of the center vectors of the two categories is less than the threshold or exceeds the maximum number of iterations;
  • Step 6 Calculate the center vector of the initial case and each category of cases to calculate the L 2 norm distance, and feed back all the cases in the category with the smallest distance to the user according to the degree of similarity.
  • d ij represents the L 2 norm distance between the searched case and the center of each category of cases
  • x j represents the search case
  • ⁇ i represents the i-th classification center vector
  • the supervised learning of the similar cases through a learning model includes the following steps:
  • the user’s feedback information on the similar cases includes: similar cases with a high degree of similarity and similar cases with a low degree of similarity, among which,
  • the learning model is used to supervise and learn similar cases with a high degree of similarity to the initial case, where the logistic regression formula is used to calculate the probability:
  • x is the feature of the participant's data
  • y is the label of the data
  • the feedback information of the user on the similar cases includes: similar cases with a high degree of similarity and similar cases with a low degree of similarity:
  • the case database is further searched for secondary cases similar to the initial case, and the user is fed back according to the preset Matching rules, searching for first-level similar cases similar to the initial case from the existing case database; sorting the similar cases according to their similarity to the initial case, and sorting the preset number of similar cases according to the sorting Feedback to the user; according to the user's feedback information on the similar case, supervise learning of the similar case through a learning model; determine and feed back to the user the degree of similarity to the initial case according to the supervised learning result of the learning model Higher second-level similar cases until the final similar cases whose similarity with the initial case meets the preset requirements are determined.
  • the search results are continuously updated through feedback on the search results.
  • the classification of cases is divided by clustering, and the search results are supervised learning through the learning model, so as to quickly and effectively learn from a large number of cases.
  • the case that best matches the condition of the disease is searched; the search method of this application makes different responses through direct or indirect feedback from the user, continuously updates the search content, realizes the interaction between the engine and the user, and improves the search efficiency of the user.
  • FIG. 3 it is a schematic diagram of the structure of an electronic device implementing a case search method based on interactive feedback in this application.
  • the electronic device 1 may include a processor 10, a memory 11, and a bus, and may also include a computer program stored in the memory 11 and running on the processor 10, such as a case search program 12 based on interactive feedback.
  • the memory 11 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, mobile hard disk, multimedia card, card-type memory (such as SD or DX memory, etc.), magnetic memory, magnetic disk, CD etc.
  • the memory 11 may be an internal storage unit of the electronic device 1 in some embodiments, for example, a mobile hard disk of the electronic device 1.
  • the memory 11 may also be an external storage device of the electronic device 1, such as a plug-in mobile hard disk, a smart media card (SMC), and a secure digital (Secure Digital) equipped on the electronic device 1. , SD) card, flash card (Flash Card), etc.
  • the memory 11 may also include both an internal storage unit of the electronic device 1 and an external storage device.
  • the memory 11 can be used not only to store application software and various data installed in the electronic device 1, such as the code of a data audit program, etc., but also to temporarily store data that has been output or will be output.
  • the processor 10 may be composed of integrated circuits in some embodiments, for example, may be composed of a single packaged integrated circuit, or may be composed of multiple integrated circuits with the same function or different functions, including one or more Combinations of central processing unit (CPU), microprocessor, digital processing chip, graphics processor, and various control chips, etc.
  • the processor 10 is the control unit of the electronic device, which uses various interfaces and lines to connect the various components of the entire electronic device, and runs or executes programs or modules (such as data) stored in the memory 11 Audit procedures, etc.), and call data stored in the memory 11 to execute various functions of the electronic device 1 and process data.
  • the bus may be a peripheral component interconnect standard (PCI) bus or an extended industry standard architecture (EISA) bus, etc.
  • PCI peripheral component interconnect standard
  • EISA extended industry standard architecture
  • the bus can be divided into address bus, data bus, control bus and so on.
  • the bus is configured to implement connection and communication between the memory 11 and at least one processor 10 and the like.
  • FIG. 3 only shows an electronic device with components. Those skilled in the art can understand that the structure shown in FIG. 2 does not constitute a limitation on the electronic device 1, and may include fewer or more components than shown in the figure. Components, or a combination of certain components, or different component arrangements.
  • the electronic device 1 may also include a power source (such as a battery) for supplying power to various components.
  • the power source may be logically connected to the at least one processor 10 through a power management device, thereby controlling power
  • the device implements functions such as charge management, discharge management, and power consumption management.
  • the power supply may also include any components such as one or more DC or AC power supplies, recharging devices, power failure detection circuits, power converters or inverters, and power status indicators.
  • the electronic device 1 may also include various sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.
  • the electronic device 1 may also include a network interface.
  • the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a Bluetooth interface, etc.), which is usually used in the electronic device 1 Establish a communication connection with other electronic devices.
  • the electronic device 1 may also include a user interface.
  • the user interface may be a display (Display) and an input unit (such as a keyboard (Keyboard)).
  • the user interface may also be a standard wired interface or a wireless interface.
  • the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode, organic light-emitting diode) touch device, etc.
  • the display can also be appropriately called a display screen or a display unit, which is used to display the information processed in the electronic device 1 and to display a visualized user interface.
  • the case search program 12 based on interactive feedback stored in the memory 11 of the electronic device 1 is a combination of multiple instructions. When running in the processor 10, it can realize:
  • the supervised learning result of the learning model determine and feed back to the user a second-level similar case that is more similar to the initial case, until a final similar case that is similar to the initial case reaches a preset requirement is determined.
  • the integrated module/unit of the electronic device 1 is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium.
  • the computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory) .
  • a computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, a case search method based on interactive feedback is implemented.
  • the specific method is as follows:
  • the supervised learning result of the learning model determine and feed back to the user a second-level similar case that is more similar to the initial case, until a final similar case that is similar to the initial case reaches a preset requirement is determined.
  • the computer-readable storage medium may be non-volatile or volatile.
  • modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional modules in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit may be implemented in the form of hardware, or may be implemented in the form of hardware plus software functional modules.

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Public Health (AREA)
  • Medical Informatics (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Primary Health Care (AREA)
  • General Health & Medical Sciences (AREA)
  • Epidemiology (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Pathology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A medical case search method and apparatus based on interactive feedback, and a computer-readable storage medium. The method comprises: according to a preset matching rule, searching an existing medical case database for first-level similar medical cases that are similar to an initial medical case (S1); sequencing the similar medical cases according to the degree of similarity between the similar medical cases and the initial medical case, and feeding a preset number of similar medical cases back to a user according to the sequence (S2); according to feedback information of the user on the similar medical cases, performing supervised learning on the similar medical cases by means of a learning model (S3); and according to a supervised learning result of the learning model, determining second-level similar medical cases, which have a higher degree of similarity to the initial medical case, and feeding same back to the user until a final similar medical case, the degree of similarity of which to the initial medical case meets a preset requirement, is determined (S4). By means of the method, a medical case that best matches a medical condition is quickly and effectively found from a massive amount of medical cases.

Description

基于交互反馈的病例搜索方法、装置及可读存储介质Case search method, device and readable storage medium based on interactive feedback
本申请要求申请号为202011118337.4,申请日为2020年10月19日,发明创造名称为“基于交互反馈的病例搜索方法、装置及可读存储介质”的专利申请的优先权。This application requires the priority of the patent application whose application number is 202011118337.4, the filing date is October 19, 2020, and the invention-creation title is "Case search method, device and readable storage medium based on interactive feedback".
技术领域Technical field
本申请涉及人工智能技术领域,尤其涉及一种基于交互反馈的病例搜索方法、装置、电子设备及计算机可读存储介质。This application relates to the field of artificial intelligence technology, and in particular to a case search method, device, electronic device, and computer-readable storage medium based on interactive feedback.
背景技术Background technique
目前,随着计算机技术的发展,检索已经成为日常生活中获取信息普遍使用的手段。在医疗领域,相似病例检索在科研、临床上具有重大意义。例如:有些疾病由于种种原因比较特殊,或是病征复杂,或是临床表现与其他疾病相似,又或是伴随发生的多个并发症反客为主掩盖住了疾病根源。这些病的诊断治疗往往比较复杂繁琐,导致病人不能及时诊断并得到相应的治疗最终导致错过了最佳治疗窗口期。在患者就诊时,如果医生通过快速查找与该患者相似的病例,并能及时通过相似病例的诊疗路径及效果做出有效的判断。At present, with the development of computer technology, retrieval has become a common method for obtaining information in daily life. In the medical field, similar case retrieval is of great significance in scientific research and clinical practice. For example, some diseases are special due to various reasons, or the symptoms are complex, or the clinical manifestations are similar to other diseases, or the accompanying multiple complications mainly cover up the root cause of the disease. The diagnosis and treatment of these diseases are often complicated and cumbersome, which results in patients not being diagnosed in time and receiving corresponding treatments, and eventually leading to miss the optimal treatment window. When a patient sees a doctor, if the doctor quickly finds a case similar to the patient, and can make an effective judgment through the diagnosis and treatment path and effect of the similar case in time.
目前在对这些疾病进行诊治的过程中,将治愈先例作为参考依据是一种高效的方式,医生搜索与病人最贴合的病例进行辅助诊治。但目前的相似病例检索仍存在一些缺陷:At present, in the process of diagnosis and treatment of these diseases, it is an efficient way to use cured precedents as a reference basis. Doctors search for the most suitable cases for the patient to assist in diagnosis and treatment. However, the current search of similar cases still has some shortcomings:
1)对于比较复杂的病情:如何准确搜索这类病例,主要是依靠医生的经验,但是,经验毕竟是有限的,不同医生搜索效率和结果都会不同;1) For more complicated conditions: How to accurately search for such cases mainly depends on the experience of doctors, but experience is limited after all, and the search efficiency and results of different doctors will be different;
2)病例数量大,搜索速度慢:从海量的病例中不能有效地搜索出最匹配的病例;2) The number of cases is large, and the search speed is slow: the most matching case cannot be effectively searched out from a large number of cases;
3)现有的病例这类搜索,并且都是单向反应,没有用户对搜索结果的反馈的搜索方式。3) Existing cases such as searches are all one-way responses, and there is no search method for users to feedback the search results.
发明人意识到传统的病例检索方法存在的上述问题,本申请亟需提供基于交互反馈的病例搜索方法。The inventor is aware of the above-mentioned problems existing in the traditional case retrieval method, and the present application urgently needs to provide a case search method based on interactive feedback.
发明内容Summary of the invention
本申请提供一种基于交互反馈的病例搜索方法、装置、电子设备及计算机可读存储介质,其主要目的在于快速有效的从海量的病例中搜索出与病情最匹配的病例。This application provides a case search method, device, electronic device, and computer-readable storage medium based on interactive feedback, the main purpose of which is to quickly and effectively search for a case that best matches the condition from a large number of cases.
为实现上述目的,本申请提供的基于交互反馈的病例搜索方法,所述方法包括:In order to achieve the above objectives, the interactive feedback-based case search method provided by this application includes:
根据预设的匹配规则,从已有的病例数据库中搜索与初始病例相近似的一级相似病例;According to preset matching rules, search for first-level similar cases similar to the initial case from the existing case database;
将所述相似病例按照其与所述初始病例的近似程度进行排序,并按照排序将预设数量的相似病例反馈给用户;Sort the similar cases according to their similarity to the initial case, and feed back a preset number of similar cases to the user according to the sort;
根据用户对所述相似病例的反馈信息,通过学习模型对所述相似病例进行监督学习;According to the user's feedback information on the similar cases, supervise learning of the similar cases through a learning model;
根据所述学习模型的监督学习结果,确定并向用户反馈与所述初始病例的相近程度更高的二级相似病例,直至确定与所述初始病例的相近程度达到预设要求的最终相似病例。According to the supervised learning result of the learning model, determine and feed back to the user a second-level similar case that is more similar to the initial case, until a final similar case that is similar to the initial case reaches a preset requirement is determined.
为了解决上述问题,本申请还提供一种基于交互反馈的病例搜索装置,所述装置包括:In order to solve the above problems, the present application also provides a case search device based on interactive feedback, the device includes:
病例搜索模块,用于根据预设的匹配规则,从已有的病例数据库中搜索与初始病例相近似的一级相似病例;The case search module is used to search for first-level similar cases similar to the initial case from the existing case database according to preset matching rules;
病例排序模块,用于将所述相似病例按照其与所述初始病例的近似程度进行排序,并按照排序将预设数量的相似病例反馈给用户;The case ranking module is used for ranking the similar cases according to their similarity to the initial case, and feeding back a preset number of similar cases to the user according to the ranking;
病例学习模块,用于根据用户对所述相似病例的反馈信息,通过学习模型对所述相似病例进行监督学习;The case learning module is used to perform supervised learning on the similar cases through a learning model according to the user's feedback information on the similar cases;
病例再次反馈模块,用于根据所述学习模型的监督学习结果,确定并向用户反馈与所述初始病例的相近程度更高的二级相似病例,直至确定与所述初始病例的相近程度达到预设要求的最终相似病例。The case feedback module is used to determine and feed back to the user a second-level similar case that is more similar to the initial case according to the supervised learning result of the learning model, until it is determined that the similarity to the initial case reaches the expected level. Set the required final similar cases.
为了解决上述问题,本申请还提供一种电子设备,所述电子设备包括:In order to solve the above-mentioned problems, the present application also provides an electronic device, which includes:
至少一个处理器;以及,At least one processor; and,
与所述至少一个处理器通信连接的存储器;其中,A memory communicatively connected with the at least one processor; wherein,
所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够如下步骤:The memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can perform the following steps:
根据预设的匹配规则,从已有的病例数据库中搜索与初始病例相近似的一级相似病例;According to preset matching rules, search for first-level similar cases similar to the initial case from the existing case database;
将所述相似病例按照其与所述初始病例的近似程度进行排序,并按照排序将预设数量的相似病例反馈给用户;Sort the similar cases according to their similarity to the initial case, and feed back a preset number of similar cases to the user according to the sort;
根据用户对所述相似病例的反馈信息,通过学习模型对所述相似病例进行监督学习;According to the user's feedback information on the similar cases, supervise learning of the similar cases through a learning model;
根据所述学习模型的监督学习结果,确定并向用户反馈与所述初始病例的相近程度更高的二级相似病例,直至确定与所述初始病例的相近程度达到预设要求的最终相似病例。According to the supervised learning result of the learning model, determine and feed back to the user a second-level similar case that is more similar to the initial case, until a final similar case that is similar to the initial case reaches a preset requirement is determined.
为了解决上述问题,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质中存储有至少一个指令,所述至少一个指令被电子设备中的处理器执行以实现上述所述的基于交互反馈的病例搜索方法。In order to solve the above-mentioned problems, the present application also provides a computer-readable storage medium having at least one instruction stored in the computer-readable storage medium, and the at least one instruction is executed by a processor in an electronic device to implement the above-mentioned Case search method based on interactive feedback.
本申请实施例根据预设的匹配规则,从已有的病例数据库中搜索与初始病例相近似的一级相似病例;将所述相似病例按照其与所述初始病例的近似程度进行排序,并按照排序将预设数量的相似病例反馈给用户;根据用户对所述相似病例的反馈信息,通过学习模型对所述相似病例进行监督学习;根据所述学习模型的监督学习结果,确定并向用户反馈与所述初始病例的相近程度更高的二级相似病例,直至确定与所述初始病例的相近程度达到预设要求的最终相似病例。在本申请的实施例中,通过对搜索结果的反馈而不断更新搜索结果,其中,通过聚类划分病例的类别、通过学习模型对搜索结果进行有监督学习,而快速有效的从海量的病例中搜索出与病情最匹配的病例;本申请的搜索方法通过用户直接或间接的反馈进行不同的响应,不断更新搜索内容,实现引擎与用户的交互,提升用户的搜索效率。According to the preset matching rules, the embodiment of this application searches the existing case database for first-level similar cases that are similar to the initial case; sorts the similar cases according to their similarity to the initial case, and sorts them according to the degree of similarity to the initial case. Sorting feeds back a preset number of similar cases to the user; according to the user’s feedback information on the similar cases, supervised learning of the similar cases through the learning model; determining and feeding back to the user according to the supervised learning results of the learning model A second-level similar case with a higher degree of similarity to the initial case is determined until a final similar case whose degree of similarity to the initial case reaches the preset requirement is determined. In the embodiments of the present application, the search results are continuously updated through feedback on the search results. Among them, the classification of cases is divided by clustering, and the search results are supervised learning through the learning model, so as to quickly and effectively learn from a large number of cases. The case that best matches the condition of the disease is searched; the search method of this application makes different responses through direct or indirect feedback from the user, continuously updates the search content, realizes the interaction between the engine and the user, and improves the search efficiency of the user.
附图说明Description of the drawings
图1为本申请一实施例提供的基于交互反馈的病例搜索方法的流程示意图;FIG. 1 is a schematic flowchart of a case search method based on interactive feedback provided by an embodiment of this application;
图2为本申请一实施例提供的基于交互反馈的病例搜索装置的模块示意图;2 is a schematic diagram of modules of a case search device based on interactive feedback provided by an embodiment of this application;
图3为本申请一实施例提供的实现基于交互反馈的病例搜索方法的电子设备的内部结构示意图;3 is a schematic diagram of the internal structure of an electronic device for implementing a case search method based on interactive feedback provided by an embodiment of the application;
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The realization, functional characteristics, and advantages of the purpose of this application will be further described in conjunction with the embodiments and with reference to the accompanying drawings.
具体实施方式Detailed ways
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。It should be understood that the specific embodiments described here are only used to explain the present application, and are not used to limit the present application.
本申请提供一种基于交互反馈的病例搜索方法。参照图1所示,为本申请一实施例提供的基于交互反馈的病例搜索识别方法的流程示意图。该方法可以由一个装置执行,该装置可以由软件和/或硬件实现。This application provides a case search method based on interactive feedback. Referring to FIG. 1, it is a schematic flowchart of a case search and identification method based on interactive feedback provided by an embodiment of this application. The method can be executed by a device, and the device can be implemented by software and/or hardware.
在本实施例中,基于交互反馈的病例搜索方法包括:In this embodiment, the case search method based on interactive feedback includes:
S1:根据预设的匹配规则,从已有的病例数据库中搜索与初始病例相近似的一级相似病例;S1: According to preset matching rules, search for first-level similar cases similar to the initial case from the existing case database;
S2:将所述相似病例按照其与所述初始病例的近似程度进行排序,并按照排序将预设数量的相似病例反馈给用户;S2: Sort the similar cases according to their similarity to the initial case, and feed back a preset number of similar cases to the user according to the sort;
S3:根据用户对所述相似病例的反馈信息,通过学习模型对所述相似病例进行监督学习;S3: Perform supervised learning on the similar case through a learning model according to the user's feedback information on the similar case;
S4:根据所述学习模型的监督学习结果,确定并向用户反馈与所述初始病例的相近程度更高的二级相似病例,直至确定与所述初始病例的相近程度达到预设要求的最终相似病例。S4: According to the supervised learning result of the learning model, determine and feed back to the user secondary similar cases that are more similar to the initial case, until it is determined that the similarity to the initial case reaches the preset required final similarity Case.
上述为本申请人工智能的基于交互反馈的病例搜索方法,可以方便医生或临床研究人员(下称用户)获取到需要了解、参考的病例。该搜索方法通过从用户对搜索结果的反馈而不断更新搜索结果,从海量的病例中搜索出与病情最匹配的病例。The above-mentioned case search method based on interactive feedback based on the artificial intelligence of this application can facilitate doctors or clinical researchers (hereinafter referred to as users) to obtain cases that need to be known and referred to. The search method continuously updates the search results from the user's feedback on the search results, and searches for a case that best matches the condition from a large number of cases.
在病例数据库中,将病例以文本形式储存,病例包含但不限于患者信息、临床表现等病理数据以及最终诊断结果。In the case database, the cases are stored in text form. The cases include but are not limited to pathological data such as patient information, clinical manifestations, and final diagnosis results.
在步骤S1中,根据预设的匹配规则,从已有的病例数据库中搜索与初始病例相近似的一级相似病例,包括以下步骤:In step S1, searching for a first-level similar case similar to the initial case from the existing case database according to the preset matching rules, including the following steps:
根据所述初始病例中包括的患者的临床表现、诊断结果,设定所述预设的匹配规则;Setting the preset matching rules according to the clinical manifestations and diagnosis results of the patients included in the initial case;
通过所述预设的匹配规则,将所述病例数据库的病例与所述初始病例进行逐一匹配,从而确定与所述初始病例相匹配的一级相似病例。According to the preset matching rule, the cases in the case database are matched with the initial case one by one, so as to determine a first-level similar case that matches the initial case.
当用户发起搜索,即输入初始病例后,系统解析初始病例生成搜索命令,在病例数据库的海量病例中进行搜索。When the user initiates a search, that is, after entering the initial case, the system parses the initial case to generate a search command, and searches among the massive cases in the case database.
在步骤S2中,将所述相似病例按照其与所述初始病例的近似程度进行排序,并按照排序将预设数量的相似病例反馈给用户。按照与所述初始病例的近似程度向用户反馈搜索到的相似病例。搜索引擎执行搜索命令按近似程度由高至低返回一批病例文本。In step S2, the similar cases are sorted according to their similarity to the initial case, and a preset number of similar cases are fed back to the user according to the sort. The searched similar cases are fed back to the user according to the degree of similarity to the initial case. The search engine executes the search command and returns a batch of case texts in descending order of similarity.
在计算近似程度时,可以由编码后计算与其它病例文本的L 2范数来定义。 When calculating the degree of similarity, it can be defined by calculating the L 2 norm with other case texts after encoding.
为了提高搜索效率,在用户执行搜索前,系统预先对病例数据库中的病例文本利用k均值算法对病例文件进行聚类划分,操作如下:In order to improve the search efficiency, before the user performs the search, the system uses the k-means algorithm to cluster the case files in advance on the case text in the case database. The operation is as follows:
S21:对数据库中病例文本序列化后进行编码得到{x i}; S21: Code the case text in the database after serialization to obtain {x i };
S22:初始化类别中心向量{μ 12,…,μ k},k根据病例文本数量在条件允许的情况下(如系统内存、最大迭代数)选取,如数据库中含有1百万条病例文本,则k取1千; S22: Initialize the category center vector {μ 12 ,…,μ k }, k is selected according to the number of case texts when conditions allow (such as system memory, the maximum number of iterations), such as the database contains 1 million cases Text, then k is 1,000;
S23:将每个文本编码按中心向量计算L 2范数进行划分,即d ij=‖x ji2S23: Divide each text code according to the central vector calculation L 2 norm, that is, d ij =‖x ji2 ;
S24:计算新的类别中心向量{μ′ 1,μ′ 2,…,μ′ k}; S24: Calculate the new category center vector {μ′ 1 ,μ′ 2 ,…,μ′ k };
S25:迭代S23和S24步骤,直至两次中心向量的对应中心值之差小于阈值或超过最大迭代数;S25: Iterate steps S23 and S24 until the difference between the corresponding center values of the two center vectors is less than the threshold or exceeds the maximum number of iterations;
S26:计算所述初始病例与各类别病例的中心向量计算L 2范数距离,将距离最小的类 别中所有病例按近似程度高低反馈给用户。 S26: Calculate the center vector of the initial case and each category of cases to calculate the L 2 norm distance, and feed back all the cases in the category with the smallest distance to the user according to the degree of similarity.
当用户输入搜索文本后,只要分别计算与各类别文本的中心L 2范数距离,例如搜索文本x与第i个分类中心μ i的距离是: After the user enters the search text, he only needs to calculate the distance to the center L 2 norm of each category text respectively. For example, the distance between the search text x and the i-th category center μ i is:
d i=‖x-μ i2 d i =‖x-μ i2
其中,d ij表示搜索病例与各类别病例的中心L 2范数距离; Among them, d ij represents the L 2 norm distance between the searched case and the center of each category of cases;
x j表示搜索病例,μ i表示第i个分类中心向量; x j represents the search case, μ i represents the i-th classification center vector;
找到距离最小的类别,再将此类别中所有病例按近似程度高低返回病例文本。Find the category with the smallest distance, and then return all cases in this category to the case text according to the degree of similarity.
在步骤S3中,根据用户对所述相似病例的反馈信息,通过学习模型对所述相似病例进行监督学习,包括如下步骤:In step S3, according to the user's feedback information on the similar cases, the supervised learning of the similar cases through the learning model includes the following steps:
用户对所述相似病例的反馈信息分包括:近似程度高的相似病例和近似程度低的相似病例,其中,The user’s feedback information on the similar cases includes: similar cases with a high degree of similarity and similar cases with a low degree of similarity, among which,
通过学习模型对与所述初始病例的近似程度高的相似病例进行监督学习,其中,采用Logistic回归公式计算概率:The learning model is used to supervise and learn similar cases with a high degree of similarity to the initial case, where the logistic regression formula is used to calculate the probability:
Figure PCTCN2020136406-appb-000001
Figure PCTCN2020136406-appb-000001
其中,x为参与方数据的特征,y为数据的标签。Among them, x is the feature of the participant's data, and y is the label of the data.
根据搜索结果可计算出该函数的损失值,在损失值区域最低的基础上达到模型的收敛。即:根据所述学习模型的监督学习结果,向用户反馈与所述初始病例更相似的病例,直至向用户反馈与所述初始病例最匹配的病例。According to the search results, the loss value of the function can be calculated, and the convergence of the model is achieved on the basis of the lowest loss value area. That is, according to the supervised learning result of the learning model, the case that is more similar to the initial case is fed back to the user, until the case that most matches the initial case is fed back to the user.
在步骤S4中,所述用户对所述相似病例的反馈信息分包括:近似程度高的相似病例和近似程度低的相似病例,In step S4, the user's feedback information on the similar cases includes: similar cases with a high degree of similarity and similar cases with a low degree of similarity,
分别提取与所述初始病例的近似程度高的相似病例的词频以及近似程度低的相似病例的词频;Respectively extract the word frequency of similar cases with a high degree of similarity to the initial case and the word frequencies of similar cases with a low degree of similarity;
根据提取的词频,修改所述预设的匹配规则;Modify the preset matching rule according to the extracted word frequency;
根据修改后的预设的匹配规则以及所述学习模型的监督学习结果,进一步搜索所述病例数据库中与所述初始病例相近似的二级病例,并向用户反馈According to the modified preset matching rules and the supervised learning results of the learning model, further search the case database for secondary cases similar to the initial case, and give feedback to the user
在本申请的实施例中,对近似程度高的相似病例和近似程度低的相似病例分别进行强化操作和弱化操作,用户查阅系统反馈的相似病例,并将查阅后的某病例文本后的返回搜索引擎。搜索引擎获得反馈,若用户认为刚才查阅的病例文本描述较接近,则进行强化操 作。反之,若用户认为刚才查阅的病例文本没有用处,则进行弱化操作。In the embodiment of the present application, similar cases with a high degree of similarity and similar cases with a low degree of similarity are respectively strengthened and weakened. The user consults the similar cases fed back by the system, and returns the search after consulting the text of a certain case. engine. The search engine gets feedback, and if the user thinks that the case text description they have just looked up is close, then strengthen the operation. On the contrary, if the user thinks that the case text just checked is not useful, then the weakening operation is performed.
强化和弱化操作,即系统每次会记录用户认可的和不认可的病例文本,并分别以标签1和0更新两个集合。Strengthen and weaken operations, that is, the system records the case texts approved and disapproved by the user each time, and updates the two sets with labels 1 and 0 respectively.
其中,强化和弱化操作,还包括:Among them, strengthening and weakening operations also include:
1、文本有效词提取:1. Extraction of valid words in text:
强化操作提取标签1集合中的文档中所有词语及计算其出现次数,用所述出现次数除以该词语在数据库中所有病例文本中出现的次数后得到词频。将词频前三的词语加入到搜索命令中的“文本包含”项。The intensive operation extracts all words in the document in the tag 1 set and calculates the number of occurrences, and divides the number of occurrences by the number of occurrences of the word in all case texts in the database to obtain the word frequency. Add the first three words in word frequency to the "text contains" item in the search command.
同理,弱化操作是提取标签0集合中文档的词频,并将词频前三的词语加入后台搜索命令中的“文本不包含”项。In the same way, the weakening operation is to extract the word frequency of the documents in the tag 0 set, and add the first three words of the word frequency to the "text does not contain" item in the background search command.
2、修正搜索:2. Correct the search:
搜索命令更新后,也会重新计算与各类别文本的中心L_2范数距离,若发现存在距离小于目前搜索所在类别的,则会转换至新类别或合并类别进行搜索,这种修正一定程度上避免搜索结果收敛到非用户所需类。After the search command is updated, the distance to the center L_2 norm of the text of each category will be recalculated. If the distance is found to be less than the category of the current search, it will switch to the new category or merge the category to search. This correction is avoided to a certain extent The search results converge to categories that are not required by the user.
对上述用户的反馈,搜索引擎也可以通过以下方式获得:For the above-mentioned user feedback, search engines can also obtain the following ways:
①直接询问是否所需病理;① Directly ask whether pathology is needed;
②根据若干次的询问,学习用户使用习惯判断是否所需病理。②According to several inquiries, learn the user's habits to determine whether the pathology is needed.
本申请实施例根据预设的匹配规则,从已有的病例数据库中搜索与初始病例相近似的一级相似病例;将所述相似病例按照其与所述初始病例的近似程度进行排序,并按照排序将预设数量的相似病例反馈给用户;根据用户对所述相似病例的反馈信息,通过学习模型对所述相似病例进行监督学习;根据所述学习模型的监督学习结果,确定并向用户反馈与所述初始病例的相近程度更高的二级相似病例,直至确定与所述初始病例的相近程度达到预设要求的最终相似病例。在本申请的实施例中,通过对搜索结果的反馈而不断更新搜索结果,其中,通过聚类划分病例的类别、通过学习模型对搜索结果进行有监督学习,而快速有效的从海量的病例中搜索出与病情最匹配的病例;本申请的搜索方法通过用户直接或间接的反馈进行不同的响应,不断更新搜索内容,实现引擎与用户的交互,提升用户的搜索效率。According to the preset matching rules, the embodiment of this application searches the existing case database for first-level similar cases that are similar to the initial case; sorts the similar cases according to their similarity to the initial case, and sorts them according to the degree of similarity to the initial case. Sorting feeds back a preset number of similar cases to the user; according to the user’s feedback information on the similar cases, supervised learning of the similar cases through the learning model; determining and feeding back to the user according to the supervised learning results of the learning model A second-level similar case with a higher degree of similarity to the initial case is determined until a final similar case whose degree of similarity to the initial case reaches the preset requirement is determined. In the embodiments of the present application, the search results are continuously updated through feedback on the search results. Among them, the classification of cases is divided by clustering, and the search results are supervised learning through the learning model, so as to quickly and effectively learn from a large number of cases. The case that best matches the condition of the disease is searched; the search method of this application makes different responses through direct or indirect feedback from the user, continuously updates the search content, realizes the interaction between the engine and the user, and improves the search efficiency of the user.
如图2所示,是本申请基于交互反馈的病例搜索装置的功能模块图。As shown in Figure 2, it is a functional block diagram of the case search device based on interactive feedback in this application.
本申请所述基于交互反馈的病例搜索装置100可以安装于电子设备中。根据实现的功 能,所述基于交互反馈的病例搜索装置可以包括:病例搜索模块101、病例排序模块102、病例学习模块103、病例再次反馈模块104。本申请所述模块也可以称之为单元,是指一种能够被电子设备处理器所执行,并且能够完成固定功能的一系列计算机程序段,其存储在电子设备的存储器中。The apparatus 100 for case search based on interactive feedback described in this application can be installed in an electronic device. According to the realized functions, the device for searching a case based on interactive feedback may include: a case searching module 101, a case ranking module 102, a case learning module 103, and a case feedback module 104. The module described in this application can also be called a unit, which refers to a series of computer program segments that can be executed by the processor of an electronic device and can complete fixed functions, and are stored in the memory of the electronic device.
在本实施例中,关于各模块/单元的功能如下:In this embodiment, the functions of each module/unit are as follows:
病例搜索模块101,用于根据预设的匹配规则,从已有的病例数据库中搜索与初始病例相近似的一级相似病例;The case search module 101 is used to search for first-level similar cases similar to the initial case from the existing case database according to preset matching rules;
病例排序模块102,用于将所述相似病例按照其与所述初始病例的近似程度进行排序,并按照排序将预设数量的相似病例反馈给用户;The case ranking module 102 is used for ranking the similar cases according to their similarity to the initial case, and feeding back a preset number of similar cases to the user according to the ranking;
病例学习模块103,用于根据用户对所述相似病例的反馈信息,通过学习模型对所述相似病例进行监督学习;The case learning module 103 is configured to perform supervised learning on the similar cases through a learning model according to user feedback information on the similar cases;
病例再次反馈模块104,用于根据所述学习模型的监督学习结果,确定并向用户反馈与所述初始病例的相近程度更高的二级相似病例,直至确定与所述初始病例的相近程度达到预设要求的最终相似病例The case feedback module 104 is used to determine and feed back to the user a second-level similar case that is more similar to the initial case according to the supervised learning result of the learning model, until it is determined that the similarity to the initial case is reached Final similar cases with pre-determined requirements
其中,所述根据预设的匹配规则,从已有的病例数据库中搜索与初始病例相近似的一级相似病例,包括以下步骤:Wherein, searching for a first-level similar case similar to the initial case from an existing case database according to a preset matching rule includes the following steps:
根据所述初始病例中包括的患者的临床表现、诊断结果,设定所述预设的匹配规则;Setting the preset matching rules according to the clinical manifestations and diagnosis results of the patients included in the initial case;
通过所述预设的匹配规则,将所述病例数据库的病例与所述初始病例进行逐一匹配,从而确定与所述初始病例相匹配的一级相似病例。According to the preset matching rule, the cases in the case database are matched with the initial case one by one, so as to determine a first-level similar case that matches the initial case.
可选地,所述将所述相似病例按照其与所述初始病例的近似程度进行排序,并按照排序将预设数量的相似病例反馈给用户,利用k均值算法对所述病例数据库中的病例进行聚类划分,包括如下步骤:Optionally, the similar cases are sorted according to their similarity to the initial case, and a preset number of similar cases are fed back to the user according to the sorting, and the k-means algorithm is used to compare the cases in the case database. The cluster division includes the following steps:
第一步:对所述病例数据库中的病例序列化后进行编码得到{x i}; The first step: serialize the cases in the case database and encode them to obtain {x i };
第二步:初始化类别中心向量{μ 12,…,μ k},其中,k根据所述病例的数量在最大迭代数进行选取; Step 2: Initialize the category center vector {μ 12 ,...,μ k }, where k is selected at the maximum number of iterations according to the number of cases;
第三步:将每个病例的编码根据所述中心向量计算L 2范数进行划分类别; The third step: the code of each case is divided into categories by calculating the L 2 norm according to the central vector;
第四步:根据划分结果,计算类别中心向量{μ′ 1,μ′ 2,…,μ′ k}; Step 4: According to the division result, calculate the category center vector {μ′ 1 ,μ′ 2 ,…,μ′ k };
第五步:迭代第三步和第四步骤,直至两次类别中心向量的对应中心值之差小于阈值或超过最大迭代数;Step 5: Iterate the third and fourth steps until the difference between the corresponding center values of the center vectors of the two categories is less than the threshold or exceeds the maximum number of iterations;
第六步:计算所述初始病例与各类别病例的中心向量计算L 2范数距离,将距离最小的 类别中所有病例按近似程度高低反馈给用户。 Step 6: Calculate the center vector of the initial case and each category of cases to calculate the L 2 norm distance, and feed back all the cases in the category with the smallest distance to the user according to the degree of similarity.
其中,所述将每个病例的编码按所述中心向量计算L 2范数进行划,采用的公式为: Wherein, the coding of each case is divided according to the central vector calculation L 2 norm, and the formula used is:
d ij=‖x ji2 d ij =‖x ji2
其中,d ij表示搜索病例与各类别病例的中心L 2范数距离; Among them, d ij represents the L 2 norm distance between the searched case and the center of each category of cases;
x j表示搜索病例,μ i表示第i个分类中心向量。 x j represents the search case, and μ i represents the i-th classification center vector.
其中,根据用户对所述相似病例的反馈信息,通过学习模型对所述相似病例进行监督学习,包括如下步骤:Wherein, according to the user's feedback information on the similar cases, the supervised learning of the similar cases through a learning model includes the following steps:
用户对所述相似病例的反馈信息分包括:近似程度高的相似病例和近似程度低的相似病例,其中,The user’s feedback information on the similar cases includes: similar cases with a high degree of similarity and similar cases with a low degree of similarity, among which,
通过学习模型对与所述初始病例的近似程度高的相似病例进行监督学习,其中,采用Logistic回归公式计算概率:The learning model is used to supervise and learn similar cases with a high degree of similarity to the initial case, where the logistic regression formula is used to calculate the probability:
Figure PCTCN2020136406-appb-000002
Figure PCTCN2020136406-appb-000002
其中,x为参与方数据的特征,y为数据的标签。Among them, x is the feature of the participant's data, and y is the label of the data.
其中,所述用户对所述相似病例的反馈信息分包括:近似程度高的相似病例和近似程度低的相似病例:Wherein, the feedback information of the user on the similar cases includes: similar cases with a high degree of similarity and similar cases with a low degree of similarity:
分别提取与所述初始病例的近似程度高的相似病例的词频以及近似程度低的相似病例的词频;Respectively extract the word frequency of similar cases with a high degree of similarity to the initial case and the word frequencies of similar cases with a low degree of similarity;
根据提取的词频,修改所述预设的匹配规则;Modify the preset matching rule according to the extracted word frequency;
根据修改后的预设的匹配规则以及所述学习模型的监督学习结果,进一步搜索所述病例数据库中与所述初始病例相近似的二级病例,并向用户反馈本申请实施例根据预设的匹配规则,从已有的病例数据库中搜索与初始病例相近似的一级相似病例;将所述相似病例按照其与所述初始病例的近似程度进行排序,并按照排序将预设数量的相似病例反馈给用户;根据用户对所述相似病例的反馈信息,通过学习模型对所述相似病例进行监督学习;根据所述学习模型的监督学习结果,确定并向用户反馈与所述初始病例的相近程度更高的二级相似病例,直至确定与所述初始病例的相近程度达到预设要求的最终相似病例。在本申请的实施例中,通过对搜索结果的反馈而不断更新搜索结果,其中,通过聚类划分病例 的类别、通过学习模型对搜索结果进行有监督学习,而快速有效的从海量的病例中搜索出与病情最匹配的病例;本申请的搜索方法通过用户直接或间接的反馈进行不同的响应,不断更新搜索内容,实现引擎与用户的交互,提升用户的搜索效率。According to the modified preset matching rules and the supervised learning results of the learning model, the case database is further searched for secondary cases similar to the initial case, and the user is fed back according to the preset Matching rules, searching for first-level similar cases similar to the initial case from the existing case database; sorting the similar cases according to their similarity to the initial case, and sorting the preset number of similar cases according to the sorting Feedback to the user; according to the user's feedback information on the similar case, supervise learning of the similar case through a learning model; determine and feed back to the user the degree of similarity to the initial case according to the supervised learning result of the learning model Higher second-level similar cases until the final similar cases whose similarity with the initial case meets the preset requirements are determined. In the embodiments of the present application, the search results are continuously updated through feedback on the search results. Among them, the classification of cases is divided by clustering, and the search results are supervised learning through the learning model, so as to quickly and effectively learn from a large number of cases. The case that best matches the condition of the disease is searched; the search method of this application makes different responses through direct or indirect feedback from the user, continuously updates the search content, realizes the interaction between the engine and the user, and improves the search efficiency of the user.
如图3所示,是本申请实现基于交互反馈的病例搜索方法的电子设备的结构示意图。As shown in FIG. 3, it is a schematic diagram of the structure of an electronic device implementing a case search method based on interactive feedback in this application.
所述电子设备1可以包括处理器10、存储器11和总线,还可以包括存储在所述存储器11中并可在所述处理器10上运行的计算机程序,如基于交互反馈的病例搜索程序12。The electronic device 1 may include a processor 10, a memory 11, and a bus, and may also include a computer program stored in the memory 11 and running on the processor 10, such as a case search program 12 based on interactive feedback.
其中,所述存储器11至少包括一种类型的可读存储介质,所述可读存储介质包括闪存、移动硬盘、多媒体卡、卡型存储器(例如:SD或DX存储器等)、磁性存储器、磁盘、光盘等。所述存储器11在一些实施例中可以是电子设备1的内部存储单元,例如该电子设备1的移动硬盘。所述存储器11在另一些实施例中也可以是电子设备1的外部存储设备,例如电子设备1上配备的插接式移动硬盘、智能存储卡(Smart Media Card,SMC)、安全数字(Secure Digital,SD)卡、闪存卡(Flash Card)等。进一步地,所述存储器11还可以既包括电子设备1的内部存储单元也包括外部存储设备。所述存储器11不仅可以用于存储安装于电子设备1的应用软件及各类数据,例如数据稽核程序的代码等,还可以用于暂时地存储已经输出或者将要输出的数据。Wherein, the memory 11 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, mobile hard disk, multimedia card, card-type memory (such as SD or DX memory, etc.), magnetic memory, magnetic disk, CD etc. The memory 11 may be an internal storage unit of the electronic device 1 in some embodiments, for example, a mobile hard disk of the electronic device 1. In other embodiments, the memory 11 may also be an external storage device of the electronic device 1, such as a plug-in mobile hard disk, a smart media card (SMC), and a secure digital (Secure Digital) equipped on the electronic device 1. , SD) card, flash card (Flash Card), etc. Further, the memory 11 may also include both an internal storage unit of the electronic device 1 and an external storage device. The memory 11 can be used not only to store application software and various data installed in the electronic device 1, such as the code of a data audit program, etc., but also to temporarily store data that has been output or will be output.
所述处理器10在一些实施例中可以由集成电路组成,例如可以由单个封装的集成电路所组成,也可以是由多个相同功能或不同功能封装的集成电路所组成,包括一个或者多个中央处理器(Central Processing unit,CPU)、微处理器、数字处理芯片、图形处理器及各种控制芯片的组合等。所述处理器10是所述电子设备的控制核心(Control Unit),利用各种接口和线路连接整个电子设备的各个部件,通过运行或执行存储在所述存储器11内的程序或者模块(例如数据稽核程序等),以及调用存储在所述存储器11内的数据,以执行电子设备1的各种功能和处理数据。The processor 10 may be composed of integrated circuits in some embodiments, for example, may be composed of a single packaged integrated circuit, or may be composed of multiple integrated circuits with the same function or different functions, including one or more Combinations of central processing unit (CPU), microprocessor, digital processing chip, graphics processor, and various control chips, etc. The processor 10 is the control unit of the electronic device, which uses various interfaces and lines to connect the various components of the entire electronic device, and runs or executes programs or modules (such as data) stored in the memory 11 Audit procedures, etc.), and call data stored in the memory 11 to execute various functions of the electronic device 1 and process data.
所述总线可以是外设部件互连标准(peripheral component interconnect,简称PCI)总线或扩展工业标准结构(extended industry standard architecture,简称EISA)总线等。该总线可以分为地址总线、数据总线、控制总线等。所述总线被设置为实现所述存储器11以及至少一个处理器10等之间的连接通信。The bus may be a peripheral component interconnect standard (PCI) bus or an extended industry standard architecture (EISA) bus, etc. The bus can be divided into address bus, data bus, control bus and so on. The bus is configured to implement connection and communication between the memory 11 and at least one processor 10 and the like.
图3仅示出了具有部件的电子设备,本领域技术人员可以理解的是,图2示出的结构并不构成对所述电子设备1的限定,可以包括比图示更少或者更多的部件,或者组合某些部件,或者不同的部件布置。FIG. 3 only shows an electronic device with components. Those skilled in the art can understand that the structure shown in FIG. 2 does not constitute a limitation on the electronic device 1, and may include fewer or more components than shown in the figure. Components, or a combination of certain components, or different component arrangements.
例如,尽管未示出,所述电子设备1还可以包括给各个部件供电的电源(比如电池),优选地,电源可以通过电源管理装置与所述至少一个处理器10逻辑相连,从而通过电源管理装置实现充电管理、放电管理、以及功耗管理等功能。电源还可以包括一个或一个以上的直流或交流电源、再充电装置、电源故障检测电路、电源转换器或者逆变器、电源状态指示器等任意组件。所述电子设备1还可以包括多种传感器、蓝牙模块、Wi-Fi模块等,在此不再赘述。For example, although not shown, the electronic device 1 may also include a power source (such as a battery) for supplying power to various components. Preferably, the power source may be logically connected to the at least one processor 10 through a power management device, thereby controlling power The device implements functions such as charge management, discharge management, and power consumption management. The power supply may also include any components such as one or more DC or AC power supplies, recharging devices, power failure detection circuits, power converters or inverters, and power status indicators. The electronic device 1 may also include various sensors, Bluetooth modules, Wi-Fi modules, etc., which will not be repeated here.
进一步地,所述电子设备1还可以包括网络接口,可选地,所述网络接口可以包括有线接口和/或无线接口(如WI-FI接口、蓝牙接口等),通常用于在该电子设备1与其他电子设备之间建立通信连接。Further, the electronic device 1 may also include a network interface. Optionally, the network interface may include a wired interface and/or a wireless interface (such as a WI-FI interface, a Bluetooth interface, etc.), which is usually used in the electronic device 1 Establish a communication connection with other electronic devices.
可选地,该电子设备1还可以包括用户接口,用户接口可以是显示器(Display)、输入单元(比如键盘(Keyboard)),可选地,用户接口还可以是标准的有线接口、无线接口。可选地,在一些实施例中,显示器可以是LED显示器、液晶显示器、触控式液晶显示器以及OLED(Organic Light-Emitting Diode,有机发光二极管)触摸器等。其中,显示器也可以适当的称为显示屏或显示单元,用于显示在电子设备1中处理的信息以及用于显示可视化的用户界面。Optionally, the electronic device 1 may also include a user interface. The user interface may be a display (Display) and an input unit (such as a keyboard (Keyboard)). Optionally, the user interface may also be a standard wired interface or a wireless interface. Optionally, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode, organic light-emitting diode) touch device, etc. Among them, the display can also be appropriately called a display screen or a display unit, which is used to display the information processed in the electronic device 1 and to display a visualized user interface.
应该了解,所述实施例仅为说明之用,在专利申请范围上并不受此结构的限制。It should be understood that the embodiments are only for illustrative purposes, and are not limited by this structure in the scope of the patent application.
所述电子设备1中的所述存储器11存储的基于交互反馈的病例搜索程序12是多个指令的组合,在所述处理器10中运行时,可以实现:The case search program 12 based on interactive feedback stored in the memory 11 of the electronic device 1 is a combination of multiple instructions. When running in the processor 10, it can realize:
根据预设的匹配规则,从已有的病例数据库中搜索与初始病例相近似的一级相似病例;According to preset matching rules, search for first-level similar cases similar to the initial case from the existing case database;
将所述相似病例按照其与所述初始病例的近似程度进行排序,并按照排序将预设数量的相似病例反馈给用户;Sort the similar cases according to their similarity to the initial case, and feed back a preset number of similar cases to the user according to the sort;
根据用户对所述相似病例的反馈信息,通过学习模型对所述相似病例进行监督学习;According to the user's feedback information on the similar cases, supervise learning of the similar cases through a learning model;
根据所述学习模型的监督学习结果,确定并向用户反馈与所述初始病例的相近程度更高的二级相似病例,直至确定与所述初始病例的相近程度达到预设要求的最终相似病例。According to the supervised learning result of the learning model, determine and feed back to the user a second-level similar case that is more similar to the initial case, until a final similar case that is similar to the initial case reaches a preset requirement is determined.
具体地,所述处理器10对上述指令的具体实现方法可参考图1对应实施例中相关步骤的描述,在此不赘述。Specifically, for the specific implementation method of the above-mentioned instructions by the processor 10, reference may be made to the description of the relevant steps in the embodiment corresponding to FIG. 1, which will not be repeated here.
进一步地,所述电子设备1集成的模块/单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。所述计算机可读介质可以包括:能够携带所述计算机程序代码的任何实体或装置、记录介质、U盘、移动硬盘、磁碟、光盘、计算机存储器、只读存储器(ROM,Read-Only Memory)。Further, if the integrated module/unit of the electronic device 1 is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, U disk, mobile hard disk, magnetic disk, optical disk, computer memory, read-only memory (ROM, Read-Only Memory) .
在本申请的实施例中,计算机可读存储介质,存储有计算机程序,所述计算机程序被处理器执行时实现基于交互反馈的病例搜索方法,具体方法如下:In the embodiment of the present application, a computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, a case search method based on interactive feedback is implemented. The specific method is as follows:
根据预设的匹配规则,从已有的病例数据库中搜索与初始病例相近似的一级相似病例;According to preset matching rules, search for first-level similar cases similar to the initial case from the existing case database;
将所述相似病例按照其与所述初始病例的近似程度进行排序,并按照排序将预设数量的相似病例反馈给用户;Sort the similar cases according to their similarity to the initial case, and feed back a preset number of similar cases to the user according to the sort;
根据用户对所述相似病例的反馈信息,通过学习模型对所述相似病例进行监督学习;According to the user's feedback information on the similar cases, supervise learning of the similar cases through a learning model;
根据所述学习模型的监督学习结果,确定并向用户反馈与所述初始病例的相近程度更高的二级相似病例,直至确定与所述初始病例的相近程度达到预设要求的最终相似病例。According to the supervised learning result of the learning model, determine and feed back to the user a second-level similar case that is more similar to the initial case, until a final similar case that is similar to the initial case reaches a preset requirement is determined.
其中,计算机可读存储介质可以是非易失性,也可以是易失性。The computer-readable storage medium may be non-volatile or volatile.
在本申请所提供的几个实施例中,应该理解到,所揭露的设备,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。In the several embodiments provided in this application, it should be understood that the disclosed equipment, device, and method may be implemented in other ways. For example, the device embodiments described above are merely illustrative. For example, the division of the modules is only a logical function division, and there may be other division methods in actual implementation.
所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。The modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the modules can be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
另外,在本申请各个实施例中的各功能模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能模块的形式实现。In addition, the functional modules in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit. The above-mentioned integrated unit may be implemented in the form of hardware, or may be implemented in the form of hardware plus software functional modules.
对于本领域技术人员而言,显然本申请不限于上述示范性实施例的细节,而且在不背离本申请的精神或基本特征的情况下,能够以其他的具体形式实现本申请。For those skilled in the art, it is obvious that the present application is not limited to the details of the foregoing exemplary embodiments, and the present application can be implemented in other specific forms without departing from the spirit or basic characteristics of the application.
因此,无论从哪一点来看,均应将实施例看作是示范性的,而且是非限制性的,本申请的范围由所附权利要求而不是上述说明限定,因此旨在将落在权利要求的等同要件的含义和范围内的所有变化涵括在本申请内。不应将权利要求中的任何附关联图标记视为限制所涉及的权利要求。Therefore, no matter from which point of view, the embodiments should be regarded as exemplary and non-limiting. The scope of this application is defined by the appended claims rather than the above description, and therefore it is intended to fall into the claims. All changes in the meaning and scope of the equivalent elements of are included in this application. Any associated diagram marks in the claims should not be regarded as limiting the claims involved.
最后应说明的是,以上实施例仅用以说明本申请的技术方案而非限制,尽管参照较佳实施例对本申请进行了详细说明,本领域的普通技术人员应当理解,可以对本申请的技术方案进行修改或等同替换,而不脱离本申请技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the application and not to limit them. Although the application has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the application can be Make modifications or equivalent replacements without departing from the spirit and scope of the technical solution of the present application.

Claims (20)

  1. 一种基于交互反馈的病例搜索方法,其中,所述方法包括:A case search method based on interactive feedback, wherein the method includes:
    根据预设的匹配规则,从已有的病例数据库中搜索与初始病例相近似的一级相似病例;According to preset matching rules, search for first-level similar cases similar to the initial case from the existing case database;
    将所述相似病例按照其与所述初始病例的近似程度进行排序,并按照排序将预设数量的相似病例反馈给用户;Sort the similar cases according to their similarity to the initial case, and feed back a preset number of similar cases to the user according to the sort;
    根据用户对所述相似病例的反馈信息,通过学习模型对所述相似病例进行监督学习;According to the user's feedback information on the similar cases, supervise learning of the similar cases through a learning model;
    根据所述学习模型的监督学习结果,确定并向用户反馈与所述初始病例的相近程度更高的二级相似病例,直至确定与所述初始病例的相近程度达到预设要求的最终相似病例。According to the supervised learning result of the learning model, determine and feed back to the user a second-level similar case that is more similar to the initial case, until a final similar case that is similar to the initial case reaches a preset requirement is determined.
  2. 如权利要求1所述的基于交互反馈的病例搜索方法,其中,所述根据预设的匹配规则,从已有的病例数据库中搜索与初始病例相近似的一级相似病例,包括以下步骤:The case search method based on interactive feedback according to claim 1, wherein the search for a first-level similar case similar to the initial case from an existing case database according to a preset matching rule comprises the following steps:
    根据所述初始病例中包括的患者的临床表现、诊断结果,设定所述预设的匹配规则;Setting the preset matching rules according to the clinical manifestations and diagnosis results of the patients included in the initial case;
    通过所述预设的匹配规则,将所述病例数据库的病例与所述初始病例进行逐一匹配,从而确定与所述初始病例相匹配的一级相似病例。According to the preset matching rule, the cases in the case database are matched with the initial case one by one, so as to determine a first-level similar case that matches the initial case.
  3. 如权利要求1所述的基于交互反馈的病例搜索方法,其中,所述将所述相似病例按照其与所述初始病例的近似程度进行排序,并按照排序将预设数量的相似病例反馈给用户,利用k均值算法对所述病例数据库中的病例进行聚类划分,包括如下步骤:The case search method based on interactive feedback according to claim 1, wherein the said similar cases are sorted according to their similarity to the initial case, and a preset number of similar cases are fed back to the user according to the sorting , Using the k-means algorithm to cluster and divide the cases in the case database, including the following steps:
    第一步:对所述病例数据库中的病例序列化后进行编码得到{x i}; The first step: serialize the cases in the case database and encode them to obtain {x i };
    第二步:初始化类别中心向量{μ 12,…,μ k},其中,k根据所述病例的数量在最大迭代数进行选取; Step 2: Initialize the category center vector {μ 12 ,...,μ k }, where k is selected at the maximum number of iterations according to the number of cases;
    第三步:将每个病例的编码根据所述中心向量计算L 2范数进行划分类别; The third step: the code of each case is divided into categories by calculating the L 2 norm according to the central vector;
    第四步:根据划分结果,计算类别中心向量{μ′ 1,μ′ 2,…,μ′ k}; Step 4: According to the division result, calculate the category center vector {μ′ 1 ,μ′ 2 ,…,μ′ k };
    第五步:迭代第三步和第四步骤,直至两次类别中心向量的对应中心值之差小于阈值或超过最大迭代数;Step 5: Iterate the third and fourth steps until the difference between the corresponding center values of the center vectors of the two categories is less than the threshold or exceeds the maximum number of iterations;
    第六步:计算所述初始病例与各类别病例的中心向量计算L 2范数距离,将距离最小的类别中所有病例按近似程度高低反馈给用户。 Step 6: Calculate the center vector of the initial case and each category of cases to calculate the L 2 norm distance, and feed back all the cases in the category with the smallest distance to the user according to the degree of similarity.
  4. 如权利要求3所述的基于交互反馈的病例搜索方法,其中,所述将每个病例的编码根据所述中心向量计算L 2范数进行划分类别,采用的公式为: The case search method based on interactive feedback according to claim 3, wherein the code of each case is divided into categories by calculating the L 2 norm according to the center vector, and the formula used is:
    d ij=‖x ji2 d ij =‖x ji2
    其中,d ij表示搜索病例与各类别病例的中心L 2范数距离; Among them, d ij represents the L 2 norm distance between the searched case and the center of each category of cases;
    x j表示搜索病例,μ i表示第i个分类中心向量。 x j represents the search case, and μ i represents the i-th classification center vector.
  5. 如权利要求1所述的基于交互反馈的病例搜索方法,其中,所述根据用户对所述相似病例的反馈信息,通过学习模型对所述相似病例进行监督学习,包括如下步骤:The case search method based on interactive feedback according to claim 1, wherein the supervised learning of the similar cases through a learning model according to the user's feedback information on the similar cases comprises the following steps:
    用户对所述相似病例的反馈信息分包括:近似程度高的相似病例和近似程度低的相似病例,其中,The user’s feedback information on the similar cases includes: similar cases with a high degree of similarity and similar cases with a low degree of similarity, among which,
    通过学习模型对与所述初始病例的近似程度高的相似病例进行监督学习,其中,采用Logistic回归公式计算概率:The learning model is used to supervise and learn similar cases with a high degree of similarity to the initial case, where the logistic regression formula is used to calculate the probability:
    Figure PCTCN2020136406-appb-100001
    Figure PCTCN2020136406-appb-100001
    其中,x为参与方数据的特征,y为数据的标签。Among them, x is the feature of the participant's data, and y is the label of the data.
  6. 如权利要求5所述的基于交互反馈的病例搜索方法,其中,所述用户对所述相似病例的反馈信息分包括:近似程度高的相似病例和近似程度低的相似病例:The case search method based on interactive feedback according to claim 5, wherein the feedback information points of the user on the similar cases include: similar cases with a high degree of similarity and similar cases with a low degree of similarity:
    分别提取与所述初始病例的近似程度高的相似病例的词频以及近似程度低的相似病例的词频;Respectively extract the word frequency of similar cases with a high degree of similarity to the initial case and the word frequencies of similar cases with a low degree of similarity;
    根据提取的词频,修改所述预设的匹配规则;Modify the preset matching rule according to the extracted word frequency;
    根据修改后的预设的匹配规则以及所述学习模型的监督学习结果,进一步搜索所述病例数据库中与所述初始病例相近似的二级病例,并向用户反馈。According to the modified preset matching rules and the supervised learning result of the learning model, the second-level case similar to the initial case in the case database is further searched, and the user is fed back.
  7. 如权利要求1所述的基于交互反馈的病例搜索方法,其中,The case search method based on interactive feedback according to claim 1, wherein:
    所述用户对所述近似程度高的相似病例进行强化操作,所述用户对所述近似程度低的相似病例进行弱化操作;其中,The user performs an enhanced operation on the similar cases with a high degree of similarity, and the user performs a weakening operation on the similar cases with a low degree of similarity; wherein,
    所述强化操作对应标签1集合,所述弱化操作对应标签0集合。The strengthening operation corresponds to the label 1 set, and the weakening operation corresponds to the label 0 set.
  8. 如权利要求7所述的基于交互反馈的病例搜索方法,其中,The case search method based on interactive feedback according to claim 7, wherein:
    所述强化操作是指:提取所述标签1集合中的文档中所有词语,计算其出现次数,用所述出现次数除以所述词语在所述病例数据库中所有病例文本中出现的次数后得到词频,并将所述词频前三的词语加入到搜索命令中的“文本包含”项。The strengthening operation refers to: extracting all words in the document in the label 1 set, calculating the number of occurrences, and dividing the number of occurrences by the number of occurrences of the word in all case texts in the case database to obtain Word frequency, and add the top three words of the word frequency to the "text contains" item in the search command.
  9. 如权利要求7所述的基于交互反馈的病例搜索方法,其中,The case search method based on interactive feedback according to claim 7, wherein:
    所述弱化操作是指:提取标签0集合中文档中所有词语,计算其出现次数,用所述出现次数除以所述词语在所述病例数据库中所有病例文本中出现的次数后得到词频,并将所述词频前三的词语加入后台搜索命令中的“文本不包含”项。The weakening operation refers to: extracting all words in the document in the tag 0 set, calculating the number of occurrences, dividing the number of occurrences by the number of occurrences of the words in all case texts in the case database to obtain the word frequency, and Add the top three words of the word frequency to the "text does not contain" item in the background search command.
  10. 一种基于交互反馈的病例搜索装置,其中,所述装置包括:A case search device based on interactive feedback, wherein the device includes:
    病例搜索模块,用于根据预设的匹配规则,从已有的病例数据库中搜索与初始病例相近似的一级相似病例;The case search module is used to search for first-level similar cases similar to the initial case from the existing case database according to preset matching rules;
    病例排序模块,用于将所述相似病例按照其与所述初始病例的近似程度进行排序,并按照排序将预设数量的相似病例反馈给用户;The case ranking module is used for ranking the similar cases according to their similarity to the initial case, and feeding back a preset number of similar cases to the user according to the ranking;
    病例学习模块,用于根据用户对所述相似病例的反馈信息,通过学习模型对所述相似病例进行监督学习;The case learning module is used to perform supervised learning on the similar cases through a learning model according to the user's feedback information on the similar cases;
    病例再次反馈模块,用于根据所述学习模型的监督学习结果,确定并向用户反馈与所述初始病例的相近程度更高的二级相似病例,直至确定与所述初始病例的相近程度达到预设要求的最终相似病例。The case feedback module is used to determine and feed back to the user a second-level similar case that is more similar to the initial case according to the supervised learning result of the learning model, until it is determined that the similarity to the initial case reaches the expected level. Set the required final similar cases.
  11. 一种电子设备,其中,所述电子设备包括:An electronic device, wherein the electronic device includes:
    至少一个处理器;以及,At least one processor; and,
    与所述至少一个处理器通信连接的存储器;其中,A memory communicatively connected with the at least one processor; wherein,
    所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行如下步骤:The memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute the following steps:
    根据预设的匹配规则,从已有的病例数据库中搜索与初始病例相近似的一级相似病例;According to preset matching rules, search for first-level similar cases similar to the initial case from the existing case database;
    将所述相似病例按照其与所述初始病例的近似程度进行排序,并按照排序将预设数量的相似病例反馈给用户;Sort the similar cases according to their similarity to the initial case, and feed back a preset number of similar cases to the user according to the sort;
    根据用户对所述相似病例的反馈信息,通过学习模型对所述相似病例进行监督学习;According to the user's feedback information on the similar cases, supervise learning of the similar cases through a learning model;
    根据所述学习模型的监督学习结果,确定并向用户反馈与所述初始病例的相近程度更高的二级相似病例,直至确定与所述初始病例的相近程度达到预设要求的最终相似病例。According to the supervised learning result of the learning model, determine and feed back to the user a second-level similar case that is more similar to the initial case, until a final similar case that is similar to the initial case reaches a preset requirement is determined.
  12. 如权利要求11所述的电子设备,其中,The electronic device according to claim 11, wherein:
    所述根据预设的匹配规则,从已有的病例数据库中搜索与初始病例相近似的一级相似病例,包括以下步骤:According to preset matching rules, searching for a first-level similar case similar to the initial case from an existing case database includes the following steps:
    根据所述初始病例中包括的患者的临床表现、诊断结果,设定所述预设的匹配规则;Setting the preset matching rules according to the clinical manifestations and diagnosis results of the patients included in the initial case;
    通过所述预设的匹配规则,将所述病例数据库的病例与所述初始病例进行逐一匹配,从而确定与所述初始病例相匹配的一级相似病例。According to the preset matching rule, the cases in the case database are matched with the initial case one by one, so as to determine a first-level similar case that matches the initial case.
  13. 如权利要求11所述的电子设备,其中,The electronic device according to claim 11, wherein:
    所述将所述相似病例按照其与所述初始病例的近似程度进行排序,并按照排序将预设数量的相似病例反馈给用户,利用k均值算法对所述病例数据库中的病例进行聚类划分, 包括如下步骤:The said similar cases are sorted according to their similarity to the initial case, and a preset number of similar cases are fed back to the user according to the sorting, and the cases in the case database are clustered and divided by the k-means algorithm , Including the following steps:
    第一步:对所述病例数据库中的病例序列化后进行编码得到{x i}; The first step: serialize the cases in the case database and encode them to obtain {x i };
    第二步:初始化类别中心向量{μ 12,…,μ k},其中,k根据所述病例的数量在最大迭代数进行选取; Step 2: Initialize the category center vector {μ 12 ,...,μ k }, where k is selected at the maximum number of iterations according to the number of cases;
    第三步:将每个病例的编码根据所述中心向量计算L 2范数进行划分类别; The third step: the code of each case is divided into categories by calculating the L 2 norm according to the central vector;
    第四步:根据划分结果,计算类别中心向量{μ′ 1,μ′ 2,…,μ′ k}; Step 4: According to the division result, calculate the category center vector {μ′ 1 ,μ′ 2 ,…,μ′ k };
    第五步:迭代第三步和第四步骤,直至两次类别中心向量的对应中心值之差小于阈值或超过最大迭代数;Step 5: Iterate the third and fourth steps until the difference between the corresponding center values of the center vectors of the two categories is less than the threshold or exceeds the maximum number of iterations;
    第六步:计算所述初始病例与各类别病例的中心向量计算L 2范数距离,将距离最小的类别中所有病例按近似程度高低反馈给用户。 Step 6: Calculate the center vector of the initial case and each category of cases to calculate the L 2 norm distance, and feed back all the cases in the category with the smallest distance to the user according to the degree of similarity.
  14. 如权利要求13所述的电子设备,其中,The electronic device according to claim 13, wherein:
    所述将每个病例的编码根据所述中心向量计算L 2范数进行划分类别,采用的公式为: The coding of each case is divided into categories by calculating the L 2 norm according to the center vector, and the formula used is:
    d ij=‖x ji2 d ij =‖x ji2
    其中,d ij表示搜索病例与各类别病例的中心L 2范数距离; Among them, d ij represents the L 2 norm distance between the searched case and the center of each category of cases;
    x j表示搜索病例,μ i表示第i个分类中心向量。 x j represents the search case, and μ i represents the i-th classification center vector.
  15. 如权利要求11所述的电子设备,其中,The electronic device according to claim 11, wherein:
    所述根据用户对所述相似病例的反馈信息,通过学习模型对所述相似病例进行监督学习,包括如下步骤:The supervised learning of the similar cases through a learning model according to the user's feedback information on the similar cases includes the following steps:
    用户对所述相似病例的反馈信息分包括:近似程度高的相似病例和近似程度低的相似病例,其中,The user’s feedback information on the similar cases includes: similar cases with a high degree of similarity and similar cases with a low degree of similarity, among which,
    通过学习模型对与所述初始病例的近似程度高的相似病例进行监督学习,其中,采用Logistic回归公式计算概率:The learning model is used to supervise and learn similar cases with a high degree of similarity to the initial case, where the logistic regression formula is used to calculate the probability:
    Figure PCTCN2020136406-appb-100002
    Figure PCTCN2020136406-appb-100002
    其中,x为参与方数据的特征,y为数据的标签。Among them, x is the feature of the participant's data, and y is the label of the data.
  16. 如权利要求15所述的电子设备,其中,所述用户对所述相似病例的反馈信息分包括:近似程度高的相似病例和近似程度低的相似病例:The electronic device according to claim 15, wherein the feedback information of the user on the similar cases includes: similar cases with a high degree of similarity and similar cases with a low degree of similarity:
    分别提取与所述初始病例的近似程度高的相似病例的词频以及近似程度低的相似病例的词频;Respectively extract the word frequency of similar cases with a high degree of similarity to the initial case and the word frequencies of similar cases with a low degree of similarity;
    根据提取的词频,修改所述预设的匹配规则;Modify the preset matching rule according to the extracted word frequency;
    根据修改后的预设的匹配规则以及所述学习模型的监督学习结果,进一步搜索所述病例数据库中与所述初始病例相近似的二级病例,并向用户反馈。According to the modified preset matching rules and the supervised learning result of the learning model, the second-level case similar to the initial case in the case database is further searched, and the user is fed back.
  17. 如权利要求11所述的电子设备,其中,The electronic device according to claim 11, wherein:
    所述用户对所述近似程度高的相似病例进行强化操作,所述用户对所述近似程度低的相似病例进行弱化操作;其中,The user performs an enhanced operation on the similar cases with a high degree of similarity, and the user performs a weakening operation on the similar cases with a low degree of similarity; wherein,
    所述强化操作对应标签1集合,所述弱化操作对应标签0集合。The strengthening operation corresponds to the label 1 set, and the weakening operation corresponds to the label 0 set.
  18. 如权利要求17所述的电子设备,其中,The electronic device according to claim 17, wherein:
    所述强化操作是指:提取所述标签1集合中的文档中所有词语及其出现次数,除以所述词语在所述病例数据库中所有病例文本中出现的次数后得到词频,并将词频前三的词语加入到搜索命令中的“文本包含”项。The strengthening operation refers to: extracting all words and their appearance times in the documents in the label 1 set, dividing by the times of appearance of the words in all case texts in the case database, and obtaining the word frequency, and adding the word frequency before the word frequency. The three words are added to the "text contains" item in the search command.
  19. 如权利要求17所述的电子设备,其中,The electronic device according to claim 17, wherein:
    所述弱化操作是提取标签0集合中文档中所有词语及其出现次数,除以所述词语在所述病例数据库中所有病例文本中出现的次数后得到词频,并将前三加入后台搜索命令中的“文本不包含”项。The weakening operation is to extract all words and their appearance times in the document in the tag 0 set, divide by the times the words appear in all case texts in the case database to obtain the word frequency, and add the first three to the background search command The "text does not contain" item.
  20. 一种计算机可读存储介质,存储有计算机程序,其中,所述计算机程序被处理器执行时实现如权利要求1至9中任一所述的基于交互反馈的病例搜索方法。A computer-readable storage medium storing a computer program, wherein when the computer program is executed by a processor, the method for searching a case based on interactive feedback according to any one of claims 1 to 9 is realized.
PCT/CN2020/136406 2020-10-19 2020-12-15 Medical case search method and apparatus based on interactive feedback, and readable storage medium WO2021189983A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011118337.4 2020-10-19
CN202011118337.4A CN112259254B (en) 2020-10-19 2020-10-19 Case search method and device based on interactive feedback and readable storage medium

Publications (1)

Publication Number Publication Date
WO2021189983A1 true WO2021189983A1 (en) 2021-09-30

Family

ID=74244937

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/136406 WO2021189983A1 (en) 2020-10-19 2020-12-15 Medical case search method and apparatus based on interactive feedback, and readable storage medium

Country Status (2)

Country Link
CN (1) CN112259254B (en)
WO (1) WO2021189983A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116662489A (en) * 2023-07-27 2023-08-29 中航创世机器人(西安)有限公司 Intelligent form matching optimization method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101903883A (en) * 2007-12-20 2010-12-01 皇家飞利浦电子股份有限公司 Method and device for case-based decision support
CN101911077A (en) * 2007-12-27 2010-12-08 皇家飞利浦电子股份有限公司 Method and apparatus for refining similar case search
WO2017158472A1 (en) * 2016-03-16 2017-09-21 Koninklijke Philips N.V. Relevance feedback to improve the performance of clustering model that clusters patients with similar profiles together
CN110111887A (en) * 2019-05-15 2019-08-09 清华大学 Clinical aid decision-making method and device
CN110517785A (en) * 2019-08-28 2019-11-29 北京百度网讯科技有限公司 Lookup method, device and the equipment of similar case

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4799251B2 (en) * 2006-04-05 2011-10-26 富士フイルム株式会社 Similar case search device, similar case search method and program thereof

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101903883A (en) * 2007-12-20 2010-12-01 皇家飞利浦电子股份有限公司 Method and device for case-based decision support
CN101911077A (en) * 2007-12-27 2010-12-08 皇家飞利浦电子股份有限公司 Method and apparatus for refining similar case search
WO2017158472A1 (en) * 2016-03-16 2017-09-21 Koninklijke Philips N.V. Relevance feedback to improve the performance of clustering model that clusters patients with similar profiles together
CN110111887A (en) * 2019-05-15 2019-08-09 清华大学 Clinical aid decision-making method and device
CN110517785A (en) * 2019-08-28 2019-11-29 北京百度网讯科技有限公司 Lookup method, device and the equipment of similar case

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116662489A (en) * 2023-07-27 2023-08-29 中航创世机器人(西安)有限公司 Intelligent form matching optimization method and system

Also Published As

Publication number Publication date
CN112259254B (en) 2024-05-07
CN112259254A (en) 2021-01-22

Similar Documents

Publication Publication Date Title
WO2021212682A1 (en) Knowledge extraction method, apparatus, electronic device, and storage medium
US20200081899A1 (en) Automated database schema matching
Jing et al. Relevance feedback in region-based image retrieval
Zheng et al. Coupled binary embedding for large-scale image retrieval
US9323794B2 (en) Method and system for high performance pattern indexing
US8126826B2 (en) Method and system for active learning screening process with dynamic information modeling
WO2021208703A1 (en) Method and apparatus for question parsing, electronic device, and storage medium
CN111949759A (en) Method and system for retrieving medical record text similarity and computer equipment
WO2022227165A1 (en) Question and answer method and apparatus for machine reading comprehension, computer device, and storage medium
WO2022160454A1 (en) Medical literature retrieval method and apparatus, electronic device, and storage medium
WO2022222300A1 (en) Open relationship extraction method and apparatus, electronic device, and storage medium
WO2022062449A1 (en) User grouping method and apparatus, and electronic device and storage medium
WO2022222943A1 (en) Department recommendation method and apparatus, electronic device and storage medium
CN112257422A (en) Named entity normalization processing method and device, electronic equipment and storage medium
CN112735597A (en) Medical text disorder identification method driven by semi-supervised self-learning
WO2022222942A1 (en) Method and apparatus for generating question and answer record, electronic device, and storage medium
Kaur et al. A systematic literature review of automated ICD coding and classification systems using discharge summaries
US20220198815A1 (en) Systems and methods for classification of scholastic works
CN115269838B (en) Classification method for electronic medical records
WO2021189983A1 (en) Medical case search method and apparatus based on interactive feedback, and readable storage medium
CN106570196B (en) Video program searching method and device
US11650996B1 (en) Determining query intent and complexity using machine learning
Yogarajan et al. Seeing the whole patient: using multi-label medical text classification techniques to enhance predictions of medical codes
CN113157739B (en) Cross-modal retrieval method and device, electronic equipment and storage medium
CN113590845B (en) Knowledge graph-based document retrieval method and device, electronic equipment and medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20926898

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20926898

Country of ref document: EP

Kind code of ref document: A1