CN107958007B

CN107958007B - Case information retrieval method and device

Info

Publication number: CN107958007B
Application number: CN201610905237.3A
Authority: CN
Inventors: 张士玉; 郭威; 卢苓欣; 孙保辉
Original assignee: Zhejiang Greenlander Information Technology Co Ltd
Current assignee: Hangzhou Mindray Digital Technology Co.,Ltd.
Priority date: 2016-10-18
Filing date: 2016-10-18
Publication date: 2022-03-29
Anticipated expiration: 2036-10-18
Also published as: CN107958007A

Abstract

The embodiment of the invention discloses a case information retrieval method and a case information retrieval device, wherein the method comprises the following steps: acquiring user input, and generating target case information according to the user input, wherein the target case information comprises at least one attribute item; acquiring a first attribute value of each attribute item included in the target case information, traversing reference case information included in a preset case database, acquiring a second attribute value of the traversed reference case information under the attribute item corresponding to the first attribute value, calculating the similarity between the first attribute value and the second attribute value, and acquiring a similarity component of the target case information and the traversed reference case information under each attribute item; calculating the overall similarity of the target case information and the traversed reference case information according to the similarity; and screening reference case information in the case database according to the overall similarity to be output as a retrieval result. By adopting the invention, the accuracy of case information retrieval can be improved.

Description

Case information retrieval method and device

Technical Field

The invention relates to the technical field of internet and the medical field, in particular to a case information retrieval method and device.

Background

With the continuous development and popularization of internet technology, data processing methods and search data processing methods capable of analyzing and processing massive data are increasingly applied.

In the biomedical industry, many departments generate a large amount of diagnosis reports and examination results every day, and many of the data are unstructured digital information, such as a large amount of examination images generated every day by various imaging medical departments such as radiology department and pathology department, and clinical diagnosis information provided for medical examination by outpatient service. These reports are important data for case tracking, academic research, and interactive learning among doctors. In the prior art, a user can perform full text search of plain text by inputting search keywords in a database storing medical reports such as the diagnosis report and the case report; or by searching for keywords after keywords preset in each report. However, the method of setting the keywords for each report in advance has a problem of excessive workload in the early stage, and in the aforementioned search mode, because of the specificity of medical reports such as case reports and examination results, various modes are expressed by terms, and certain correlation exists between terms, which results in that many searched reports are not required by the user, or many desired results cannot be searched, that is, the accuracy of the search result is insufficient.

Disclosure of Invention

Based on the above, a case information retrieval method is particularly provided for solving the technical problem of insufficient accuracy of medical report-based search in the conventional technology.

A case information retrieval method, comprising:

acquiring user input, and generating target case information according to the user input, wherein the target case information comprises at least one attribute item;

acquiring a first attribute value of each attribute item included in the target case information, traversing reference case information included in a preset case database, acquiring a second attribute value of the traversed reference case information under the attribute item corresponding to the first attribute value, calculating the similarity between the first attribute value and the second attribute value, and acquiring a similarity component of the target case information and the traversed reference case information under each attribute item;

calculating the overall similarity of the target case information and the traversed reference case information according to the similarity of the target case information and the traversed reference case information under each attribute item;

and screening the reference case information in the case database according to the overall similarity between the target case information and the traversed reference case information, and outputting the reference case information as a retrieval result.

Optionally, in one embodiment, the user input comprises at least one form item;

the step of generating target case information according to the user input further comprises:

and generating target case information corresponding to the user input according to at least one form item contained in the user input and a preset hierarchical tree structure model, wherein attribute items contained in the target case information correspond to the form items contained in the user input.

Optionally, in one embodiment, the step of calculating the overall similarity between the target case information and the traversed reference case information according to the similarity between the target case information and the traversed reference case information under each attribute item further includes:

acquiring a weight coefficient corresponding to each attribute item included in the target case information;

and weighting the similarity component under each attribute item according to the weight coefficient corresponding to the attribute item to obtain the overall similarity between the target case information and the traversed reference case information.

Optionally, in one embodiment, the attribute items included in the target case information include at least 2 sub-attribute items corresponding to the attribute items;

the step of calculating the similarity between the first attribute value and the second attribute value further comprises:

acquiring a sub-attribute item weight coefficient corresponding to the at least 2 sub-attribute items;

calculating the sub-similarity of a first sub-attribute value of the target case information under each sub-attribute item and a second sub-attribute value of the reference case information under each sub-attribute item;

and weighting the sub-similarity according to the sub-attribute item weight coefficient to obtain the similarity of the first attribute value and the second attribute value.

Optionally, in one embodiment, the step of obtaining the first attribute value of each attribute item included in the target case information further includes:

performing word segmentation processing on a first attribute value of each attribute item included in the target case information, and taking a result obtained by the word segmentation processing as a keyword word segmentation set corresponding to the first attribute value;

acquiring the position and/or the frequency of the occurrence of the keyword participles in the second attribute value, wherein the keyword participles are included in the keyword participle set;

and calculating a similarity component between the first attribute value and the second attribute value according to the position and/or the times of the keyword segmentation in the second attribute value.

Optionally, in one embodiment, after the step of using the result obtained by the word segmentation processing as the keyword word segmentation set corresponding to the first attribute value, the method further includes:

and performing synonym expansion on the keyword participles contained in the keyword participle set according to a preset medical term library, combining the expanded keyword participle set obtained by expansion into the keyword participle set, and executing the step of acquiring the positions and/or times of the keyword participles contained in the keyword participle set appearing in the second attribute value.

Optionally, in one embodiment, before the step of performing synonym expansion on the keyword participles included in the keyword participle set according to a preset medical term library, the method further includes:

and determining an extended term library matched with the attribute item corresponding to the first attribute value as the preset medical term library.

Optionally, in one embodiment, after the step of using the result obtained by the word segmentation processing as the keyword word segmentation set corresponding to the first attribute value, the step of:

distributing corresponding sub-weight coefficients for the keyword participles contained in the keyword participle set;

the step of calculating a similarity component between the first attribute value and the second attribute value according to the position and/or number of occurrences of the keyword segmentation in the second attribute value further includes:

and calculating a similarity component between the first attribute value and the second attribute value according to the position and/or the frequency of the keyword segmentation in the second attribute value and the sub-weight coefficient distributed to the keyword segmentation.

Optionally, in one embodiment, the step of calculating the similarity between the first attribute value and the second attribute value further includes:

and calculating the similarity between the first attribute value and the second attribute value according to a preset similarity calculation function, wherein the preset similarity calculation function is a preset text equivalence judgment function, a preset numerical value range comparison function, a preset semantic equivalence judgment function, a preset code equivalence judgment function or a preset semantic inclusion judgment function.

Optionally, in one embodiment, the method further includes:

detecting form content filled in a case filling page, generating reference case information corresponding to the detected form content according to the form content and the preset hierarchical tree structure model, and storing the reference case information to the preset case database.

In addition, in order to solve the technical problem of insufficient accuracy of medical report-based search in the conventional technology, a case information retrieval device is particularly proposed.

A case information retrieval device comprising:

the system comprises a target case information generation module, a data processing module and a data processing module, wherein the target case information generation module is used for acquiring user input and generating target case information according to the user input, and the target case information comprises at least one attribute item;

the similarity calculation module is used for acquiring a first attribute value of each attribute item included in the target case information, traversing reference case information included in a preset case database, acquiring a second attribute value of the traversed reference case information under the attribute item corresponding to the first attribute value, calculating the similarity between the first attribute value and the second attribute value, and acquiring a similarity component between the target case information and the traversed reference case information under each attribute item;

the overall similarity calculation module is used for calculating the overall similarity between the target case information and the traversed reference case information according to the similarity between the target case information and the traversed reference case information under each attribute item;

and the retrieval result screening module is used for screening the reference case information in the case database according to the overall similarity between the target case information and the traversed reference case information and outputting the reference case information as a retrieval result.

Optionally, in one embodiment, the user input comprises at least one form item;

the target case information generating module is further configured to generate target case information corresponding to the user input according to at least one form item included in the user input and a preset hierarchical tree structure model, where an attribute item included in the target case information corresponds to the form item included in the user input.

Optionally, in one embodiment, the attribute items included in the target case information are classification attribute items or description attribute items, and the classification attribute items include at least one classification attribute item corresponding to the classification attribute item and at most one description attribute item.

Optionally, in one embodiment, the overall similarity calculation module is further configured to obtain a weight coefficient corresponding to each attribute item included in the target case information; and weighting the similarity component under each attribute item according to the weight coefficient corresponding to the attribute item to obtain the overall similarity between the target case information and the traversed reference case information.

the similarity calculation module is further used for obtaining a sub-attribute item weight coefficient corresponding to the at least 2 sub-attribute items; calculating the sub-similarity of a first sub-attribute value of the target case information under each sub-attribute item and a second sub-attribute value of the reference case information under each sub-attribute item; and weighting the sub-similarity according to the sub-attribute item weight coefficient to obtain the similarity of the first attribute value and the second attribute value.

Optionally, in one embodiment, the similarity component calculation module is further configured to perform word segmentation on a first attribute value of each attribute item included in the target case information, and use a result obtained by the word segmentation as a keyword word segmentation set corresponding to the first attribute value; acquiring the position and/or the frequency of the occurrence of the keyword participles in the second attribute value, wherein the keyword participles are included in the keyword participle set; and calculating a similarity component between the first attribute value and the second attribute value according to the position and/or the times of the keyword segmentation in the second attribute value.

Optionally, in one embodiment, the similarity component calculation module is further configured to perform synonym expansion on the keyword participles included in the keyword participle set according to a preset medical term library, and cooperate the expanded keyword participle set obtained through the expansion as the keyword participle set.

Optionally, in one embodiment, the similarity calculation module is further configured to determine an extended term library matched with the attribute item corresponding to the first attribute value as the preset medical term library.

Optionally, in one embodiment, the similarity component calculation module is further configured to assign corresponding sub-weight coefficients to the keyword participles included in the keyword participle set; and calculating a similarity component between the first attribute value and the second attribute value according to the position and/or the frequency of the keyword segmentation in the second attribute value and the sub-weight coefficient distributed to the keyword segmentation.

Optionally, in one embodiment, the apparatus further includes a database update module, configured to detect a form content filled in a case filling page, generate, according to the form content and the preset hierarchical tree structure model, reference case information corresponding to the detected form content, and store the reference case information in the preset case database.

The embodiment of the invention has the following beneficial effects:

after the case information retrieval method and the case information retrieval device are adopted, in the process of retrieving case information, target case information is generated by keywords needing to be retrieved and input by a user according to a preset case model, when a retrieval result corresponding to the target case information is searched in a case database, the overall similarity between the reference case information and the target case information is obtained by calculating the similarity of the reference case information and the target case information in the case database under each attribute item, and finally, the corresponding retrieval result is screened from the case database through the overall similarity. That is, preprocessing such as keyword setting is not required for each case information in the case database, and all data in the case database can be covered in the process of retrieval, improving the accuracy of the retrieval result.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Wherein:

FIG. 1 is a diagram illustrating the structure of an attribute node of a case model in one embodiment;

FIG. 2 is a flow diagram of a method for case information retrieval in one embodiment;

fig. 3 is a schematic structural diagram of a case information retrieval apparatus according to an embodiment;

fig. 4 is a schematic structural diagram of a computer device for executing the case information retrieval method in one embodiment.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

To solve the technical problem of insufficient accuracy of medical report-based search in the conventional technology, in the present embodiment, a case information retrieval method is specifically proposed, which can be implemented by relying on a computer program that can run on a computer system based on the von neumann system, which can be a case information retrieval application program of a hospital or a medical reporting system, as an integral part of other medical software. The computer system may be a terminal device such as a smart phone, a tablet computer, a personal computer, etc. running the computer program.

It should be noted that, in the present embodiment, a database of medical reports, i.e., a preset case database, needs to be established first. Each case information contained in the case database is completed under the guidance of a preset case model in the filling process of the case information. For example, in the case information filling system, each piece of case information includes a plurality of attribute items such as patient information, examination items, clinical diagnosis, and the like, and each attribute item may further include a plurality of sub-attribute items, for example, a plurality of sub-attribute items such as name, gender, birth date, contact number, family medical history, and the like. Furthermore, for a preset case model, the user can further expand the case model according to the needs of the user, for example, a certain specific attribute item is added, so that the application range of the case model is wider. It should be noted that, in this embodiment, there may also be a certain association relationship between the attribute items, for example, the attribute item at the next level includes the attribute item at the previous level, or there may also be a certain association relationship between the attribute items at two levels, for example, there is a certain association relationship between the clinical diagnosis and the patient complaint.

In one embodiment, as shown in fig. 1, fig. 1 illustrates a node diagram of an attribute item of a case model corresponding to a radiology examination, where the case model includes a plurality of attribute items, that is, the case model includes 6 attribute items of patient information, examination application, clinical diagnosis, image findings, image conclusions, etc., and each attribute item further includes a plurality of sub-attribute items, and a next-level sub-attribute item corresponding to the sub-attribute item is further included under the sub-attribute item.

It should be noted that, in this embodiment, each case information includes a plurality of attribute items, each attribute item may also include a plurality of sub-attribute items, and a sub-attribute item may also include a plurality of sub-attribute items; for a specific case information, the attribute value corresponding to a certain attribute item or sub-attribute item may be null. That is, the case information has a multi-level tree structure including the attribute items, the child attribute items corresponding to the child attribute items, and the like, and in the present embodiment, the number of levels of the tree structure of the attribute items of the case information is not limited, and the user can further expand the tree structure as necessary.

In a system based on case information filling and case information searching, attribute items corresponding to each case information and sub-attribute items included in the attribute items are determined according to the using process of doctors or other users and the needs of case storage, namely, form items required to be filled by users such as doctors when filling forms corresponding to case information. In this embodiment, the form items filled by the user such as the doctor when filling the form corresponding to the case information may be attribute items such as gender selected in a drop-down box form, or may define input options of certain requirements or conditions, for example, the user inputs structured or semi-structured information, the input information includes preset semantic information, or may be an input box capable of being freely input, and the user may input any free text.

In this embodiment, the case information, or the target case information or the reference case information described below, is established based on the case model, or is established based on a preset hierarchical tree structure model. The case information, the target case information, or the reference case information includes attribute items determined according to a case model or a preset hierarchical tree structure model, each attribute item may include a plurality of sub-attribute items corresponding to the attribute item, or each node includes a plurality of sub-nodes corresponding to the attribute item, however, it should be noted that not each attribute item may include a plurality of sub-attribute items, for example, in the case model shown in fig. 1, the number of attribute items in the layer 5 includes only one attribute item, and the corresponding attribute value is necessarily the number.

Specifically, in an optional embodiment, the case model includes attribute items that are classification attribute items or description attribute items, and the classification attribute items include at least one classification attribute item corresponding to the classification attribute item and at most one description attribute item. That is, the classification attribute items may include a plurality of next-level classification attribute items and 0 or 1 next-level description attribute item, and the description attribute item may include only one attribute item corresponding to the classification attribute item, and the attribute item does not include any other attribute item of this time. For example, in the case model shown in fig. 1, "radiology examination" is a classification attribute item, which includes 5 classification attribute items such as "patient information" and the like thereunder, includes a "lesion" classification attribute item thereunder, and includes a "summary description" description attribute item. Furthermore, two attribute items are included in the description attribute item, one is an attribute item using free text as an attribute value, such as "summary description"; one is attribute items with specific numeric values or other non-free text as attribute values, such as "number", "size".

Specifically, as shown in fig. 2, the case information retrieval method includes steps S102 to S108 as follows:

step S102: user input is acquired, and target case information is generated according to the user input and comprises at least one attribute item.

In a specific implementation, in a case that the case information needs to be retrieved, a doctor or other users may input relevant information that needs to be retrieved in a corresponding retrieval page to perform retrieval, so as to trigger the retrieval, that is, execute steps S102 to S108. That is, when a user manually inputs a case information retrieval instruction in a corresponding retrieval page, the terminal triggers execution of a procedure related to case information retrieval according to the received case information retrieval instruction when detecting the case information retrieval instruction input by the user.

In this embodiment, the user input may be the content of a form filled in by the user on the case form, and the user input is the target object to be used as the basis for searching and is the basis for generating the target medical record information.

In one embodiment, the form or input box filled in by the user on the retrieval interface is a target object serving as a retrieval basis, i.e., user input. In the present embodiment, in order to improve the accuracy of the search, in the process of manually triggering the search by a doctor or other users, the form for filling in the input content for the search may be determined according to the case model, that is, the user may search a case corresponding to a term (for example, "liver cancer") included in one input content in its entirety, or may search a case corresponding to "liver cancer" appearing in an attribute value corresponding to one specific attribute item (for example, an attribute item whose name is an attribute item for diagnosis by the doctor). After the user fills in the relevant content to be retrieved in the corresponding case form for retrieval, that is, the relevant content is the user input, in this embodiment, the corresponding target case information is generated according to the user input and the case model.

In another embodiment, in an application scenario where a doctor fills in a case or develops an examination order, a drug order, or the like, it may be necessary to refer to past case information of the patient or to related case information of similar cases, and in such a case, if the doctor needs to switch to a search page for searching, the number of operations in the process is inevitably increased; thus, in the present embodiment, the execution of steps S102-S108 is automatically triggered during the physician 'S process of filling in case information, i.e. during the physician' S process of filling in relevant form content in case forms. That is, when detecting an event that needs to trigger the case information retrieval, the terminal automatically generates a case information retrieval instruction, and triggers the execution of the above-described steps S102 to S108 according to the case information retrieval instruction. In this case, the relevant content filled in the case form by the user is the user input, and then when the retrieval is triggered, the corresponding target case information is generated according to the target correspondence and the case model.

In a specific embodiment, a doctor needs to fill in a case of a user under the condition of visiting and the like, namely, a corresponding case form is filled in a corresponding system; or, when a doctor or other user needs to search for related case information, the doctor or other user first needs to input a search basis needed to be searched in a corresponding search interface, that is, fill in a corresponding representative item of case information to be searched in a corresponding case list. And generating target case information according to the filled form contents on the case form currently filled in, namely, the target case information is a retrieval key word for retrieving the case information.

In summary, in the present embodiment, the case form includes not only the form filled by the doctor or other users when filling cases for patients in the case system or medical system, but also the form filled in the corresponding case retrieval interface.

It should be noted that, in this embodiment, because the filling-in case form is filled according to the case model preset in the system, that is, according to the tree structure of the attribute items included in the preset case model, it can be determined that the target case information generated according to the form content filled in the case form generated according to the preset case model also includes the attribute item corresponding to the case model, that is, the target case information includes at least one attribute item.

Specifically, in one embodiment, the target object corresponding to the user input includes at least one form item; the step of generating target case information according to the user input further comprises: and generating target case information corresponding to the user input according to at least one form item contained in the user input and a preset hierarchical tree structure model, wherein attribute items contained in the target case information correspond to the form items contained in the user input.

That is, when the user inputs a case form to be filled in for the user, the case form includes a plurality of form items, and when the user inputs related contents to be filled in the input box for the user, the corresponding search key may constitute a plurality of form items. Moreover, because the target case information is generated according to the preset case model in the process of generating the target case information by the user input, that is, according to the hierarchical tree structure model corresponding to the preset case model, that is, a corresponding relationship is established between each form item included in the user input and each attribute item included in the preset case model or each node included in the preset hierarchical tree structure model, and then according to the corresponding relationship between each attribute item included in the preset case model and each form item included in the user input and between the attribute items, the attribute items included in the generated target case information and the attribute values corresponding to each attribute item are determined, of course, the attribute values corresponding to part of the attribute items are allowed to be empty, or according to each node included in the preset hierarchical tree interface model and each form item included in the user input and between the attribute values corresponding to the attribute items, attribute items contained in the generated target case information and attribute values corresponding to each attribute item are determined, and the attribute values corresponding to some of the attribute items are allowed to be null.

Step S104: acquiring a first attribute value of each attribute item included in the target case information, traversing reference case information included in a preset case database, acquiring a second attribute value of the traversed reference case information under the attribute item corresponding to the first attribute value, calculating the similarity between the first attribute value and the second attribute value, and acquiring a similarity component of the target case information and the traversed reference case information under each attribute item.

In this embodiment, each attribute item included in the target case information corresponds to an attribute item name and an attribute value; all reference case information contained in the retrieved database (i.e., the preset case database) also includes an attribute item corresponding to the preset case model, i.e., includes the same attribute item as the target case information, i.e., each attribute item included in the target case information also corresponds to an attribute item under the reference case information. In this step, in the process of retrieving case information, traversal is performed on all reference case information included in a preset case database, and for the traversed reference case information, a second attribute value under an attribute item corresponding to a first attribute value under one attribute item of the target case information is acquired, and then a similarity between the first attribute value and the second attribute value is calculated.

In the present embodiment, the reference case information included in the preset case database is generated based on the case information previously filled by the user according to the preset case model, for example, the case information filled in the case form and stored. The method comprises the following specific steps: detecting form content filled in a case filling page, generating reference case information corresponding to the detected form content according to the form content and the preset hierarchical tree structure model, and storing the reference case information to the preset case database.

That is, when the user fills in the form contents corresponding to the patient, the case, the search, and the like, which are filled in the corresponding case filling page, the corresponding reference case information is generated according to the form contents and the preset case model or the preset hierarchical tree structure model and is stored in the case database. As time goes by, all case information filled in the same hospital system, each of which is reference case information, is stored in the case database.

Further, the similarity between the first attribute value and the second attribute value is a similarity component between the target case information and the reference case information under the attribute item corresponding to the first attribute value.

It should be noted that, in this embodiment, a specific calculation method of the similarity between the first attribute value and the second attribute value may be determined according to the first attribute value, the second attribute value, and the corresponding attribute item. For example, for the name attribute item under the patient information, the value of the corresponding similarity is only 1 or 0, that is, the value of the similarity is not 0 but 1 only when the corresponding first attribute value and the corresponding second attribute value are completely matched.

Specifically, the method for calculating the similarity between the first attribute value and the second attribute value under the attribute item includes text equivalence judgment (such as name), numerical range comparison (such as lesion size of 10cm-15cm), semantic equivalence judgment (such as synonym judgment), code equivalence judgment (such as two-dimensional code), semantic inclusion judgment, and the like. That is, the step of calculating the similarity between the first attribute value and the second attribute value further includes: and calculating the similarity between the first attribute value and the second attribute value according to a preset similarity calculation function, wherein the preset similarity calculation function is a preset text equivalence judgment function, a preset numerical value range comparison function, a preset semantic equivalence judgment function, a preset code equivalence judgment function or a preset semantic inclusion judgment function.

The text equality judgment means that the first attribute value and the second attribute value are both text information, and the text mode is used for directly judging whether the first attribute value and the second attribute value are equal, 1 is output if the first attribute value and the second attribute value are equal, and 0 is output if the first attribute value and the second attribute value are not equal, for example, "liver" is equal to "liver" and "liver" is not equal.

The comparison of the numerical ranges refers to that the value of the attribute item is a numerical value or a numerical expression, such as 10cm, it should be noted that in this embodiment, the numerical value may include a unit, and in the process of calculating the similarity between the first attribute value and the second attribute value, it is determined whether the numerical value or the numerical range corresponding to the first attribute value and the second attribute value is equal or includes or has an overlapping portion, and a corresponding similarity calculation formula is defined for each case.

The semantic equivalence judgment means that the first attribute value and the second attribute value are both text information, and in the process of comparing the first attribute value and the second attribute value, whether the semantics corresponding to the first attribute value and the second attribute value are equivalent is judged through a preset term library, for example, all synonym elements can be judged to be equivalent, and in this case, the output is 1.

The Code equality judgment means that the first attribute value and the second attribute value are both codes, such as bar codes, two-dimensional codes or other Code forms, in this case, whether the Code corresponding to the first attribute value is equal to the Code corresponding to the second attribute value is directly judged, and 1 is output if the codes are equal.

In another embodiment, the Code equality determination may be based on a medical coding system, for example, in one coding system, if 101 represents a liver, and the description is performed by using the related codes in the coding system, it is directly determined that two corresponding codes are both completely equal, and if yes, 1 is output.

The semantic inclusion judgment means that the first attribute value and the second attribute value are text information and can be identified as semantic elements, and whether an inclusion relationship exists between the semantic element corresponding to the first attribute value and the semantic element corresponding to the second attribute value is judged through a preset semantic inclusion database.

The judgment methods in the similarity calculation process all judge the attribute values corresponding to the attribute items as a whole, in this embodiment, the attribute value of the attribute item may also be free text, such as, for example, a diagnosis conclusion, an order, or the like, the text can be freely edited by a doctor or other users when filling in or generating, and therefore, in this case, the comparison between the attribute values is still directly performed on the free text form as a whole, and since the comparison between the free text and the free text is not suitable for the above-described determination method, the calculation accuracy of the similarity between the first attribute value and the second attribute value is not high, and therefore, the calculation of the similarity between the attribute values of the corresponding attribute items is not used in the above-mentioned judgment methods such as text and the like, but needs to select other calculation methods.

In an optional embodiment, the step of obtaining the first attribute value of each attribute item included in the target case information further includes: performing word segmentation processing on a first attribute value of each attribute item included in the target case information, and taking a result obtained by the word segmentation processing as a keyword word segmentation set corresponding to the first attribute value; the step of calculating the similarity between the first attribute value and the second attribute value further comprises: acquiring the position and/or the frequency of the occurrence of the keyword participles in the second attribute value, wherein the keyword participles are included in the keyword participle set; and calculating a similarity component between the first attribute value and the second attribute value according to the position and/or the times of the keyword segmentation in the second attribute value.

Specifically, for the attribute value of the free text, word segmentation processing is performed on the free text corresponding to the attribute value, that is, all character sequences included in the free text are segmented into meaningful words, that is, word segmentation processing is performed on a first attribute value corresponding to each attribute item included in the target case information, and a result obtained by the word segmentation processing (that is, a plurality of keyword sequences or sets into which the first attribute value is segmented) is used as a keyword word segmentation set corresponding to the first attribute value.

In this embodiment, the method for segmenting words is not limited, and for example, an existing segmentation tool, such as a Chinese-Segmenter of Stanford, may be used to perform the batch segmentation processing on the free text, or any existing segmentation algorithm may be used to perform the segmentation processing, and of course, any segmentation method may also be used.

In this embodiment, after performing word segmentation processing on the first attribute value in the form of a free text, similarity calculation may be performed on a keyword word segmentation set obtained by word segmentation, that is, in the calculation process of the similarity between the first attribute value and the second attribute value, the original free text corresponding to the first attribute value is not used for the first attribute value, but the keyword word segmentation set obtained after performing word segmentation processing on the first attribute value is used for the first attribute value.

Specifically, in a second attribute value under an attribute item corresponding to a first attribute value in reference case information traversed in a preset case database, each keyword segmentation contained in the keyword segmentation set is searched, information such as the number of times of occurrence, the position and the like of each keyword segmentation in the second attribute value is counted, and a similarity component between the first attribute value and the second attribute value is calculated according to a preset similarity component calculation formula and the counted information such as the number of times of occurrence, the position and the like of each keyword segmentation in the second attribute value.

It should be noted that, in this embodiment, when the similarity component between the first attribute value and the second attribute value is calculated according to the counted information of the number of times, positions, and the like of each keyword segmentation in the second attribute value, the calculation formula of the similarity component used may be determined according to the attribute item corresponding to the first attribute value, or may be determined according to the keyword segmentation set, that is, the calculation formula of the similarity component may be any preset calculation formula, and the calculation formula may be set in advance by the system or the user.

It should be noted that, in this embodiment, it is also necessary to consider a synonym of each keyword participle obtained by performing a participle process on a first attribute value in a free text form in consideration of various words with equal semantics existing in the medical industry and different usage habits of each doctor in the usage process, so as to improve the accuracy of similarity calculation.

Specifically, the step of using the result obtained by the word segmentation processing as the keyword word segmentation set corresponding to the first attribute value further includes: and performing synonym expansion on the keyword participles contained in the keyword participle set according to a preset medical term library, combining the expanded keyword participle set obtained by expansion into the keyword participle set, and executing the step of acquiring the positions and/or times of the keyword participles contained in the keyword participle set appearing in the second attribute value.

That is to say, when the keyword segmentation corresponding to the first attribute value is searched in the second attribute value, the synonym of each keyword segmentation is also considered, that is, the synonym of the keyword segmentation in the keyword segmentation set is analyzed according to a preset medical term library to perform synonym expansion, the synonym of the keyword segmentation in which the synonym exists is added to the expanded keyword segmentation set, and the synonym is used as a substitute keyword segmentation of the keyword segmentation, that is, in the process of calculating the position and the number of times of the keyword segmentation appearing in the second attribute value, the position and the number of times of the keyword segmentation appearing in the second attribute value of the corresponding synonym are also considered in the position and the number of times of the keyword segmentation appearing in the second attribute value.

It should be noted that, in this embodiment, common words corresponding to different attribute items are different, and therefore, different databases for expansion may also be set for different attribute items, that is, a preset medical term library used in a process of performing synonym expansion on a keyword in a keyword segmentation set may also be a medical term library corresponding to the keyword that is set according to different attribute items, and in a process of performing synonym expansion, it is first necessary to determine a corresponding medical term library according to an attribute item or an attribute item name, and then perform expansion on the keyword.

In summary, the similarity calculation methods corresponding to the attribute items are respectively set according to all the attribute items included in the target case information or the case model, that is, different similarity calculation methods corresponding to different attribute items are set.

Step S106: and calculating the overall similarity of the target case information and the traversed reference case information according to the similarity of the target case information and the traversed reference case information under each attribute item.

In a specific embodiment, since the target case information and the reference case information included in the preset case database are both composed of a plurality of attribute items, the similarity between the target case information and the reference case information included in the preset case database can be characterized by the similarity between the plurality of attribute items, that is, the similarity between the target case information and the reference case information can be described according to the similarity component of the target case information and the traversed reference case information under each attribute item. Specifically, in the present embodiment, the overall similarity between the target case information and the reference case information is calculated according to the similarity component of the target case information and the traversed reference case information under each attribute item.

It should be noted that, in the overall similarity calculation process, the overall similarity may be an average value of the similarity components of the target case information and the traversed reference case information under each attribute item, or a sum of the similarity components of the target case information and the traversed reference case information under each attribute item, and in another embodiment, the overall similarity may also be a weighted average value or a weighted sum of the similarity components of the target case information and the traversed reference case information under each attribute item, that is, the similarity component under each attribute item corresponds to a weight coefficient corresponding to the attribute item.

Specifically, the step of calculating the overall similarity between the target case information and the traversed reference case information according to the similarity between the target case information and the traversed reference case information under each attribute item further includes: acquiring a weight coefficient corresponding to each attribute item included in the target case information; and weighting the similarity component under each attribute item according to the weight coefficient corresponding to the attribute item to obtain the overall similarity between the target case information and the traversed reference case information.

Specifically, according to each attribute item included in the target case information, a weighting coefficient corresponding to the attribute item is assigned in advance, and the weighting coefficient may be determined according to specific content specifically included in the attribute item, for example, the setting of the weighting coefficient for the attribute item corresponding to the patient information may be higher than that for other attribute items. In the process of calculating the overall similarity, the similarity components of the target case information and the traversed reference case information under each attribute item are weighted according to each attribute item contained in the target case information and the corresponding weight coefficient, and the obtained weighted sum is the overall similarity between the target case information and the reference case information.

It should be noted that, in this embodiment, the weighting factor corresponding to each attribute item may be preset by the system, that is, for each case information retrieval, the weighting factor corresponding to the attribute item is fixed, and in another embodiment, the weighting factor corresponding to each attribute item may also be variable, for example, determined according to a trigger event of the case information retrieval, and in the case information retrieval triggered when the doctor fills in the patient information, the weighting factor corresponding to the attribute item of the patient information is much higher than the weighting factors corresponding to other attribute items; in case information retrieval triggered when a doctor fills in a patient profile, the weight coefficient of an attribute item corresponding to the patient profile is higher than the weight coefficients corresponding to other attribute items. Of course, the setting method of the weight coefficient corresponding to each attribute item may also be other setting schemes, and specifically may be set in advance by a system or a user.

Step S108: and screening the reference case information in the case database according to the overall similarity between the target case information and the traversed reference case information, and outputting the reference case information as a retrieval result.

In the present embodiment, the output result of the case information search is several pieces of reference case information in the case database, and in the present embodiment, the reference case information as the search result is decided based on the overall similarity between the reference case information calculated in step S106 and the target case information. For example, in one embodiment, the reference case information with the overall similarity greater than the preset threshold is used as the search result, and in another embodiment, a preset number of reference case information with the overall similarity ahead may be further selected as the search result according to the overall similarity.

Further, in this embodiment, before the retrieval result is output, the reference case information included in the retrieval result may be arranged in a descending order according to the overall similarity between the target case information and the traversed reference case information, and when the retrieval result is output, the reference case information in the retrieval result is displayed in a descending order form, so that the user can intuitively find the reference case information with the maximum similarity.

It should be noted that, in the above-described embodiment, the case information is searched by screening the search results according to the overall similarity between the target case information and the reference case information and then according to the overall similarity, which corresponds to the overall similarity between the two case information. In another embodiment, the user needs to give an emphasis on the corresponding similarity degree of a certain attribute item, and only if the similarity degree of the attribute item satisfies the condition, the corresponding reference case information is output as the retrieval result.

Specifically, the attribute items that are considered by the user in an emphasized manner may be attribute items whose attribute values are free text, or attribute items in other forms.

Before the search result is screened, the attribute value of the attribute item in the target case information and the attribute value of the reference case information in the case database under the attribute item are calculated, and only when the attribute value between the attribute value and the attribute value meets the preset condition, the corresponding reference case information is used as the candidate case information of the finally output search result. For example, in one embodiment, if the attribute value corresponding to the attribute item is a non-free text, only the reference case information with equal attribute value can be set as the alternative case information; in another embodiment, if the attribute value corresponding to the attribute item is free text, the preset condition may be set such that the reference case information with the similarity between the attribute values exceeding 90% can be used as the alternative case information.

Under the condition of determining the alternative case information, the database composed of all the alternative case information is used for replacing the preset case database in the method, namely, the reference case information traversed in the process of calculating the similarity is changed into the alternative case information traversed in the database composed of all the alternative case information.

It should be noted that, in the embodiment, in the process of calculating the similarity, not only different weight coefficients may be assigned according to the attribute items, but also corresponding weight coefficients may be set for a plurality of sub-attribute items under a certain attribute item, that is, when calculating a similarity component between a first attribute value and a second attribute value under an attribute item including a plurality of sub-attribute items, a sub-similarity component between an attribute value of target case information under the plurality of sub-attribute items and an attribute value of reference case information under the sub-attribute item is calculated, and then a corresponding similarity component is calculated according to the sub-similarity component, and in the process of calculating the similarity component, the weight coefficients corresponding to the sub-attribute items and the sub-attribute components are considered.

Specifically, the attribute items included in the target case information include at least 2 sub-attribute items corresponding to the attribute items; the step of calculating the similarity between the first attribute value and the second attribute value further comprises: acquiring a sub-attribute item weight coefficient corresponding to the at least 2 sub-attribute items; calculating the sub-similarity of a first sub-attribute value of the target case information under each sub-attribute item and a second sub-attribute value of the reference case information under each sub-attribute item; and weighting the sub-similarity according to the sub-attribute item weight coefficient to obtain the similarity of the first attribute value and the second attribute value.

It should be noted that, in this embodiment, not only may a corresponding weight coefficient be set for a sub-attribute item under a certain attribute item, but also, as long as the calculation of the similarity component under a certain attribute item or a node of a sub-attribute item is performed by a next-level similarity component corresponding to a next-level attribute item corresponding to the node, and a corresponding weight coefficient may be set for each next-level attribute item.

Further, in another embodiment, different weighting coefficients may be set for the keyword segments included in the keyword segment set, that is, when calculating the similarity component between the first attribute value and the second attribute value according to the positions and/or the times of the keyword segments appearing in the second attribute value, the influence of the keyword segments at the same times and/or positions on the similarity component is different.

Specifically, after the result obtained by the word segmentation processing is used as the keyword word segmentation set corresponding to the first attribute value, the method further includes: distributing corresponding sub-weight coefficients for the keyword participles contained in the keyword participle set; the step of calculating a similarity component between the first attribute value and the second attribute value according to the position and/or number of occurrences of the keyword segmentation in the second attribute value further includes: and calculating a similarity component between the first attribute value and the second attribute value according to the position and/or the frequency of the keyword segmentation in the second attribute value and the sub-weight coefficient distributed to the keyword segmentation.

That is, each keyword participle in the keyword participle set obtained after the participle processing is performed on the first attribute value in the free text form, the influence of the similarity of each keyword segmentation in the retrieval of the current case information is not the same, but the influence of different keyword segmentation is different, for example, the influence of the medical related keyword sub-words such as expression symptoms, clinical manifestations and the like on the similarity is large, and the influence of the keyword sub-words such as language word, connecting word and the like on the similarity is small, so that, according to the semanteme of the keyword participle, corresponding weight coefficients can be set for the keyword participle contained in the keyword participle set, and when calculating the similarity component between the first attribute value and the second attribute value, considering the influence of the weight coefficient corresponding to each keyword segmentation on the similarity component.

It should be noted that, in this embodiment, the corresponding weight of each keyword participle is not constant in the similarity calculation process, and different weight coefficient setting methods may be set for different attribute items, so that the change of common phrases in different attribute items can also be considered in the similarity calculation of the keyword participles.

Furthermore, in order to solve the technical problem of insufficient accuracy of medical report-based search in the conventional technology, in one embodiment, as shown in fig. 3, a case information retrieval apparatus is also proposed, which includes a target case information generation module 102, a similarity component calculation module 104, an overall similarity calculation module 106, and a retrieval result screening module 108, wherein:

the target case information generating module 102 is configured to obtain user input, and generate target case information according to the user input, where the target case information includes at least one attribute item;

a similarity component calculation module 104, configured to obtain a first attribute value of each attribute item included in the target case information, traverse reference case information included in a preset case database, obtain a second attribute value of the traversed reference case information under the attribute item corresponding to the first attribute value, calculate a similarity between the first attribute value and the second attribute value, and obtain a similarity component between the target case information and the traversed reference case information under each attribute item;

an overall similarity calculation module 106, configured to calculate an overall similarity between the target case information and the traversed reference case information according to the similarity between the target case information and the traversed reference case information under each attribute item;

and the retrieval result screening module 108 is used for screening the reference case information in the case database according to the overall similarity between the target case information and the traversed reference case information and outputting the reference case information as a retrieval result.

Optionally, in one embodiment, the user input comprises at least one form item; the target case information generating module 102 is further configured to generate target case information corresponding to the user input according to at least one form item included in the user input and a preset hierarchical tree structure model, where an attribute item included in the target case information corresponds to the form item included in the user input.

Optionally, in an embodiment, the attribute items included in the target case information are classification attribute items or description attribute items, and the classification attribute items include at least one classification attribute item corresponding to the classification attribute item and at most one description attribute item.

Optionally, in an embodiment, the overall similarity calculation module 106 is further configured to obtain a weight coefficient corresponding to each attribute item included in the target case information; and weighting the similarity component under each attribute item according to the weight coefficient corresponding to the attribute item to obtain the overall similarity between the target case information and the traversed reference case information.

Optionally, in an embodiment, the attribute items included in the target case information include at least 2 sub-attribute items corresponding to the attribute items; the similarity component calculating module 104 is further configured to obtain a sub-attribute item weight coefficient corresponding to the at least 2 sub-attribute items; calculating the sub-similarity of a first sub-attribute value of the target case information under each sub-attribute item and a second sub-attribute value of the reference case information under each sub-attribute item; and weighting the sub-similarity according to the sub-attribute item weight coefficient to obtain the similarity of the first attribute value and the second attribute value.

Optionally, in an embodiment, the similarity calculation module 104 is further configured to perform word segmentation on a first attribute value of each attribute item included in the target case information, and use a result obtained by the word segmentation as a keyword word segmentation set corresponding to the first attribute value; acquiring the position and/or the frequency of the occurrence of the keyword participles in the second attribute value, wherein the keyword participles are included in the keyword participle set; and calculating a similarity component between the first attribute value and the second attribute value according to the position and/or the times of the keyword segmentation in the second attribute value.

Optionally, in an embodiment, the similarity calculation module 104 is further configured to perform synonym expansion on the keyword participles included in the keyword participle set according to a preset medical term library, and cooperate the expanded keyword participle set obtained by the expansion as the keyword participle set.

Optionally, in an embodiment, the similarity calculation module 104 is further configured to determine an extended term library matched with the attribute item corresponding to the first attribute value as the preset medical term library.

Optionally, in an embodiment, the similarity calculation module 104 is further configured to assign corresponding sub-weight coefficients to the keyword segments included in the keyword segment set; and calculating a similarity component between the first attribute value and the second attribute value according to the position and/or the frequency of the keyword segmentation in the second attribute value and the sub-weight coefficient distributed to the keyword segmentation.

Optionally, in an embodiment, the similarity calculation module 104 is further configured to calculate a similarity between the first attribute value and the second attribute value according to a preset similarity calculation function, where the preset similarity calculation function is a preset text equivalence determination function, a preset numerical range comparison function, a preset semantic equivalence determination function, a preset code equivalence determination function, or a preset semantic inclusion determination function.

Optionally, in an embodiment, as shown in fig. 3, the apparatus further includes a database updating module 110, configured to detect a form content filled in a case filling page, generate, according to the form content and the preset hierarchical tree structure model, reference case information corresponding to the detected form content, and store the reference case information in the preset case database.

The embodiment of the invention has the following beneficial effects:

after the case information retrieval method and the case information retrieval device are adopted, in the process of retrieving case information, target case information is generated by keywords needing to be retrieved and input by a user according to a preset case model, when a retrieval result corresponding to the target case information is searched in a case database, the overall similarity between the reference case information and the target case information in the case database is obtained by calculating the similarity of the reference case information and the target case information under each attribute item node, and finally, the corresponding retrieval result is screened from the case database through the overall similarity. That is, preprocessing such as keyword setting is not required for each case information in the case database, and all data in the case database can be covered in the process of retrieval, improving the accuracy of the retrieval result.

In one embodiment, as shown in fig. 4, fig. 4 illustrates a terminal of a computer system based on von neumann architecture, which runs the above case information retrieval method. The computer system can be terminal equipment such as a smart phone, a tablet computer, a palm computer, a notebook computer or a personal computer. Specifically, an external input interface 1001, a processor 1002, a memory 1003, and an output interface 1004 connected through a system bus may be included. The external input interface 1001 may optionally include at least a network interface 10012. Memory 1003 can include external memory 10032 (e.g., a hard disk, optical or floppy disk, etc.) and internal memory 10034. The output interface 1004 may include at least a display 10042 or the like.

In this embodiment, the method is executed based on a computer program, program files of which are stored in the external memory 10032 of the computer system based on the von neumann system, loaded into the internal memory 10034 at the time of execution, and then compiled into machine code and then transferred to the processor 1002 to be executed, so that the logical target case information generation module 102, the similarity component calculation module 104, the overall similarity calculation module 106, the search result screening module 108, and the database update module 110 are formed in the computer system based on the von neumann system. In the process of executing the case information retrieval method, the input parameters are received through the external input interface 1001, transferred to the memory 1003 for caching, then input to the processor 1002 for processing, and the processed result data is cached in the memory 1003 for subsequent processing or transferred to the output interface 1004 for outputting.

The above disclosure is only for the purpose of illustrating the preferred embodiments of the present invention, and it is therefore to be understood that the invention is not limited by the scope of the appended claims.

Claims

1. A case information retrieval method, comprising:

screening reference case information in the case database according to the overall similarity between the target case information and the traversed reference case information, and outputting the reference case information as a retrieval result;

the user input comprises at least one form item, and the user input is content filled in a case form according to a case model preset in the system;

generating target case information corresponding to the user input according to at least one form item and a preset hierarchical tree structure model included in the user input, wherein attribute items included in the target case information correspond to the form items included in the user input, and the generating target case information corresponding to the user input according to the at least one form item and the preset hierarchical tree structure model included in the user input comprises: and determining attribute items contained in the generated target case information and attribute values corresponding to the attribute items according to each node contained in a preset hierarchical tree structure model, each form item contained in user input and the corresponding relationship between the node and the form item.

2. The case information retrieval method according to claim 1, wherein the step of calculating the overall similarity of the target case information and the traversed reference case information according to the similarity amount of the target case information and the traversed reference case information under each attribute item further comprises:

3. The case information retrieval method according to claim 1 or 2, wherein the attribute items included in the target case information include at least 2 sub-attribute items corresponding to the attribute items;

4. The case information retrieval method according to claim 1, wherein the step of obtaining the first attribute value of each attribute item included in the target case information further includes, after the step of obtaining the first attribute value of each attribute item included in the target case information:

5. The case information retrieval method according to claim 4, wherein the step of using the result of the word segmentation processing as the keyword word segmentation set corresponding to the first attribute value further includes:

6. The case information retrieval method according to claim 5, wherein the step of performing synonym expansion on the keyword participles included in the keyword participle set according to a preset medical term library further comprises:

7. The case information retrieval method according to any one of claims 4 to 6, wherein the step of, after the step of using the result of the participle processing as the keyword participle set corresponding to the first attribute value, further comprises:

8. The case information retrieval method according to claim 1, wherein the step of calculating the similarity between the first attribute value and the second attribute value further comprises:

9. The case information retrieval method according to claim 1, characterized in that the method further comprises:

10. A case information retrieval device, comprising:

the retrieval result screening module is used for screening the reference case information in the case database according to the overall similarity between the target case information and the traversed reference case information and outputting the reference case information as a retrieval result;

the target case information generating module is further configured to generate target case information corresponding to the user input according to at least one form item and a preset hierarchical tree structure model included in the user input, where an attribute item included in the target case information corresponds to the form item included in the user input, and the generating of the target case information by the target case information generating module according to the at least one form item and the preset hierarchical tree structure model included in the user input includes: and the target case information generation module determines attribute items contained in the generated target case information and attribute values corresponding to the attribute items according to each node contained in a preset hierarchical tree structure model, each form item contained in user input and the corresponding relationship between the node and the form item.

11. The case information retrieval device according to claim 10, wherein the target case information includes attribute items that are classification attribute items or description attribute items, the classification attribute items including at least one classification attribute item corresponding to the classification attribute item and at most one description attribute item.

12. The case information retrieval device according to claim 10, wherein the overall similarity calculation module is further configured to acquire a weight coefficient corresponding to each attribute item included in the target case information; and weighting the similarity component under each attribute item according to the weight coefficient corresponding to the attribute item to obtain the overall similarity between the target case information and the traversed reference case information.

13. The case information retrieval device according to claim 10 or 12, wherein the attribute items included in the target case information include at least 2 sub-attribute items corresponding to the attribute items;

14. The case information retrieval device according to claim 10, wherein the similarity component calculation module is further configured to perform a word segmentation process on a first attribute value of each attribute item included in the target case information, and use a result obtained by the word segmentation process as a keyword word segmentation set corresponding to the first attribute value; acquiring the position and/or the frequency of the occurrence of the keyword participles in the second attribute value, wherein the keyword participles are included in the keyword participle set; and calculating a similarity component between the first attribute value and the second attribute value according to the position and/or the times of the keyword segmentation in the second attribute value.

15. The case information retrieval device according to claim 14, wherein the similarity component calculation module is further configured to perform synonym expansion on the keyword participles included in the keyword participle set according to a preset medical term library, and cooperate the expanded keyword participle set obtained by the expansion as the keyword participle set.

16. The case information retrieval device according to claim 15, wherein the similarity degree calculation module is further configured to determine an extended term library matched with the attribute item corresponding to the first attribute value as the preset medical term library.

17. The case information retrieval device according to any one of claims 14 to 16, wherein the similarity component calculation module is further configured to assign corresponding sub-weight coefficients to the keyword participles included in the keyword participle set; and calculating a similarity component between the first attribute value and the second attribute value according to the position and/or the frequency of the keyword segmentation in the second attribute value and the sub-weight coefficient distributed to the keyword segmentation.

18. The apparatus of claim 10, further comprising a database update module configured to detect a form content filled in a case filling page, generate reference case information corresponding to the detected form content according to the form content and the preset hierarchical tree structure model, and store the reference case information in the preset case database.