CN117009541A - Method, device, equipment and medium for constructing and applying clinical medicine inspection knowledge base - Google Patents
Method, device, equipment and medium for constructing and applying clinical medicine inspection knowledge base Download PDFInfo
- Publication number
- CN117009541A CN117009541A CN202310670816.4A CN202310670816A CN117009541A CN 117009541 A CN117009541 A CN 117009541A CN 202310670816 A CN202310670816 A CN 202310670816A CN 117009541 A CN117009541 A CN 117009541A
- Authority
- CN
- China
- Prior art keywords
- knowledge base
- target
- vector
- clinical medical
- language model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 47
- 238000007689 inspection Methods 0.000 title claims abstract description 38
- 239000003814 drug Substances 0.000 title claims abstract description 36
- 239000013598 vector Substances 0.000 claims abstract description 171
- 238000006243 chemical reaction Methods 0.000 claims abstract description 28
- 238000005516 engineering process Methods 0.000 claims description 18
- 238000004590 computer program Methods 0.000 claims description 15
- 238000004364 calculation method Methods 0.000 claims description 12
- 238000010276 construction Methods 0.000 claims description 11
- 238000012795 verification Methods 0.000 claims description 7
- 238000009411 base construction Methods 0.000 claims description 6
- 238000010339 medical test Methods 0.000 claims description 6
- 238000012360 testing method Methods 0.000 claims description 6
- 238000004820 blood count Methods 0.000 description 8
- 238000010586 diagram Methods 0.000 description 8
- 210000000265 leukocyte Anatomy 0.000 description 7
- 238000002558 medical inspection Methods 0.000 description 6
- 238000004891 communication Methods 0.000 description 5
- 210000004369 blood Anatomy 0.000 description 3
- 239000008280 blood Substances 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 2
- 238000012549 training Methods 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000009534 blood test Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 210000003743 erythrocyte Anatomy 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H40/00—ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Databases & Information Systems (AREA)
- Mathematical Physics (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Epidemiology (AREA)
- Human Computer Interaction (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Primary Health Care (AREA)
- Public Health (AREA)
- Biomedical Technology (AREA)
- General Business, Economics & Management (AREA)
- Business, Economics & Management (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
The application discloses a method, a device, equipment and a medium for constructing and applying a clinical medicine inspection knowledge base, which relate to the technical field of computers and comprise the following steps: converting the acquired clinical medical data into corresponding dense vectors based on a preset large language model to obtain a target clinical medical examination knowledge base; performing configuration operation of a dialogue retrieval mode aiming at a target clinical medicine inspection knowledge base, and performing vector conversion on received natural language sentences to be replied on the basis of a preset large language model after the configuration is completed so as to obtain corresponding problem vectors; the cosine similarity between the problem vector and each dense vector in the target clinical medicine inspection knowledge base is calculated respectively, and a target vector most relevant to the problem vector is determined; and generating corresponding target natural language reply sentences based on the preset large language model and the target vector so as to reply to the natural language sentences to be replied. Thus, the search efficiency and thus the convenience of search can be effectively improved.
Description
Technical Field
The application relates to the technical field of computers, in particular to a method, a device, equipment and a medium for constructing and applying a clinical medicine inspection knowledge base.
Background
Currently, conventional tab-type or keyword-type search modes, although helping users to find relevant information quickly to some extent, have certain limitations. For example, the setting of a tag or keyword may not cover all possible query requirements, and the user may need to accurately input the corresponding keyword or select the correct tag to find the relevant test item information. In addition, the tag or keyword type search mode often cannot provide multi-crowd conversational search service, which cannot meet the personalized requirements of users with different roles such as medical students, internists, attending physicians, senior citizens and the like on knowledge base use, and the problem of low search efficiency generally exists.
Disclosure of Invention
In view of the above, the present application aims to provide a method, a device, an apparatus and a medium for constructing and applying a clinical medicine inspection knowledge base, which can effectively improve the retrieval efficiency and further improve the retrieval convenience. The specific scheme is as follows:
in a first aspect, the present application provides a method for constructing and applying a clinical medical examination knowledge base, including:
converting the acquired clinical medical data into corresponding dense vectors based on a preset large language model to obtain a target clinical medical examination knowledge base;
performing configuration operation of a dialogue retrieval mode aiming at the target clinical medicine inspection knowledge base, and performing vector conversion on the received natural language sentences to be replied on the basis of the preset large language model after the configuration is completed so as to obtain corresponding problem vectors;
the cosine similarity between the problem vector and each dense vector in the target clinical medicine inspection knowledge base is calculated respectively, and a target vector most relevant to the problem vector is determined;
and generating a corresponding target natural language reply sentence based on the preset large language model and the target vector so as to reply to the natural language sentence to be replied.
Optionally, the method comprises converting the acquired clinical medical data into corresponding dense vectors based on a preset large language model to obtain a target clinical medical examination knowledge base
Based on a preset large language model, the acquired clinical medical data is converted into corresponding dense vectors by utilizing an embedding technology, so that a target clinical medical examination knowledge base is obtained.
Optionally, the method for converting the collected clinical medical data into the corresponding dense vector based on the preset large language model and by using the embedding technology includes:
and converting the collected clinical medical test item data and test knowledge text into corresponding dense vectors based on the preset large language model by utilizing an embedding technology.
Optionally, the construction and application method of the clinical medical examination knowledge base comprises the following steps:
and converting the acquired clinical medical examination item data into corresponding dense vectors meeting preset dimensions based on the preset large language model by utilizing an embedding technology.
Optionally, the construction and application method of the clinical medical examination knowledge base further comprises:
and analyzing and summarizing by using the preset large language model and the model fine tuning technology to obtain a fixed academic knowledge base answer mode of the target clinical medical examination knowledge base, so as to provide knowledge retrieval service for an external system based on the fixed academic knowledge base answer mode and through a preset external service data interface.
Optionally, the determining the target vector most relevant to the problem vector by calculating cosine similarity between the problem vector and each dense vector in the target clinical medicine inspection knowledge base includes:
the cosine similarity between the problem vector and each dense vector in the target clinical medicine inspection knowledge base is calculated based on a preset formula, so that corresponding semantic similarity information is determined according to an obtained calculation result;
and determining a plurality of target vectors most relevant to the problem vector by analyzing the semantic similarity information.
In a second aspect, the present application provides a device for constructing and applying a clinical medical examination knowledge base, including:
the knowledge base construction module is used for converting the acquired clinical medical data into corresponding dense vectors based on a preset large language model so as to obtain a target clinical medical examination knowledge base;
the problem vector conversion module is used for executing the configuration operation of a dialogue type retrieval mode aiming at the target clinical medicine inspection knowledge base, and carrying out vector conversion on the received natural language sentences to be replied on the basis of the preset large language model after the configuration is completed so as to obtain corresponding problem vectors;
the similarity calculation module is used for respectively calculating cosine similarity between the problem vector and each dense vector in the target clinical medicine inspection knowledge base to determine a target vector most relevant to the problem vector;
and the reply sentence generating module is used for generating a corresponding target natural language reply sentence based on the preset large language model and the target vector so as to reply to the natural language sentence to be replied.
Optionally, the knowledge base construction module includes:
the dense vector conversion unit is used for converting the acquired clinical medical data into corresponding dense vectors based on a preset large language model and by utilizing an embedding technology so as to obtain a target clinical medical examination knowledge base.
In a third aspect, the present application provides an electronic device, comprising:
a memory for storing a computer program;
and the processor is used for executing the computer program to realize the steps of the construction and application method of the clinical medical examination knowledge base.
In a fourth aspect, the present application provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the aforementioned method for constructing and applying a clinical medical verification knowledge base.
Therefore, in the application, firstly, the collected clinical medical data is converted into corresponding dense vectors based on a preset large language model so as to obtain a target clinical medical examination knowledge base. And then, performing configuration operation of a dialogue retrieval mode aiming at the target clinical medicine inspection knowledge base, and performing vector conversion on the received natural language sentences to be replied on the basis of the preset large language model after the configuration is completed so as to obtain corresponding problem vectors. And then, determining a target vector most relevant to the problem vector by respectively calculating cosine similarity between the problem vector and each dense vector in the target clinical medicine inspection knowledge base. And then generating corresponding target natural language reply sentences based on the preset large language model and the target vector so as to reply to the natural language sentences to be replied. The application constructs the target clinical medicine inspection knowledge base by presetting the large language model and executes the configuration operation of the dialogue type retrieval mode, so that a user can retrieve in a dialogue mode, the retrieval efficiency can be effectively improved, and the retrieval convenience is further improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for constructing and applying a clinical medical examination knowledge base provided by the application;
FIG. 2 is a schematic flow chart of a method for constructing and applying a knowledge base for clinical medical examination provided by the application;
FIG. 3 is a schematic diagram of a Python code segment for vector conversion according to the present application;
FIG. 4 is a schematic diagram of a Python code segment for similarity calculation according to the present application;
FIG. 5 is a schematic diagram of a Python code segment for model tuning provided by the present application;
FIG. 6 is a schematic diagram of a Python code segment retrieved by an external system according to the present application;
FIG. 7 is a flowchart of a method for constructing and applying a knowledge base for clinical medical examination according to the present application;
FIG. 8 is a schematic diagram of a construction and application apparatus of a clinical medical examination knowledge base according to the present application;
fig. 9 is a block diagram of an electronic device according to the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Currently, conventional tab-type or keyword-type search modes, although helping users to find relevant information quickly to some extent, have certain limitations. For example, the setting of a tag or keyword may not cover all possible query requirements, and the user may need to accurately input the corresponding keyword or select the correct tag to find the relevant test item information. In addition, the tag or keyword type search mode often cannot provide multi-crowd conversational search service, which cannot meet the personalized requirements of users with different roles such as medical students, internists, attending physicians, senior citizens and the like on knowledge base use, and the problem of low search efficiency generally exists. Therefore, the application provides a construction and application scheme of a clinical medicine inspection knowledge base, which can effectively improve the retrieval efficiency and further improve the retrieval convenience.
Referring to fig. 1, the embodiment of the application discloses a construction and application method of a clinical medical examination knowledge base, which comprises the following steps:
and step S11, converting the acquired clinical medical data into corresponding dense vectors based on a preset large language model so as to obtain a target clinical medical examination knowledge base.
Specifically, in the present application, as shown in fig. 2, first, collected clinical medical data is converted into corresponding dense vectors based on a preset large language model and by using an embedding technique, so as to obtain a knowledge base for clinical medical examination. Further, the method for converting the collected clinical medical data into corresponding dense vectors based on the preset large language model by utilizing the embedding technology comprises the following steps: and converting the collected clinical medical test item data and test knowledge text into corresponding dense vectors based on the preset large language model by utilizing an embedding technology. Wherein the clinical medical test item data includes, but is not limited to, conventional blood test (blood routine) item data including, but not limited to, red blood cell count, white blood cell count, platelet count. The preset large language model includes, but is not limited to, GPT-4 (generating Pre-trained Transformer, that is, generating Pre-training converter 4, which is a language model issued by OpenAI for chat robots ChatGPT), where OpenAI is an artificial intelligence research company, and ChatGPT is Chat Generative Pre-trained Transformer (generating Pre-training conversion model), which is a chat robot model issued by OpenAI.
It is to be understood that, when vector conversion is performed on the clinical medical test item data, the obtained dense vector should satisfy a preset dimension set in advance based on actual requirements, for example, when vector conversion is performed on the blood routine item data, the white blood cell count included in the blood routine item data may be converted into a dense vector satisfying the preset dimension, and the converted vector may capture semantic information of the test item for subsequent retrieval and comparison. In particular it may be implemented using a Python code segment as shown in fig. 3. Example segments of vector data are derived such as: array ([ 0.01798287, -0.03457677,0.0128045,., 0.00358112, -0.02577634,0.01090625 ]).
It will be appreciated that in vector conversion for the verification knowledge text, each smaller portion of the verification knowledge text is converted into Dense vectors by an embedded model (including, but not limited to, a Dense Retriever model, a sense-transformers model). For example, a text segment having a "white blood cell count normal value ranging from 4 to 10X 10≡9/L" is converted into a vector.
In this embodiment, after the vector conversion is completed, the obtained dense vectors are stored so as to be compared during searching, so that the establishment of the index database is completed, and the target clinical medicine inspection knowledge base is obtained. In a specific embodiment, the obtained dense vector may be stored in a Key-Value database, where Key is the name of a clinical medical test item and Value is the corresponding dense vector.
And step S12, executing the configuration operation of a dialogue type retrieval mode aiming at the target clinical medicine inspection knowledge base, and carrying out vector conversion on the received natural language sentences to be replied on the basis of the preset large language model after the configuration is completed so as to obtain corresponding problem vectors.
In this embodiment, after the target clinical medical inspection knowledge base is obtained, the conversational search mode is modified from the past label-type and keyword-type search mode by performing the configuration operation of the conversational search mode on the target clinical medical inspection knowledge base, so that a user can find related inspection items only by asking through natural language without accurately inputting keywords or selecting labels after the configuration is completed, the use threshold of the knowledge base is reduced, and the personalized requirements of users with different roles such as medical students, internists, attending doctors, senior citizens and the like can be satisfied.
It can be understood that, after the configuration is completed, when a natural language sentence to be replied is received through a preset interface, vector conversion needs to be performed on the received natural language sentence to be replied based on the preset large language model so as to obtain a corresponding problem vector. For example, it may be proposed for the user that: "is white blood cell count 11 x 10≡9/L normal? ", vector conversion is performed on this problem based on the pre-set large language model and embedding technique.
And step S13, determining a target vector most relevant to the problem vector by respectively calculating cosine similarity between the problem vector and each dense vector in the target clinical medical examination knowledge base.
In this embodiment, the determining, by calculating cosine similarity between the problem vector and each dense vector in the target clinical medical inspection knowledge base, the target vector most relevant to the problem vector may specifically include: the cosine similarity between the problem vector and each dense vector in the target clinical medicine inspection knowledge base is calculated based on a preset formula, so that corresponding semantic similarity information is determined according to an obtained calculation result; and determining a plurality of target vectors most relevant to the problem vector by analyzing the semantic similarity information. Wherein the preset formula is as follows.
cosine_similarity=dot(A,B)/(norm(A)*norm(B));
Where A and B represent two vectors, dot (A, B) represents the dot product of A and B, and norm (A) represents the norm of vector A. This formula can be used to calculate the angle between the two vectors, resulting in their similarity. A specific calculation procedure may be implemented by using a Python code segment as shown in fig. 4. Finally, a plurality of target vectors most relevant to the problem vector are selected based on the calculation result.
And step S14, generating corresponding target natural language reply sentences based on the preset large language model and the target vector so as to reply to the natural language sentences to be replied.
In this embodiment, after obtaining a plurality of the target vectors, the natural language sentence to be replied is replied by generating a corresponding target natural language reply sentence by using the preset large language model and the plurality of the target vectors. For example, the target natural language reply sentence generated may be: "white blood cell count 11 x 10≡9/L is slightly higher than normal range. Note that if there is discomfort, please seek medical attention in time. "
It should be further understood that in this embodiment, the fixed academic knowledge base answer mode of the target clinical medical inspection knowledge base can be obtained by performing analysis and summarization by using the preset large language model and the model fine tuning technology, so as to provide knowledge retrieval service for an external system based on the fixed academic knowledge base answer mode and through a preset external service data interface. The process of fine tuning requires a large amount of annotation data, including questions and corresponding answers. The goal of the fine tuning is to enable the model to generate professional answers that meet industry standards. For example, if there are currently a large number of medical question-answer pairs (question + correct answer), model fine tuning may be performed by a Python code segment as shown in fig. 5.
It will be appreciated that, in connection with the Python code segment shown in fig. 6, the corresponding preset external service data interface may include, but is not limited to, the following functions: receiving a verification knowledge query request from an external system, e.g., "is white blood cell count 11 x 10 a 9/L normal? "; invoking the target clinical medicine inspection knowledge base to query, and finding out the most relevant inspection knowledge; and generating professional answers by using the preset large language model, wherein the white blood cell count is 11 multiplied by 10 and is 9/L slightly higher than the normal value range. Note that if there is discomfort, please seek medical attention in time. "; and returns the generated answer to the corresponding external system. Therefore, the external system is convenient to dock, and the data can be quickly exchanged and shared.
Therefore, in the embodiment of the application, the acquired clinical medical data is firstly converted into the corresponding dense vector based on the preset large language model so as to obtain the target clinical medical examination knowledge base. And then, performing configuration operation of a dialogue retrieval mode aiming at the target clinical medicine inspection knowledge base, and performing vector conversion on the received natural language sentences to be replied on the basis of the preset large language model after the configuration is completed so as to obtain corresponding problem vectors. And then, determining a target vector most relevant to the problem vector by respectively calculating cosine similarity between the problem vector and each dense vector in the target clinical medicine inspection knowledge base. And then generating corresponding target natural language reply sentences based on the preset large language model and the target vector so as to reply to the natural language sentences to be replied. The application constructs the target clinical medicine inspection knowledge base by presetting the large language model and executes the configuration operation of the dialogue type retrieval mode, so that a user can retrieve in a dialogue mode, the retrieval efficiency can be effectively improved, and the retrieval convenience is further improved.
Referring to fig. 7, the embodiment of the application discloses a method for constructing and applying a clinical medical examination knowledge base, which comprises the following steps:
and S21, converting the acquired clinical medical data into corresponding dense vectors based on a preset large language model to obtain a target clinical medical examination knowledge base.
And S22, executing configuration operation of a dialogue type retrieval mode aiming at the target clinical medicine inspection knowledge base, and carrying out vector conversion on the received natural language sentences to be replied on the basis of the preset large language model after the configuration is completed so as to obtain corresponding problem vectors.
And S23, determining a target vector most relevant to the problem vector by respectively calculating cosine similarity between the problem vector and each dense vector in the target clinical medical examination knowledge base.
And step S24, generating corresponding target natural language reply sentences based on the preset large language model and the target vector so as to reply to the natural language sentences to be replied.
For the specific process from step S21 to step S24, reference may be made to the corresponding content disclosed in the foregoing embodiment, and no further description is given here.
Therefore, in the embodiment of the application, when the received natural language sentences to be replied are received, corresponding problem vectors are obtained, and the corresponding target natural language reply sentences are determined by calculating cosine similarity between the problem vectors and each dense vector in the target clinical medicine inspection knowledge base. In this way, the accuracy of the search can be ensured.
Referring to fig. 8, the embodiment of the application also correspondingly discloses a device for constructing and applying the clinical medical examination knowledge base, which comprises:
the knowledge base construction module 11 is used for converting the acquired clinical medical data into corresponding dense vectors based on a preset large language model so as to obtain a target clinical medical examination knowledge base;
the question vector conversion module 12 is configured to perform a configuration operation of a dialogue retrieval mode with respect to the target clinical medical verification knowledge base, and perform vector conversion on the received natural language sentence to be replied based on the preset large language model after the configuration is completed, so as to obtain a corresponding question vector;
a similarity calculation module 13, configured to determine a target vector most relevant to the problem vector by calculating cosine similarity between the problem vector and each of the dense vectors in the target clinical medicine inspection knowledge base;
the reply sentence generating module 14 is configured to generate a corresponding target natural language reply sentence based on the preset large language model and the target vector, so as to reply to the natural language sentence to be replied. The more specific working process of each module may refer to the corresponding content disclosed in the foregoing embodiment, and will not be described herein.
Therefore, in the application, firstly, the collected clinical medical data is converted into corresponding dense vectors based on a preset large language model so as to obtain a target clinical medical examination knowledge base. And then, performing configuration operation of a dialogue retrieval mode aiming at the target clinical medicine inspection knowledge base, and performing vector conversion on the received natural language sentences to be replied on the basis of the preset large language model after the configuration is completed so as to obtain corresponding problem vectors. And then, determining a target vector most relevant to the problem vector by respectively calculating cosine similarity between the problem vector and each dense vector in the target clinical medicine inspection knowledge base. And then generating corresponding target natural language reply sentences based on the preset large language model and the target vector so as to reply to the natural language sentences to be replied. The application constructs the target clinical medicine inspection knowledge base by presetting the large language model and executes the configuration operation of the dialogue type retrieval mode, so that a user can retrieve in a dialogue mode, the retrieval efficiency can be effectively improved, and the retrieval convenience is further improved.
In some specific embodiments, the knowledge base construction module 11 may specifically include:
the dense vector conversion sub-module is used for converting the acquired clinical medical data into corresponding dense vectors based on a preset large language model and by utilizing an embedding technology so as to obtain a target clinical medical examination knowledge base.
In some specific embodiments, the dense vector conversion submodule may specifically include:
and the dense vector conversion unit is used for converting the acquired clinical medical examination item data and the examination knowledge text into corresponding dense vectors based on the preset large language model and by utilizing an embedding technology.
In some embodiments, the construction and application apparatus of the clinical medical examination knowledge base may specifically include:
the project data conversion unit is used for converting the acquired clinical medical examination project data into corresponding dense vectors meeting preset dimensions based on the preset large language model and by utilizing an embedding technology.
In some specific embodiments, the construction and application apparatus of the clinical medical examination knowledge base may specifically further include:
and the external service unit is used for analyzing and summarizing by utilizing the preset large language model and the model fine tuning technology to obtain a fixed academic knowledge base answer mode of the target clinical medical inspection knowledge base so as to provide knowledge retrieval service for an external system based on the fixed academic knowledge base answer mode and through a preset external service data interface.
In some specific embodiments, the similarity calculation module 13 may specifically include:
the cosine similarity calculation unit is used for respectively calculating cosine similarity between the problem vector and each dense vector in the target clinical medical inspection knowledge base based on a preset formula so as to determine corresponding semantic similarity information according to an obtained calculation result;
and the target vector determining unit is used for determining a plurality of target vectors most relevant to the problem vector by analyzing the semantic similarity information.
Further, the embodiment of the present application further discloses an electronic device, and fig. 9 is a block diagram of an electronic device 20 according to an exemplary embodiment, where the content of the figure is not to be considered as any limitation on the scope of use of the present application.
Fig. 9 is a schematic structural diagram of an electronic device 20 according to an embodiment of the present application. The electronic device 20 may specifically include: at least one processor 21, at least one memory 22, a power supply 23, a communication interface 24, an input output interface 25, and a communication bus 26. Wherein the memory 22 is used for storing a computer program, and the computer program is loaded and executed by the processor 21 to implement relevant steps in the construction and application methods of the clinical medical examination knowledge base disclosed in any of the foregoing embodiments. In addition, the electronic device 20 in the present embodiment may be specifically an electronic computer.
In this embodiment, the power supply 23 is configured to provide an operating voltage for each hardware device on the electronic device 20; the communication interface 24 can create a data transmission channel between the electronic device 20 and an external device, and the communication protocol to be followed is any communication protocol applicable to the technical solution of the present application, which is not specifically limited herein; the input/output interface 25 is used for acquiring external input data or outputting external output data, and the specific interface type thereof may be selected according to the specific application requirement, which is not limited herein.
The memory 22 may be a carrier for storing resources, such as a read-only memory, a random access memory, a magnetic disk, or an optical disk, and the resources stored thereon may include an operating system 221, a computer program 222, and the like, and the storage may be temporary storage or permanent storage.
The operating system 221 is used for managing and controlling various hardware devices on the electronic device 20 and computer programs 222, which may be Windows Server, netware, unix, linux, etc. The computer program 222 may further include a computer program that can be used to perform other specific tasks in addition to the computer program that can be used to perform the method of constructing and applying the clinical medical verification knowledge base performed by the electronic device 20 disclosed in any of the previous embodiments.
Further, the application also discloses a computer readable storage medium for storing a computer program; the computer program, when executed by the processor, realizes the construction and application methods of the clinical medical examination knowledge base disclosed in the foregoing. For specific steps of the method, reference may be made to the corresponding contents disclosed in the foregoing embodiments, and no further description is given here.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The foregoing has outlined rather broadly the more detailed description of the application in order that the detailed description of the application that follows may be better understood, and in order that the present principles and embodiments may be better understood; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.
Claims (10)
1. The construction and application method of the clinical medicine inspection knowledge base is characterized by comprising the following steps:
converting the acquired clinical medical data into corresponding dense vectors based on a preset large language model to obtain a target clinical medical examination knowledge base;
performing configuration operation of a dialogue retrieval mode aiming at the target clinical medicine inspection knowledge base, and performing vector conversion on the received natural language sentences to be replied on the basis of the preset large language model after the configuration is completed so as to obtain corresponding problem vectors;
the cosine similarity between the problem vector and each dense vector in the target clinical medicine inspection knowledge base is calculated respectively, and a target vector most relevant to the problem vector is determined;
and generating a corresponding target natural language reply sentence based on the preset large language model and the target vector so as to reply to the natural language sentence to be replied.
2. The method according to claim 1, wherein the step of converting the collected clinical medical data into corresponding dense vectors based on a predetermined large language model to obtain the target clinical medical examination knowledge base comprises
Based on a preset large language model, the acquired clinical medical data is converted into corresponding dense vectors by utilizing an embedding technology, so that a target clinical medical examination knowledge base is obtained.
3. The method for constructing and applying a clinical medical examination knowledge base according to claim 2, wherein the step of converting the collected clinical medical data into corresponding dense vectors based on a predetermined large language model and using an embedding technique comprises:
and converting the collected clinical medical test item data and test knowledge text into corresponding dense vectors based on the preset large language model by utilizing an embedding technology.
4. The method for constructing and applying a clinical medical examination knowledge base according to claim 3, wherein the step of converting the collected clinical medical examination item data into corresponding dense vectors based on a predetermined large language model and using an embedding technique comprises:
and converting the acquired clinical medical examination item data into corresponding dense vectors meeting preset dimensions based on the preset large language model by utilizing an embedding technology.
5. The method for constructing and applying a clinical medical examination knowledge base according to claim 1, further comprising:
and analyzing and summarizing by using the preset large language model and the model fine tuning technology to obtain a fixed academic knowledge base answer mode of the target clinical medical examination knowledge base, so as to provide knowledge retrieval service for an external system based on the fixed academic knowledge base answer mode and through a preset external service data interface.
6. The method for constructing and applying a clinical medical examination knowledge base according to any one of claims 1 to 5, wherein the determining a target vector most relevant to the problem vector by calculating cosine similarity between the problem vector and each of the dense vectors in the target clinical medical examination knowledge base, respectively, comprises:
the cosine similarity between the problem vector and each dense vector in the target clinical medicine inspection knowledge base is calculated based on a preset formula, so that corresponding semantic similarity information is determined according to an obtained calculation result;
and determining a plurality of target vectors most relevant to the problem vector by analyzing the semantic similarity information.
7. A device for constructing and applying a clinical medical examination knowledge base, comprising:
the knowledge base construction module is used for converting the acquired clinical medical data into corresponding dense vectors based on a preset large language model so as to obtain a target clinical medical examination knowledge base;
the problem vector conversion module is used for executing the configuration operation of a dialogue type retrieval mode aiming at the target clinical medicine inspection knowledge base, and carrying out vector conversion on the received natural language sentences to be replied on the basis of the preset large language model after the configuration is completed so as to obtain corresponding problem vectors;
the similarity calculation module is used for respectively calculating cosine similarity between the problem vector and each dense vector in the target clinical medicine inspection knowledge base to determine a target vector most relevant to the problem vector;
and the reply sentence generating module is used for generating a corresponding target natural language reply sentence based on the preset large language model and the target vector so as to reply to the natural language sentence to be replied.
8. The apparatus for constructing and applying a clinical medical examination knowledge base according to claim 7, wherein the knowledge base construction module comprises:
the dense vector conversion sub-module is used for converting the acquired clinical medical data into corresponding dense vectors based on a preset large language model and by utilizing an embedding technology so as to obtain a target clinical medical examination knowledge base.
9. An electronic device, comprising:
a memory for storing a computer program;
a processor for executing the computer program to implement the method of constructing and applying a clinical medical examination knowledge base according to any one of claims 1 to 6.
10. A computer readable storage medium for storing a computer program which when executed by a processor implements a method of constructing and applying a clinical medical verification knowledge base according to any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310670816.4A CN117009541A (en) | 2023-06-07 | 2023-06-07 | Method, device, equipment and medium for constructing and applying clinical medicine inspection knowledge base |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310670816.4A CN117009541A (en) | 2023-06-07 | 2023-06-07 | Method, device, equipment and medium for constructing and applying clinical medicine inspection knowledge base |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117009541A true CN117009541A (en) | 2023-11-07 |
Family
ID=88568057
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310670816.4A Pending CN117009541A (en) | 2023-06-07 | 2023-06-07 | Method, device, equipment and medium for constructing and applying clinical medicine inspection knowledge base |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117009541A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117667979A (en) * | 2023-12-08 | 2024-03-08 | 暨南大学 | Data mining method, device, equipment and medium based on large language model |
-
2023
- 2023-06-07 CN CN202310670816.4A patent/CN117009541A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117667979A (en) * | 2023-12-08 | 2024-03-08 | 暨南大学 | Data mining method, device, equipment and medium based on large language model |
CN117667979B (en) * | 2023-12-08 | 2024-07-05 | 暨南大学 | Data mining method, device, equipment and medium based on large language model |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111090987B (en) | Method and apparatus for outputting information | |
CN112069302B (en) | Training method of conversation intention recognition model, conversation intention recognition method and device | |
CN111125309A (en) | Natural language processing method and device, computing equipment and storage medium | |
JP2020537777A (en) | Methods and devices for identifying the user's intent of speech | |
US20150379087A1 (en) | Apparatus and method for replying to query | |
US11461613B2 (en) | Method and apparatus for multi-document question answering | |
CN113268609A (en) | Dialog content recommendation method, device, equipment and medium based on knowledge graph | |
CN111611452B (en) | Method, system, equipment and storage medium for identifying ambiguity of search text | |
US11880664B2 (en) | Identifying and transforming text difficult to understand by user | |
CN112328808A (en) | Knowledge graph-based question and answer method and device, electronic equipment and storage medium | |
CN117009541A (en) | Method, device, equipment and medium for constructing and applying clinical medicine inspection knowledge base | |
CN113705191A (en) | Method, device and equipment for generating sample statement and storage medium | |
CN112148859A (en) | Question-answer knowledge base management method, device, terminal equipment and storage medium | |
CN113821527A (en) | Hash code generation method and device, computer equipment and storage medium | |
CN116595026A (en) | Information inquiry method | |
CN116467417A (en) | Method, device, equipment and storage medium for generating answers to questions | |
CN117972038A (en) | Intelligent question-answering method, device and computer readable medium | |
CN117932022A (en) | Intelligent question-answering method and device, electronic equipment and storage medium | |
CN115409042B (en) | Method and device for robot question answering based on thought guide graph | |
US11409950B2 (en) | Annotating documents for processing by cognitive systems | |
CN117171328A (en) | Text question-answering processing method and device, electronic equipment and storage medium | |
CN116186220A (en) | Information retrieval method, question and answer processing method, information retrieval device and system | |
CN118093796B (en) | Multi-round dialogue method, device, equipment and storage medium | |
CN110990528A (en) | Question answering method and device and electronic equipment | |
CN116775947B (en) | Graph data semantic retrieval method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |