CN117009541A - Method, device, equipment and medium for constructing and applying clinical medicine inspection knowledge base - Google Patents

Method, device, equipment and medium for constructing and applying clinical medicine inspection knowledge base Download PDF

Info

Publication number
CN117009541A
CN117009541A CN202310670816.4A CN202310670816A CN117009541A CN 117009541 A CN117009541 A CN 117009541A CN 202310670816 A CN202310670816 A CN 202310670816A CN 117009541 A CN117009541 A CN 117009541A
Authority
CN
China
Prior art keywords
knowledge base
target
vector
clinical medical
language model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310670816.4A
Other languages
Chinese (zh)
Inventor
余豪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Clp Tongshang Digital Technology Shanghai Co ltd
Original Assignee
Clp Tongshang Digital Technology Shanghai Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Clp Tongshang Digital Technology Shanghai Co ltd filed Critical Clp Tongshang Digital Technology Shanghai Co ltd
Priority to CN202310670816.4A priority Critical patent/CN117009541A/en
Publication of CN117009541A publication Critical patent/CN117009541A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H40/00ICT specially adapted for the management or administration of healthcare resources or facilities; ICT specially adapted for the management or operation of medical equipment or devices
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Epidemiology (AREA)
  • Human Computer Interaction (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Primary Health Care (AREA)
  • Public Health (AREA)
  • Biomedical Technology (AREA)
  • General Business, Economics & Management (AREA)
  • Business, Economics & Management (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The application discloses a method, a device, equipment and a medium for constructing and applying a clinical medicine inspection knowledge base, which relate to the technical field of computers and comprise the following steps: converting the acquired clinical medical data into corresponding dense vectors based on a preset large language model to obtain a target clinical medical examination knowledge base; performing configuration operation of a dialogue retrieval mode aiming at a target clinical medicine inspection knowledge base, and performing vector conversion on received natural language sentences to be replied on the basis of a preset large language model after the configuration is completed so as to obtain corresponding problem vectors; the cosine similarity between the problem vector and each dense vector in the target clinical medicine inspection knowledge base is calculated respectively, and a target vector most relevant to the problem vector is determined; and generating corresponding target natural language reply sentences based on the preset large language model and the target vector so as to reply to the natural language sentences to be replied. Thus, the search efficiency and thus the convenience of search can be effectively improved.

Description

Method, device, equipment and medium for constructing and applying clinical medicine inspection knowledge base
Technical Field
The application relates to the technical field of computers, in particular to a method, a device, equipment and a medium for constructing and applying a clinical medicine inspection knowledge base.
Background
Currently, conventional tab-type or keyword-type search modes, although helping users to find relevant information quickly to some extent, have certain limitations. For example, the setting of a tag or keyword may not cover all possible query requirements, and the user may need to accurately input the corresponding keyword or select the correct tag to find the relevant test item information. In addition, the tag or keyword type search mode often cannot provide multi-crowd conversational search service, which cannot meet the personalized requirements of users with different roles such as medical students, internists, attending physicians, senior citizens and the like on knowledge base use, and the problem of low search efficiency generally exists.
Disclosure of Invention
In view of the above, the present application aims to provide a method, a device, an apparatus and a medium for constructing and applying a clinical medicine inspection knowledge base, which can effectively improve the retrieval efficiency and further improve the retrieval convenience. The specific scheme is as follows:
in a first aspect, the present application provides a method for constructing and applying a clinical medical examination knowledge base, including:
converting the acquired clinical medical data into corresponding dense vectors based on a preset large language model to obtain a target clinical medical examination knowledge base;
performing configuration operation of a dialogue retrieval mode aiming at the target clinical medicine inspection knowledge base, and performing vector conversion on the received natural language sentences to be replied on the basis of the preset large language model after the configuration is completed so as to obtain corresponding problem vectors;
the cosine similarity between the problem vector and each dense vector in the target clinical medicine inspection knowledge base is calculated respectively, and a target vector most relevant to the problem vector is determined;
and generating a corresponding target natural language reply sentence based on the preset large language model and the target vector so as to reply to the natural language sentence to be replied.
Optionally, the method comprises converting the acquired clinical medical data into corresponding dense vectors based on a preset large language model to obtain a target clinical medical examination knowledge base
Based on a preset large language model, the acquired clinical medical data is converted into corresponding dense vectors by utilizing an embedding technology, so that a target clinical medical examination knowledge base is obtained.
Optionally, the method for converting the collected clinical medical data into the corresponding dense vector based on the preset large language model and by using the embedding technology includes:
and converting the collected clinical medical test item data and test knowledge text into corresponding dense vectors based on the preset large language model by utilizing an embedding technology.
Optionally, the construction and application method of the clinical medical examination knowledge base comprises the following steps:
and converting the acquired clinical medical examination item data into corresponding dense vectors meeting preset dimensions based on the preset large language model by utilizing an embedding technology.
Optionally, the construction and application method of the clinical medical examination knowledge base further comprises:
and analyzing and summarizing by using the preset large language model and the model fine tuning technology to obtain a fixed academic knowledge base answer mode of the target clinical medical examination knowledge base, so as to provide knowledge retrieval service for an external system based on the fixed academic knowledge base answer mode and through a preset external service data interface.
Optionally, the determining the target vector most relevant to the problem vector by calculating cosine similarity between the problem vector and each dense vector in the target clinical medicine inspection knowledge base includes:
the cosine similarity between the problem vector and each dense vector in the target clinical medicine inspection knowledge base is calculated based on a preset formula, so that corresponding semantic similarity information is determined according to an obtained calculation result;
and determining a plurality of target vectors most relevant to the problem vector by analyzing the semantic similarity information.
In a second aspect, the present application provides a device for constructing and applying a clinical medical examination knowledge base, including:
the knowledge base construction module is used for converting the acquired clinical medical data into corresponding dense vectors based on a preset large language model so as to obtain a target clinical medical examination knowledge base;
the problem vector conversion module is used for executing the configuration operation of a dialogue type retrieval mode aiming at the target clinical medicine inspection knowledge base, and carrying out vector conversion on the received natural language sentences to be replied on the basis of the preset large language model after the configuration is completed so as to obtain corresponding problem vectors;
the similarity calculation module is used for respectively calculating cosine similarity between the problem vector and each dense vector in the target clinical medicine inspection knowledge base to determine a target vector most relevant to the problem vector;
and the reply sentence generating module is used for generating a corresponding target natural language reply sentence based on the preset large language model and the target vector so as to reply to the natural language sentence to be replied.
Optionally, the knowledge base construction module includes:
the dense vector conversion unit is used for converting the acquired clinical medical data into corresponding dense vectors based on a preset large language model and by utilizing an embedding technology so as to obtain a target clinical medical examination knowledge base.
In a third aspect, the present application provides an electronic device, comprising:
a memory for storing a computer program;
and the processor is used for executing the computer program to realize the steps of the construction and application method of the clinical medical examination knowledge base.
In a fourth aspect, the present application provides a computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the aforementioned method for constructing and applying a clinical medical verification knowledge base.
Therefore, in the application, firstly, the collected clinical medical data is converted into corresponding dense vectors based on a preset large language model so as to obtain a target clinical medical examination knowledge base. And then, performing configuration operation of a dialogue retrieval mode aiming at the target clinical medicine inspection knowledge base, and performing vector conversion on the received natural language sentences to be replied on the basis of the preset large language model after the configuration is completed so as to obtain corresponding problem vectors. And then, determining a target vector most relevant to the problem vector by respectively calculating cosine similarity between the problem vector and each dense vector in the target clinical medicine inspection knowledge base. And then generating corresponding target natural language reply sentences based on the preset large language model and the target vector so as to reply to the natural language sentences to be replied. The application constructs the target clinical medicine inspection knowledge base by presetting the large language model and executes the configuration operation of the dialogue type retrieval mode, so that a user can retrieve in a dialogue mode, the retrieval efficiency can be effectively improved, and the retrieval convenience is further improved.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present application, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a method for constructing and applying a clinical medical examination knowledge base provided by the application;
FIG. 2 is a schematic flow chart of a method for constructing and applying a knowledge base for clinical medical examination provided by the application;
FIG. 3 is a schematic diagram of a Python code segment for vector conversion according to the present application;
FIG. 4 is a schematic diagram of a Python code segment for similarity calculation according to the present application;
FIG. 5 is a schematic diagram of a Python code segment for model tuning provided by the present application;
FIG. 6 is a schematic diagram of a Python code segment retrieved by an external system according to the present application;
FIG. 7 is a flowchart of a method for constructing and applying a knowledge base for clinical medical examination according to the present application;
FIG. 8 is a schematic diagram of a construction and application apparatus of a clinical medical examination knowledge base according to the present application;
fig. 9 is a block diagram of an electronic device according to the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Currently, conventional tab-type or keyword-type search modes, although helping users to find relevant information quickly to some extent, have certain limitations. For example, the setting of a tag or keyword may not cover all possible query requirements, and the user may need to accurately input the corresponding keyword or select the correct tag to find the relevant test item information. In addition, the tag or keyword type search mode often cannot provide multi-crowd conversational search service, which cannot meet the personalized requirements of users with different roles such as medical students, internists, attending physicians, senior citizens and the like on knowledge base use, and the problem of low search efficiency generally exists. Therefore, the application provides a construction and application scheme of a clinical medicine inspection knowledge base, which can effectively improve the retrieval efficiency and further improve the retrieval convenience.
Referring to fig. 1, the embodiment of the application discloses a construction and application method of a clinical medical examination knowledge base, which comprises the following steps:
and step S11, converting the acquired clinical medical data into corresponding dense vectors based on a preset large language model so as to obtain a target clinical medical examination knowledge base.
Specifically, in the present application, as shown in fig. 2, first, collected clinical medical data is converted into corresponding dense vectors based on a preset large language model and by using an embedding technique, so as to obtain a knowledge base for clinical medical examination. Further, the method for converting the collected clinical medical data into corresponding dense vectors based on the preset large language model by utilizing the embedding technology comprises the following steps: and converting the collected clinical medical test item data and test knowledge text into corresponding dense vectors based on the preset large language model by utilizing an embedding technology. Wherein the clinical medical test item data includes, but is not limited to, conventional blood test (blood routine) item data including, but not limited to, red blood cell count, white blood cell count, platelet count. The preset large language model includes, but is not limited to, GPT-4 (generating Pre-trained Transformer, that is, generating Pre-training converter 4, which is a language model issued by OpenAI for chat robots ChatGPT), where OpenAI is an artificial intelligence research company, and ChatGPT is Chat Generative Pre-trained Transformer (generating Pre-training conversion model), which is a chat robot model issued by OpenAI.
It is to be understood that, when vector conversion is performed on the clinical medical test item data, the obtained dense vector should satisfy a preset dimension set in advance based on actual requirements, for example, when vector conversion is performed on the blood routine item data, the white blood cell count included in the blood routine item data may be converted into a dense vector satisfying the preset dimension, and the converted vector may capture semantic information of the test item for subsequent retrieval and comparison. In particular it may be implemented using a Python code segment as shown in fig. 3. Example segments of vector data are derived such as: array ([ 0.01798287, -0.03457677,0.0128045,., 0.00358112, -0.02577634,0.01090625 ]).
It will be appreciated that in vector conversion for the verification knowledge text, each smaller portion of the verification knowledge text is converted into Dense vectors by an embedded model (including, but not limited to, a Dense Retriever model, a sense-transformers model). For example, a text segment having a "white blood cell count normal value ranging from 4 to 10X 10≡9/L" is converted into a vector.
In this embodiment, after the vector conversion is completed, the obtained dense vectors are stored so as to be compared during searching, so that the establishment of the index database is completed, and the target clinical medicine inspection knowledge base is obtained. In a specific embodiment, the obtained dense vector may be stored in a Key-Value database, where Key is the name of a clinical medical test item and Value is the corresponding dense vector.
And step S12, executing the configuration operation of a dialogue type retrieval mode aiming at the target clinical medicine inspection knowledge base, and carrying out vector conversion on the received natural language sentences to be replied on the basis of the preset large language model after the configuration is completed so as to obtain corresponding problem vectors.
In this embodiment, after the target clinical medical inspection knowledge base is obtained, the conversational search mode is modified from the past label-type and keyword-type search mode by performing the configuration operation of the conversational search mode on the target clinical medical inspection knowledge base, so that a user can find related inspection items only by asking through natural language without accurately inputting keywords or selecting labels after the configuration is completed, the use threshold of the knowledge base is reduced, and the personalized requirements of users with different roles such as medical students, internists, attending doctors, senior citizens and the like can be satisfied.
It can be understood that, after the configuration is completed, when a natural language sentence to be replied is received through a preset interface, vector conversion needs to be performed on the received natural language sentence to be replied based on the preset large language model so as to obtain a corresponding problem vector. For example, it may be proposed for the user that: "is white blood cell count 11 x 10≡9/L normal? ", vector conversion is performed on this problem based on the pre-set large language model and embedding technique.
And step S13, determining a target vector most relevant to the problem vector by respectively calculating cosine similarity between the problem vector and each dense vector in the target clinical medical examination knowledge base.
In this embodiment, the determining, by calculating cosine similarity between the problem vector and each dense vector in the target clinical medical inspection knowledge base, the target vector most relevant to the problem vector may specifically include: the cosine similarity between the problem vector and each dense vector in the target clinical medicine inspection knowledge base is calculated based on a preset formula, so that corresponding semantic similarity information is determined according to an obtained calculation result; and determining a plurality of target vectors most relevant to the problem vector by analyzing the semantic similarity information. Wherein the preset formula is as follows.
cosine_similarity=dot(A,B)/(norm(A)*norm(B));
Where A and B represent two vectors, dot (A, B) represents the dot product of A and B, and norm (A) represents the norm of vector A. This formula can be used to calculate the angle between the two vectors, resulting in their similarity. A specific calculation procedure may be implemented by using a Python code segment as shown in fig. 4. Finally, a plurality of target vectors most relevant to the problem vector are selected based on the calculation result.
And step S14, generating corresponding target natural language reply sentences based on the preset large language model and the target vector so as to reply to the natural language sentences to be replied.
In this embodiment, after obtaining a plurality of the target vectors, the natural language sentence to be replied is replied by generating a corresponding target natural language reply sentence by using the preset large language model and the plurality of the target vectors. For example, the target natural language reply sentence generated may be: "white blood cell count 11 x 10≡9/L is slightly higher than normal range. Note that if there is discomfort, please seek medical attention in time. "
It should be further understood that in this embodiment, the fixed academic knowledge base answer mode of the target clinical medical inspection knowledge base can be obtained by performing analysis and summarization by using the preset large language model and the model fine tuning technology, so as to provide knowledge retrieval service for an external system based on the fixed academic knowledge base answer mode and through a preset external service data interface. The process of fine tuning requires a large amount of annotation data, including questions and corresponding answers. The goal of the fine tuning is to enable the model to generate professional answers that meet industry standards. For example, if there are currently a large number of medical question-answer pairs (question + correct answer), model fine tuning may be performed by a Python code segment as shown in fig. 5.
It will be appreciated that, in connection with the Python code segment shown in fig. 6, the corresponding preset external service data interface may include, but is not limited to, the following functions: receiving a verification knowledge query request from an external system, e.g., "is white blood cell count 11 x 10 a 9/L normal? "; invoking the target clinical medicine inspection knowledge base to query, and finding out the most relevant inspection knowledge; and generating professional answers by using the preset large language model, wherein the white blood cell count is 11 multiplied by 10 and is 9/L slightly higher than the normal value range. Note that if there is discomfort, please seek medical attention in time. "; and returns the generated answer to the corresponding external system. Therefore, the external system is convenient to dock, and the data can be quickly exchanged and shared.
Therefore, in the embodiment of the application, the acquired clinical medical data is firstly converted into the corresponding dense vector based on the preset large language model so as to obtain the target clinical medical examination knowledge base. And then, performing configuration operation of a dialogue retrieval mode aiming at the target clinical medicine inspection knowledge base, and performing vector conversion on the received natural language sentences to be replied on the basis of the preset large language model after the configuration is completed so as to obtain corresponding problem vectors. And then, determining a target vector most relevant to the problem vector by respectively calculating cosine similarity between the problem vector and each dense vector in the target clinical medicine inspection knowledge base. And then generating corresponding target natural language reply sentences based on the preset large language model and the target vector so as to reply to the natural language sentences to be replied. The application constructs the target clinical medicine inspection knowledge base by presetting the large language model and executes the configuration operation of the dialogue type retrieval mode, so that a user can retrieve in a dialogue mode, the retrieval efficiency can be effectively improved, and the retrieval convenience is further improved.
Referring to fig. 7, the embodiment of the application discloses a method for constructing and applying a clinical medical examination knowledge base, which comprises the following steps:
and S21, converting the acquired clinical medical data into corresponding dense vectors based on a preset large language model to obtain a target clinical medical examination knowledge base.
And S22, executing configuration operation of a dialogue type retrieval mode aiming at the target clinical medicine inspection knowledge base, and carrying out vector conversion on the received natural language sentences to be replied on the basis of the preset large language model after the configuration is completed so as to obtain corresponding problem vectors.
And S23, determining a target vector most relevant to the problem vector by respectively calculating cosine similarity between the problem vector and each dense vector in the target clinical medical examination knowledge base.
And step S24, generating corresponding target natural language reply sentences based on the preset large language model and the target vector so as to reply to the natural language sentences to be replied.
For the specific process from step S21 to step S24, reference may be made to the corresponding content disclosed in the foregoing embodiment, and no further description is given here.
Therefore, in the embodiment of the application, when the received natural language sentences to be replied are received, corresponding problem vectors are obtained, and the corresponding target natural language reply sentences are determined by calculating cosine similarity between the problem vectors and each dense vector in the target clinical medicine inspection knowledge base. In this way, the accuracy of the search can be ensured.
Referring to fig. 8, the embodiment of the application also correspondingly discloses a device for constructing and applying the clinical medical examination knowledge base, which comprises:
the knowledge base construction module 11 is used for converting the acquired clinical medical data into corresponding dense vectors based on a preset large language model so as to obtain a target clinical medical examination knowledge base;
the question vector conversion module 12 is configured to perform a configuration operation of a dialogue retrieval mode with respect to the target clinical medical verification knowledge base, and perform vector conversion on the received natural language sentence to be replied based on the preset large language model after the configuration is completed, so as to obtain a corresponding question vector;
a similarity calculation module 13, configured to determine a target vector most relevant to the problem vector by calculating cosine similarity between the problem vector and each of the dense vectors in the target clinical medicine inspection knowledge base;
the reply sentence generating module 14 is configured to generate a corresponding target natural language reply sentence based on the preset large language model and the target vector, so as to reply to the natural language sentence to be replied. The more specific working process of each module may refer to the corresponding content disclosed in the foregoing embodiment, and will not be described herein.
Therefore, in the application, firstly, the collected clinical medical data is converted into corresponding dense vectors based on a preset large language model so as to obtain a target clinical medical examination knowledge base. And then, performing configuration operation of a dialogue retrieval mode aiming at the target clinical medicine inspection knowledge base, and performing vector conversion on the received natural language sentences to be replied on the basis of the preset large language model after the configuration is completed so as to obtain corresponding problem vectors. And then, determining a target vector most relevant to the problem vector by respectively calculating cosine similarity between the problem vector and each dense vector in the target clinical medicine inspection knowledge base. And then generating corresponding target natural language reply sentences based on the preset large language model and the target vector so as to reply to the natural language sentences to be replied. The application constructs the target clinical medicine inspection knowledge base by presetting the large language model and executes the configuration operation of the dialogue type retrieval mode, so that a user can retrieve in a dialogue mode, the retrieval efficiency can be effectively improved, and the retrieval convenience is further improved.
In some specific embodiments, the knowledge base construction module 11 may specifically include:
the dense vector conversion sub-module is used for converting the acquired clinical medical data into corresponding dense vectors based on a preset large language model and by utilizing an embedding technology so as to obtain a target clinical medical examination knowledge base.
In some specific embodiments, the dense vector conversion submodule may specifically include:
and the dense vector conversion unit is used for converting the acquired clinical medical examination item data and the examination knowledge text into corresponding dense vectors based on the preset large language model and by utilizing an embedding technology.
In some embodiments, the construction and application apparatus of the clinical medical examination knowledge base may specifically include:
the project data conversion unit is used for converting the acquired clinical medical examination project data into corresponding dense vectors meeting preset dimensions based on the preset large language model and by utilizing an embedding technology.
In some specific embodiments, the construction and application apparatus of the clinical medical examination knowledge base may specifically further include:
and the external service unit is used for analyzing and summarizing by utilizing the preset large language model and the model fine tuning technology to obtain a fixed academic knowledge base answer mode of the target clinical medical inspection knowledge base so as to provide knowledge retrieval service for an external system based on the fixed academic knowledge base answer mode and through a preset external service data interface.
In some specific embodiments, the similarity calculation module 13 may specifically include:
the cosine similarity calculation unit is used for respectively calculating cosine similarity between the problem vector and each dense vector in the target clinical medical inspection knowledge base based on a preset formula so as to determine corresponding semantic similarity information according to an obtained calculation result;
and the target vector determining unit is used for determining a plurality of target vectors most relevant to the problem vector by analyzing the semantic similarity information.
Further, the embodiment of the present application further discloses an electronic device, and fig. 9 is a block diagram of an electronic device 20 according to an exemplary embodiment, where the content of the figure is not to be considered as any limitation on the scope of use of the present application.
Fig. 9 is a schematic structural diagram of an electronic device 20 according to an embodiment of the present application. The electronic device 20 may specifically include: at least one processor 21, at least one memory 22, a power supply 23, a communication interface 24, an input output interface 25, and a communication bus 26. Wherein the memory 22 is used for storing a computer program, and the computer program is loaded and executed by the processor 21 to implement relevant steps in the construction and application methods of the clinical medical examination knowledge base disclosed in any of the foregoing embodiments. In addition, the electronic device 20 in the present embodiment may be specifically an electronic computer.
In this embodiment, the power supply 23 is configured to provide an operating voltage for each hardware device on the electronic device 20; the communication interface 24 can create a data transmission channel between the electronic device 20 and an external device, and the communication protocol to be followed is any communication protocol applicable to the technical solution of the present application, which is not specifically limited herein; the input/output interface 25 is used for acquiring external input data or outputting external output data, and the specific interface type thereof may be selected according to the specific application requirement, which is not limited herein.
The memory 22 may be a carrier for storing resources, such as a read-only memory, a random access memory, a magnetic disk, or an optical disk, and the resources stored thereon may include an operating system 221, a computer program 222, and the like, and the storage may be temporary storage or permanent storage.
The operating system 221 is used for managing and controlling various hardware devices on the electronic device 20 and computer programs 222, which may be Windows Server, netware, unix, linux, etc. The computer program 222 may further include a computer program that can be used to perform other specific tasks in addition to the computer program that can be used to perform the method of constructing and applying the clinical medical verification knowledge base performed by the electronic device 20 disclosed in any of the previous embodiments.
Further, the application also discloses a computer readable storage medium for storing a computer program; the computer program, when executed by the processor, realizes the construction and application methods of the clinical medical examination knowledge base disclosed in the foregoing. For specific steps of the method, reference may be made to the corresponding contents disclosed in the foregoing embodiments, and no further description is given here.
In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant points refer to the description of the method section.
Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. The software modules may be disposed in Random Access Memory (RAM), memory, read Only Memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.
Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The foregoing has outlined rather broadly the more detailed description of the application in order that the detailed description of the application that follows may be better understood, and in order that the present principles and embodiments may be better understood; meanwhile, as those skilled in the art will have variations in the specific embodiments and application scope in accordance with the ideas of the present application, the present description should not be construed as limiting the present application in view of the above.

Claims (10)

1. The construction and application method of the clinical medicine inspection knowledge base is characterized by comprising the following steps:
converting the acquired clinical medical data into corresponding dense vectors based on a preset large language model to obtain a target clinical medical examination knowledge base;
performing configuration operation of a dialogue retrieval mode aiming at the target clinical medicine inspection knowledge base, and performing vector conversion on the received natural language sentences to be replied on the basis of the preset large language model after the configuration is completed so as to obtain corresponding problem vectors;
the cosine similarity between the problem vector and each dense vector in the target clinical medicine inspection knowledge base is calculated respectively, and a target vector most relevant to the problem vector is determined;
and generating a corresponding target natural language reply sentence based on the preset large language model and the target vector so as to reply to the natural language sentence to be replied.
2. The method according to claim 1, wherein the step of converting the collected clinical medical data into corresponding dense vectors based on a predetermined large language model to obtain the target clinical medical examination knowledge base comprises
Based on a preset large language model, the acquired clinical medical data is converted into corresponding dense vectors by utilizing an embedding technology, so that a target clinical medical examination knowledge base is obtained.
3. The method for constructing and applying a clinical medical examination knowledge base according to claim 2, wherein the step of converting the collected clinical medical data into corresponding dense vectors based on a predetermined large language model and using an embedding technique comprises:
and converting the collected clinical medical test item data and test knowledge text into corresponding dense vectors based on the preset large language model by utilizing an embedding technology.
4. The method for constructing and applying a clinical medical examination knowledge base according to claim 3, wherein the step of converting the collected clinical medical examination item data into corresponding dense vectors based on a predetermined large language model and using an embedding technique comprises:
and converting the acquired clinical medical examination item data into corresponding dense vectors meeting preset dimensions based on the preset large language model by utilizing an embedding technology.
5. The method for constructing and applying a clinical medical examination knowledge base according to claim 1, further comprising:
and analyzing and summarizing by using the preset large language model and the model fine tuning technology to obtain a fixed academic knowledge base answer mode of the target clinical medical examination knowledge base, so as to provide knowledge retrieval service for an external system based on the fixed academic knowledge base answer mode and through a preset external service data interface.
6. The method for constructing and applying a clinical medical examination knowledge base according to any one of claims 1 to 5, wherein the determining a target vector most relevant to the problem vector by calculating cosine similarity between the problem vector and each of the dense vectors in the target clinical medical examination knowledge base, respectively, comprises:
the cosine similarity between the problem vector and each dense vector in the target clinical medicine inspection knowledge base is calculated based on a preset formula, so that corresponding semantic similarity information is determined according to an obtained calculation result;
and determining a plurality of target vectors most relevant to the problem vector by analyzing the semantic similarity information.
7. A device for constructing and applying a clinical medical examination knowledge base, comprising:
the knowledge base construction module is used for converting the acquired clinical medical data into corresponding dense vectors based on a preset large language model so as to obtain a target clinical medical examination knowledge base;
the problem vector conversion module is used for executing the configuration operation of a dialogue type retrieval mode aiming at the target clinical medicine inspection knowledge base, and carrying out vector conversion on the received natural language sentences to be replied on the basis of the preset large language model after the configuration is completed so as to obtain corresponding problem vectors;
the similarity calculation module is used for respectively calculating cosine similarity between the problem vector and each dense vector in the target clinical medicine inspection knowledge base to determine a target vector most relevant to the problem vector;
and the reply sentence generating module is used for generating a corresponding target natural language reply sentence based on the preset large language model and the target vector so as to reply to the natural language sentence to be replied.
8. The apparatus for constructing and applying a clinical medical examination knowledge base according to claim 7, wherein the knowledge base construction module comprises:
the dense vector conversion sub-module is used for converting the acquired clinical medical data into corresponding dense vectors based on a preset large language model and by utilizing an embedding technology so as to obtain a target clinical medical examination knowledge base.
9. An electronic device, comprising:
a memory for storing a computer program;
a processor for executing the computer program to implement the method of constructing and applying a clinical medical examination knowledge base according to any one of claims 1 to 6.
10. A computer readable storage medium for storing a computer program which when executed by a processor implements a method of constructing and applying a clinical medical verification knowledge base according to any one of claims 1 to 6.
CN202310670816.4A 2023-06-07 2023-06-07 Method, device, equipment and medium for constructing and applying clinical medicine inspection knowledge base Pending CN117009541A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310670816.4A CN117009541A (en) 2023-06-07 2023-06-07 Method, device, equipment and medium for constructing and applying clinical medicine inspection knowledge base

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310670816.4A CN117009541A (en) 2023-06-07 2023-06-07 Method, device, equipment and medium for constructing and applying clinical medicine inspection knowledge base

Publications (1)

Publication Number Publication Date
CN117009541A true CN117009541A (en) 2023-11-07

Family

ID=88568057

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310670816.4A Pending CN117009541A (en) 2023-06-07 2023-06-07 Method, device, equipment and medium for constructing and applying clinical medicine inspection knowledge base

Country Status (1)

Country Link
CN (1) CN117009541A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117667979A (en) * 2023-12-08 2024-03-08 暨南大学 Data mining method, device, equipment and medium based on large language model

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117667979A (en) * 2023-12-08 2024-03-08 暨南大学 Data mining method, device, equipment and medium based on large language model
CN117667979B (en) * 2023-12-08 2024-07-05 暨南大学 Data mining method, device, equipment and medium based on large language model

Similar Documents

Publication Publication Date Title
CN111090987B (en) Method and apparatus for outputting information
CN112069302B (en) Training method of conversation intention recognition model, conversation intention recognition method and device
CN111125309A (en) Natural language processing method and device, computing equipment and storage medium
JP2020537777A (en) Methods and devices for identifying the user's intent of speech
US20150379087A1 (en) Apparatus and method for replying to query
US11461613B2 (en) Method and apparatus for multi-document question answering
CN113268609A (en) Dialog content recommendation method, device, equipment and medium based on knowledge graph
CN111611452B (en) Method, system, equipment and storage medium for identifying ambiguity of search text
US11880664B2 (en) Identifying and transforming text difficult to understand by user
CN112328808A (en) Knowledge graph-based question and answer method and device, electronic equipment and storage medium
CN117009541A (en) Method, device, equipment and medium for constructing and applying clinical medicine inspection knowledge base
CN113705191A (en) Method, device and equipment for generating sample statement and storage medium
CN112148859A (en) Question-answer knowledge base management method, device, terminal equipment and storage medium
CN113821527A (en) Hash code generation method and device, computer equipment and storage medium
CN116595026A (en) Information inquiry method
CN116467417A (en) Method, device, equipment and storage medium for generating answers to questions
CN117972038A (en) Intelligent question-answering method, device and computer readable medium
CN117932022A (en) Intelligent question-answering method and device, electronic equipment and storage medium
CN115409042B (en) Method and device for robot question answering based on thought guide graph
US11409950B2 (en) Annotating documents for processing by cognitive systems
CN117171328A (en) Text question-answering processing method and device, electronic equipment and storage medium
CN116186220A (en) Information retrieval method, question and answer processing method, information retrieval device and system
CN118093796B (en) Multi-round dialogue method, device, equipment and storage medium
CN110990528A (en) Question answering method and device and electronic equipment
CN116775947B (en) Graph data semantic retrieval method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination