CN116595131A

CN116595131A - Method and system for medical question answering by using large language model

Info

Publication number: CN116595131A
Application number: CN202310295002.7A
Authority: CN
Inventors: 何世柱; 赵军; 刘康; 翁诣轩; 夏飞; 朱敏郡
Original assignee: Institute of Automation of Chinese Academy of Science
Current assignee: Institute of Automation of Chinese Academy of Science
Priority date: 2023-03-23
Filing date: 2023-03-23
Publication date: 2023-08-15

Abstract

The invention provides a method and a system for medical question answering by using a large language model, wherein the method comprises the following steps: acquiring medical dialogue history content of a user; inputting the medical dialogue history content and a first prompt instruction into a large language model, and decoding to obtain a plurality of first responses based on diversified sampling of the large language model; inputting the medical dialogue history content and a plurality of second prompt instructions into the large language model to obtain a plurality of second responses respectively corresponding to each second prompt instruction; and inputting the medical dialogue history content, the first responses and the second responses into the large language model, generating reply content of medical dialogue and sending the reply content to the user. The large language model can utilize the integral thinking, and the depth and the breadth of the thinking are improved, so that more accurate reply content can be generated, and the use experience of a user is improved.

Description

Method and system for medical question answering by using large language model

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a method and a system for medical question answering by using a large language model.

Background

Medical research based on artificial intelligence technology is rapidly growing. A medical session question-answer (CQA) task aims to improve the efficiency of healthcare by providing patients with a range of professional medical services. The CQA system may improve the patient's experience during clinical treatment by quickly responding to the patient's needs and providing relevant medical information.

In recent years, the use of neural networks and language models to solve medical problems has become a viable solution. However, general language models tend to be difficult to achieve in the medical field because they are trained on only small medical datasets, lacking the ability to address rare symptoms and diseases. Recently, large language models (Large Language Models, LLM) have demonstrated that this problem can be alleviated by contextual learning when a small number of shot samples are provided. The chain of thought (CoT) hint method solves complex logic problems by decomposing reasoning into multiple thought steps, which can be transferred into complex medical reasoning.

Medical CQA systems require strong medical reasoning capabilities and the ability to take into account patient personal factors (widely and in-depth thinking). At present, even if the CoT is used to decompose complex problems, the tasks still lack the overall thinking required by all relevant information, such as the health condition, medical history, current condition and the like of a patient, so that the accuracy of question-answer reply can be influenced, and the use experience of a user is influenced.

Disclosure of Invention

Aiming at the problems existing in the prior art, the invention provides a method and a system for medical question answering by using a large language model.

In a first aspect, the present invention provides a method for medical question answering using a large language model, comprising:

acquiring medical dialogue history content of a user;

inputting the medical dialogue history content and a first prompt instruction into a large language model, and decoding to obtain a plurality of first responses based on diversified sampling of the large language model;

inputting the medical dialogue history content and a plurality of second prompt instructions into the large language model to obtain a plurality of second responses respectively corresponding to each second prompt instruction;

inputting the medical dialogue history content, the first responses and the second responses into the large language model, generating reply content of medical dialogue and sending the reply content to the user;

the large language model is trained based on Chinese and English universal text data sets.

Optionally, the content of the first prompting instruction includes:

"doctor: "or" the doctor may want to: ".

Optionally, the content of the second prompting instruction is determined based on the record item of the medical record.

Optionally, the content of the second prompting instruction includes at least one of the following:

what the complaint is;

what the past history is;

aiding in checking what is;

diagnosing what is;

what is suggested.

In a second aspect, the present invention also provides a system for medical question answering using a large language model, comprising:

the acquisition module is used for acquiring medical dialogue history contents of the user;

the first input module is used for inputting the medical dialogue history content and the first prompt instruction into a large-scale language model, and decoding a plurality of first responses based on diversified sampling of the large-scale language model;

the second input module is used for inputting the medical dialogue history content and a plurality of second prompt instructions into the large language model to obtain a plurality of second responses respectively corresponding to each second prompt instruction;

the reply module is used for inputting the medical dialogue history content, the first responses and the second responses into the large language model, generating reply content of medical dialogue and sending the reply content to the user;

Optionally, the content of the first prompting instruction includes:

"doctor: "or" the doctor may want to: ".

what the complaint is;

what the past history is;

aiding in checking what is;

diagnosing what is;

what is suggested.

In a third aspect, the present invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of using a large language model for medical question-answering as described in the first aspect when executing the program.

In a fourth aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a method for medical question answering using a large language model as described in the first aspect above.

According to the method and the system for carrying out medical question answering by using the large-scale language model, the medical dialogue history content and the first prompt instruction are input into the large-scale language model, and a plurality of first responses are obtained based on diversified sampling decoding of the large-scale language model; inputting medical dialogue history content and a plurality of second prompt instructions into a large language model to obtain a plurality of second responses respectively corresponding to each second prompt instruction; and then inputting the medical dialogue history content, the first responses and the second responses into the large language model to generate the reply content of the medical dialogue, so that the large language model can utilize the integral thinking to improve the depth and breadth of the thinking, thereby generating more accurate reply content and improving the use experience of users.

Drawings

In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of a method for medical question answering using a large language model provided by the invention;

FIG. 2 is an exemplary diagram of a medical question-answering through an overall thought using a large language model provided by the present invention;

FIG. 3 is a diagram illustrating an example of the differences between direct response generation and overall thinking provided by the present invention;

FIG. 4 is a graph comparing the overall thinking in a plurality of data sets with the experimental results of the less sample method provided by the present invention;

FIG. 5 is a diagram showing an example of the impact of various hint text provided by the present invention on overall mental performance;

FIG. 6 is a schematic diagram of the results of the HoT ablation experiments provided by the present invention;

FIG. 7 is a schematic diagram of a system for medical question answering using a large language model according to the present invention;

fig. 8 is a schematic structural diagram of an electronic device provided by the present invention.

Detailed Description

For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

FIG. 1 is a flow chart of a method for medical question answering using a large language model according to the present invention, as shown in FIG. 1, the method comprises the following steps:

step 100, acquiring medical dialogue history contents of a user.

And step 101, inputting medical dialogue historical contents and first prompt instructions into a large-scale language model, and decoding a plurality of first responses based on diversified sampling of the large-scale language model.

Step 102, inputting the medical dialogue history content and the plurality of second prompt instructions into a large language model to obtain a plurality of second responses respectively corresponding to each second prompt instruction.

And step 103, inputting the medical dialogue history content, the first responses and the second responses into a large language model, generating reply content of the medical dialogue and transmitting the reply content to the user.

Specifically, the execution subject of the method may be a system, an apparatus, a device, etc. for medical question answering, and for convenience of description, the system is hereinafter referred to as a medical question answering system.

The large language model used in the invention can be the existing large language model, such as GPT-3, instruct-GPT, GLM and the like.

The large language model used in the invention is trained based on Chinese and English general text data sets, wherein the Chinese and English general text data sets comprise various types of corpus in various fields of Chinese and English news, novel, articles, dialogue, chat, comments, critique and the like, and the existing Chinese and English general text data sets can be used or constructed according to the needs, and the invention is not particularly limited.

When a user makes a medical-related question on an interactive platform (including but not limited to an application APP, applet, web page, software client, etc.), the medical question and answer system may obtain the medical dialogue history content of the user, including the question content of the user, the answer content of the doctor, etc.

After the medical dialogue history content of the user is obtained, the medical question-answering system can input the medical dialogue history content of the user and a first prompt instruction into the large-scale language model, wherein the first prompt instruction can be used for exciting the diffuse thinking of the large-scale language model, so that the large-scale language model can generate a plurality of different medical answers, namely first responses, through diversified sampling and decoding based on the medical dialogue history content of the user, and the generation quantity of the first responses can be set according to requirements.

Optionally, the content of the first hint instruction may include: "doctor: "(" vector: ") or" Doctor may want to: "(" vector machine think: "). Of course, the content of the first hint instruction may be other similar relatively simple hint instructions, which are not limited in detail herein.

After acquiring the medical dialogue history content of the user, the medical question-answering system can also input the medical dialogue history content of the user and a plurality of second prompt instructions into the large language model, wherein each second prompt instruction can be used for generating a corresponding response content, namely a second response. The second response may be understood as a result of focused thinking by the large language model.

Optionally, the content of the second prompting instruction may be determined based on the record item of the medical record so as to generate medical record information of the patient. For example, a prompt instruction of "What is the …" ("What is.") can be used to generate an answer for each medical record item.

Optionally, the record items of the medical records may include a Complaint (Chief Complaint), a Past History (Past History), a secondary review (Auxiliary Examination), a Diagnosis (Diagnosis), a recommendation (recommendation), and the like.

Optionally, the content of the second hint instruction may include at least one of:

(1) What is the main complaint (What is the Chief Complaint).

(2) What the past history was (What is the Past History).

(3) Auxiliary check what is (What is the Auxiliary Examination).

(4) What is diagnosed (What is the Diagnosis).

(5) What is suggested (What is the Suggestion).

After the first response and the second response are generated, the medical question-answering system can input the medical dialogue history content, the first response and the second response of the user into the large language model to obtain final answer content, and the final answer content is returned to the user.

It should be noted that the order of execution of the steps 101 and 102 may not be limited, for example: step 101 may be performed before step 102 is performed, or step 102 may be performed before step 101 is performed.

The invention uses the large language model to carry out medical question and answer, and the whole process can be carried out without any extra supervision data, and only the medical dialogue history content and the related prompt instructions are required to be input.

According to the method for carrying out medical question and answer by using the large-scale language model, medical dialogue historical contents and first prompt instructions are input into the large-scale language model, and a plurality of first responses are obtained based on diversified sampling decoding of the large-scale language model; inputting medical dialogue history content and a plurality of second prompt instructions into a large language model to obtain a plurality of second responses respectively corresponding to each second prompt instruction; and then inputting the medical dialogue history content, the first responses and the second responses into the large language model to generate the reply content of the medical dialogue, so that the large language model can utilize the integral thinking to improve the depth and breadth of the thinking, thereby generating more accurate reply content and improving the use experience of users.

The present invention will be described in further detail with reference to the following examples.

In the real world, the use of global thinking enables doctors to consider various factors of patient condition during the consultation phase and provide a broad and thorough understanding. Without such an overall thought, incorrect medical advice may be generated and cause injury to the patient. Based on this, the present invention proposes a medical question-answering through an integrated thinking (Holistically Thought, hereinafter HoT) using a large language model.

Fig. 2 is an exemplary diagram of a medical question-answering through an overall thinking using a large language model according to the present invention, and as shown in fig. 2, the whole process can be divided into three steps: namely diffuse Thinking (Diffused Thinking), focused Thinking (Response Generation), and return generation. Firstly, using diversified decoding to obtain different ideas; then, generating a Medical abstract through a fixed prompt for Medical records (Medical Record); finally, a reply is generated using the medical dialog history (Medical Dialogue History) and the two thought results.

In diffuse thinking, the goal is to generate a diversity of reactions that fit into a medical dialogue. FIG. 3 is a diagram illustrating an example of the difference between Direct Response generation and overall thinking provided by the present invention, as shown in FIG. 3, the original Direct Response (i.e., the large language model Direct generation) may lack careful consideration, be without detailed guidance, and sometimes be in fact wrong. The diverse reactions resulting from the diffusion thinking can be regarded as candidate reactions having different ideas. Specifically, a fixed hint "vector:" ("Doctor:") on the LLM can be used to generate |d| times different content information, where the value of |d| can be set as desired. LLM will produce different responses under diversified decoding, as shown in fig. 2, with fixed cues for generating three different responses for this diffuse thought. Each result may contain potential solutions that better respond to medical CQA conditions.

The goal of the focused thought is to aggregate key information before generating a dialogue response. This process is similar to medical dialog digest generation. Original medical dialog digest generation is a direct generation of digest content, where critical information may be lacking in the generation process. The difference between the centralized thinking is that for a typical medical record, the manual design prompt of LLM can be used to generate the result of each record item, the generation difficulty is lower than that of the complete abstract, and the generation result of each item on the medical record is shorter and more specific. As shown in fig. 2, given a typical medical record, a prompt using "What is the …" ("What is.. -%) requires LLM to generate an answer for each item, the generated record contains some key information about the medical dialogue history, which can be used as a chain of ideas to further reply to the generation.

The reply generates a result that integrates the diffuse and concentrated thoughts, which can be used to prompt the LLM to obtain a final response. The results of the generation of diffuse and concentrated thinking cover two direct thinking directions, which may help LLM learn and integrate key information to generate responses. While generating an overall thought process may be performed under a zero shot setting, experiments have shown that such a thought process may be initiated by prompting only a few examples including a chain of thought, which does not require a large training data set or modifying the weights of the language model.

To illustrate the effectiveness of the present invention, the present invention was experimentally explored on three different data sets, medDialog, COVID and CMDD data sets, respectively, as shown in table 1.

Table 1 class comparison of different data sets

Overall performance comparison: tables 2 and 3 show experimental results on multiple data sets, the method for comparison comprising: direct generation (Direct), coT prompt (CoT) and a series of Fine-tuning-based methods (Fine-tuning) are directly generated by a large language model, and the HoT in the table is the method provided by the invention. The effectiveness of the method is demonstrated by comparing three large language models, namely GPT-3, instruct-GPT and GLM, respectively, and the method is obviously better than a reference method in performance as can be found from the table.

TABLE 2 Experimental results in MedDialog and COVID data set

TABLE 3 experimental results in CMDD data set

In addition, experiments have led to a series of conclusions:

1. through the overall thinking, LLM reasoning under zero samples can achieve performance approaching or even exceeding that of few-sample reasoning.

Context learning is a widely used technique for rapid large language models. It involves using a small portion of data (called hints) to improve the performance of the model without requiring fine tuning of its parameters. Fig. 4 is a graph comparing the overall thinking in a plurality of data sets with the experimental results of the few samples method provided by the present invention, as shown in fig. 4, it can be seen that the overall thinking (Holistically Thought) in med dialogs and covds exceeds the performance of the few shot (Few-shot) hints in three evaluation indexes (BLEU-2, metero, nist-2) of three different LLMs. This suggests that the overall thinking is competitive even when compared to manually constructing a small number of snapshot cues, indicating its importance in data-scarce scenarios, such as covd-19. In addition, the overall thinking is more flexible, and the task adaptability is greater.

2. The simple prompt can effectively excite the whole thinking ability of the LLM. FIG. 5 is a graph showing the influence of different prompt texts on the overall thinking performance, as shown in FIG. 5, by researching the medical answer generating capability under various prompt (prompt) templates, it can be found that a simple prompt template such as "#1 vector:" or "#2 vector machine think:" can effectively excite the overall thinking capability of LLM, while the performance of the field prompt #3- #5 and the advanced text instruction prompt #6- #8 with specific descriptions is not much better than that of a simple prompt template.

3. Concentrated and diffuse thinking is important to overall thinking. Fig. 6 is a schematic diagram of the results of the HoT ablation experiments provided by the present invention, as shown in fig. 6, in the english data set (MedDialog and covd), both the concentrated and diffuse ideas alone have positive effects compared to direct generation, however, the performance gain is significantly lower compared to the combined HoT. In general, both concentrated and diffuse thinking contribute to overall thinking. In addition, experiments have shown that by taking advantage of the overall thought, LLM can take into account the depth and breadth of the thought, thereby enabling more accurate answers to be generated.

The system for performing medical questions and answers using a large language model provided by the invention is described below, and the system for performing medical questions and answers using a large language model described below and the method for performing medical questions and answers using a large language model described above can be referred to correspondingly with each other.

Fig. 7 is a schematic structural diagram of a system for performing medical questions and answers using a large language model according to the present invention, as shown in fig. 7, the system includes:

an acquisition module 700 for acquiring medical dialogue history contents of a user;

the first input module 710 is configured to input the medical dialogue history content and the first prompt instruction into a large language model, and decode a plurality of first responses based on diversified samples of the large language model;

a second input module 720, configured to input the medical dialogue history content and the plurality of second prompt instructions into the large language model, to obtain a plurality of second responses respectively corresponding to each of the second prompt instructions;

a reply module 730, configured to input the medical dialogue history content, the plurality of first responses, and the plurality of second responses into the large language model, generate reply content of the medical dialogue, and send the reply content to the user;

Optionally, the content of the first hint instruction includes:

"doctor: "or" the doctor may want to: ".

Optionally, the content of the second hint instruction includes at least one of:

what the complaint is;

what the past history is;

aiding in checking what is;

diagnosing what is;

what is suggested.

It should be noted that, the system provided by the present invention can implement all the method steps implemented by the method embodiment and achieve the same technical effects, and the parts and beneficial effects that are the same as those of the method embodiment in the present embodiment are not described in detail herein.

Fig. 8 is a schematic structural diagram of an electronic device according to the present invention, as shown in fig. 8, the electronic device may include: processor 810, communication interface (Communications Interface) 820, memory 830, and communication bus 840, wherein processor 810, communication interface 820, memory 830 accomplish communication with each other through communication bus 840. Processor 810 may invoke logic instructions in memory 830 to perform any of the methods for medical question answering using a large language model provided in the various embodiments described above.

Further, the logic instructions in the memory 830 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

It should be noted that, the electronic device provided by the present invention can implement all the method steps implemented by the method embodiments and achieve the same technical effects, and the details and beneficial effects of the same parts and advantages as those of the method embodiments in the present embodiment are not described in detail.

In another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform any of the methods of using a large language model for medical question-answering provided in the above embodiments.

It should be noted that, the non-transitory computer readable storage medium provided by the present invention can implement all the method steps implemented by the method embodiments and achieve the same technical effects, and detailed descriptions of the same parts and beneficial effects as those of the method embodiments in this embodiment are omitted.

The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A method for medical question answering using a large language model, comprising:

acquiring medical dialogue history content of a user;

2. The method for medical question answering using a large language model according to claim 1, wherein the content of the first prompt instruction includes:

"doctor: "or" the doctor may want to: ".

3. The method for medical question-answering using the large language model according to claim 1, wherein the content of the second prompt instruction is determined based on the recorded item of medical record.

4. The method of claim 3, wherein the content of the second prompt instruction comprises at least one of:

what the complaint is;

what the past history is;

aiding in checking what is;

diagnosing what is;

what is suggested.

5. A system for medical questions and answers using a large language model, comprising:

6. The system for medical question-answering using a large language model according to claim 5, wherein the content of the first prompt instruction includes:

"doctor: "or" the doctor may want to: ".

7. The system for medical question-answering using a large language model according to claim 5, wherein the content of the second prompt instruction is determined based on the recorded item of medical records.

8. The system for medical question-answering using a large language model according to claim 7, wherein the content of the second prompt instruction includes at least one of:

what the complaint is;

what the past history is;

aiding in checking what is;

diagnosing what is;

what is suggested.

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program implements the method of using a large language model for medical question-answering as claimed in any one of claims 1 to 4.

10. A non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor, implements a method of medical question answering using a large language model as claimed in any one of claims 1 to 4.