CN116168796A - Medical image report structured generation method based on visual question and answer - Google Patents

Medical image report structured generation method based on visual question and answer Download PDF

Info

Publication number
CN116168796A
CN116168796A CN202310198891.5A CN202310198891A CN116168796A CN 116168796 A CN116168796 A CN 116168796A CN 202310198891 A CN202310198891 A CN 202310198891A CN 116168796 A CN116168796 A CN 116168796A
Authority
CN
China
Prior art keywords
state
information
node
tree
nodes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310198891.5A
Other languages
Chinese (zh)
Other versions
CN116168796B (en
Inventor
周子杰
余宙
俞俊
朱耕蔚
高梓豪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Dianzi University
Original Assignee
Hangzhou Dianzi University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Dianzi University filed Critical Hangzhou Dianzi University
Priority to CN202310198891.5A priority Critical patent/CN116168796B/en
Publication of CN116168796A publication Critical patent/CN116168796A/en
Priority to ZA2023/07473A priority patent/ZA202307473B/en
Application granted granted Critical
Publication of CN116168796B publication Critical patent/CN116168796B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H15/00ICT specially adapted for medical reports, e.g. generation or transmission thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/258Data format conversion from or to a database
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Animal Behavior & Ethology (AREA)
  • Medical Informatics (AREA)
  • General Health & Medical Sciences (AREA)
  • Epidemiology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Public Health (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Primary Health Care (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention provides a medical image report structured generation method based on visual questions and answers. The method comprises the following steps: 1. VQA model design and modification. 2. "problem-state tree" design. 3. And (5) automatically extracting information. 4. Structured information integration. The invention is based on visual question-answering technology, and aims at generating a medical image diagnosis report, aiming at enhancing the interactivity between a problem and a model, and designs a series of data structures with a 'problem-state tree' as a core and conversion algorithms between the data structures. The invention reduces the randomness of the problem organization in the visual question answering technology to a certain extent, and is helpful for the VQA model to acquire more effective information in medical images. The model constructed by the technology has high expandability, can complete more comprehensive business with lower training cost, and can flexibly generate medical image diagnosis reports which have complete logic and contain rich information according to specific different application scenes in practical application.

Description

Medical image report structured generation method based on visual question and answer
Technical Field
The invention belongs to the technical field of computer vision, and relates to a medical image report structuring generation method which is based on a computer vision question-answering technology, guides a model to carry out image information mining through tree structuring problem setting and controllable judging logic and automatically carries out data integration.
Background
In the big data age, the number of images that humans acquire from nature or directly are in explosive growth phase at every moment, and the amount of information contained therein is far beyond what humans themselves can handle. In order to mine the information needed by human beings from a huge number of images, computer vision technology has been developed and is now the relatively most mature direction for artificial intelligence applications. Particularly, a plurality of medical links in the medical field are not separated from the support of images, and along with the increasing maturity of medical imaging technologies such as X-ray, MRI and the like, the application is increasingly wide, and the modern reliable medical diagnosis is not separated from the support of images, so that the information quantity worthy of mining in the images is increased. Therefore, in order to secure the reliability of medical diagnosis and to alleviate the burden on doctors, the integration of computer vision technology with medical diagnosis has become a necessary trend.
The following computer vision techniques have some preliminary applications in the field of medical diagnostics. The conventional medical diagnosis model uses a machine learning method, and has the capability of judging a specific disease type according to pictures through supervised learning of a large number of similar pictures and labels thereof; although the method has higher accuracy, the training cost is high, a large-scale data set needs to be manufactured, and meanwhile, the model only returns a corresponding final judgment result, so that all the characteristics mined from the medical image cannot be displayed in front of a user, and further diagnosis by a doctor is not facilitated. Visual description is one of the emerging technologies currently applied in the field of medical diagnosis, and can generate an objective summary description of the whole medical image; however, the medical image has a lot of information, one image may contain a plurality of focuses, and due to a certain degree of randomness of the medical diagnosis report generated by the visual description technology, part of important information in the image is likely to be insufficiently mined, so that hidden danger is buried for subsequent treatment. Compared with the prior art, the visual question-answering (VQA) technology has more flexibility and pertinence, on one hand, because the training data of the visual question-answering model comprises a plurality of modes, pictures of organs and various problems in different aspects, different symptoms can be diagnosed by only using one model in actual application, thereby avoiding the time cost caused by training a large number of conventional machine learning models; on the other hand, since the problem of natural language form can be designed by the human body, the system can search and extract the key information in the image according to the requirement of the user, and the result output by the system is more reliable than the visual description. Although the technology in the field of VQA is rapidly developed, due to the problem of model accuracy and the randomness of the problem, the mature VQA system can only be used for extracting the information of the conventional type image, such as the assistance of the blind, but is difficult to truly apply to the field of medical image processing with strong professionals and high information integration level.
Disclosure of Invention
In order to better promote the floor application of the computer vision technology in the medical field, the invention provides a medical image report structured generation method based on vision questions and answers. The invention is based on visual question-answering technology, and aims at generating a medical image diagnosis report, aiming at enhancing the interactivity between a problem and a model, and designs a series of data structures with a 'problem-state tree' as a core and conversion algorithms between the data structures. Compared with the traditional medical decision model, the method reduces the randomness of the problem organization in the visual question-answering technology to a certain extent, and is beneficial to the VQA model to acquire more effective information in the medical image. The model constructed by the technology has high expandability, can complete more comprehensive business with lower training cost, and can flexibly generate medical image diagnosis reports which have complete logic and contain rich information according to specific different application scenes in practical application.
The medical image report structured generation method based on visual question and answer comprises a VQA model design and transformation module, a problem-state tree design module, an information automatic extraction module and a structured information integration module.
The VQA model design and transformation module is specifically realized as follows
Firstly, a medical image VQA model with strong feature extraction and reasoning capability and good generalization performance needs to be constructed, and the model is particularly characterized in that the model is supposed to achieve better test results on a plurality of current medical VQA data sets. For any given medical image V and question text Q as input, the module will output a single determined text as answer a, and for the VQA model M (·) a=m (V, Q) can be found.
Based on the model, the main stream VQA model is basically selected from the candidate answer space to output the answer with the highest possibility as a real answer; in a real application scenario, the VQA model is limited by precision, and a judgment error is likely to occur, so that an answer beyond the expectation of a user is given, and the traditional VQA model is specially modified. The VQA model N (-) here will be described with a given medical image V, a given question text Q and a preset set of candidate answers A p ={A 1 ,A 2 ...,...A c As input, then selects the most appropriate answer in the given candidate answer set, which may be labeled a=n (V, Q, a) p ) The specific selection method is that the probability value corresponding to each answer given by the traditional VQA model is determined, the largest probability value is selected as the output A, and if the finally selected answer is not the globally optimal answer given by the model, the judgment is marked in a 'question-state tree' to warn. When a suitable model is selected and reproduced and reformed, the model is used as a component of a 'black box' type for the subsequent whole process of the invention, namely, the model parameters are not changed as not specifically explained.
The 'problem-state tree' design module
The system comprises a pathology information demand analysis unit, a pathology state generation and combination unit, a pathology problem design unit, a tree structure optimization unit and an auxiliary information insertion unit.
Pathological information demand analysis unit
I.e. first analysing which information is necessary for generating a medical image diagnostic report. A complete medical image diagnostic report should produce diagnostic results from the phenomenon, which itself may be divided into different levels, corresponding to information on the different levels. In the concrete analysis, the invention adopts a bottom-up analysis method, namely, which information is needed is judged according to the type of the disease which is needed to be judged currently, and the acquisition of the information is established on which more basic information, so that the information has a global-local progressive relationship.
The invention expresses the image information as sigma, and constructs the information required by different diseases into a set I= { sigma according to a bottom-up method 1 ,σ 2 ...,...σ n From sigma 1 To sigma n The represented information is arranged in a global to local order, and all the information should be able to find the corresponding representation from the answer space of model N (-) of VQA, i.e. when N (-) identifies that the answer has the highest likelihood, the corresponding information is obtained from image V on behalf of N (-). In particular, σ here 1 Should represent global information for the entire image.
According to the thought, the information I needed by doctors to make a decision on the illness state can be analyzed w On the basis of which information I can be provided by images of different modes respectively h Further find the intersection I of the two all =I w ∩I h ,I all The final "problem-state tree" is the main pathological information that needs to be contained.
Pathological state generation and combination unit
I.e. according to the pathological information I which has been analyzed in the previous step all It is integrated into a pathological state. Any kind of pathologyThe states are all superposition of several pathological information, and the more finely divided pathological states the more pathological information is needed. From the pathological information obtained by the pathological information demand analysis unit in the invention, the calculation can be performed
Figure BDA0004108291430000031
(the addition between information and state in the invention is not strictly mathematical addition, but represents integration and division of information, and the addition represents integration and the subtraction represents division), wherein 1 is less than or equal to a i ≤n,1≤m≤n,/>
Figure BDA0004108291430000032
k=1, 2., where, further obtain a pathological state set S= { S 1 ,s 2 ...,...s z Particularly, s herein 1 =σ 1 Representing an initial state. The subdivided pathological states often build on the more macroscopic pathological states, so that the inclusion relationships between the states can be found in the pathological state set S. It is further possible to organize the different states into a tree structure based on this inclusion relationship, i.e. in s 1 As root node, the rest of the states s i′ E S is divided into a plurality of mutually disjoint finite sets T 1 ,T 2 ...,T l ,...,T g Each T l Nodes other than the root node of Tl can be divided into a plurality of mutually disjoint finite sets +.>
Figure BDA0004108291430000041
And so on. In particular for
Figure BDA0004108291430000042
If there is s m′ ∈T m′ ,s n′ ∈T n′ And s is n′ Is T n′ Should there be s m′ -s n′ >0, i.e., a node always contains more information than its ancestor node. According to the thought, the tree T can be constructed according to the pathological state set S ST In the present invention this process can be denoted as T ST =tree (S), where S represents a set of case states, referred to herein as T ST Is a "state tree". To accommodate the later tree structure adjustment, all leaf nodes may be designated as feature state nodes and all branch nodes as flag state nodes.
Pathological problem design module
I.e. from the state tree T ST Starting with, design and insert questions into the state tree T ST In the last, the infrastructure T of the problem-state tree is obtained QS . In the invention, the state tree presents the process of continuously refining states from top to bottom, and each state transition from a parent node to a child node needs to take a problem as a medium, and the answer of each problem can lead to a plurality of different state transition possibilities. For parent state nodes
Figure BDA00041082914300000412
It has several substate nodes +.>
Figure BDA0004108291430000043
For parent state node->
Figure BDA00041082914300000411
The set of sub-state nodes of (a) is divided into blocks according to characteristics, and +.>
Figure BDA0004108291430000044
Generating a question q a Ensure pair A a =N(V,q a ,A pa ) There is->
Figure BDA0004108291430000045
Make->
Figure BDA0004108291430000046
The constant holds. To further improve the accuracy of the VQA model, q a Should be generated as much as possible according to->
Figure BDA0004108291430000047
The information of the model is set, and more constraint limiting words are added appropriately, so that the VQA model can search the correct result more easily when the answer space is searched. To the state tree T ST Repeating the above steps for all non-leaf nodes (i.e. parent state nodes) to generate a question set +.>
Figure BDA00041082914300000413
Figure BDA00041082914300000414
This process can be marked as Q u =Question(T ST ). The problem node is inserted into the middle of the corresponding state node, so that the infrastructure T of the data structure 'problem-state tree' defined in the invention can be formed QS . For example, in the above case, if the case of A a =N(V,q a ,A pa ) There is->
Figure BDA0004108291430000048
Q is a Become->
Figure BDA0004108291430000049
Sub-nodes of->
Figure BDA00041082914300000410
Becomes q a Is a child of the node (a). The infrastructure of the "problem-state tree" can be expressed in particular as T QS ={Q u S, wherein Q u S is a set of problems and S is a set of pathological states. In this case, see T QS The root node and the leaf node of the node are both state nodes, and the child nodes of the state nodes are necessarily problem nodes, and the child nodes of the problem nodes are also necessarily state nodes.
Tree-type structure optimizing unit
In order to maximize the self-checking capability of the model constructed by the technology, the problem-state tree infrastructure T can be obtained QS Further processing. Infrastructure for each sub-problem-state tree
Figure BDA0004108291430000051
There is->
Figure BDA0004108291430000052
But->
Figure BDA0004108291430000053
All state nodes contained are all +_ with their root node>
Figure BDA0004108291430000054
Is built on the basis that these state nodes all contain the root node +.>
Figure BDA0004108291430000055
Information provided, therefore, the present invention will be this +.>
Figure BDA0004108291430000056
All nodes included are regarded as having the same label which in actual use can represent a certain type of disease that can be qualitatively judged, herein referred to as a label set L, initially defined as l=t QS Here T is QS Is denoted as the label node of the label set L. By "problem-state tree" infrastructure T QS The inclusion relationship of the self-collection can be known that the inclusion relationship of the label collection also exists, and the initial definition L is that x =T QSx ,L y =T QSy It can be seen that in the initial state, +.>
Figure BDA0004108291430000057
And->
Figure BDA0004108291430000058
Mutually are the filling conditions. For sub-tag set L y Assuming its corresponding T QSy Can directly divide T QSx T, i.e QSx Root node s x Is T QSy Root node s y Then can be at s x Selecting state nodes only comprising sub-features from all sub-problem nodes of (1)Is then taken from T along with the sub-feature state nodes QSx Removed, but still remain at T QSy And is taken as s y And needs to be tightly attached to s x To the left of the parent problem node. By executing the Label (-) operation, the invention realizes the process of checking the symptoms and then making the decision, enhances the logic of generating the text, and is also beneficial to a doctor to judge whether the diagnosis result given by the model meets the normal theory. It should be noted that the tag set L finally obtained here x And T is QSx Is not identical and can therefore be described as l=Label (T QS ). The invention is thus distinguished because of the "problem-state tree" infrastructure T QS The tree structure finally obtained by the invention is described, and the label set L describes the logic meaning of the true tree structure, the former is more beneficial to the subsequent checksum text generation work of the invention, and the latter is beneficial to the designer of the problem-state tree to better understand the project.
Auxiliary information inserting unit
After the optimization of the tree structure is completed, a plurality of auxiliary nodes can be inserted into the label set L according to specific requirements. The auxiliary nodes are divided into two types, one type is to further excavate the information contained in the existing characteristic state nodes, and the invention is called a characteristic state expansion node; another category is treatment advice, review advice, etc. for the condition to which the current label belongs, referred to herein as label replenishment nodes. The feature state expansion node should become a sub node of the corresponding feature state node, and may include a plurality of problem nodes and a plurality of feature state nodes, for further refining the features that have not been fully mined previously; the label supplement node should be used as a sub-node of the corresponding label node and needs to be located at the leftmost side, so that after the model makes a corresponding diagnosis, a proper suggestion can be given in the generated diagnosis report aiming at some specific phenomena, and the further development of doctor diagnosis and treatment work is assisted. For the feature state expansion node, the feature state expansion node is essentially a combination of the problem node and the feature state node, so that the problem node in the feature state expansion node can be added into the problem set Q u The characteristic state nodes can be added into the pathological state set S, and the label supplementing nodes are introduction of medical information which is not possessed before, so that the characteristic state nodes are marked as a label supplementing node set E, and the information contained in each label supplementing node t is marked as epsilon. In summary, the "problem-state tree" ultimately constructed by the present invention can be expressed as T QSE ={Q u ,S,E}。
Information automatic extraction module
At the completion of the "problem-status tree" T QSE ={Q u After construction of S, E, the "problem-state tree" T may be found here QSE Is searched and automatically generates an information tree T IN . If a "problem-state tree" is said to correspond to a more generic template, then the "information tree" corresponds to a specific implementation of the template. The information tree is composed of several information nodes, where the information sigma contained therein is used directly b To refer to information nodes. T is defined herein IN ={I s ,I E },
Figure BDA0004108291430000061
I E ={∈ 1 ,∈ 2 ...,...∈ h -and satisfy pair->
Figure BDA0004108291430000062
Make->
Figure BDA0004108291430000063
The construction principle is as follows: for a given picture and a given "problem-state tree" T QSE ={Q u S, E, assuming the current system is at S t In the E S state, q v Is s t Is not traversed, s v E S is q v Is a sub-state node, sigma v =s v -s t While at the same time s is present v =s t +A v Wherein A is v =N(V,q v ,A pv ) I.e. satisfy sigma v =A v Then s will be v Information sigma contained in a node v Copy to information tree as s t Corresponding information node sigma t Then changing the original system state to s v And so on; if the label supplementing node t is encountered in the traversal process, the information epsilon is directly copied to the sub-node serving as the latest generation node in the information tree; if the state of the system is not traversed by the child problem node, returning to the last state. Note here that in the process of inserting an information node into an information tree, a plurality of sub-nodes should be sequentially arranged from left to right in the insertion order.
The process of generating the information tree is similar to the tree advanced traversal algorithm, but is different from the prior art, in the implementation process, only the state nodes and the label supplement nodes are reserved according to the whole running logic of the program to generate the information tree, the problem nodes are responsible for the logic guidance of the whole program, and when the problem nodes confirm that the program state is transferred to the state shown by one result, other states are automatically ignored. This design of the present invention takes into account the characteristics of the different diseases and thereby avoids too much invalid information contained in the resulting text.
Structured information integration module
In obtaining the information tree T IN Later, since it conforms to the basic characteristics possessed by the data structure of "tree", it can be directly applied to T IN Performing preface traversal, and concatenating all extracted information into characters, so as to finally complete the generation of a medical image diagnosis report.
The invention has the advantages and beneficial results that:
1. the invention provides a medical image report structured generation technology, which is favorable for the VQA technology to be applied to the field of medicine.
2. The invention designs a 'question-status tree', which generates a status from the information itself, and then generates a question from the status, thereby fully excavating the association between the question and the status, reducing the blindness and randomness of the organization of the visual question-answering question itself, and helping doctors to acquire more accurate and practical information.
3. The invention designs an algorithm for generating an information tree by a problem-state tree, which is favorable for extracting various information in a medical picture completely and carrying out structural storage and representation on the information on the premise of less human intervention, and is favorable for the subsequent comprehensive processing of the information.
4. The 'problem-state tree' organization form designed by the invention is simple and flexible, can be flexibly changed according to specific requirements, is easy to construct, is very easy to construct a complete medical image diagnosis system in a short time, and reduces the training time cost of an artificial intelligent algorithm in the application process.
5. The "problem-state tree" designed by the present invention has an inherent logic rule. Firstly, the rule can be reflected into a text generated when traversing the whole tree, so that the generated text has stronger logicality, accords with the reading habit of human beings, and meanwhile, a doctor can judge whether the decision made by the model is correct or not according to the logicality reflected in the text, thereby being beneficial to improving the reliability of the model; secondly, the relationships of inclusion, cause and effect and the like in the logic rule are good in compatibility with the knowledge graph, and the problem-state tree can be directly generated by converting the medical knowledge graph in the implementation process, so that the system is convenient to build.
6. The problem-state tree designed by the invention has strong expansibility, and a large number of auxiliary nodes containing diagnosis suggestions and treatment suggestions can be added so that the content of the finally generated text report is more abundant, and the decision making by doctors is more facilitated.
Drawings
FIG. 1 is a schematic diagram of the "problem-state tree" infrastructure of the present invention;
FIG. 2 is a schematic diagram of a process for optimizing a "problem-state tree" structure in accordance with the present invention;
FIG. 3 is a schematic diagram of a final "problem-state tree" structure of the present invention;
FIG. 4 is a schematic diagram of an "information tree" generated by the present invention and a diagnostic report generated;
FIG. 5 is a schematic diagram of the execution logic of the model VQA modified by the present invention;
Detailed Description
The invention is further illustrated by the following detailed description:
the specific implementation of the VQA model design and transformation in the step 1 is as follows:
the current mature medical image VQA model can be selected as a basic model M (-), then the output part of the model is changed, and the result with the largest output probability is changed into the result with the largest probability in the output candidate options, so that a VQA model N (-) is generated. In order to meet the above conditions, the whole model should be extended on the basis of the original one so that it can simultaneously input pictures, questions and candidate sets, and one picture and question have one-to-many correspondence, and one question corresponds to one candidate set. At the same time, if the result with the highest probability is not among the candidate sets, the last result given by the problem should be marked to alert the doctor that the model may not obtain the most ideal information on the problem. The structure after the whole transformation is shown in figure 5.
The specific implementation of the "problem-state tree" design in step 2 is as follows:
the design of the "problem-status tree" requires delivery to doctors with a professional medical setting. After the pathological information demand analysis is completed, theoretically the intersection I should be all The corresponding information is arranged according to logic from whole to local, and a certain incremental subsequence is taken for addition to obtain a state set S, and a 'state tree' can be constructed through hierarchical connection among the state sets.
In practical application, in order to simplify the construction process of the "state tree", the logic structure of the "problem-state tree" is considered to be similar to the knowledge graph, and the generation can be directly performed through the medical knowledge graph. The disease entities in the knowledge graph have inclusion and subdivision relations, the entities can be directly used as mark state nodes of a state tree, and the attributes of the entities in the knowledge graph correspond to characteristic state nodes in the problem-state tree. With the method as a target, I is obtained by combining pathological information demand analysis w Can be selected from medical knowledge graphThe trunk node is used as the root node of the sub-state tree, the depth-first traversal is carried out on the graph to obtain a plurality of spanning trees, and the spanning trees are analyzed according to the pathological information requirement to obtain I h And combining the two states into a complete state tree, so that a prototype of a 'state tree' can be quickly constructed. On the basis, the nodes generated by the entities and the attributes selected from the knowledge graph can be manually pruned, and finally a needed 'state tree' is obtained.
After obtaining the "state tree", the problem is designed according to the given pathological problem generating method and is inserted into the "state tree", so as to obtain the "problem-state tree" infrastructure, and fig. 1 is a schematic diagram of the "problem-state tree" infrastructure given by taking an X-ray picture as an example.
On the basis, the 'problem-state tree' basic framework is optimized and perfected in turn, wherein the label supplementing nodes can be completely obtained from the medical knowledge graph, so that the auxiliary nodes can be reserved in the pruning process in actual operation. Fig. 2 illustrates an example of the "problem-state tree" infrastructure of fig. 1, showing a schematic diagram of an optimization "problem-state tree".
After the above steps are completed, the final "problem-state tree" is obtained, and the result after optimization as shown in fig. 2 is shown in fig. 3.
The information automatic extraction part in the step 3 is specifically implemented as follows:
the conversion process from the problem-state tree to the information tree is implemented in the step 3, and the implementation process is described in detail in the step 3, so that the description of how the information is stored and represented in the information tree is emphasized here. Since the "information tree" is converted from the "problem-status tree" and the diagnostic report is generated in step 4, it is necessary to unify the expression forms of the three information. To facilitate report generation in step 4, each information node should store a descriptive prefix or sentence concerning the nature or quantity of the image, and an example of an "information tree" is shown in the upper half of fig. 4, which is an illustration of the result obtained by the "problem-state tree" in fig. 3 through step 3, and can be referred to as reference. In order to achieve the result, the state of each state node should be analyzed in the design process of the problem-state tree during actual operation, so that the descriptive suffixes or sentences are preset, the traversal can directly extract the information expressed in the form of characters, and finally, the automatic generation of a diagnosis report is achieved.
The structural information integration part in the step 4 is specifically implemented as follows:
starting from the initial node of the information tree, traversing the whole tree by adopting an advanced traversing algorithm, extracting the information of each information node into a sentence of diagnosis conclusion, presetting keywords in a key part, for example, presetting the description of the information node X-ray image in the information node, and enabling the generated report to be smoother. The lower half of fig. 4 shows a report result generation case of the "information tree" of the upper half, and the final generated diagnosis report is "this is an X-ray image, the front chest is photographed, there is a lung lobe solid change, there is a pathological abnormality on the upper right lung, there is a thickening phenomenon of lung texture, it is diagnosed that there is pneumonia, it is recommended that bed rest, a lot of drinking water, there is a bronchus airway sign, it is diagnosed that pneumococcal pneumonia, it is recommended that penicillin medication is used, there is no pleural effusion, and it is not legionella pneumonia; there are fractures, which suggest bed rest, immobilization of joints ", for reference.
The above embodiments are further described in detail for the implementation of the present invention, but the present invention is not limited to the above examples, and the changes, modifications, additions or substitutions made by those skilled in the art within the technical scope of the present invention are also within the scope of the present invention.

Claims (7)

1. The medical image report structuring generation method based on visual question and answer is characterized by comprising a VQA model design and transformation module, a 'question-state tree' design module, an information automation extraction module and a structuring information integration module;
the VQA model design and transformation module is specifically realized as follows:
firstly, a VQA model needs to be constructed, and for any given medical image V and question text Q to be used as input, the model outputs a single determined text as an answer A, and for a VQA model M (-), A=M (V, Q) can be obtained;
special adaptations to the traditional VQA model, the VQA model N (-) will be used with a given medical image V, a given question text Q and a preset candidate answer set A p ={A 1 ,A 2 …,…A c As input, then selects the most appropriate answer in the given candidate answer set, which may be labeled a=n (V, Q, a) p ) The specific selection method is determined according to the probability value corresponding to each answer given by the traditional VQA model, the highest probability is selected as output A, and if the finally selected answer is not the globally optimal answer given by the model, the judgment is marked in a 'question-state tree' to warn;
the problem-state tree design module comprises five parts, namely a pathological information demand analysis unit, a pathological state generation and combination unit, a pathological problem design unit, a tree structure optimization unit and an auxiliary information insertion unit;
the information automatic extraction module is used for: after the construction of the problem-state tree is completed, searching is carried out in the problem-state tree, and an information tree is automatically generated;
the structured information integration module comprises: after the information tree is obtained, the information tree T can be directly accessed because the information tree conforms to the basic characteristics of the data structure of the tree IN Performing preface traversal, and concatenating all extracted information into characters, so as to finally complete the generation of a medical image diagnosis report.
2. The visual question-answering based medical image report structured generation method according to claim 1, wherein the pathological information demand analysis unit is specifically implemented as follows
Representing image information as sigma, constructing information required for different diseases according to a bottom-up methodSet i= { σ 12 …,…σ n From sigma 1 To sigma n The represented information is arranged in a global to local order, and all the information can find out the corresponding representation from the answer space of the VQA model N (-), namely when the N (-) confirms that the answer has the highest possibility, the N (-) is represented to acquire the corresponding information from the image V; in particular, σ here 1 Global information that should represent the entire image;
according to the thought, the information I needed by doctors to make a decision on the illness state is analyzed w Further analysing on the basis of this which information I can be provided by the images of the different modalities, respectively h Further find the intersection I of the two all =I w ∩I h ,I all It is the final "problem-state tree" that needs to contain the main pathological information.
3. The visual question-answering based medical image report structured generation method according to claim 2, wherein the pathological state generation and combination unit is specifically implemented as follows:
according to the pathological information I which is obtained by the pathological information demand analysis unit all Will be pathological information I all Integration into pathological states; any one pathological state is a superposition of a plurality of pathological information, and the more finely divided pathological states, the more pathological information is needed; the pathological state obtained by the pathological information demand analysis unit is calculated as follows:
Figure FDA0004108291420000021
wherein a is 1 to or less i ≤n,1≤m≤n,/>
Figure FDA0004108291420000022
k=1, 2 …, …; obtaining a pathological state set S= { S 1 ,s 2 …,…s z -a }; wherein pathological state s 1 =σ 1 Representing an initial state; while the subdivided pathological state builds on the more macroscopic pathological state, and therefore on the pathologyFinding the inclusion relation among the states in the state set S; the different states are further organized into a tree structure according to this inclusion relationship, i.e. in s 1 As root node, the rest of the states s i' E S is divided into a plurality of mutually disjoint finite sets T 1 ,T 2 …,T l ,…,T g Each T l And can divide T l The nodes outside the root node of (a) are divided into a plurality of mutually disjoint finite sets
Figure FDA0004108291420000023
And so on; in particular for->
Figure FDA0004108291420000024
If there is s m' ∈T m' ,s n' ∈T n' And s is n' Is T n' Should there be s m' -s n' >0, i.e., a node always contains more information than its ancestor node; according to the thought, constructing tree T according to pathological state set S ST This process is denoted as T ST =tree (S), where S represents a set of pathological states, T ST Referred to as a "state tree"; to accommodate the later tree structure adjustment, all leaf nodes are designated as feature state nodes and all branch nodes are designated as flag state nodes.
4. The visual question-answering-based medical image report structured generation method according to claim 3, wherein the pathology problem design module is specifically implemented as follows:
i.e. from the state tree T ST Starting with, design and insert questions into the state tree T ST In the last, the infrastructure T of the problem-state tree is obtained QS The method comprises the steps of carrying out a first treatment on the surface of the State tree T ST The process of continuously refining the state is presented from top to bottom, and each state transition from the parent node to the child node needs to take the problem as a medium, and the answer of each problem can lead to a plurality of different state transition possibilities; for parent state nodes
Figure FDA0004108291420000025
It has several substate nodes +.>
Figure FDA0004108291420000026
For parent state node->
Figure FDA0004108291420000027
The set of sub-state nodes of (a) is divided into blocks according to characteristics, and +.>
Figure FDA0004108291420000031
Generating a question q a Ensure pair A a =N(V,q a ,A pa ) There is->
Figure FDA0004108291420000032
Make->
Figure FDA0004108291420000033
The constant is established;
to further improve the accuracy of the VQA model, q a Should be generated according to parent state nodes as much as possible
Figure FDA0004108291420000034
The information of the VQA model is set, constraint limiting words are added, so that the VQA model can more easily search a correct result when searching an answer space; to the state tree T ST Repeating the above steps for all non-leaf nodes to generate a question set +.>
Figure FDA0004108291420000035
This process is labeled Q u =Question(T ST );
Inserting problem nodes into the middle of corresponding state nodes can form the infrastructure T of the defined data structure' problem-state tree QS The method comprises the steps of carrying out a first treatment on the surface of the The infrastructure of the "problem-state tree" is specifically representedIs T QS ={Q u S, wherein Q u S is a pathological state set; in this case, see T QS The root node and the leaf node of the node are both state nodes, and the child nodes of the state nodes are necessarily problem nodes, and the child nodes of the problem nodes are also necessarily state nodes.
5. The visual question-answering based medical image report structuring generation method according to claim 4, wherein the tree structure optimizing unit is specifically implemented as follows:
in order to have good self-checking capability, a problem-state tree infrastructure T is obtained QS Further processing above, for each sub-problem-state tree infrastructure
Figure FDA0004108291420000036
There is->
Figure FDA0004108291420000037
Figure FDA0004108291420000038
But->
Figure FDA0004108291420000039
All state nodes contained are all +_ with their root node>
Figure FDA00041082914200000310
Is built on the basis that these state nodes all contain the root node +.>
Figure FDA00041082914200000311
Information provided, therefore, the +.>
Figure FDA00041082914200000312
All nodes included are regarded as having the same label which in actual use represents a certain type of disease which can be judged qualitatively, inReferred to herein as a tag set L, initially defined as l=t QS Here T is QS The root node of the label set L is marked as a label node of the label set L; by "problem-state tree" infrastructure T QS The inclusion relationship of the self-collection can be known that the inclusion relationship of the label collection also exists, and the initial definition L is that x =T QSx ,L y =T QSy It can be seen that in the initial state, +.>
Figure FDA00041082914200000313
And (3) with
Figure FDA00041082914200000314
Mutually being the filling conditions; for sub-tag set L y Assuming its corresponding T QSy Can directly divide T QSx T, i.e QSx Root node s x Is T QSy Root node s y Then at s x Selecting these sub-problem nodes containing only sub-feature state nodes and then subtracting them together with the sub-feature state nodes from T QSx Removed, but still remain at T QSy And is taken as s y And needs to be tightly attached to s x To the left of the parent problem node; the process of checking the symptoms and then making the decision is realized by executing the Label (-) operation, so that the logic of the generated text is enhanced, and the doctor is facilitated to judge whether the diagnosis result given by the model meets the normal condition; it should be noted that the resulting tag set L x And T is QSx Is not identical and can therefore be described as l=Label (T QS )。
6. The visual question-answering based medical image report structuring generation method according to claim 5, wherein the auxiliary information inserting unit is specifically implemented as follows:
after the optimization of the tree structure is completed, a plurality of auxiliary nodes are inserted into the label set L according to specific requirements; auxiliary nodes are divided into two types, one type is to further mine information contained in the existing characteristic state nodes, and the other type is calledThe characteristic state expansion nodes; the other category is treatment advice and review advice of the disease to which the current label belongs, and the advice is called label supplement nodes; the feature state expansion node should become a sub node of the corresponding feature state node, and may include a plurality of problem nodes and a plurality of feature state nodes, for further refining the features that have not been fully mined previously; the label supplementing node is used as a sub-node of the corresponding label node and needs to be positioned at the leftmost side, so that after the model makes corresponding diagnosis, a proper suggestion can be given in the generated diagnosis report aiming at some specific phenomena, and the further development of doctor diagnosis and treatment work is assisted; for the feature state expansion node, the feature state expansion node is essentially a combination of the problem node and the feature state node, so that the problem node in the feature state expansion node can be added into the problem set Q u The characteristic state node can be added into the pathological state set S, and the label supplementing node is the introduction of medical information which is not possessed before, so that the characteristic state node is marked as a label supplementing node set E, and the information contained in each label supplementing node t is marked as E; the final constructed "problem-state tree" can thus be denoted as T QSE ={Q u ,S,E}。
7. The visual question-answering based medical image report structured generation method according to claim 6, wherein the information automatic extraction module is specifically implemented as follows:
at the completion of the "problem-status tree" T QSE ={Q u After construction of S, E, the "problem-state tree" T may be found here QSE Is searched and automatically generates an information tree T IN The method comprises the steps of carrying out a first treatment on the surface of the The information tree is composed of several information nodes, where the information sigma contained therein is used directly b To refer to information nodes; t is defined herein IN ={I s ,I E },
Figure FDA0004108291420000041
I E ={∈ 1 ,∈ 2 …,…∈ h -and satisfy pair->
Figure FDA0004108291420000042
Figure FDA0004108291420000043
Make->
Figure FDA0004108291420000044
The construction principle is as follows: for a given picture and a given "problem-state tree" T QSE ={Q u S, E, assuming the current system is at S t In the E S state, q v Is s t Is not traversed, s v E S is q v Is a sub-state node, sigma v =s v -s t While at the same time s is present v =s t +A v Wherein A is v =N(V,q v ,A pv ) I.e. satisfy sigma v =A v Then s will be v Information sigma contained in a node v Copy to information tree as s t Corresponding information node sigma t Then changing the original system state to s v And so on; if the label supplementing node t is encountered in the traversal process, the information epsilon is directly copied to the sub-node serving as the latest generation node in the information tree; if the state of the system does not have the sub-problem nodes which are not traversed yet, returning to the previous state; note here that in the process of inserting an information node into an information tree, a plurality of sub-nodes should be sequentially arranged from left to right in the insertion order. />
CN202310198891.5A 2023-03-03 2023-03-03 Medical image report structured generation method based on visual question and answer Active CN116168796B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202310198891.5A CN116168796B (en) 2023-03-03 2023-03-03 Medical image report structured generation method based on visual question and answer
ZA2023/07473A ZA202307473B (en) 2023-03-03 2023-07-27 Method for structured generation of medical imaging reports based on visual question answering

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310198891.5A CN116168796B (en) 2023-03-03 2023-03-03 Medical image report structured generation method based on visual question and answer

Publications (2)

Publication Number Publication Date
CN116168796A true CN116168796A (en) 2023-05-26
CN116168796B CN116168796B (en) 2023-11-10

Family

ID=86418200

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310198891.5A Active CN116168796B (en) 2023-03-03 2023-03-03 Medical image report structured generation method based on visual question and answer

Country Status (2)

Country Link
CN (1) CN116168796B (en)
ZA (1) ZA202307473B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107463786A (en) * 2017-08-17 2017-12-12 王卫鹏 Medical image Knowledge Base based on structured report template
CN111008293A (en) * 2018-10-06 2020-04-14 上海交通大学 Visual question-answering method based on structured semantic representation
CN111326236A (en) * 2020-03-25 2020-06-23 朱利锋 Medical image automatic processing system
CN112309528A (en) * 2020-10-27 2021-02-02 上海交通大学 Medical image report generation method based on visual question-answering method
CN113792177A (en) * 2021-08-05 2021-12-14 杭州电子科技大学 Scene character visual question-answering method based on knowledge-guided deep attention network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107463786A (en) * 2017-08-17 2017-12-12 王卫鹏 Medical image Knowledge Base based on structured report template
CN111008293A (en) * 2018-10-06 2020-04-14 上海交通大学 Visual question-answering method based on structured semantic representation
CN111326236A (en) * 2020-03-25 2020-06-23 朱利锋 Medical image automatic processing system
CN112309528A (en) * 2020-10-27 2021-02-02 上海交通大学 Medical image report generation method based on visual question-answering method
CN113792177A (en) * 2021-08-05 2021-12-14 杭州电子科技大学 Scene character visual question-answering method based on knowledge-guided deep attention network

Also Published As

Publication number Publication date
ZA202307473B (en) 2024-03-27
CN116168796B (en) 2023-11-10

Similar Documents

Publication Publication Date Title
Suermondt Explanation in Bayesian belief networks
CN116682553B (en) Diagnosis recommendation system integrating knowledge and patient representation
CN107247881A (en) A kind of multi-modal intelligent analysis method and system
CN113707339B (en) Method and system for concept alignment and content inter-translation among multi-source heterogeneous databases
Golbreich et al. The Foundational Model of Anatomy in OWL 2 and its use
CN117056493A (en) Large language model medical question-answering system based on medical record knowledge graph
Pan et al. Muvam: A multi-view attention-based model for medical visual question answering
CN109710928B (en) Method and device for extracting entity relationship of unstructured text
CN117423470B (en) Chronic disease clinical decision support system and construction method
CN116168796B (en) Medical image report structured generation method based on visual question and answer
Li et al. MedDM: LLM-executable clinical guidance tree for clinical decision-making
CN107085655A (en) The traditional Chinese medical science data processing method and system of constrained concept lattice based on attribute
CN116386895B (en) Epidemic public opinion entity identification method and device based on heterogeneous graph neural network
CN114708952B (en) Image annotation method and device, storage medium and electronic equipment
Roth-Berghofer et al. Improving understandability of semantic search explanations
Tang et al. Work like a doctor: Unifying scan localizer and dynamic generator for automated computed tomography report generation
CN112084319A (en) Relational network video question-answering system and method based on actions
Pan et al. S3-Net: A Self-Supervised dual-Stream Network for Radiology Report Generation
Burtseva et al. SonaRes—Diagnostic decision support system for ultrasound examination
Marin et al. Effectiveness of neural language models for word prediction of textual mammography reports
Anand Reverse multiple-choice based clustering for machine learning and knowledge acquisition
CN115730082A (en) Medical knowledge map fusion-based disease entity alignment method and device
Sloan et al. Automated Radiology Report Generation: A Review of Recent Advances
Zhang Construction of Visualization Model of Maternal Health Care Based on Domain Ontology
Froeschl A metadata approach to statistical query processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant