CN114882968A

CN114882968A - Medical image report generation method and system

Info

Publication number: CN114882968A
Application number: CN202210516010.5A
Authority: CN
Inventors: 李阳; 刘超然
Original assignee: Shanghai United Imaging Healthcare Co Ltd
Current assignee: Shanghai United Imaging Healthcare Co Ltd
Priority date: 2022-05-12
Filing date: 2022-05-12
Publication date: 2022-08-09

Abstract

The invention relates to the field of medical images, and provides a medical image report generation method and system. Wherein, the method comprises the following steps: acquiring medical image data of a target object; inputting the medical image data into a classification positioning model to obtain a classification positioning result; determining a report generation decision based on the classified positioning result; and generating a decision based on the report, and acquiring a target report. In the invention, medical image data is processed based on the classified positioning model, so that doctors can be rapidly helped to determine the type and the position of a lesion, the doctors can determine whether to perform scanning or stop scanning on patients for treatment based on the classified positioning result and the medical image data, the lesion is rapidly identified on the premise of not delaying the treatment opportunity of the patients, and the requirements of different diseases or cases on time and effect are met.

Description

Medical image report generation method and system

Technical Field

The present disclosure relates to the field of medical imaging, and in particular, to a method and a system for generating a medical image report.

Background

Medical imaging is an important technology of modern medicine, and can interact with a target in a non-invasive manner by means of a certain medium, so as to obtain an image of internal tissues and organs of the target and assist a doctor in diagnosing and treating diseases. How to obtain the medical image examination result quickly to assist the doctor to decide whether to perform more scans or determine the treatment strategy is a considerable problem.

Therefore, there is a need for a medical image report generation method and system.

Disclosure of Invention

One embodiment of the present specification provides a medical image report generation method. The method comprises the following steps: acquiring medical image data of a target object; inputting medical image data into a classification positioning model to obtain a classification positioning result; determining a report generation decision based on the classified positioning result; and obtaining a target report based on the report generation decision.

In some embodiments, determining a report generation decision based on the classified positioning result comprises: displaying medical image data and a classification positioning result; receiving decision information input by a user; based on the decision information, a report generation decision is determined.

In some embodiments, the decision information includes increasing scanning of the target object or ending scanning of the target object.

In some embodiments, the medical image data is head scan data, or the medical image data is head scan data and physiological signal data.

In some embodiments, the classification localization model is obtained by: obtaining a plurality of training samples; inputting a plurality of training samples into a classification positioning model to obtain a processing result; and adjusting parameters of the classification positioning model.

In some embodiments, inputting a plurality of training samples into the classification positioning model, and obtaining a processing result, includes: encoding input sample data to acquire a plurality of image characteristics; performing attention pooling on the plurality of image features to obtain a lesion classification result; and decoding the plurality of image characteristics to obtain a lesion positioning result.

In some embodiments, performing attention pooling on the plurality of image features to obtain a lesion classification result includes: performing convolution on a plurality of image features to obtain a single-channel feature map; performing pooling operation on the single-channel feature map to obtain a plurality of feature values; acquiring a plurality of characteristic weight values based on the plurality of characteristic values; carrying out weighted summation based on a plurality of characteristic weight values and a plurality of image characteristics to obtain classification characteristics; and acquiring a lesion classification result based on the classification characteristics.

In some embodiments, pooling the single-channel feature map to obtain a plurality of feature values includes: averagely dividing the single-channel feature map into a plurality of areas; and performing average pooling on each of the plurality of regions to obtain a plurality of characteristic values.

In some embodiments, the method further comprises: performing compression excitation processing based on a plurality of image characteristics to obtain channel weight; scaling the classification features based on the channel weight to obtain a scaling result; and obtaining a lesion classification result based on the scaling result.

One embodiment of the present specification provides a medical image report generation system. The system comprises: the medical image data acquisition module can be used for acquiring medical image data of a target object; the classification positioning result acquisition module can be used for inputting the medical image data into the classification positioning model to acquire a classification positioning result; a report generation decision acquisition module operable to determine a report generation decision based on the classified positioning result; and the target report acquisition module can be used for generating a decision based on the report and acquiring the target report.

Another aspect of the present description provides a medical image report generating device comprising at least one storage medium for storing computer instructions and at least one processor; the at least one processor is configured to execute the computer instructions to implement a medical image report generation method as described above.

Another aspect of the present specification provides a computer-readable storage medium storing computer instructions, and when the computer instructions in the storage medium are read by a computer, the computer executes the medical image report generation method.

Drawings

The present description will be further explained by way of exemplary embodiments, which will be described in detail by way of the accompanying drawings. These embodiments are not intended to be limiting, and in these embodiments like numerals refer to like structures, wherein:

FIG. 1 is an exemplary diagram of an application scenario of a medical image report generation system according to some embodiments of the present description;

FIG. 2 is an exemplary flow diagram of a medical image report generation method according to some embodiments of the present description;

FIG. 3 is an exemplary flow diagram of a method of determining report generation decisions, shown in some embodiments herein;

FIG. 4 is an exemplary flow diagram of a method of obtaining a classification-based localization model, according to some embodiments of the present description;

FIG. 5 is an exemplary flow diagram illustrating a method of obtaining processing results in accordance with some embodiments of the present description;

FIG. 6 is an exemplary diagram of a classification location model according to some embodiments of the present description.

In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings used in the description of the embodiments will be briefly described below. It is obvious that the drawings in the following description are only examples or embodiments of the present description, and that for a person skilled in the art, the present description can also be applied to other similar scenarios on the basis of these drawings without inventive effort. Unless otherwise apparent from the context, or otherwise indicated, like reference numbers in the figures refer to the same structure or operation.

It should be understood that "system", "apparatus", "unit" and/or "module" as used herein is a method for distinguishing different components, elements, parts, portions or assemblies at different levels. However, other words may be substituted by other expressions if they accomplish the same purpose.

As used in this specification and the appended claims, the terms "a," "an," "the," and/or "the" are not intended to be inclusive in the singular, but rather are intended to be inclusive in the plural, unless the context clearly dictates otherwise. In general, the terms "comprises" and "comprising" merely indicate that steps and elements are included which are explicitly identified, that the steps and elements do not form an exclusive list, and that a method or apparatus may include other steps or elements.

Flow charts are used in this description to illustrate operations performed by a system according to embodiments of the present description. It should be understood that the preceding or following operations are not necessarily performed in the exact order in which they are performed. Rather, the various steps may be processed in reverse order or simultaneously. Meanwhile, other operations may be added to the processes, or a certain step or several steps of operations may be removed from the processes.

Medical imaging technology is currently widely used in clinical diagnosis to assist doctors in diagnosing diseased parts of patients. In the related art, after a diseased part of a patient is scanned by using a medical imaging technology, a detailed report needs to be provided according to a scanning result, for example, in the case of scanning a brain of the patient, a detailed report of detection results including cerebral ischemia, white matter lesions, space occupying lesions, bone fracture, bleeding and the like needs to be provided after the scanning, which takes a long time. However, it is not beneficial to diagnose and treat patients because it takes a long time to provide detailed reports, and especially in an emergency scene, the emergency treatment is focused on one urgent situation, and how to improve the efficiency of the diagnosis process and save time to diagnose and treat patients becomes a problem that needs to be solved at present.

The development of the deep learning technology greatly shortens the image processing time from two dimensions of time and effect. However, in the related art, image reports of a single type (for example, one of cerebral ischemia, white matter lesion, space occupying lesion, etc.) and a single form (for a certain type of disease) are often generated, and in order to obtain detailed medical image examination results, much time is required for data processing. For example, in the case of CT cerebral hemorrhage, the detailed medical image examination report needs to include the bleeding position, the bleeding volume, the edema condition, the midline shift condition, etc., which needs a plurality of deep learning networks for processing, which not only increases the time loss, but also consumes a great deal of resources for training the plurality of deep learning networks. In addition, for brain emergency treatment, after CT scanning, different scanning or treatment strategies may be required for different conditions, some diseases or cases do not need detailed reports, treatment must be taken immediately after obtaining the type and location of the disease based on the current scanning result, and some can determine the next scanning or treatment strategy according to the detailed reports.

Therefore, some embodiments of the present disclosure disclose a medical report generation method and system, in which a classified localization result of a lesion is obtained by processing medical image data through a classified localization model, so as to shorten the time for continuing treatment of a patient, and whether to perform subsequent scanning can be determined according to the classified localization result of the lesion, thereby achieving the purpose of improving the time requirements of different diseases or cases while rapidly identifying the lesion.

It should be noted that the above examples are for illustrative purposes only, and the technical solutions disclosed in the embodiments of the present disclosure are also applicable to other target parts or tissues, such as the chest, the lung, etc., and the present disclosure is not limited thereto. The technical solutions disclosed in the present specification are explained below by the detailed description of the drawings.

Fig. 1 is an exemplary schematic diagram of an application scenario of a medical image report generation system according to some embodiments of the present description.

In some embodiments, the medical image report generation system 100 may be configured to acquire medical image data and process the medical image data through a classification and localization model to obtain a classification and localization result. The classified localization result may include a lesion type and a lesion location in the medical image data. The medical image report generation system 100 may determine a report generation decision based on the classified localization result. The medical image report generation system 100 may obtain a target report based on the report generation decision.

As shown in fig. 1, in some embodiments, medical image report generating system 100 may include imaging device 110, network 120, terminal 130, processing device 140, and storage device 150.

The imaging device 110 may be used to scan a target object to obtain scan data and image. In some embodiments, the imaging device 110 may include a single modality scanner and/or a multi-modality scanner. The single modality scanner may include an ultrasound scanner, an X-ray scanner, a Computed Tomography (CT) scanner, a Magnetic Resonance Imaging (MRI) scanner, an ultrasound tester, a Positron Emission Tomography (PET) scanner, an Optical Coherence Tomography (OCT) scanner, an Ultrasound (US) scanner, an intravascular ultrasound (IVUS) scanner, a near infrared spectroscopy (NIRS) scanner, a Far Infrared (FIR) scanner, or the like, or any combination thereof. The multi-modality scanner may include, for example, an X-ray imaging-magnetic resonance imaging (X-ray-MRI) scanner, a positron emission tomography-X-ray imaging (PET-X-ray) scanner, a single photon emission computed tomography-magnetic resonance imaging (SPECT-MRI) scanner, a positron emission tomography-computed tomography (PET-CT) scanner, a digital subtraction angiography-magnetic resonance imaging (DSA-MRI) scanner, or the like.

In some embodiments, the imaging device 110 may include a gantry 111, a detector 112, a scan region 113, and a scan bed 114. A target object may be placed on the scanning couch 114 to receive a scan. The gantry 111 may be used to support a detector 112. The detector 112 may be used to detect the radiation beam. The scanning region 113 is an imaging region scanned by the radiation beam.

In some embodiments, the detector 112 may include one or more detector cells. In some embodiments, the detector unit may comprise a single row of detectors and/or a plurality of rows of detectors. In some embodiments, the detector unit may include a scintillation detector (e.g., a cesium iodide detector), other detectors, and the like. In some embodiments, the gantry 111 may rotate, for example, in a CT imaging apparatus, the gantry 111 may rotate clockwise or counterclockwise about a gantry rotation axis. In some embodiments, the imaging device 110 may further include a radiation scanning source, which may rotate with the gantry 111. The radiation scanning source may emit a beam of radiation (e.g., X-rays) toward the object of interest, which is attenuated by the object of interest and detected by the detector 112 to generate an image signal. In some embodiments, the scanning bed 114 may be movably disposed in front of the machine and parallel to the ground, the scanning bed 114 may be moved to enter and exit the scanning region 113, and the scanning bed 114 may also be moved in a vertical direction to adjust the distance between the target object on the scanning bed and the detector 112 (or the scanning center) when entering the scanning region 113, so as to scan the target object within the scanning range.

Processing device 140 may process data and/or information obtained from imaging device 110, terminal 130, and/or storage device 150. For example, the processing device 140 may process the signals detected by the imaging device 110 to obtain medical image data. For another example, the processing device 140 may process the medical image data using the classified localization model to obtain a classified localization result. In some embodiments, the processing device 140 may be a single server or a group of servers. The server group may be centralized or distributed. In some embodiments, the processing device 140 may be local or remote. For example, processing device 140 may access information and/or data from imaging device 110, terminal 130, and/or storage device 150 via network 120. As another example, processing device 140 may be directly connected to imaging device 110, terminal 130, and/or storage device 150 to access information and/or data. In some embodiments, the processing device 140 may be implemented on a cloud platform. For example, the cloud platform may include one or a combination of private cloud, public cloud, hybrid cloud, community cloud, distributed cloud, cross-cloud, multi-cloud, and the like.

The terminal 130 may include a mobile device 131, a tablet computer 132, a notebook computer 133, and the like, or any combination thereof. In some embodiments, the terminal 130 may interact with other components in the medical image report generating system 100 through the network 120. For example, the terminal 130 may send one or more control instructions to the imaging device 110 via the network 120 to control the imaging device 110 to scan a target object according to the instructions. For another example, the terminal 130 may also receive the classified positioning result determined by the processing device 140 through the network 120, and display the classified positioning result for analysis and confirmation by the operator, for example, display the classified positioning result to a doctor through the terminal 130 for viewing.

In some embodiments, the terminal 130 may be part of the processing device 140. In some embodiments, the terminal 130 may be integrated with the processing device 140 as an operating console for the imaging device 110. For example, a user/operator (e.g., a doctor or nurse) of the medical image report generating system 100 may control the operation of the imaging device 110 through the console, such as scanning a target object, controlling the movement of the scanning bed 114, and the like.

Storage device 150 may store data (e.g., scan data for a target object), instructions, and/or any other information. In some embodiments, storage device 150 may store data obtained from imaging device 110, terminal 130, and/or processing device 140. For example, the storage device 150 may store treatment plans obtained from the imaging device 110, medical image data of the target object, and the like. In some embodiments, storage device 150 may store data and/or instructions that processing device 140 may execute or use to perform the example methods described herein. In some embodiments, the storage device 150 may include one or a combination of mass storage, removable storage, volatile read-write memory, read-only memory (ROM), and the like. In some embodiments, the storage device 150 may be implemented by a cloud platform as described herein.

In some embodiments, the storage device 150 may be connected to the network 120 to enable communication with one or more components (e.g., the processing device 140, the terminal 130, etc.) in the medical image report generation system 100. One or more components in the medical image report generation system 100 may read data or instructions in the storage device 150 via the network 120. In some embodiments, the storage device 150 may be part of the processing device 140 or may be separate and directly or indirectly coupled to the processing device 140.

Network 120 may include any suitable network capable of facilitating the exchange of information and/or data for medical image report generating system 100. In some embodiments, one or more components of the medical image report generating system 100 (e.g., the imaging device 110, the terminal 130, the processing device 140, the storage device 150, etc.) may exchange information and/or data with one or more components of the medical image report generating system 100 via the network 120. For example, processing device 140 may obtain image data from imaging device 110 via network 120. In some embodiments, network 120 may include one or more network access points. For example, network 120 may include wired and/or wireless network access points, such as base stations and/or internet exchange points, through which one or more components of medical image report generating system 100 may connect to network 120 to exchange data and/or information.

It should be noted that the above description of medical image report generating system 100 is for illustrative purposes only, and is not intended to limit the scope of the present description. Various modifications and adaptations may occur to those skilled in the art in light of this disclosure. However, such changes and modifications do not depart from the scope of the present specification. For example, the imaging device 110, the processing device 140, and the terminal 130 may share one storage device 150, or may have respective storage devices.

In some embodiments, the medical image report generation system 100 may include a medical image data acquisition module, a classified localization results acquisition module, a report generation decision acquisition module, and a target report acquisition module.

In some embodiments, the medical image data acquisition module may be for acquiring medical image data of a target object. For a detailed description, reference may be made to the flow chart section associated with this specification, e.g., detailed description of step 210.

In some embodiments, the classified positioning result obtaining module may be configured to input the medical image data into the classified positioning model, and obtain the classified positioning result. For a detailed description, reference may be made to the flow chart section associated with this specification, e.g., the detailed description of step 220.

In some embodiments, the report generation decision acquisition module may be configured to determine a report generation decision based on the classified positioning result. For a detailed description, reference may be made to the flow chart section associated with this specification, e.g., the detailed description of step 230.

In some embodiments, the target report acquisition module may be configured to acquire the target report based on the report generation decision. For a detailed description, reference may be made to the flow chart section associated with this specification, e.g., the detailed description of step 240.

It should be understood that the system and its modules shown in FIG. 1 may be implemented in a variety of ways. For example, in some embodiments, the system and its modules may be implemented in hardware, software, or a combination of software and hardware. Wherein the hardware portion may be implemented using dedicated logic; the software portions may be stored in a memory for execution by a suitable instruction execution system, such as a microprocessor or specially designed hardware. Those skilled in the art will appreciate that the methods and systems described above may be implemented using computer executable instructions and/or embodied in processor control code, such code being provided, for example, on a carrier medium such as a diskette, CD-or DVD-ROM, a programmable memory such as read-only memory (firmware), or a data carrier such as an optical or electronic signal carrier. The system and its modules in this specification may be implemented not only by hardware circuits such as very large scale integrated circuits or gate arrays, semiconductors such as logic chips, transistors, or programmable hardware devices such as field programmable gate arrays, programmable logic devices, etc., but also by software executed by various types of processors, for example, or by a combination of the above hardware circuits and software (e.g., firmware).

It should be noted that the above descriptions of the medical image report generation system and the modules thereof are only for convenience of description, and should not limit the scope of the present disclosure to the illustrated embodiments. It will be appreciated by those skilled in the art that, given the teachings of the system, any combination of modules or sub-system may be configured to interface with other modules without departing from such teachings. In some embodiments, the medical image data acquisition module, the classification and localization result acquisition module, the report generation decision acquisition module and the target report acquisition module disclosed in fig. 1 may be different modules in one system, or may be a module that implements the functions of two or more modules. For example, each module may share one memory module, and each module may have its own memory module. Such variations are within the scope of the present disclosure.

Fig. 2 is an exemplary flow diagram of a medical image report generation method according to some embodiments of the present description. In some embodiments, flow 200 may be performed by a processing device (e.g., processing device 140). For example, the process 200 may be stored in a storage device (e.g., an onboard storage unit of a processing device or an external storage device) in the form of a program or instructions that, when executed, may implement the process 200. The flow 200 may include the following operations.

Step 210, acquiring medical image data of the target object. In some embodiments, step 210 may be performed by a medical image data acquisition module.

In some embodiments, the target object may include a patient or other medical subject (e.g., a laboratory white mouse or other animal), or the like. In some embodiments, the target object may be a part of the body, such as the head, chest, abdomen, etc., or any combination thereof. In some embodiments, the target object may include a specific organ, such as a heart, esophagus, trachea, bronchi, stomach, gall bladder, small intestine, colon, bladder, ureter, uterus, fallopian tube, and the like.

The medical image data may refer to image data obtained after scanning and imaging of a target object by a medical imaging apparatus. For example, the medical image data may be a CT image, a PET image, a nuclear magnetic resonance image, an ultrasound image, or the like.

In some embodiments, the medical imagery data may include images obtained by various modes of scanning of the medical scanning device. For example, for the CT image, it may be obtained by various scanning modes such as tomographic plain scan, helical scan, and the like by the CT scanning apparatus.

In some embodiments, the medical image data may be head scan data of the target object, and the acquired scan modality may include CT panning, MRI scanning, PET scanning, and the like. The CT flat scanning can be used for quickly obtaining the scanning data of the target object, reducing the scanning time and meeting the requirement on the treatment opportunity of the patient.

In some embodiments, the medical image data may also be physiological signal data of the target object. For example, signal data reflecting changes in point activity, pressure changes, tension changes, and blood flow changes of a living body are detected from the inside of the body of the target object.

In some embodiments, the medical image data may be a two-dimensional image or a three-dimensional image.

In some embodiments, the processing device 140 may obtain the medical image data by reading from an imaging device, a database, a storage device, and invoking a data interface, among other means.

In some embodiments, the processing device may obtain the medical image data by controlling the imaging device to scan and image the target object.

Step 220, inputting the medical image data into the classification positioning model to obtain a classification positioning result. In some embodiments, step 220 may be performed by the classified positioning result acquisition module.

The classification positioning result may be an output result of the classification positioning model after processing the input medical image data. In some embodiments, the classification localization result may include a type of lesion and a location of the lesion present in the medical image data. The lesion type represents the type of lesion that may be present in the target object or its site/tissue, for example, in the case of a brain, the lesion type may include a skull fracture, a cerebral hemorrhage, a cerebral ischemia, a white matter lesion, a space occupying lesion, and the like. The lesion location represents a location of the lesion in the medical image data, e.g., the lesion location is at the upper right, lower right, middle, etc. of the medical image data.

In some embodiments, the lesion type may be represented in a variety of ways, for example, different numbers or letters may represent different lesion types, such as 1 for skull fracture, 2 for cerebral hemorrhage, 3 for cerebral ischemia, etc.; for another example, the expression is given by a combination of initials of the types of lesions, for example, lggz indicates a fracture of the skull, ncx indicates cerebral hemorrhage, nqx indicates cerebral ischemia, and the like.

In some embodiments, the lesion location may be represented in the form of a box or thermodynamic diagram. For example, the lesion position in the medical image data is boxed or the region with the highest thermal value in the thermodynamic diagram corresponds to the lesion position.

The input of the classification positioning model is medical image data, and the output is a classification positioning result. In some embodiments, the types of the classified positioning model may include, but are not limited to, a Neural Network (NN) model, a Convolutional Neural Network (CNN) model, a Recurrent Neural Network (RNN) model, a ResNet model, and various combination models, such as a CNN-transformer model, and the like, which is not limited in this specification. For an exemplary training mode of the classification and localization model, refer to fig. 4 and description thereof, and for an exemplary structural schematic of the classification and localization model, refer to fig. 6 and description thereof, which are not repeated herein.

In some embodiments, the processing device may input the medical image data into the classification positioning model, and output the classification positioning result from the classification positioning model.

A report generation decision is determined based on the classified positioning result, step 230. In some embodiments, step 230 may be performed by a report generation decision acquisition module.

Report generation decisions may refer to instructions and/or information indicating a manner in which subsequent reports are generated. In some embodiments, subsequent report generation may include increasing scans of the target object, stopping and beginning treatment, changing scan patterns, and so forth. For example, the current medical image data is obtained by performing a scout scan on the brain, the scout scan has a characteristic of high scanning speed, the report generation decision may indicate to continue to scan the brain of the target object through the CT device, or to scan the brain by using MRI, or to stop scanning and indicate to the radiotherapy device to treat the target object if the current brain lesion needs to be treated immediately.

In some embodiments, the processing device may determine the report generation decision based on information entered by the physician based on the classified localization results and their corresponding medical image data. For example, the processing device may present the classified localization results and the medical image data to a user (e.g., a doctor) for viewing, receive decision information input by the user, and determine a report generation decision based on the decision information. For more description of the determination manner of the report generation decision, refer to fig. 3 and the description thereof, which are not repeated herein.

Step 240, based on the report generation decision, a target report is obtained. In some embodiments, step 240 may be performed by a target report acquisition module.

The target report may refer to a detailed report that needs to be presented after the target object is scanned. For example, still taking a brain scan as an example, in the target report, a skull fracture may give the fracture location; the cerebral hemorrhage target report can give the hemorrhage volume, the hemorrhage position, the cerebral edema and the midline deviation; the cerebral ischemia can be given according to different scanning schemes, and the ASPECT score and (or) high-density feature position and (or) low-density area and (or) hemorrhage transformation position and (or) CT blood vessel display and (or) responsible blood vessel positioning and (or) perfusion result and (or) collateral circulation condition and (or) ischemia penumbra analysis and the like can be given; white matter lesions can be different according to scanning schemes, and the white matter lesion position and (or) perfusion condition and (or) lesion grade are given; the space occupying lesion may be different according to the scanning scheme, and the position of the lesion and/or the type of tumor or other lesion and/or the perfusion condition are/is given.

In some embodiments, the processing device may generate a decision according to the report, and when the target object needs to be scanned, for example, after the scan processing is performed on cerebral ischemia, white matter lesion, and space occupying lesion, the processing device may perform automated processing in combination with various image data to obtain the target report. The automatic processing mode may include automatic post-processing, and the specific mode may refer to an existing automatic report generation mode, which is not described herein again.

In some embodiments of the present description, medical image data is processed based on the classified positioning model, the type and position of a lesion can be determined quickly and provided to a user, and the user can decide whether to perform a scan or stop the scan for treatment on a patient based on the classified positioning result and the medical image data, so that the time for continuing to treat a case can be shortened. Meanwhile, a target report with more detailed content can be obtained based on report generation decision and determination of scanning data based on current medical image data and/or subsequent scanning. Under the premise of not delaying the treatment time of a patient, the pathological changes can be quickly identified, and the time requirements of different diseases or cases can be met.

FIG. 3 is an exemplary flow diagram illustrating a method of determining report generation decisions in accordance with some embodiments of the present description. In some embodiments, flow 300 may be performed by a processing device (e.g., processing device 140). For example, the process 300 may be stored in a storage device (e.g., an onboard storage unit of a processing device or an external storage device) in the form of a program or instructions that, when executed, may implement the process 300. In some embodiments, the flow 300 may be performed by a report generation decision acquisition module. The flow 300 may include the following operations.

And step 310, displaying the medical image data and the classification positioning result.

In some embodiments, the processing device may present the classified positioning result and the medical image data through a display screen of a user terminal (e.g., terminal 130). For example, the processing device may transmit the classification positioning result and the medical image data to the user terminal and display them on a screen of the user terminal.

Step 320, receiving decision information input by a user.

The decision information may refer to content and/or instructions input by the user indicating a subsequent operation.

In some embodiments, the decision information may include increasing the scanning of the target object or ending the scanning of the target object. For example, the decision information may be to continue to use CT flat scan for the scan; for another example, the decision information may be to scan instead with a CT helical scan; as another example, the decision information may be to end the scan of the target object. After the scanning is finished, the doctor can immediately cure the user, so that the treatment opportunity is prevented from being delayed, and meanwhile, the processing equipment can continuously execute the subsequent flow of automatically generating the target report based on the current medical image data to obtain the corresponding target report.

In some embodiments, the user may enter the decision information in the form of touch, gesture operation, or the like. In some embodiments, the user may enter the decision information through an external device such as a keyboard, mouse, microphone, or the like. In some embodiments, the user may enter the decision information in a gesture or expression, among other ways. For example, the processing device may acquire the gesture and expression of the user through an image acquisition device (e.g., a camera) or a sound acquisition device to acquire the decision information input by the user.

Based on the decision information, a report generation decision is determined, step 330.

In some embodiments, the processing device may determine the report generation policy based on content and/or instructions in the decision information indicating subsequent operations. For example, when the decision information is to continue to scan, the processing device may determine that the report generation decision is to generate a target report based on the current medical image data and the scanned image data; for another example, when the decision information is to end a scan, the processing device may determine that the report generation decision is to generate a target report based on current medical image data.

In some embodiments of the present disclosure, the time and effect requirements of different diseases or cases can be met by displaying the classified positioning results to the user and determining the report generation decision based on the received decision information input by the user, for example, in case of head emergency, different disease scanning or treatment strategies may be different after a flat scan CT scan for head emergency, detailed reports are not needed for some diseases or cases, the disease type and the disease location need to be informed, the treatment needs to be grasped, and more detailed reports need to be issued after more scans are arranged.

FIG. 4 is an exemplary flow diagram illustrating a method of obtaining a classification localization model according to some embodiments of the present description. In some embodiments, flow 400 may be performed by a processing device (e.g., processing device 140). For example, the process 400 may be stored in a storage device (e.g., an onboard storage unit of a processing device or an external storage device) in the form of a program or instructions that, when executed, may implement the process 400. The flow 400 may include the following operations.

At step 410, a plurality of training samples are obtained.

Training samples may refer to data that may be used to train a class-oriented model. In some embodiments, each training sample may include sample data and corresponding label data. The sample data can be sample medical image data in the historical data, and the label data is a sample classification positioning result in the historical data. The sample classification positioning result in the label data comprises the type and the position of the lesion. For the sample classification positioning result of the tag data, reference may be made to the related description in fig. 2, which is not described herein again.

In some embodiments, the sample data may include sample medical image data with a disease and sample medical image data without a disease (e.g., medical image data obtained by scanning a normal target object). The mode of training the model by mixing the sample medical image data with the disease and the sample medical image data without the disease can increase the learning capacity of the model and is beneficial to improving the training effect of the model.

In some embodiments, the processing device may obtain the plurality of training samples by reading from the imaging device, the database, the storage device, and invoking a data interface.

Step 420, inputting a plurality of training samples into the classification positioning model to obtain a processing result.

The processing result may be a prediction result output after the classification positioning model processes the input training sample. In some embodiments, the prediction results may include a predicted lesion classification result and a predicted lesion localization result. The predicted lesion classification result may refer to classification of a lesion that may exist in the sample data, such as skull fracture, cerebral hemorrhage, cerebral ischemia, and the like, and the predicted lesion localization result may refer to a location area of the predicted lesion in the sample medical image data. In some embodiments, the predicted lesion localization result may be in the form of a box or a thermodynamic diagram, which is described in detail in step 220 and is not described herein again.

In some embodiments, the processing device may input sample data in the training samples to the classification positioning model, and obtain a processing result. For more description of the obtaining processing result, reference may be made to fig. 5 and the description thereof, which are not described herein again.

Step 430, adjusting parameters of the classification positioning model to reduce the difference between the processing result and the tag data.

Model parameters may refer to configuration variables within the model whose values may be summarized estimated from training data.

In some embodiments, the processing device may construct a loss function based on the processing results of the classification location model and the labels, and adjust model parameters based on values of the loss function. For example, the processing device may continuously adjust model parameters of the classification and positioning model with the minimization of the loss function value as an optimization target, so as to obtain a trained classification and positioning model. And finishing the model training when the loss function of the classification positioning model meets the preset condition, so as to obtain the trained classification positioning model. Wherein the preset condition may be a loss function convergence. In some embodiments, the trained classification and localization model may also be obtained when the number of training iterations of the classification and localization model reaches a preset number, for example, 10000 times or 20000 times.

Illustratively, the loss function may be expressed as follows.

Loss＝Loss _C +λ ₁ Loss _R

Wherein Loss denotes the Loss function, Loss _C Representing a first Loss function term, Loss, related to the lesion classification result _R Representing a second loss function term, λ, related to the lesion localization result ₁ The value of the weight ratio of the second loss function term may be set empirically, and may be set to 1, 0.8, or the like, for example.

In some embodiments, when the label of the lesion localization result is a box, the secondThe loss function term may be a frame coordinate regression loss function, and the specific form thereof may refer to SSD or YOLO series network, which is not described herein again. When the label of the lesion localization result is thermodynamic diagram, the size of the thermodynamic diagram output by prediction can be W × H × S × C ₁ Wherein W, H, S is the length, width and number of layers of the thermodynamic diagram, respectively, C ₁ Is the kind of the positioning target. Illustratively, the target species may be 7, respectively background, fracture site, bleeding site, ischemic site, aortic vascular occlusion site, white matter lesion site, site-occupying lesion site. In some embodiments, the thermodynamic map label may be a gaussian field set at the center of the lesion.

In some embodiments, the first loss function term and the second loss function term of the loss function may be various common types of loss functions, such as an absolute value loss function, a log-log loss function, a square loss function, a cross-entropy loss function, and the like, which is not limited in this embodiment.

In some embodiments, the loss function type of the first loss function term may be the same as or different from the loss function type of the second loss function term.

FIG. 5 is an exemplary flow diagram of a method of obtaining processing results, shown in some embodiments herein. In some embodiments, flow 500 may be performed by a processing device (e.g., processing device 140). For example, the process 500 may be stored in a storage device (e.g., an onboard memory unit of a processing device or an external storage device) in the form of a program or instructions that, when executed, may implement the process 500. Flow 500 may include the following operations.

Step 510, encoding the input sample data to obtain a plurality of image features.

Image features may refer to abstract expressions of feature information in sample data. For example, an abstract representation of feature information in sample medical image data. The feature information may include information of gray scale, texture, edge, etc. of the image. In some embodiments, the image features may be represented in the form of a feature matrix or vector representation.

The encoding may refer toAnd (3) performing abstract expression on the characteristic information in the sample data. In some embodiments, the processing device may encode the sample data by way of a machine learning model and/or an encoder, among other means. For example, the processing device may encode the sample data using a convolutional neural network (e.g., VGG, ResNet) or other neural network (e.g., CNN in conjunction with a transform). Illustratively, the original size of the input sample data is W × H × S × N brain scout medical image data, where W, H, S are the length, width and number of slices of the image (the number of slices is 1 when the sample data is a 2D image; the number of slices may be 2 or more when the sample data is a 3D image), respectively, and N is the size of one Batch in the training phase. The image feature size after encoding is

Where t is the down-sampling ratio, and C is the number of image features output by encoding, which may also be referred to as the number of feature channels. It should be noted that the specific form of encoding the sample data is not limited in this embodiment, and the above example is only for illustrative purposes.

Step 520, performing attention pooling on the plurality of image features to obtain a lesion classification result.

In some embodiments, attention pooling may be understood as a process of weighting multiple image features. Pooling can be understood as down-sampling, which can reduce the data size of image features; attention may be understood as the degree of attention to various regions of the image feature. The contribution of the features corresponding to the lesion region can be enhanced by attention pooling, so that the location of the lesion region in the medical image data can be more prominent. For example, different regions in the plurality of image features may be given different weights by attention pooling, and the lesion region may get a higher weight, making the importance of the feature of the location more prominent.

For example, the process of the processing device performing attention pooling on a plurality of image features may be as shown in the following embodiments.

The processing device may convolve the plurality of image features to obtainObtaining a single-channel characteristic diagram. A single-channel feature map may refer to a feature map (image feature) having 1 layer number. In some embodiments, the processing device may obtain the single-channel feature map by performing convolution processing on the plurality of image features using a convolution kernel. For example, following the above example, assume that the plurality of image features obtained after encoding are

Where t is the down-sampling ratio, C is the number of image features (also referred to as feature maps) to be output by encoding, which may also be referred to as the number of feature channels, W, H, S are the length, width, and number of layers of image features, respectively, and N is the size of one Batch (Batch) in the training phase, the size of the convolution kernel used may be 1 × 1 × C × 1, and a single-channel feature map may be obtained by convolving a plurality of image features using the convolution kernel

The processing device can perform pooling operation on the single-channel feature map to obtain a plurality of feature values. Pooling may refer to numerical processing of elements within certain regions in a single-channel feature map. In some embodiments, pooling operations may include average pooling, maximum pooling, and the like. Taking average pooling as an example, the processing device may average divide the single-channel feature map into a plurality of regions, and average-pool each of the plurality of regions to obtain a plurality of feature values. For example, a batch of single-channel feature maps may be divided into N × m × s, and assuming that N is 3, m is 3, and s is 3, 27 regions may be obtained after average division, and each of the 27 regions is averaged and pooled to obtain 27 × N feature values (if the division is insufficient, a sub-pixel format may be used). The eigenvalues are the result of averaging pooling values of the elements in the region.

The processing device may obtain a plurality of feature weight values based on the plurality of feature values. The feature weight value may be a normalized weight value obtained by processing the feature value through a fully connected network and an activation function. For example, the processing device may pass the 27 × N weight values through a plurality of fully connected networks and a softmax activation function to obtain 27 × N characteristic weight values.

The processing device may perform a weighted summation based on the plurality of feature weight values and the plurality of image features to obtain a classification feature. In some embodiments, the processing device may first obtain a weighted feature map corresponding to the plurality of image features based on the plurality of feature weight values. For example, the processing device may fill 27 × N feature weight values into the single-channel feature map in a dividing manner

And obtaining a weight characteristic graph, and then multiplying the weight characteristic graph and each channel of the image characteristics output by coding point to point and summing to obtain the classification characteristics. The classification feature may be represented by a vector, for example, through the above process, the above example may finally obtain a classification feature vector with a size of C × N.

The processing device may obtain a lesion classification result based on the classification features. In some embodiments, the processing device may obtain a lesion classification result after inputting the classification features into the multi-layer fully-connected network and the activation function. For example, the processing device may input C × N classification features into the multi-layer fully-connected network to obtain 5 × N feature values, and input an output result of the multi-layer fully-connected network to the sigmoid activation function to obtain 5 × N classification probabilities. The classification probability may be used to indicate the probability of the type of lesion possibly existing in the medical image data, for example, 5 classification probabilities may respectively correspond to skull fracture, cerebral hemorrhage, cerebral ischemia, white matter lesion, and space occupying lesion, and when the classification probability is greater than a set threshold, for example, 0.5, 0.6, 0.8, and 0.9, the type of lesion may be considered to exist in the medical image data. It should be understood that the above examples are for illustrative purposes only and are not intended to limit the particular manner of operations for attention pooling and obtaining lesion classification results, for example, the classification probabilities may be 6, 7, 8, and the partitioning of the single-channel feature map is not limited to 3 × 3 × 3.

In some embodiments, the processing device may further perform a compression excitation process on the plurality of image features, and obtain a lesion classification result based on a result of the compression excitation process and the classification features. Illustratively, the specific process thereof may be as shown in the following embodiments.

The processing device may perform a compression excitation process based on the plurality of image features to obtain the channel weights. The compressed excitation process may refer to a compression operation and an excitation operation on image features. The compression may be a global average pooling operation of image features, for example, a W × H × C image feature is compressed into a 1 × 1 × C vector after global average pooling. The excitation may refer to converting a compressed vector into a vector in a nonlinear relationship after sequentially passing through the full connection layer and the activation function, for example, a 1 × 1 × C vector may output a vector with a size of 1 × 1 × C again after passing through the full connection layer and the activation function, and the vector is a channel weight. The channel weight may represent a weight corresponding to each feature map (feature channel) in the plurality of image features.

The processing device may scale the classification features based on the channel weights to obtain a scaling result. Scaling may refer to multiplying the channel weights with the plurality of image feature correspondences. For example, after obtaining a 1 × 1 × C vector through the excitation processing, the vector may be point-multiplied with the classification feature to obtain a scaling result. For example, to

After the compressed excitation processing is performed on the image features, the channel weight with the size of C × N can be obtained, and the C × N channel weight can be subjected to point multiplication with the C × N classification features. Through compression excitation processing, different characteristics can be provided with distinguishing force, and the obtained lesion classification result can be more accurate.

The processing device may obtain a lesion classification result based on the scaling result. In some embodiments, the processing device may obtain the lesion classification result based on the scaling result in the same manner as the lesion classification result based on the classification feature, and further description may refer to the above description, which is not repeated herein.

Step 530, decoding the plurality of image features to obtain a lesion location result.

In some embodiments, the processing device may perform a decoding process on the plurality of image features by a decoder (decoder), and obtain a lesion localization result based on a result of the decoding process. In some embodiments, the decoder may be designed based on a convolutional neural network, and the decoder and the encoder (encoder) are subjected to jumper connection for performing underlying feature fusion to obtain a lesion localization result.

It should be noted that the above description of the respective flows is only for illustration and description, and does not limit the applicable scope of the present specification. Various modifications and changes to the procedures described herein will be apparent to those skilled in the art in light of the disclosure. However, such modifications and variations are intended to be within the scope of the present description. For example, changes to the flow steps described herein, such as the addition of pre-processing steps and storage steps, may be made.

As shown in fig. 6, input data (e.g., medical image data or training samples) is input into the classification and positioning model, and the input data is decoded by an encoder to obtain a plurality of image features. After the image features are decoded by the decoder, a lesion localization result 610 can be obtained, and after the attention pooling process, a lesion classification result 620 can be obtained.

The process of the attention pooling process is shown as 630, in the attention pooling process, the image features are firstly processed by a 1 × 1 convolution kernel, then sequentially processed by grid average pooling, full Connected layers (FCs) and activation functions to obtain weight feature values, then a weight feature map is obtained based on the weight feature values, and finally the weight feature map and each channel point of a plurality of image features output by encoding are multiplied and summed to obtain classification features.

The dashed lines in fig. 6 are shown as optional and may be discarded in some embodiments. Optionally, in some embodiments, the grid weights may be deeply supervised, the weight gold criteria may be learned, and the grid weight loss function may be constructed (for example, a grid weight loss function term may be added to the loss function of the model), so that the training speed of the model may be improved. The weight gold standard is obtained by equally dividing the original medical image data into the same n × m × s regions, and if the original medical image data is a square frame, the weight gold standard is the area ratio occupied by the square frame (for example, the area of the whole image is 1, the area of a certain region is 0.5, and the area ratio can be obtained by dividing 0.5 by 1); and if the thermodynamic diagram is the thermodynamic diagram, averaging the thermodynamic diagram in each region, and performing normalization operation.

Optionally, a compression Excitation (SE) process may be added in the process of obtaining the lesion classification positioning result, the channel weight 640 may be obtained by performing the compression Excitation process on the image features, and the channel weight 640 and the classification features are subjected to dot multiplication, so that different features have a distinguishing force therebetween.

For a detailed description of each part illustrated in fig. 6, reference may be made to a flowchart part of this specification, for example, fig. 2 to 5 and descriptions thereof, which are not described again here.

Having thus described the basic concept, it will be apparent to those skilled in the art that the foregoing detailed disclosure is to be considered as illustrative only and not limiting, of the present invention. Various modifications, improvements and adaptations to the present description may occur to those skilled in the art, although not explicitly described herein. Such modifications, improvements and adaptations are proposed in the present specification and thus fall within the spirit and scope of the exemplary embodiments of the present specification.

Also, the description uses specific words to describe embodiments of the description. Reference throughout this specification to "one embodiment," "an embodiment," and/or "some embodiments" means that a particular feature, structure, or characteristic described in connection with at least one embodiment of the specification is included. Therefore, it is emphasized and should be appreciated that two or more references to "an embodiment" or "one embodiment" or "an alternative embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, some features, structures, or characteristics of one or more embodiments of the specification may be combined as appropriate.

Additionally, the order in which the elements and sequences of the process are recited in the specification, the use of alphanumeric characters, or other designations, is not intended to limit the order in which the processes and methods of the specification occur, unless otherwise specified in the claims. While various presently contemplated embodiments of the invention have been discussed in the foregoing disclosure by way of example, it is to be understood that such detail is solely for that purpose and that the appended claims are not limited to the disclosed embodiments, but, on the contrary, are intended to cover all modifications and equivalent arrangements that are within the spirit and scope of the embodiments herein. For example, although the system components described above may be implemented by hardware devices, they may also be implemented by software-only solutions, such as installing the described system on an existing server or mobile device.

Similarly, it should be noted that in the preceding description of embodiments of the present specification, various features are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure aiding in the understanding of one or more of the embodiments. This method of disclosure, however, is not intended to imply that more features than are expressly recited in a claim. Indeed, the embodiments may be characterized as having less than all of the features of a single embodiment disclosed above.

Numerals describing the number of components, attributes, etc. are used in some embodiments, it being understood that such numerals used in the description of the embodiments are modified in some instances by the use of the modifier "about", "approximately" or "substantially". Unless otherwise indicated, "about", "approximately" or "substantially" indicates that the number allows a variation of ± 20%. Accordingly, in some embodiments, the numerical parameters used in the specification and claims are approximations that may vary depending upon the desired properties of the individual embodiments. In some embodiments, the numerical parameter should take into account the specified significant digits and employ a general digit preserving approach. Notwithstanding that the numerical ranges and parameters setting forth the broad scope of the range are approximations, in the specific examples, such numerical values are set forth as precisely as possible within the scope of the application.

For each patent, patent application publication, and other material, such as articles, books, specifications, publications, documents, etc., cited in this specification, the entire contents of each are hereby incorporated by reference into this specification. Except where the application history document does not conform to or conflict with the contents of the present specification, it is to be understood that the application history document, as used herein in the present specification or appended claims, is intended to define the broadest scope of the present specification (whether presently or later in the specification) rather than the broadest scope of the present specification. It is to be understood that the descriptions, definitions and/or uses of terms in the accompanying materials of this specification shall control if they are inconsistent or contrary to the descriptions and/or uses of terms in this specification.

Finally, it should be understood that the embodiments described herein are merely illustrative of the principles of the embodiments of the present disclosure. Other variations are also possible within the scope of the present description. Thus, by way of example, and not limitation, alternative configurations of the embodiments of the specification can be considered consistent with the teachings of the specification. Accordingly, the embodiments of the present description are not limited to only those embodiments explicitly described and depicted herein.

Claims

1. A medical image report generation method, comprising:

acquiring medical image data of a target object;

inputting the medical image data into a classification positioning model to obtain a classification positioning result;

determining a report generation decision based on the classified positioning result;

and generating a decision based on the report, and acquiring a target report.

2. The generation method of claim 1, said determining a report generation decision based on said classified positioning result, comprising:

displaying the medical image data and the classification positioning result;

receiving decision information input by a user;

determining the report generation decision based on the decision information.

3. The generation method according to claim 2, the decision information comprising increasing scanning of the target object or ending scanning of the target object.

4. The generation method of claim 1, the classification localization model being obtained by:

obtaining a plurality of training samples;

inputting the training samples into a classification positioning model to obtain a processing result;

and adjusting parameters of the classification positioning model.

5. The generation method of claim 4, wherein the inputting the plurality of training samples into a classification positioning model to obtain a processing result comprises:

encoding input sample data to acquire a plurality of image characteristics;

performing attention pooling on the image features to obtain a lesion classification result;

and decoding the image characteristics to obtain a lesion positioning result.

6. The generation method according to claim 5, wherein the performing attention-pooling on the plurality of image features to obtain a lesion classification result includes:

performing convolution on the plurality of image features to obtain a single-channel feature map;

performing pooling operation on the single-channel feature map to obtain a plurality of feature values;

acquiring a plurality of characteristic weight values based on the plurality of characteristic values;

carrying out weighted summation based on the plurality of feature weight values and the plurality of image features to obtain classification features;

and acquiring a lesion classification result based on the classification characteristic.

7. The method of generating as claimed in claim 6, performing a pooling operation on the single-channel feature map to obtain a plurality of feature values, comprising:

averagely dividing the single-channel feature map into a plurality of regions;

and performing average pooling on each of the plurality of regions to obtain a plurality of characteristic values.

8. The generation method of claim 6, the method further comprising:

performing compression excitation processing based on the plurality of image characteristics to obtain channel weight;

scaling the classification features based on the channel weights to obtain scaling results;

and obtaining a lesion classification result based on the scaling result.

9. A medical image report generation system, comprising:

the medical image data acquisition module is used for acquiring medical image data of the target object;

the classified positioning result acquisition module is used for inputting the medical image data into a classified positioning model to acquire a classified positioning result;

a report generation decision obtaining module for determining a report generation decision based on the classified positioning result;

and the target report acquisition module is used for generating a decision based on the report and acquiring a target report.

10. A computer-readable storage medium storing computer instructions, wherein when the computer instructions in the storage medium are read by a computer, the computer executes the medical image report generation method according to any one of claims 1 to 8.