WO2021027571A1 - 基于人工智能的医学图像处理方法、医学设备和存储介质 - Google Patents

基于人工智能的医学图像处理方法、医学设备和存储介质 Download PDF

Info

Publication number
WO2021027571A1
WO2021027571A1 PCT/CN2020/105461 CN2020105461W WO2021027571A1 WO 2021027571 A1 WO2021027571 A1 WO 2021027571A1 CN 2020105461 W CN2020105461 W CN 2020105461W WO 2021027571 A1 WO2021027571 A1 WO 2021027571A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
neural network
feature
processing
segmentation
Prior art date
Application number
PCT/CN2020/105461
Other languages
English (en)
French (fr)
Inventor
张富博
魏东
马锴
郑冶枫
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2021027571A1 publication Critical patent/WO2021027571A1/zh
Priority to US17/503,160 priority Critical patent/US11941807B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/774Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10081Computed x-ray tomography [CT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10072Tomographic images
    • G06T2207/10088Magnetic resonance imaging [MRI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30004Biomedical image processing
    • G06T2207/30101Blood vessel; Artery; Vein; Vascular

Definitions

  • the present disclosure relates to the field of intelligent medical treatment, and in particular to a medical image processing method, medical equipment and storage medium based on artificial intelligence.
  • Artificial intelligence is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge, and use knowledge to obtain the best results.
  • artificial intelligence is a comprehensive technology of computer science, which attempts to understand the essence of intelligence and produce a new kind of intelligent machine that can react in a similar way to human intelligence.
  • Artificial intelligence is to study the design principles and implementation methods of various intelligent machines, so that the machines have the functions of perception, reasoning and decision-making.
  • Artificial intelligence technology can be widely used in traditional medical fields. For example, neural networks can be used to process medical images acquired by medical equipment to perform feature detection faster and more accurately.
  • the traditional artificial intelligence-based medical image processing method only involves two-dimensional images, and does not make full use of the three-dimensional space characteristics of disease-related features, which reduces the accuracy of detection results.
  • the present disclosure provides a medical image processing method based on artificial intelligence, which is used to perform feature detection based on medical prior knowledge and improve the accuracy of the detection result.
  • a medical image processing method based on artificial intelligence, including: processing the medical image to generate an encoded intermediate image representing the structural feature of the medical image; The encoded intermediate image is segmented to generate a segmented intermediate image; the encoded intermediate image and the segmented intermediate image are processed based on the attention mechanism to generate an attention-enhanced detection intermediate input image; and the detection intermediate input image is processed.
  • the second feature is detected to determine whether the second feature is included in the image content where the first feature is located.
  • an artificial intelligence-based medical device including: an image acquisition device configured to acquire medical images; a processor; and a memory, wherein computer readable code is stored in the memory, The computer-readable code, when executed by the processor, executes the artificial intelligence-based medical image processing method as described above.
  • a computer-readable storage medium having instructions stored thereon.
  • the processor executes the artificial intelligence-based medical image. Approach.
  • feature detection processing can be performed based on the medical prior knowledge of the first feature including the second feature to be detected.
  • Use the coding neural network to process the medical image to generate the coded intermediate image use the segmentation neural network for the first feature segmentation processing and the detection neural network for the second feature detection processing, in the process of processing, segment the neural network and detect
  • the neural network shares the encoded intermediate image output by the encoding neural network, and introduces the segmented intermediate image output by the segmentation neural network into the processing process of the detection neural network, so that the processing of the detection neural network focuses more on the first feature, thereby improving the second feature The accuracy of the test results.
  • Fig. 1 shows a flowchart of a medical image processing method based on artificial intelligence according to an embodiment of the present disclosure
  • FIG. 2 shows a schematic diagram of the overall flow of a multitasking processing method according to an embodiment of the present disclosure
  • Figure 3 shows a schematic diagram of the overall structure of a multitasking network according to an embodiment of the present disclosure
  • Figure 4 shows a schematic structural diagram of an attention network according to an embodiment of the present disclosure
  • Fig. 5 shows a schematic block diagram of a medical image processing device based on artificial intelligence according to an embodiment of the present disclosure
  • Fig. 6 shows a schematic block diagram of a medical device based on artificial intelligence according to an embodiment of the present disclosure
  • FIG. 7 shows a schematic diagram of the architecture of an exemplary computing device according to an embodiment of the present disclosure
  • Fig. 8 shows a schematic diagram of a computer storage medium according to an embodiment of the present disclosure.
  • the present disclosure provides a medical image processing method based on artificial intelligence, which is used to perform medical image processing by using a multi-task processing network including a coding neural network, a segmentation neural network, and a detection neural network to improve the accuracy of feature detection.
  • Fig. 1 shows a flowchart of a medical image processing method based on artificial intelligence according to an embodiment of the present disclosure.
  • the method in FIG. 1 may be executed by one or more computing devices, such as PCs, servers, server clusters, cloud computing network devices, and so on.
  • the medical image is processed to generate a coded intermediate image.
  • a coding neural network may be used to perform the processing performed in step S101.
  • the coding neural network is a three-dimensional convolutional neural network, that is, the input image of the coding neural network is a three-dimensional image.
  • the coding neural network may include one or more convolutional neural networks, pooling layers, residual networks and other structures, which are used to perform coding processing on input medical images to extract feature images and output one Or multiple encoded intermediate images.
  • the coded intermediate image in each embodiment may refer to an image that is extracted from the feature of the medical image by analyzing the feature of the medical image using a preset coding operation.
  • the encoded intermediate image generated by the encoding neural network based on the three-dimensional image is also a three-dimensional image.
  • the medical image may be a computed tomography angiography (Computed Tomography Angiography, CTA).
  • a CT device may be used to obtain an intracranial angiography image as the medical image.
  • the intracranial angiography images obtained by the CT equipment include images in which the intracranial blood vessels are located at different depths, forming a three-dimensional image.
  • the size of the intracranial angiography image can be expressed as 512*512*256, where 512*512 means that the image has 512*512 pixels on a two-dimensional plane and includes a total of 256 layers of images, namely Corresponds to 256 depth positions.
  • the medical image may also be a magnetic resonance angiography (Magnetic Resonance Angiography, MRA) image.
  • MRA Magnetic Resonance Angiography
  • CTA images have the advantages of lower price and faster imaging speed.
  • CTA images are used as the primary means of preliminary screening for intracranial aneurysms in my country.
  • the obtained CTA image can also be preprocessed before being input to the coding neural network.
  • the spatial resolution can be unified to 0.5 ⁇ 0.5 ⁇ 0.5mm 3 through interpolation, and then windowing is performed, which is expressed as:
  • i w represents the intensity after window truncation processing
  • I represents the intensity before window truncation processing.
  • the above processing steps of window truncation are used to adjust the contrast of the obtained CTA image according to the imaging characteristics of the blood vessel tomography image, so as to highlight the blood vessel characteristics.
  • the encoded intermediate image is processed to segment the first feature and generate a segmented intermediate image.
  • the divided intermediate image may refer to an image that is determined to have the first feature.
  • This step is to segment the coded intermediate image according to the first feature, thereby generating a segmented intermediate image including the first feature.
  • a segmented neural network may be used to perform the processing performed in step S102.
  • the segmentation neural network is a three-dimensional convolutional neural network, that is, it can process the input three-dimensional image.
  • the first feature may be a feature of a body organ, such as a brain feature, a heart feature, an artery feature, etc.
  • step S103 the encoded intermediate image and the segmented intermediate image are processed based on the attention mechanism to generate a detection intermediate input image.
  • the intermediate input image refers to the image that is generated by encoding the intermediate image and segmenting the intermediate image and has been enhanced with attention.
  • an attention network can be used to execute the processing performed in step S103.
  • step S104 the detection intermediate input image is processed to detect the second feature included in the first feature.
  • This step is to perform the second feature detection on the detected intermediate input image, so as to determine whether the second feature is included in the image area where the first feature is located.
  • a detection neural network can be used to perform the processing performed in step S102.
  • the detection neural network may output the detection result of the second feature, where the detection result includes: a prediction frame parameter of the second feature and a prediction probability of the second feature included in the prediction frame.
  • the prediction frame of the second feature refers to the area where the second feature is located in the image.
  • the detection neural network is a three-dimensional convolutional neural network, that is, the input three-dimensional image can be processed.
  • the second feature may be at least one of an aneurysm feature, an arterial vessel wall calcification feature, and an arterial vessel occlusion feature.
  • the overall network structure including the coding neural network, the segmentation neural network, and the detection neural network may be referred to as a multi-task processing network, and the multi-task may include the segmentation performed by the segmentation neural network.
  • the segmentation task of one feature and the detection task of detecting the second feature performed by the detection neural network.
  • Both the segmentation neural network and the detection neural network are processed based on the characteristic image output by the encoding neural network, that is, the encoding intermediate image. Since the first feature includes the second feature, the segmentation task and the detection task have an associated relationship.
  • the first feature is an artery feature and the second feature is an aneurysm feature
  • an aneurysm is caused by the blood flow in the artery hitting the weak part of the blood vessel for a long time. Formation, so an aneurysm is an abnormal bulge on an artery, which can only appear on an artery.
  • the aneurysm feature is included in the arterial vascular feature.
  • the above-mentioned segmentation task and the detection task are related, and the processing of the segmentation task helps to improve the accuracy of the detection task.
  • FIG. 2 shows a schematic diagram of the overall flow of a multitasking processing method according to an embodiment of the present disclosure.
  • the method in FIG. 1 may be executed by one or more computing devices, such as PCs, servers, server clusters, cloud computing network devices, and so on.
  • FIG. 2 uses the multi-task method to detect intracranial aneurysm as a specific embodiment.
  • the input CTA image can be obtained, for example, the intracranial angiography image of the patient can be obtained by the CT equipment, which includes the artery features and the aneurysm features.
  • the input CTA image (for example, the size is 512*512*256) can be input to the multi-task processing network as a whole, or it can be divided into multiple sub-images in the input CTA image and input to the multi-task processing respectively
  • the network performs processing to reduce the size of the image that needs to be processed at a time, thereby reducing the amount of calculation and increasing the calculation rate, which is not limited here.
  • two sets of task processing results can be output, including the artery segmentation result output by the segmentation neural network and the aneurysm detection result output by the detection neural network.
  • the segmentation neural network in the case of dividing the CTA image into multiple sub-images for processing, for one CTA image, the segmentation neural network will output multiple artery segmentation results based on the multiple sub-images.
  • the position parameters of the image in the CTA image may be used to stitch the multiple artery segmentation results into an artery segmentation result corresponding to the entire CTA image.
  • the aneurysm detection result includes, for example, the aneurysm prediction frame parameter and the prediction probability that the aneurysm is contained in the prediction frame.
  • the detection neural network can output the aneurysm prediction frame parameter corresponding to the pixel and the prediction probability of the aneurysm contained in the prediction frame.
  • the frame parameters may include the center point position coordinates of the prediction frame (ie, the position coordinates of the pixel in the input image) and the size (such as side length) of the prediction frame.
  • the non-maximum suppression (NMS) method can be used for processing to obtain the final aneurysm candidate frame.
  • the medical image processing method may further include: displaying a candidate frame on the image containing the first feature, the candidate frame including the second feature detected by the detection neural network Forecast box.
  • the predicted aneurysm candidate frame can be visually displayed in the segmented artery feature image, so as to realize the rapid and intuitive display of the detected CTA image Aneurysm in.
  • FIG. 2 only uses the artery feature and the aneurysm feature as specific examples of the first feature and the second feature, which does not constitute a limitation to the method of the present disclosure.
  • the method can also be applied to handle other types of features.
  • the second feature may also be the calcification feature of the arterial vessel wall and the arterial vessel occlusion feature as described above.
  • the first feature may be features such as veins, bones, etc., which will not be listed here.
  • FIG. 3 shows a schematic diagram of the overall structure of a multi-task processing network according to an embodiment of the present disclosure.
  • the medical image processing method according to the present disclosure will be described in detail below in conjunction with FIG. 3.
  • the Encoder includes M processing layers
  • the SegDecoder includes M processing layers
  • M is a positive integer.
  • the Encoder and the segmentation neural network The number of included processing layers is the same, so that the size of the image output by the segmentation neural network can be the same as the size of the input image of the coding neural network.
  • the processing layer includes at least one of a convolutional network, a transposed convolutional network, and a pooling layer.
  • the specific network structure may be the same or different, and they may be arranged according to actual application needs.
  • the structure shown in Figure 3 is only an example structure, and it can also be based on Actual application requirements increase or decrease some processing structures.
  • the processing of the medical image using a coding neural network includes: using a first processing layer of the coding neural network to process the medical image to output a first coded intermediate image; using the The m1th processing layer of the coding neural network processes the m1-1th coded intermediate image output by the m1-1th processing layer of the coded neural network, and outputs the m1th coded intermediate image, where m1 is a positive integer and m1 is greater than 1. Less than or equal to M.
  • Each processing layer in the coded neural network may be composed of a pooling layer and a residual block (ResBlock).
  • the pooling layer is used to reduce the image size
  • the residual block in each processing layer can be composed of one or more convolutional networks, normalization functions, and activation functions. It should be noted that the specific structure in each residual block can be the same or different, and there is no limitation here.
  • the first processing layer may also include a three-dimensional convolutional network block, denoted as ConvBlock_V1.
  • the processing layer is used to extract features and output feature images, that is, to encode intermediate images.
  • Feature Map feature image
  • the processing the encoded intermediate image using a segmentation neural network, and performing image segmentation according to a first feature to generate the segmented intermediate image includes: using a first processing layer of the segmentation neural network to perform processing on the encoding neural network Process the M-th encoded intermediate image output by the M-th processing layer of, and output the first segmented intermediate image; use the m2-th processing layer of the segmentation neural network to output the m2-1-th processing layer of the segmentation neural network.
  • the first processing layer may include a transposed convolutional network (TConvBlock_S4), which receives the fourth encoded intermediate image output by the fourth processing layer of the encoded neural network, and outputs the first divided intermediate image.
  • the transposed convolutional network can be used to process the input feature image and increase the image size.
  • the first divided intermediate image and the third encoded intermediate image have the same image size.
  • the concatenated image can be expressed as a*b*2c, that is, two images are concatenated into one concatenated image by increasing the number of channels, which is different from combining two images. The corresponding parameters of an image are added together.
  • the second processing layer may process the tandem image and output a second divided intermediate image.
  • the processing process is similar to that of the second processing layer, and will not be repeated here.
  • q i represents the probability that the i-th pixel in the medical image is an artery.
  • the output result of the segmentation neural network is the probability that each pixel in the medical image is an artery, which is used as the segmentation result.
  • the pixels whose q i is greater than the preset threshold for segmentation may be determined as arteries.
  • the preset threshold for segmentation is not specifically limited, and may be dynamically set based on actual applications.
  • the deep supervision (DSV) mechanism can also be applied to the segmentation neural network to supervise the accuracy of the intermediate processing results of the segmentation neural network.
  • the specific implementation method will be described below in conjunction with training processing.
  • the detection neural network includes N processing layers, where N is a positive integer.
  • Using the detection neural network to process the encoded intermediate image and the segmentation intermediate image includes: using the first processing layer of the detection neural network to pair The M-th encoded intermediate image output by the M-th processing layer of the encoding neural network is processed to output the first detected intermediate image.
  • using the attention network to process the encoded intermediate image and the segmented intermediate image, and generating the detection intermediate input image may include: using the attention network to output the nth of the n-1th processing layer of the detection neural network -1
  • the detection intermediate image, the m1 encoding intermediate image output by the m1 processing layer of the encoding neural network, and the m2 segmentation intermediate image output by the m2 processing layer of the segmentation neural network are processed, and the n detection intermediate input is output image.
  • the nth detection intermediate input image of the nth processing layer of the detection neural network is used for processing to output the nth detection intermediate image, where the m1-th encoded intermediate image and the m2-th divided intermediate image are the same as the n-1th detection intermediate image.
  • the intermediate images have the same image size, n is a positive integer, and n is greater than 1 and less than or equal to N.
  • the first processing layer may include a transposed convolutional network (TConvBlock_D2), which receives the fourth encoded intermediate image output by the fourth processing layer of the encoded neural network, and outputs the first detected intermediate image.
  • the transposed convolutional network can be used to process the input feature image and increase the image size.
  • ConvBlock_V3 can adopt the Region Proposal Network (RPN) network structure in R-CNN to output the detection results of aneurysm features, that is, the prediction frame parameters and probabilities, which will be described below.
  • RPN Region Proposal Network
  • a coordinate tensor mechanism can also be introduced into the detection neural network, for example, as shown in FIG. 3.
  • the corresponding probability of having the second feature that is, the aneurysm feature is not the same.
  • the probability of a pixel located at the edge of the image is lower than the probability of a pixel located at the center of the image.
  • the coordinate tensor mechanism introduces the above-mentioned difference in spatial probability caused by the position coordinates into the processing of the detection neural network, such as by setting the weight of the position coordinates, so as to further improve the accuracy of the detection result.
  • outputting the n-th detection intermediate input image using the attention network includes: performing channel concatenation of the m1-th encoded intermediate image and the n-1th detection intermediate image to obtain a serial image; The segmented intermediate image is added and added to obtain an added image; the added image is processed by an activation function to obtain an attention feature image; the attention feature image and the series image are multiplied to obtain an attention enhanced image; The attention-enhanced image and the concatenated image are added to obtain the n-th detection intermediate input image.
  • Fig. 4 shows a schematic structural diagram of an attention network according to an embodiment of the present disclosure.
  • the input of the attention network includes the detection intermediate image output by the detection neural network (shown as Figure D), the encoded intermediate image output by the encoding neural network (shown as Figure E), and the output of the segmentation neural network.
  • Split the intermediate image (denoted as Figure S).
  • the channel series connection of the picture E and the picture D is performed to obtain a series image.
  • the processing process of the serial image is as described above, and the description is not repeated here.
  • both the graph C and the graph S can be processed through a 1 ⁇ 1 ⁇ 1 convolutional network (denoted as ConvBlock_1) to reduce the dimensionality of the image, that is, to reduce the size of the image.
  • ConvBlock_1 The reduced dimensionality of the image is represented as Figure C'and Figure S'.
  • the convolutional network that reduces the image size can be set according to the specific image size. For example, the size of Figure C and Figure S is smaller. In the case, you can omit the above steps.
  • the C′ and the picture S′ are added (denoted as Add) to obtain the added image, and the activation function (denoted as ReLu) is used to process the added image to obtain the attention feature image, that is, the picture A.
  • another 1 ⁇ 1 ⁇ 1 convolutional network (denoted as ConvBlock_2) can be set in the attention network to further reduce the dimensionality of the image, for example, to one dimension, that is Include only one channel image.
  • Another activation function (denoted as Sigmoid) can also be included to normalize the value of graph A to between 0 and 1.
  • the image A and the image C are multiplied to obtain an attention-enhanced image, that is, image B.
  • the multiplication can be expressed as among them, Represents multiplying each channel of graph A and graph C.
  • the image B and the image C can be added to obtain the detection intermediate input image, which is expressed as
  • the attention network shown in Figure 4 it makes the segmentation intermediate image output by the processing layer of the segmentation neural network be introduced into the processing image of the detection neural network, using the feature map of the segmentation neural network for segmenting artery features as Attention is used to enhance the detection of arterial and vascular characteristics in neural networks, which can be called attention mechanism.
  • the attention mechanism is inspired by the human visual process. When reading, humans only pay attention to the salient part of the entire visual area and ignore the interference of other parts. Adding an attention mechanism to the detection neural network can make it possible to focus on the characteristics of arteries while performing aneurysm detection tasks, that is, to enhance the saliency of arteries in the feature map of the detection task.
  • the detection neural network including the above-mentioned attention network can be equivalent to introducing the medical prior knowledge of the presence of aneurysms in the arteries into the processing process of the neural network, making the detection processing task more focused on the characteristics of the artery, and reducing the impact on noise such as noise. Attention to non-arterial features, thereby improving the accuracy of detection results.
  • the size of the aneurysm to be detected may vary from 2-70mm. For this reason, multiple prediction frames of different sizes can be preset to detect aneurysms of different sizes. For example, the size of the prediction frame corresponding to 2.5mm, 5mm, 10mm, 15mm, and 30mm can be set respectively.
  • the detection neural network can output the detection output result containing 32 3 data points Each data point corresponds to A pixel position in, output the prediction data of 5 anchors at each data point, and each anchor contains (p i , t x , t y , t z , t b ) 5 parameters, where p i represents the pixel
  • the location contains the probability of an aneurysm.
  • t x , t y , t z , t b represent the prediction frame parameters, specifically, t x , t y , t z represent the relative position parameters of the pixel location in the input image, t b represents the relative size parameter of the prediction frame, and the relative size is related to the side length of the prediction frame.
  • the multi-task processing network shown in Figure 3 only detects aneurysm features.
  • the detection task can be extended to a variety of cerebrovascular diseases. Simultaneous detection, such as arterial wall calcification, etc., is not limited here.
  • the prior knowledge between related tasks can be used to improve the prediction accuracy of the prediction task.
  • Multitasking methods have been widely used in computer vision and natural language processing. Using multitasking can bring better processing effects than single-tasking.
  • the multitasking method includes two forms: hard parameter sharing and soft parameter sharing. In the form of hard parameter sharing, different tasks will share part of the same network, but each task will have its own branch network to independently generate output. Soft parameter sharing means that each task has its own independent complete network, but there will be connections and interactions between the networks for corresponding constraints or selective sharing of intermediate feature images.
  • Soft parameter sharing can avoid performance degradation caused by forced sharing when the correlation between tasks is not strong, but each task has an independent and complete network, which greatly increases the amount of model parameters and calculations.
  • hard parameter sharing a part of the same network can be shared between different tasks to reduce the redundancy of the network, and a strong correlation between multiple tasks is required.
  • the multi-task includes a detection task performed by a detection neural network and a segmentation task of a segmentation neural network. Using the correlation between the segmentation task and the detection task, the detection neural network and the segmentation neural network The structure of the coding neural network is shared, and the detection output result and the segmentation output result are generated separately.
  • the multi-task processing network with the above hard parameter sharing enables the enhancement of the blood vessel features extracted by the coding neural network while reducing the overall complexity of the network, thereby improving the detection accuracy of the detection neural network.
  • the medical image processing method may further include a training step, that is, optimizing the parameters of the multi-task processing network.
  • the training step includes: training the segmentation neural network and the coding neural network according to the Deiss loss function and the cross-entropy loss function; and training the detection neural network and the coding neural network according to the classification loss function and the regression loss function.
  • the training of the segmentation neural network and the coding neural network according to the Deiss loss function and the cross-entropy loss function includes: according to the Deiss loss function based on the real segmentation label and the first feature output by the segmentation neural network The segmentation label calculates the Deiss loss value; according to the cross entropy loss function, the cross entropy loss value is calculated based on the real segmentation label and the segmentation label of the first feature output by the segmentation neural network; according to the preset threshold, based on the Deiss loss value and cross entropy The loss value is trained, where the Deiss loss function And cross entropy loss function Denoted as:
  • si represents the real segmentation label of the i-th pixel in the medical image
  • q i represents the predicted segmentation label of the i-th pixel output by the segmentation neural network
  • V represents the total pixels included in the medical image Number
  • sum function means to sum the processing results of each pixel in the training image
  • log is the natural logarithmic function.
  • L s can be used as the real segmentation label of the training image to verify the accuracy of the ⁇ R W ⁇ H ⁇ D output by the segmentation neural network.
  • training based on the Dyce loss value and the cross-entropy loss value according to a preset threshold includes: in a case where the cross-entropy loss value is less than the preset threshold value, based on the Dyce loss Training based on the cross-entropy loss value in the case where the cross-entropy loss value is not less than the preset threshold value.
  • it can be calculated according to the above formula with Two values.
  • Less than the preset threshold g use Value to train the network, otherwise use Value to train the network, specifically expressed as the following formula:
  • the segmentation neural network includes M processing layers
  • the training of the segmentation neural network and the coding neural network according to the Deiss loss function and the cross-entropy loss function further includes:
  • the segmentation label and the segmentation label of the first feature output by the m-th processing layer of the segmentation neural network calculate the intermediate Dys loss value; according to the cross-entropy loss function based on the real segmentation label and the first segment output of the m-th processing layer of the segmentation neural network
  • the feature segmentation label calculates the intermediate cross entropy loss value; according to the preset threshold, training is performed based on the intermediate Dyce loss value and the intermediate cross entropy loss value, where m and M are positive integers, and m is greater than 1 and less than M.
  • the structure of the segmentation neural network including M processing is shown in Fig. 3, wherein the intermediate segmentation results generated for the second processing layer and the third processing layer can be expressed as A 2 , A 3 .
  • the intermediate results of the above segmentation can be expressed as A 2 , A 3 , which can be expressed as
  • the function calculates the loss value, which is used to compare with the detection result
  • the calculated loss values are used to train the network together, expressed as:
  • the above method of calculating the loss value based on the intermediate results of segmentation and training the segmentation neural network can be called the above-mentioned deep supervision mechanism, which can strengthen the supervision of the intermediate processing during the training process, and help to improve the neural network with more layers The training effect of the network.
  • the training of the detection neural network and the coding neural network according to the classification loss function and the regression loss function includes: using the coding neural network, the segmentation neural network and the detection neural network to process training samples to obtain
  • the prediction box parameters include the center point position coordinates and size of the prediction box
  • the classification loss value is calculated based on the prediction probability according to the classification loss function
  • the regression loss function is based on the prediction box parameters and the second
  • the regression loss value is calculated based on the true frame parameters of the feature
  • training is performed based on the classification loss value and the regression loss value.
  • the training samples used in the above training process may be the preprocessed CTA images as described above, or training samples that are more conducive to training may be obtained by sampling based on the CTA images.
  • obtaining training samples may include: training the detection neural network and coding neural network according to a classification loss function and a regression loss function, and further includes: sampling in a medical image to obtain at least one training sample; and calculating the at least The area ratio of the bounding box of a training sample to the bounding box of the second feature; the training sample whose area ratio is greater than the first threshold is determined as a positive training sample, and the training sample whose area ratio is less than the second threshold is determined It is a negative training sample, where the positive training sample is used for training classification loss and regression loss, and the negative training sample is used for training classification loss.
  • an intersection over Union (IOU) function may be used to calculate the area ratio, and the intersection over union function is expressed as calculating the ratio of the intersection and union of two bounding boxes.
  • IOU intersection over Union
  • a training sample with an area ratio greater than 0.5 can be determined as a positive training sample
  • a training sample with an area ratio less than 0.02 can be determined as a negative training sample
  • the classification loss and regression loss can be determined based on the positive and negative training samples.
  • the positive training samples are used for training classification loss and regression loss
  • the negative training samples are used for training classification loss.
  • the following sampling strategy can be adopted: For CTA images including ground truth, the ground truth corresponds to the second feature described above
  • the bounding box of the CTA image can be sampled within a certain pixel offset near the center point of the real candidate box of the CTA image to obtain training sample images with different sizes, so as to ensure that each real candidate box in the CTA image is in the training process Are included in the training sample.
  • you can also randomly sample in the CTA image so that the sampled training samples generally do not include the above-mentioned true candidate frame.
  • the area ratio can be calculated according to the intersection ratio function as described above, so as to divide the obtained training samples into the positive and negative training samples.
  • the positive training sample set S pos and the negative sample set S neg can be obtained in the above manner.
  • the number of negative training samples obtained may be far more than the number of positive training samples.
  • a subset of the negative sample set can be selected as training samples. For example, a part of the negative training samples that are difficult to distinguish from the negative sample set S neg can be selected to form a difficult negative training sample set, denoted as S hard , where,
  • the classification loss function and regression loss function used to train the detection neural network and the coding neural network will be described in detail below.
  • the classification loss function is used to indicate the accuracy of the detection result on the predicted probability value.
  • the classification loss function can be expressed as:
  • indicates the number of training samples in the corresponding set.
  • the regression loss function is used to indicate the accuracy of the detection result on the predicted prediction frame parameter value.
  • the regression loss function (Smooth L1 loss function) can be expressed as:
  • is a weighting constant.
  • a stochastic gradient descent (SGD) method can also be used to train the multi-task processing network.
  • SGD stochastic gradient descent
  • its parameters can be set, where momentum (momentum) can be 0.9, weight decay (weight decay) can be 1e-4, including training 200 artificial intelligence training patterns (epochs).
  • the initial learning rate is 1e-2, and after 100 epochs, the learning rate drops to the original 0.1.
  • the structure of a multi-task processing network is adopted, the input medical image is processed by a coding neural network to generate a coded intermediate image, and the coded intermediate image is processed by the segmentation neural network.
  • Perform processing segment the first feature and generate a segmented intermediate image, use a detection neural network to process the encoded intermediate image and the segmented intermediate image, detect the second feature included in the first feature, and output the second feature The test results. Due to the second feature included in the first feature, the segmentation task and the detection task are task-related.
  • the task-related detection neural network and segmentation neural network share the processing results of the coding neural network in the form of hard parameter sharing, and enhance the first feature processed in the detection task, thereby improving the accuracy of the detection result.
  • an attention mechanism is also introduced into the detection neural network to strengthen the attention of the detection neural network to the first feature, thereby further improving the accuracy of the detection result.
  • the present disclosure also provides a medical image processing device based on artificial intelligence.
  • Fig. 5 shows a schematic block diagram of a medical image processing device based on artificial intelligence according to an embodiment of the present disclosure.
  • the device 1000 may include an encoding neural network unit 1010, a segmentation neural network unit 1020, an attention network unit 1030, and a detection neural network unit 1040.
  • the encoding neural network unit 1010 is configured to process the medical image to generate an encoded intermediate image.
  • the segmentation neural network unit 1020 is configured to process the encoded intermediate image, segment the encoded intermediate image according to a first feature, and generate a segmented intermediate image including the first feature.
  • the attention network unit 1030 is configured to process the encoded intermediate image and the segmented intermediate image to generate a detection intermediate input image.
  • the detection neural network unit 1040 is configured to process the detection intermediate input image, and detect whether the second feature is included in the image content where the first feature is located.
  • the coding neural network unit 1010 includes M processing layers, and the segmentation neural network unit 1020 includes M processing layers, where M is a positive integer, and the processing layers include convolutional networks, transposed convolutional networks, and At least one of the pooling layers.
  • the first processing layer of the coding neural network unit 1010 processes the medical image and outputs a first coding intermediate image.
  • the m1 processing layer of the coding neural network unit 1010 The m1-1th coded intermediate image output by the m1-1th processing layer of the coding neural network unit 1010 is processed to output the m1-th coded intermediate image, where m1 is a positive integer, and m1 is greater than 1 and less than or equal to M.
  • the first processing layer of the segmentation neural network unit 1020 processes the M-th encoded intermediate image output by the M-th processing layer of the encoding neural network unit 1010, outputs the first segmented intermediate image, and divides the m2th processing layer of the neural network unit 1020
  • the m2-1th segmented intermediate image output by the m2-1th processing layer of the segmentation neural network unit 1020 and the M-m2+1th encoding intermediate image output by the M-m2+1th processing layer of the coding neural network 1010 The image is processed to output the m2-th divided intermediate image, where m2 is a positive integer, and m2 is greater than 1 and less than or equal to M.
  • the M-th segmented intermediate image output by the M-th processing layer of the segmentation neural network unit 1020 is processed to generate the segmentation result of the first feature.
  • the detection neural network unit 1040 includes N processing layers, where N is a positive integer, and the first processing layer of the detection neural network unit 1040 outputs the first processing layer to the Mth processing layer of the encoding neural network unit 1010.
  • the M-coded intermediate image is processed, and the first detected intermediate image is output.
  • the attention network unit 1030 detects the n-1th detection intermediate image output by the n-1th processing layer of the detection neural network unit 1040, and the output of the m1th processing layer of the coding neural network unit 1010
  • the m1-th encoded intermediate image and the m2-th segmented intermediate image output by the m2-th processing layer of the segmentation neural network unit 1020 are processed to output the n-th detected intermediate input image.
  • the n-th detection intermediate input image of the n-th processing layer of the detection neural network unit 1040 is processed to output the n-th detection intermediate image, where the m1-th encoded intermediate image and the m2-th divided intermediate image are the same as the n-1th detection intermediate image
  • the images have the same image size, n is a positive integer, and n is greater than 1 and less than or equal to N.
  • the attention network unit 1030 performs channel concatenation on the m1 encoded intermediate image and the n-1th detected intermediate image to obtain a concatenated image; adds the concatenated image and the m2 divided intermediate image to obtain a phase Add an image; use an activation function to process the added image to obtain an attention feature image; multiply the attention feature image and the series image to obtain an attention enhancement image; add the attention enhancement image to the series The images are added to obtain the n-th detection intermediate input image.
  • the medical image is a three-dimensional image
  • the encoding neural network unit 1010, the segmentation neural network unit 1020, and the detection neural network unit 1040 are three-dimensional convolutional neural networks.
  • the medical image is a computed tomography angiography image
  • the first feature is an artery feature
  • the second feature is at least one of aneurysm feature, arterial vessel wall calcification feature, and arterial vessel occlusion feature.
  • the detection neural network unit 1040 may output the detection result of the second feature, where the detection result includes: the prediction frame parameter of the second feature and the prediction frame containing the second feature Forecast probability.
  • the medical image processing device may further include a display unit configured to display a candidate frame on the image containing the first feature, and the candidate frame includes the second feature detected by the detection neural network unit 1040. Forecast box.
  • the medical image processing apparatus may further include a training unit.
  • the training unit may be configured to train the segmentation neural network unit 1020 and the coding neural network unit 1010 according to the Deiss loss function and the cross-entropy loss function; and train the detection neural network unit according to the classification loss function and the regression loss function 1040 and coding neural network unit 1010.
  • the training unit training the segmentation neural network unit 1020 and the coding neural network unit 1010 according to the Deiss loss function and the cross-entropy loss function includes: segmenting the neural network based on the real label and segmentation according to the Deiss loss function The segmentation label of the first feature output by the unit 1020 calculates the Deiss loss value; according to the cross-entropy loss function, the cross-entropy loss value is calculated based on the real segmentation label and the segmentation label of the first feature output by the segmentation neural network unit 1020; according to the preset threshold, Training is performed based on the Deiss loss value and the cross-entropy loss value, where the Deiss loss function And cross entropy loss function Denoted as:
  • s i represents the real segmentation label of the i-th pixel in the medical image
  • q i represents the predicted segmentation label of the i-th pixel output by the segmentation neural network unit 1020
  • V represents the total number of pixels included in the medical image. The number of pixels.
  • the training unit performing training based on the Deiss loss value and the cross entropy loss value according to a preset threshold includes: when the cross entropy loss value is less than the preset threshold, Training is performed on the Deiss loss value; in the case that the cross entropy loss value is not less than the preset threshold value, training is performed based on the cross entropy loss value.
  • the segmentation neural network unit 1040 may include M processing layers, and the training unit trains the segmentation neural network unit 1020 and the coding neural network unit 1010 according to the Deiss loss function and the cross-entropy loss function. Including: calculating the intermediate Dyss loss value based on the real segmentation label and the segmentation label of the first feature output by the mth processing layer of the segmentation neural network unit 1020 according to the Dyss loss function; according to the cross-entropy loss function based on the real segmentation label and segmentation The segmentation label of the first feature output by the m-th processing layer of the network unit 1020 calculates the intermediate cross-entropy loss value; according to the preset threshold, training is performed based on the intermediate Dys loss value and the intermediate cross-entropy loss value, wherein, m and M are positive integers, and m is greater than 1 and less than M.
  • the training of the detection neural network unit 1040 and the coding neural network unit 1010 according to the classification loss function and the regression loss function includes: the coding neural network unit 1010, the segmentation neural network unit 1020 and the detection nerve
  • the network unit 1040 processes the training samples to obtain the detection results.
  • the prediction box parameters include the center point position coordinates and size of the prediction box; the classification loss value is calculated based on the prediction probability according to the classification loss function, and the classification loss value is calculated based on the regression loss function.
  • the prediction frame parameter and the real frame parameter of the second feature calculate a regression loss value; training is performed based on the classification loss value and the regression loss value.
  • training the detection neural network unit 1040 and the coding neural network unit 1010 according to the classification loss function and the regression loss function further includes: sampling in the medical image to obtain at least one training sample; and calculating the at least one training sample The area ratio of the bounding box of the sample to the bounding box of the second feature; the training samples whose area ratio is greater than the first threshold are determined as positive training samples, and the training samples whose area ratio is less than the second threshold are determined as negative Training samples, wherein the positive training samples are used for training classification loss and regression loss, and the negative training samples are used for training classification loss.
  • FIG. 6 shows a schematic block diagram of an artificial intelligence-based medical device 2000 according to an embodiment of the present disclosure.
  • the device 2000 may include an image acquisition device 2010, a processor 2020, and a memory 2030.
  • the memory 2030 stores computer readable codes, and when the computer readable codes are executed by the processor 2020, the medical image processing method based on artificial intelligence as described above can be executed.
  • the image acquisition device 2010 may be a CT device, and acquire, for example, an intracranial artery angiography image as the medical image described above.
  • the processor 2020 can be wired and/or wired with the image acquisition device 2010 to receive the above-mentioned medical images, and then, the processor 2020 can run the memory 2030 and store computer-readable codes, the computer-readable codes
  • the medical image processing method based on artificial intelligence as described above can be executed to obtain the artery segmentation result and the aneurysm detection result based on the medical image.
  • the medical device 2000 may also include a display device such as a display screen, which is used to display the arterial segmentation result and the aneurysm detection result, and the display effect may refer to that shown in FIG. 2.
  • the computing device 3000 may include a bus 3010, one or more CPUs 3020, a read only memory (ROM) 3030, a random access memory (RAM) 3040, a communication port 3050 connected to a network, and an input/output component 3060 , Hard Disk 3070, etc.
  • the storage device in the computing device 3000 such as the ROM 3030 or the hard disk 3070, can store various data or files used in the processing and/or communication of the artificial intelligence-based medical image processing method provided by the present disclosure, and program instructions executed by the CPU.
  • the computing device 800 may also include a user interface 3080.
  • the architecture shown in FIG. 7 is only exemplary. When implementing different devices, one or more components of the computing device shown in FIG. 7 may be omitted according to actual needs.
  • FIG. 8 shows a schematic diagram 4000 of a storage medium according to the present disclosure.
  • the computer storage medium 4020 stores computer readable instructions 4010.
  • the computer-readable instructions 4010 When the computer-readable instructions 4010 are executed by the processor, the artificial intelligence-based medical image processing method according to the embodiments of the present disclosure described with reference to the above drawings can be executed.
  • the computer-readable storage medium includes, but is not limited to, for example, volatile memory and/or non-volatile memory.
  • the volatile memory may include random access memory (RAM) and/or cache memory (cache), for example.
  • the non-volatile memory may include, for example, read-only memory (ROM), hard disk, flash memory, etc.
  • the computer storage medium 4020 may be connected to a computing device such as a computer, and then, when the computing device executes the computer-readable instructions 4010 stored on the computer storage medium 4020, the above-mentioned operation may be performed.
  • a medical image processing method based on artificial intelligence according to the present disclosure.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Quality & Reliability (AREA)
  • Molecular Biology (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • Image Analysis (AREA)
  • Apparatus For Radiation Diagnosis (AREA)

Abstract

本公开提供了一种基于人工智能的医学图像处理方法、医学设备和存储介质。所述基于人工智能的医学图像处理方法,包括:对所述医学图像进行处理,生成编码中间图像;对所述编码中间图像进行处理,分割第一特征并生成分割中间图像;基于注意力机制对所述编码中间图像和分割中间图像进行处理,生成检测中间输入图像;以及对所述检测中间输入图像进行处理,检测所述第一特征中包括的第二特征。

Description

基于人工智能的医学图像处理方法、医学设备和存储介质
本申请要求于2019年08月15日提交中国专利局、申请号为201910752632.6、发明名称为“基于人工智能的医学图像处理方法、医学设备和存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本公开涉及智能医疗领域,具体的涉及一种基于人工智能的医学图像处理方法、医学设备和存储介质。
发明背景
人工智能是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。换句话说,人工智能是计算机科学的一个综合技术,它企图了解智能的实质,并生产出一种新的能以人类智能相似的方式做出反应的智能机器。人工智能也就是研究各种智能机器的设计原理与实现方法,使机器具有感知、推理与决策的功能。人工智能技术可以广泛应用于传统的医学领域。例如,可以利用神经网络对医疗设备获取的医学图像进行处理,以更快、更准确进行特征检测。传统的基于人工智能的医学图像处理方法仅涉及二维图像,而并未充分利用疾病相关特征的三维空间特性,降低了检测结果准确性。
发明内容
本公开提供一种基于人工智能的医学图像处理方法,用于基于医学先验知识进行特征检测,并提高检测结果的准确性。
根据本公开的一方面,提供了一种基于人工智能的医学图像处理方法,包括:对所述医学图像进行处理,生成表示所述医学图像的结构特征的编码中间图像;根据第一特征对所述编码中间图像进行分割,从而生成分割中间图像;基于注意力机制对所述编码中间图像和分割中间图像进行处理,生成经过注意力增强的检测中间输入图像;以及对所述检测中间输入图像进行第二特征的检测,从而确定所述第一特征所在的图像内容中是否包括第二特征。
根据本公开的另一方面,提供了一种基于人工智能的医学设备,包括: 图像采集装置,配置成获取医学图像;处理器;和存储器,其中,所述存储器中存储有计算机可读代码,所述计算机可读代码当由所述处理器运行时,执行如上所述的基于人工智能的医学图像处理方法。
根据本公开的又一方面,提供了一种计算机可读存储介质,其上存储有指令,所述指令在被处理器执行时,使得所述处理器执行如上所述的基于人工智能的医学图像处理方法。
利用本公开提供的基于人工智能的医学图像处理方法,可以基于第一特征中包括要检测的第二特征的医学先验知识进行特征检测处理。利用编码神经网络对医学图像进行处理,生成编码中间图像,利用分割神经网络进行第一特征的分割处理以及利用检测神经网络进行第二特征的检测处理,在处理的过程中,分割神经网络和检测神经网络共用编码神经网络输出的编码中间图像,并将分割神经网络输出的分割中间图像引入到检测神经网络的处理过程,使得检测神经网络的处理更关注于第一特征,从而提高对于第二特征的检测结果的准确性。
附图简要说明
为了更清楚地说明本公开实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1示出了根据本公开实施例的基于人工智能的医学图像处理方法的流程图;
图2示出了根据本公开实施例的多任务处理方法的整体流程示意图;
图3示出了根据本公开实施例的多任务处理网络的整体结构示意图;
图4示出了根据本公开实施例的注意力网络的结构示意图;
图5示出了根据本公开实施例的基于人工智能的医学图像处理装置的示意性框图;
图6示出了根据本公开实施例的基于人工智能的医学设备的示意性框图;
图7示出了根据本公开实施例的示例性计算设备的架构的示意图;
图8示出了根据本公开实施例的计算机存储介质的示意图。
实施本发明的方式
下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述。显然,所描述的实施例仅是本公开一部分的实施例,而不是全部的实施例。基于本公开中的实施例,本领域普通技术人员在无需创造性劳动的前提下所获得的所有其他实施例,都属于本公开保护的范围。
本公开中使用的“第一”、“第二”以及类似的词语并不表示任何顺序、数量或者重要性,而只是用来区分不同的组成部分。同样,“包括”或者“包含”等类似的词语意指出现该词前面的元件或者物件涵盖出现在该词后面列举的元件或者物件及其等同,而不排除其他元件或者物件。“连接”或者“相连”等类似的词语并非限定于物理的或者机械的连接,而是可以包括电性的连接,不管是直接的还是间接的。
本公开中使用了流程图用来说明根据本公开的实施例的方法的步骤。应当理解的是,前面或后面的步骤不一定按照顺序来精确的进行。相反,可以按照倒序或同时处理各种步骤。同时,也可以将其他操作添加到这些过程中。
本公开提供了一种用于基于人工智能的医学图像处理方法,用于利用包括编码神经网络、分割神经网络和检测神经网络的多任务处理网络来进行医学图像处理,提高特征检测的准确性。
图1示出了根据本公开实施例的基于人工智能的医学图像处理方法的流程图。图1的方法可以由一个或多个计算设备执行,例如PC、服务器、服务器集群、云计算网络设备等。如图1所示,首先,在步骤S101,对所述医学图像进行处理,生成编码中间图像。
根据本公开实施例,可以利用编码神经网络来执行所述步骤S101进行的处理。所述编码神经网络为三维卷积神经网络,即,所述编码神经网络的输入图像为三维图像。根据本公开实施例,所述编码神经网络可以包括一个或多个卷积神经网络、池化层、残差网络等结构,用于对输入的医学图像进行编码处理以提取特征图像,并输出一个或多个编码中间图像。各实施例的编码中间图像可以指利用预设的编码操作对医学图像的特征进行解析,从而提取出的表示医学图像的结构特征的图像。此外,所述编码神经网络基于三维图像生成的编码中间图像也是三维图像。所述编码神经网络的具体结构以及处理方式将在下文详细描述。
根据本公开实施例,所述医学图像可以是计算机断层扫描血管造影图像 (Computed Tomography Angiography,CTA)。例如,可以利用CT设备获取颅内血管造影图像作为所述医学图像。CT设备得到的所述颅内血管造影图像包括颅内血管位于不同深度位置的图像,组成三维图像。举例来说,所述颅内血管造影图像的尺寸可以表示为512*512*256,其中,512*512表示所述图像在二维平面上具有512*512个像素,共包括256层图像,即对应于256个深度位置。在根据本公开的其他实施例中,所述医学图像也可以是磁共振血管造影图像(Magnetic Resonance Angiography,MRA)。相比于MRA图像,CTA图像具有价格更低、成像速度更快的优势,例如,CTA图像在我国被作为颅内动脉瘤初筛的首要手段。
对于获得的CTA图像,在输入至编码神经网络之前还可以对其进行预处理。诸如,对于尺寸可以为512*512*256的颅内血管造影图像可以通过插值将其空间分辨率统一到0.5×0.5×0.5mm 3,接着,对其进行窗口截断(windowing),表示为:
Figure PCTCN2020105461-appb-000001
其中,i w表示经过窗口截断处理后的强度,I表示经过窗口截断处理之前的强度,对于颅内血管造影图像,通常将WL和WW设置成WL=300,WW=600。上述窗口截断的处理步骤用于根据血管断层扫描图像的成像特性来调整获得的CTA图像的对比度,以突出地显示血管特征。
如图1所示,在步骤S102,对所述编码中间图像进行处理,分割第一特征并生成分割中间图像。其中,分割中间图像可以指,被确定为具有所述第一特征的图像。该步骤也即,根据第一特征对所述编码中间图像进行分割,从而生成包括所述第一特征的分割中间图像。根据本公开实施例,可以利用分割神经网络来执行所述步骤S102进行的处理。所述分割神经网络是三维卷积神经网络,即,可以对输入的三维图像进行处理。具体的,所述第一特征可以是身体器官的特征,例如大脑特征、心脏特征、动脉特征,等。
接着,在步骤S103,基于注意力机制对所述编码中间图像和分割中间图像进行处理,生成检测中间输入图像。中间输入图像是指利用编码中间图像和分割中间图像生成的、经过注意力增强的图像。根据本公开实施例,可以利用注意力网络来执行所述步骤S103进行的处理。
接着,在步骤S104,对所述检测中间输入图像进行处理,检测所述第一特征中包括的第二特征。该步骤也即,对所述检测中间输入图像进行第二特征的检测,从而确定所述第一特征所在的图像区域中是否包括第二特征。根据本公开实施例,可以利用检测神经网络来执行所述步骤S102进行的处理。例如,所述检测神经网络可以输出所述第二特征的检测结果,其中,所述检测结果包括:第二特征的预测框参数和所述预测框中包含第二特征的预测概率。第二特征的预测框是指,表示所述第二特征在图像中所处的区域。
根据本公开实施例,所述检测神经网络是三维卷积神经网络,即,可以对输入的三维图像进行处理。根据本公开实施例,在所述第一特征是动脉特征的情况下,所述第二特征可以是动脉瘤特征、动脉血管壁钙化特征、动脉血管阻塞特征中的至少一种。
在根据本公开的医学图像处理方法中,可以将包括编码神经网络、分割神经网络以及检测神经网络的整体网络结构称为多任务处理网络,所述多任务可以包括由分割神经网络进行的分割第一特征的分割任务以及由检测神经网络进行的检测第二特征的检测任务。所述分割神经网络以及检测神经网络均基于编码神经网络输出的特征图像,即编码中间图像进行处理。由于所述第一特征中包括第二特征,使得所述分割任务与检测任务具有关联关系。举例来说,在所述第一特征为动脉血管特征、所述第二特征为动脉瘤特征的情况下,基于基本医学常识可知,动脉瘤是由于动脉中血流长时间冲击血管的薄弱部位所形成,因此动脉瘤是动脉上的异常膨出,只能在动脉上出现,换句话说,动脉瘤特征是被包括在动脉血管特征中的。基于上述医学常识可知,上述分割任务和检测任务是相关关联的,分割任务的处理有助于提高检测任务的准确性。关于根据本公开的多任务处理网络的具体结构以及处理过程将在下文结合附图详细描述。
利用上述多任务网络进行多任务处理的过程可以称为多任务处理方法。图2示出了根据本公开实施例的多任务处理方法的整体流程示意图。图1的方法可以由一个或多个计算设备执行,例如PC、服务器、服务器集群、云计算网络设备等。具体的,图2中以利用所述多任务方法进行颅内动脉瘤检测为具体实施例。
首先,可以获取输入CTA图像,例如由CT设备获取病人的颅内血管造影图像,其中包括动脉特征以及动脉瘤特征。可以将所述输入CTA图像(例如尺寸 为512*512*256)作为整体输入至所述多任务处理网络,也可以在所述输入CTA图像中划分为多个子图像分别输入至所述多任务处理网络进行处理,以减少单次需要处理的图像的尺寸,从而降低计算量、提高计算速率,在此不作限制。
利用所述多任务处理网络,可以输出两组任务处理结果,包括分割神经网络输出的动脉分割结果和检测神经网络输出的动脉瘤检测结果。根据本公开实施例,在将所述CTA图像中划分为多个子图像进行处理的情况下,针对一幅CTA图像,分割神经网络将基于多个子图像分别输出多个动脉分割结果,按照所述子图像在所述CTA图像中的位置参数,可以将所述多个动脉分割结果拼接成对应于整个CTA图像的动脉分割结果。
所述动脉瘤检测结果例如包括所述动脉瘤预测框参数和所述预测框中包含动脉瘤的预测概率。根据本公开实施例,对于输入图像中的每个像素点,所述检测神经网络可以输出对应于该像素点的动脉瘤预测框参数和所述预测框中包含动脉瘤的预测概率,所述预测框参数可以包括所述预测框的中心点位置坐标(即,该像素在输入图像中的位置坐标)和所述预测框的尺寸(诸如,边长)。对于检测神经网络输出的多个检测结果,可以采用非极大值抑制(NMS)方法进行处理,以得到最终的动脉瘤候选框。
根据本公开实施例,所述医学图像处理方法还可以包括:在包含所述第一特征的图像上显示候选框,所述候选框内包括所述检测神经网络检测得到的所述第二特征的预测框。举例来说,如图2所示,基于所述多任务处理网络可以实现在分割得到的动脉特征图像中可视化地显示预测得到的动脉瘤候选框,从而实现快速且直观地显示检测出的CTA图像中的动脉瘤。
需要注意的是,上述图2中仅以动脉特征和动脉瘤特征作为第一特征和第二特征的具体示例,这并不构成对于本公开的方法的限制。所述方法还可以应用于处理其他类型的特征。例如,所述第二特征还可以是如上所述的动脉血管壁钙化特征和动脉血管阻塞特征等。所述第一特征可以是静脉血管、骨头等特征,在此不再一一列举。
图3示出了根据本公开实施例的多任务处理网络的整体结构示意图,以下将结合图3对根据本公开的医学图像处理方法进行详细描述。
根据本公开实施例,所述编码神经网络(Encoder)包括M个处理层, 分割神经网络(SegDecoder)包括M个处理层,M为正整数,换句话说,所述编码神经网络与分割神经网络包括的处理层的个数相同,由此可以使得经过分割神经网络输出的图像的尺寸与编码神经网络的输入图像的尺寸相同。
根据本公开实施例,所述处理层包括卷积网络、转置卷积网络和池化层中的至少一种。对于编码神经网络和分割神经网络中的各个处理层,其具体网络结构可以相同也可以不相同,根据实际的应用需要进行布置,图3中示出的结构仅为一种示例结构,还可以根据实际应用需求增加或者减少一些处理结构。
根据本公开实施例,所述利用编码神经网络对所述医学图像进行处理包括:利用所述编码神经网络的第1处理层对所述医学图像进行处理,输出第1编码中间图像;利用所述编码神经网络的第m1处理层对所述编码神经网络的第m1-1处理层输出的第m1-1编码中间图像进行处理,输出第m1编码中间图像,其中,m1为正整数,m1大于1小于等于M。
如图3所示,所述编码神经网络可以包括4个处理层,即M=4。所述编码神经网络中的每个处理层可以由池化层(Pooling)和残差块(ResBlock)组成。所述池化层用于降低图像尺寸,对于每个处理层内的残差块可以由一个或多个卷积网络、标准化函数、激活函数组成。需要注意的是,各个残差块内的具体结构可以相同也可以不同,在此不作限制。此外,如图3所示,第1处理层还可以包括三维卷积网络块,表示为ConvBlock_V1,具体的,其可以由卷积网络、标准化函数(诸如,Batch Normalization)和激活函数(诸如,修正线性单元,ReLu)组成,用于对输入图像进行初步处理。所述处理层用于提取特征并输出特征图像,即编码中间图像。
在将医学图像输入至编码神经网络之后,可以首先由第1处理层的ConvBlock_V1进行处理,然后由第1处理层的池化层(Pooling_1)和残差块(ResBlock_E1)进行处理并输出第1编码中间图像,接着,利用所述编码神经网络的第m1=2处理层(包括Pooling_2和ResBlock_E2)对所述编码神经网络的第1处理层输出的第1编码中间图像进行处理,输出第2编码中间图像,以此类推。由此,利用图3中的编码神经网络可以生成4个编码中间图像,每个编码中间图像表示具有不同尺寸的特征图像(Feature Map)。
根据本公开实施例,所述利用分割神经网络对所述编码中间图像进行处 理,根据第一特征进行图像分割并生成分割中间图像包括:利用分割神经网络的第1处理层对所述编码神经网络的第M处理层输出的第M编码中间图像进行处理,输出第1分割中间图像;利用分割神经网络的第m2处理层对所述分割神经网络的第m2-1处理层输出的第m2-1分割中间图像以及所述编码神经网络的第M-m2+1处理层输出的第M-m2+1编码中间图像进行处理,输出第m2分割中间图像,其中,m2为正整数,m2大于1小于等于M;利用卷积网络对分割神经网络的第M处理层输出的第M分割中间图像进行处理,生成所述第一特征的分割结果。
如图3所示,所述分割神经网络包括M=4个处理层。具体的,第1处理层可以包括转置卷积网络(TConvBlock_S4),其接收所述编码神经网络的第4处理层输出的第4编码中间图像,并输出第1分割中间图像。相比于池化层,所述转置卷积网络可以用于对输入的特征图像进行处理并提高图像尺寸。接着,所述分割神经网络的第2(m2=2)处理层可以包括转置卷积网络(表示为TConvBlock_S3)、残差块(ResBlock_S3)以及关联模块。具体的,所述第2处理层可以接收所述第1(m2-1=1)分割中间图像和编码神经网络的第3(M-m2+1=3)编码中间图像,并由所述关联模块对所述第1分割中间图像和第3编码中间图像进行通道(Chanel)串联处理以得到串联图像。其中,所述第1分割中间图像和第3编码中间图像具有相同的图像尺寸。举例来说,对于尺寸为a*b*c的中间图像,所述串联图像可以表示为a*b*2c,即将两幅图像以增加通道数目的方式串联成一幅串联图像,这不同于将两幅图像的对应参数进行相加的处理。接着,所述第2处理层可以对该串联图像进行处理并输出第2分割中间图像。对于分割神经网络中的其他处理层,诸如第3、4处理层,其处理过程与第2处理层的处理过程类似,在此不再赘述。
对于图3中示出的分割神经网络,其分割输出结果可以表示为
Figure PCTCN2020105461-appb-000002
Figure PCTCN2020105461-appb-000003
q i∈Q,i=1,...,WHD,0≤q i≤1,其中,W×H×D表示输入至编码神经网络的医学图像的尺寸,即包括的像素数目。q i表示医学图像中第i个像素是动脉血管的概率。换句话说,分割神经网络的输出结果为医学图像中每个像素是动脉血管的概率,以此作为所述分割结果。举例来说,可以将所述q i大于分割预设阈值的像素确定为动脉血管,在此不对所述分割预设阈值作具体限制,可以基于实际应用动态地设置。
如图3所示,所述分割神经网络中还可以应用深度监督(DSV)机制, 用于监督分割神经网络的中间处理结果的准确性,具体实现方法将在下文中结合训练处理进行描述。
根据本公开实施例,所述检测神经网络包括N个处理层,N为正整数,利用检测神经网络对所述编码中间图像和分割中间图像进行处理包括:利用检测神经网络的第1处理层对所述编码神经网络的第M处理层输出的第M编码中间图像进行处理,输出第1检测中间图像。根据本公开实施例,利用注意力网络对所述编码中间图像和分割中间图像进行处理,生成检测中间输入图像可以包括:利用注意力网络对检测神经网络的第n-1处理层输出的第n-1检测中间图像、所述编码神经网络的第m1处理层输出的第m1编码中间图像以及所述分割神经网络的第m2处理层输出的第m2分割中间图像进行处理,输出第n检测中间输入图像。接着,利用检测神经网络的第n处理层所述第n检测中间输入图像进行处理,输出第n检测中间图像,其中,第m1编码中间图像和第m2分割中间图像与所述第n-1检测中间图像具有相同的图像尺寸,n为正整数,n大于1小于等于N。
如图3所示,所述检测神经网络包括N=3个处理层。具体的,第1处理层可以包括转置卷积网络(TConvBlock_D2),其接收所述编码神经网络的第4处理层输出的第4编码中间图像,并输出第1检测中间图像。相比于池化层,所述转置卷积网络可以用于对输入的特征图像进行处理并提高图像尺寸。接着,所述检测神经网络的第2(n=2)处理层可以包括转置卷积网络(表示为TConvBlock_D1)、残差块(ResBlock_D2)以及注意力网络(将在下文描述)。具体的,可以利用注意力网络对检测神经网络的第1(n-1=1)处理层输出的第1检测中间图像、所述编码神经网络的第3处理层输出的第3编码中间图像以及所述分割神经网络的第1处理层输出的第1分割中间图像进行处理,输出第2检测中间输入图像,利用检测神经网络的第2处理层中的TConvBlock_D1和ResBlock_D2对所述第2检测中间输入图像进行处理,输出第2检测中间图像,其中,第3编码中间图像和第1分割中间图像与所述第1检测中间图像具有相同的图像尺寸。对于分割神经网络中的第3处理层,其包括三维卷积网络块,表示为ConvBlock_V3、残差块(ResBlock_D1)以及注意力网络。其中,ConvBlock_V3可以采用R-CNN中的Region Proposal Network(RPN)网络结构,用于输出动脉瘤特征的检测结果,即预测框参数以及概率,将在下文进行描述。
根据本公开实施例,还可以在所述检测神经网络中引入坐标张量机制,例如,如图3所示出的。在输入的医学图像中,对于具有不同坐标值的像素,其对应的具有第二特征,即动脉瘤特征的概率并不相同。例如,位于图像边缘的像素的概率低于位于图像中心的像素的概率。所述坐标张量机制是将上述由位置坐标导致的空间概率的不同引入至检测神经网络的处理过程中,诸如以设置位置坐标权重的方式,从而进一步提高检测结果的准确性。
根据本公开实施例,利用注意力网络输出第n检测中间输入图像包括:对所述第m1编码中间图像和第n-1检测中间图像进行通道串联得到串联图像;对所述串联图像和第m2分割中间图像进行相加得到相加图像;利用激活函数对所述相加图像进行处理得到注意力特征图像;对所述注意力特征图像与所述串联图像进行相乘得到注意力增强图像;对所述注意力增强图像和串联图像进行相加得到所述第n检测中间输入图像。
图4示出了根据本公开实施例的注意力网络的结构示意图。如图4所示,所述注意力网络的输入包括检测神经网络输出的检测中间图像(表示为图D)、编码神经网络输出的编码中间图像(表示为图E)以及分割神经网络的输出的分割中间图像(表示为图S)。首先,对所述图E和图D进行通道串联得到串联图像。所述串联图像的处理过程如上文所述,在此不再重复描述。接着,为了减少计算量,可以将图C和图S都经过1×1×1卷积网络(表示为ConvBlock_1)进行处理,以降低图像的维度,即,减少图像的尺寸。降维后的图像分别表示为图C'和图S',需要注意的是,可以根据具体的图像尺寸来设置上述降低图像尺寸的卷积网络,例如,在图C和图S的尺寸较小的情况下,可以省略上述步骤。接着,对所述C'和图S'进行相加(表示为Add)得到相加图像,利用激活函数(表示为ReLu)对所述相加图像进行处理得到注意力特征图像,即图A。此外,如图4中所示出的,还可以对注意力网络中设置另一1×1×1卷积网络(表示为ConvBlock_2),用于进一步降低图像的维度,例如降低到一维,即仅包括一个通道的图像。还可以包括另一激活函数(表示为Sigmoid),用于将图A的数值归一化到0和1之间。接着,对所述图A与所述图C进行相乘得到注意力增强图像,即图B。所述相乘可以表示为
Figure PCTCN2020105461-appb-000004
其中,
Figure PCTCN2020105461-appb-000005
表示将图A与图C的每个通道进行相乘。接着,还可以对所述图B和图C进行相加得到所述检测中间输入图像,表示为
Figure PCTCN2020105461-appb-000006
对于如图4中示出的注意力网络,其使得将分割神经网络的处理层输出的分割中间图像引入到检测神经网络的处理图像中,利用用于分割动脉特征的分割神经网络的特征图作为注意力来增强检测神经网络中的动脉血管特征,可以称为注意力机制。所述注意力机制来自于人类视觉过程的启发,人类在在进行阅读时只会关注整个视觉区域中的显著部分忽略其他部分的干扰。在检测神经网络中加入注意力机制可以使得在进行动脉瘤的检测任务的同时,关注于动脉血管特征,即,增强检测任务的特征图中动脉血管的显著性。包括上述注意力网络的检测神经网络可以等效于将动脉瘤存在于动脉上的医学先验知识引入到神经网络的处理过程中,使得检测的处理任务更关注于动脉特征,降低对于诸如噪声等非动脉特征的注意力,由此提高检测结果的准确性。
对于图3中示出的检测神经网络,其要检测的动脉瘤的尺寸可能存在2-70mm的变化范围,为此,可以预设多个不同大小的预测框来分别检测不同尺寸的动脉瘤,例如,可以分别设置对应于2.5mm、5mm、10mm、15mm和30mm的预测框的大小。对于由
Figure PCTCN2020105461-appb-000007
表示的输入的CTA图像(例如,W=H=D=128),检测神经网络的检测输出结果可以表示为
Figure PCTCN2020105461-appb-000008
其中,
Figure PCTCN2020105461-appb-000009
A=5表示anchor的数目,M=5表示对应于每个预测框的中心(表示为锚点(anchor))的参数的个数。例如,对于每个128 3大小的CTA图像,检测神经网络可以输出包含32 3个数据点的检测输出结果
Figure PCTCN2020105461-appb-000010
每个数据点对应
Figure PCTCN2020105461-appb-000011
中的一个像素位置,在每个数据点输出5个anchor的预测数据,每个anchor包含(p i,t x,t y,t z,t b)5个参数,其中,p i表示该像素位置处包含动脉瘤的概率,以上t x,t y,t z,t b表示预测框参数,具体的,t x,t y,t z表示该像素位置在输入图像中的相对位置参数,t b表示预测框的相对大小参数,所述相对大小与所述预测框的边长相关。
需要注意的是,如图3中示出的多任务处理网络仅针对动脉瘤特征进行检测,通过改变上述ConvBlock_V3网络的设置以及相应的训练数据,可以将上述检测任务拓展到多种脑血管疾病的同时检测,例如动脉管壁钙化等,在此不作限制。
对于以上结合图2和图3描述的多任务处理方法以及多任务处理网络,可以实现利用相关任务之间的先验知识来提升预测任务的预测准确性。多任务的方法在计算机视觉和自然语言处理中得到了广泛的应用,使用多任务处理能够比单任务处理带来更好的处理效果。所述多任务处理方法包括两种形式:硬参数共享和软 参数共享。在硬参数共享形式中,不同的任务会共用一部分相同的网络,但每个任务将会有自己的分支网络独立产生输出。软参数共享则是每个任务有自己的独立的完整网络,但网络之间会有连接和交互进行相应的约束或者有选择性地对中间特征图像进行共享。软参数共享可以避免任务之间相关性不强时强行共享导致的性能下降,但是每个任务都有独立完整的网络,大大增加了模型参数量和计算量。对于硬参数共享形式,可以在不同任务间共享一部分相同网络而减少网络的冗余度,且需要多个任务之间具有较强的相关性。
在根据本公开的实施例中所述多任务包括由检测神经网络进行的检测任务以及分割神经网络的分割任务,利用所述分割任务和检测任务的相关性,所述检测神经网络和分割神经网络共用编码神经网络的结构,并分别生成检测输出结果和分割输出结果。以上硬参数共享的多任务处理网络使得在降低网络整体复杂度的同时,增强编码神经网络提取的血管特征,从而提高检测神经网络的检测准确性。
根据本公开的医学图像处理方法还可以包括训练步骤,即优化所述多任务处理网络的参数。所述训练步骤包括:按照戴斯损失函数和交叉熵损失函数来训练所述分割神经网络和编码神经网络;以及按照分类损失函数和回归损失函数来训练所述检测神经网络和编码神经网络。
根据本公开实施例,所述按照戴斯损失函数和交叉熵损失函数来训练所述分割神经网络和编码神经网络包括:按照戴斯损失函数基于真实分割标签和分割神经网络输出的第一特征的分割标签计算戴斯损失值;按照交叉熵损失函数基于真实分割标签和分割神经网络输出的第一特征的分割标签计算交叉熵损失值;根据预设阈值,基于所述戴斯损失值和交叉熵损失值进行训练,其中,戴斯损失函数
Figure PCTCN2020105461-appb-000012
和交叉熵损失函数
Figure PCTCN2020105461-appb-000013
分别表示为:
Figure PCTCN2020105461-appb-000014
Figure PCTCN2020105461-appb-000015
其中,s i表示所述医学图像中第i个像素的真实分割标签,q i表示分割神经网络输出的所述第i个像素的预测分割标签,V表示所述医学图像中包括的总的像素数目,求和函数表示对训练图像中的每个像素的处理结果进行求和,log为自然对数函数。
在训练步骤中,对于一幅训练图像,其中包括的动脉特征的真实分割标签是已知的,表示为s i∈L s,i=1,...,W*H*D,s i的取值可以是0或者1,其中,s i=1表示该像素i是动脉血管,s i=0表示该像素i不是动脉血管,W*H*D表示训练图像中包括的像素数目。以上L s可以作为所述训练图像的真实分割标签来验证所述分割神经网络输出的∈R W×H×D的准确性。
根据本公开实施例,根据预设阈值,基于所述戴斯损失值和交叉熵损失值进行训练包括:在所述交叉熵损失值小于所述预设阈值的情况下,基于所述戴斯损失值进行训练;在所述交叉熵损失值不小于所述预设阈值的情况下,基于所述交叉熵损失值进行训练。在每次前向传播时,可以按照如上公式计算
Figure PCTCN2020105461-appb-000016
Figure PCTCN2020105461-appb-000017
两个数值。如果
Figure PCTCN2020105461-appb-000018
小于预设阈值g,则使用
Figure PCTCN2020105461-appb-000019
值来训练所述网络,否则使用
Figure PCTCN2020105461-appb-000020
值来训练所述网络,具体表示为如下公式:
Figure PCTCN2020105461-appb-000021
其中,
Figure PCTCN2020105461-appb-000022
表示分割损失函数。
根据本公开实施例,所述分割神经网络包括M个处理层,所述按照戴斯损失函数和交叉熵损失函数来训练所述分割神经网络和编码神经网络还包括:按照戴斯损失函数基于真实分割标签和分割神经网络的第m个处理层输出的第一特征的分割标签计算中间戴斯损失值;按照交叉熵损失函数基于真实分割标签和分割神经网络的第m个处理层输出的第一特征的分割标签计算中间交叉熵损失值;根据所述预设阈值,基于所述中间戴斯损失值和中间交叉熵损失值进行训练,其中,m和M为正整数,m大于1小于M。
包括M个处理的分割神经网络的结构如图3所示,其中,对于第2处理层和第3处理层生成的分割中间结果可以表示为A 2,A 3。在训练过程中,对于上述分割中间结果可以表示为A 2,A 3,可以按照
Figure PCTCN2020105461-appb-000023
函数计算分别其损失值,用于与基于检测结果
Figure PCTCN2020105461-appb-000024
计算的损失值一起训练所述网络,表示为:
Figure PCTCN2020105461-appb-000025
其中,
Figure PCTCN2020105461-appb-000026
表示分割神经网络的总的分割函数,其是基于分割神经网络输出的
Figure PCTCN2020105461-appb-000027
和A i与真实分割标签L s之间的损失值的加权和,例如,ε 0=0.7,ε 3=0.2,ε 2=0.1。
上述基于分割中间结果计算损失值并训练所述分割神经网络的方式可以称为如上所述的深度监督机制,可以加强训练过程对于中间处理的监督,有助于提高例如具有较多层网络的神经网络的训练效果。
根据本公开实施例,所述按照分类损失函数和回归损失函数来训练所述检测神经网络和编码神经网络包括:利用所述编码神经网络、分割神经网络和检测神经网络对训练样本进行处理,得到检测结果,所述预测框参数包括所述预测框的中心点位置坐标和尺寸;按照分类损失函数基于所述预测概率计算分类损失值,按照回归损失函数基于所述预测框参数以及所述第二特征的真实框参数计算回归损失值;基于所述分类损失值和回归损失值进行训练。
根据本公开实施例,上述训练过程中使用的训练样本可以是如上所述的经过预处理的CTA图像,也可以基于所述CTA图像来采样得到更有利于训练的训练样本。
根据本公开实施例,获取训练样本可以包括:按照分类损失函数和回归损失函数来训练所述检测神经网络和编码神经网络还包括:在医学图像中进行采样得到至少一个训练样本;计算所述至少一个训练样本的边界框与所述第二特征的边界框的面积比值;将所述面积比值大于第一阈值的训练样本确定为正训练样本,将所述面积比值小于第二阈值的训练样本确定为负训练样本,其中,所述正训练样本用于训练分类损失和回归损失,所述负训练样本用于训练分类损失。
作为一个示例,可以采用交并比函数(Intersection over Union,IOU)来计算所述面积比值,所述交并比函数表示为计算两个边界框的交集和并集的比值。举例来说,可以将面积比值大于0.5的训练样本确定为正训练样本,将面积比值小于0.02的训练样本确定为负训练样本,并基于所述正、负训练样本来分别确定分类损失和回归损失,例如,所述正训练样本用于训练分类损失和回归损失,所述负训练样本用于训练分类损失。
在训练过程中,为了维持合理的正、负训练样本的比例,可以采取以下采样策略:对于包括真实候选框(ground truth)的CTA图像,其中,真实候选框对应于如上所述的第二特征的边界框,可以在CTA图像的真实候选框的中心点附近一定像素偏移范围内进行采样,得到具有不同尺寸的训练样本图像,从而保证CTA图像中的每个真实候选框在训练过程中都有被包括到训练样本中。此外,还可以随机 地在CTA图像中进行采样,这样采样出来的训练样本一般不包括上述真实候选框。基于以上采样得到的训练样本,可以按照如上所述的交并比函数计算所述面积比值,从而将得到的训练样本划分为所述正、负训练样本。
根据本公开实施例,按照如上方式可以获得正训练样本集S pos和负样本集合为S neg,在真实的CTA图像中,获得的负训练样本的数目可能远多于正训练样本的数目,为了避免上述正、负训练样本的在数目上的不均衡,可以在负样本集合中选取一部分子集作为训练样本。例如,可以在负样本集合S neg中选取一部分较难区分的负训练样本来组成困难负训练样本集合,表示为S hard,其中,
Figure PCTCN2020105461-appb-000028
以下将详细描述用于训练所述检测神经网络和编码神经网络的分类损失函数和回归损失函数。
所述分类损失函数用于表示检测结果在预测的概率值上的准确性。作为一个示例,所述分类损失函数可以表示为:
Figure PCTCN2020105461-appb-000029
其中,
Figure PCTCN2020105461-appb-000030
表示分类损失函数,正、负训练样本权重系数ξ 1=ξ 2=0.5,
Figure PCTCN2020105461-appb-000031
表示对基于S pos内的训练样本进行处理得到的处理结果进行求和。
Figure PCTCN2020105461-appb-000032
表示对基于S hard内的训练样本进行处理得到的处理结果进行求和。|·|表示对应集合中训练样本的数目。
所述回归损失函数用于表示检测结果在预测的预测框参数值上的准确性。作为一个示例,所述回归损失函数(Smooth L1损失函数)可以表示为:
Figure PCTCN2020105461-appb-000033
Figure PCTCN2020105461-appb-000034
其中,
Figure PCTCN2020105461-appb-000035
表示回归损失函数,t∈S pos表示仅针对正样本进行计算。(v x,v y,v z,v b)表示真实候选框的坐标参数。
由此,上述多任务处理网络的整体损失函数可以表示为:
Figure PCTCN2020105461-appb-000036
其中,α为加权常数。
根据本公开实施例,还可以采用其他的训练函数来训练所述多任务处理网络。例如,还可以利用随机梯度下降方法(Stochastic Gradient Descent,SGD)来训练所述多任务处理网络。对于SGD,具体的,可以将其参数进行设置,其中,动量(momentum)可以是0.9,权重衰减(weight decay)可以是1e-4,包括训练200个人工智能训练型样(epochs)。初始的学习率为1e-2,在100个epochs后学习率下降为原来的0.1。
根据本公开的基于人工智能的医学图像处理方法中采用了多任务处理网络的结构,利用编码神经网络对输入的医学图像进行处理,生成编码中间图像,并利用分割神经网络对所述编码中间图像进行处理,分割第一特征并生成分割中间图像、利用检测神经网络对所述编码中间图像和分割中间图像进行处理,检测所述第一特征中包括的第二特征,并输出所述第二特征的检测结果。由于所述第一特征中包括的第二特征,使得上述分割任务与检测任务具有任务关联性。上述具有任务关联性的检测神经网络和分割神经网络以硬参数共享的形式共享编码神经网络的处理结果,增强检测任务中处理的第一特征,从而提高检测结果的准确性。此外,还在所述检测神经网络中引入注意力机制,加强检测神经网络对于第一特征的注意力,从而进一步提高检测结果的准确性。
本公开还提供了一种基于人工智能的医学图像处理装置。图5示出了根据本公开实施例的基于人工智能的医学图像处理装置的示意性框图。如图5所示,所述装置1000可以包括编码神经网络单元1010、分割神经网络单元1020、注意力网络单元1030以及检测神经网络单元1040。
根据本公开实施例,所述编码神经网络单元1010配置成对所述医学图像进行处理,生成编码中间图像。所述分割神经网络单元1020配置成对所述编码中间图像进行处理,根据第一特征对所述编码中间图像进行分割并生成包括所述第一特征的分割中间图像。所述注意力网络单元1030配置成对所述编码中间图像和分割中间图像进行处理,生成检测中间输入图像。所述检测神经网络单元1040配置成对所述检测中间输入图像进行处理,检测所述第一特征所在的图像内容中是否包括第二特征。
根据本公开实施例,所述编码神经网络单元1010包括M个处理层,分割神经网络单元1020包括M个处理层,M为正整数,所述处理层包括卷积网络、 转置卷积网络和池化层中的至少一种,所述编码神经网络单元1010的第1处理层对所述医学图像进行处理,输出第1编码中间图像,接着,所述编码神经网络单元1010的第m1处理层对所述编码神经网络单元1010的第m1-1处理层输出的第m1-1编码中间图像进行处理,输出第m1编码中间图像,其中,m1为正整数,m1大于1小于等于M。分割神经网络单元1020的第1处理层对所述编码神经网络单元1010的第M处理层输出的第M编码中间图像进行处理,输出第1分割中间图像,分割神经网络单元1020的第m2处理层对所述分割神经网络单元1020的第m2-1处理层输出的第m2-1分割中间图像以及所述编码神经网络1010的第M-m2+1处理层输出的第M-m2+1编码中间图像进行处理,输出第m2分割中间图像,其中,m2为正整数,m2大于1小于等于M。接着,对分割神经网络单元1020的第M处理层输出的第M分割中间图像进行处理,生成所述第一特征的分割结果。
根据本公开实施例,所述检测神经网络单元1040包括N个处理层,N为正整数,检测神经网络单元1040的第1处理层对所述编码神经网络单元1010的第M处理层输出的第M编码中间图像进行处理,输出第1检测中间图像。根据本公开实施例,所述注意力网络单元1030对检测神经网络单元1040的第n-1处理层输出的第n-1检测中间图像、所述编码神经网络单元1010的第m1处理层输出的第m1编码中间图像以及所述分割神经网络单元1020的第m2处理层输出的第m2分割中间图像进行处理,输出第n检测中间输入图像。检测神经网络单元1040的第n处理层所述第n检测中间输入图像进行处理,输出第n检测中间图像,其中,第m1编码中间图像和第m2分割中间图像与所述第n-1检测中间图像具有相同的图像尺寸,n为正整数,n大于1小于等于N。
根据本公开实施例,注意力网络单元1030对所述第m1编码中间图像和第n-1检测中间图像进行通道串联得到串联图像;对所述串联图像和第m2分割中间图像进行相加得到相加图像;利用激活函数对所述相加图像进行处理得到注意力特征图像;对所述注意力特征图像与所述串联图像进行相乘得到注意力增强图像;对所述注意力增强图像和串联图像进行相加得到所述第n检测中间输入图像。
根据本公开实施例,所述医学图像为三维图像,所述编码神经网络单元1010、分割神经网络单元1020和检测神经网络单元1040为三维卷积神经网络。
根据本公开实施例,所述医学图像为计算机断层扫描血管造影图像,所 述第一特征为动脉特征,所述第二特征为动脉瘤特征、动脉血管壁钙化特征、动脉血管阻塞特征中的至少一种。
根据本公开实施例,所述检测神经网络单元1040可以输出所述第二特征的检测结果,其中,所述检测结果包括:第二特征的预测框参数和所述预测框中包含第二特征的预测概率。所述医学图像处理装置还可以包括显示单元,配置成在包含所述第一特征的图像上显示候选框,所述候选框内包括所述检测神经网络单元1040检测得到的所述第二特征的预测框。
根据本公开实施例,所述医学图像处理装置还可以包括训练单元。所述训练单元可以配置成按照戴斯损失函数和交叉熵损失函数来训练所述分割神经网络单元1020和编码神经网络单元1010;以及按照分类损失函数和回归损失函数来训练所述检测神经网络单元1040和编码神经网络单元1010。
根据本公开实施例,所述训练单元按照戴斯损失函数和交叉熵损失函数来训练所述分割神经网络单元1020和编码神经网络单元1010包括:按照戴斯损失函数基于真实分割标签和分割神经网络单元1020输出的第一特征的分割标签计算戴斯损失值;按照交叉熵损失函数基于真实分割标签和分割神经网络单元1020输出的第一特征的分割标签计算交叉熵损失值;根据预设阈值,基于所述戴斯损失值和交叉熵损失值进行训练,其中,戴斯损失函数
Figure PCTCN2020105461-appb-000037
和交叉熵损失函数
Figure PCTCN2020105461-appb-000038
分别表示为:
Figure PCTCN2020105461-appb-000039
Figure PCTCN2020105461-appb-000040
其中,s i表示所述医学图像中第i个像素的真实分割标签,q i表示分割神经网络单元1020输出的所述第i个像素的预测分割标签,V表示所述医学图像中包括的总的像素数目。
根据本公开实施例,所述训练单元根据预设阈值,基于所述戴斯损失值和交叉熵损失值进行训练包括:在所述交叉熵损失值小于所述预设阈值的情况下,基于所述戴斯损失值进行训练;在所述交叉熵损失值不小于所述预设阈值的情况下,基于所述交叉熵损失值进行训练。
根据本公开实施例,所述分割神经网络单元1040可以包括M个处理层,所述训练单元按照戴斯损失函数和交叉熵损失函数来训练所述分割神经网络单元 1020和编码神经网络单元1010还包括:按照戴斯损失函数基于真实分割标签和分割神经网络单元1020的第m个处理层输出的第一特征的分割标签计算中间戴斯损失值;按照交叉熵损失函数基于真实分割标签和分割神经网络单元1020的第m个处理层输出的第一特征的分割标签计算中间交叉熵损失值;根据所述预设阈值,基于所述中间戴斯损失值和中间交叉熵损失值进行训练,其中,m和M为正整数,m大于1小于M。
根据本公开实施例,所述按照分类损失函数和回归损失函数来训练所述检测神经网络单元1040和编码神经网络单元1010包括:由所述编码神经网络单元1010、分割神经网络单元1020和检测神经网络单元1040对训练样本进行处理,得到检测结果,所述预测框参数包括所述预测框的中心点位置坐标和尺寸;按照分类损失函数基于所述预测概率计算分类损失值,按照回归损失函数基于所述预测框参数以及所述第二特征的真实框参数计算回归损失值;基于所述分类损失值和回归损失值进行训练。
根据本公开实施例,按照分类损失函数和回归损失函数来训练所述检测神经网络单元1040和编码神经网络单元1010还包括:在医学图像中进行采样得到至少一个训练样本;计算所述至少一个训练样本的边界框与所述第二特征的边界框的面积比值;将所述面积比值大于第一阈值的训练样本确定为正训练样本,将所述面积比值小于第二阈值的训练样本确定为负训练样本,其中,所述正训练样本用于训练分类损失和回归损失,所述负训练样本用于训练分类损失。
根据本公开的又一方面,还提供了一种基于人工智能的医学设备。图6示出了根据本公开实施例的基于人工智能的医学设备2000的示意性框图。
如图6所示,所述设备2000可以包括图像采集装置2010、处理器2020,和存储器2030。其中,所述存储器2030中存储有计算机可读代码,所述计算机可读代码当由所述处理器2020运行时,可以执行如上所述的基于人工智能的医学图像处理方法。
作为一个具体实施例,所述图像采集装置2010可以是CT设备,并获取诸如颅内动脉血管造影图像,作为如上所述的医学图像。接着,所述处理器2020可以与图像采集装置2010有线和/或有线连接,以接收上述医学图像,然后,所述处理器2020可以运行存储器2030中存储有计算机可读代码,述计算机可读代码当 由所述处理器2020运行时,可以执行如上所述的基于人工智能的医学图像处理方法,从而基于所述医学图像得到动脉分割结果和动脉瘤检测结果。此外,所述医学设备2000还可以包括诸如显示屏幕等显示设备,用于显示动脉分割结果和动脉瘤检测结果,显示效果可以参照图2中所示出的。
根据本公开实施例的方法或装置也可以借助于图7所示的计算设备3000的架构来实现。如图7所示,计算设备3000可以包括总线3010、一个或多个CPU3020、只读存储器(ROM)3030、随机存取存储器(RAM)3040、连接到网络的通信端口3050、输入/输出组件3060、硬盘3070等。计算设备3000中的存储设备,例如ROM 3030或硬盘3070可以存储本公开提供的基于人工智能的医学图像处理方法的处理和/或通信使用的各种数据或文件以及CPU所执行的程序指令。计算设备800还可以包括用户界面3080。当然,图7所示的架构只是示例性的,在实现不同的设备时,根据实际需要,可以省略图7示出的计算设备中的一个或多个组件。
根据本公开的又一方面,还提供了一种计算机可读存储介质。图8示出了根据本公开的存储介质的示意图4000。
如图8所示,所述计算机存储介质4020上存储有计算机可读指令4010。当所述计算机可读指令4010由处理器运行时,可以执行参照以上附图描述的根据本公开实施例的基于人工智能的医学图像处理方法。所述计算机可读存储介质包括但不限于例如易失性存储器和/或非易失性存储器。所述易失性存储器例如可以包括随机存取存储器(RAM)和/或高速缓冲存储器(cache)等。所述非易失性存储器例如可以包括只读存储器(ROM)、硬盘、闪存等。例如,所述计算机存储介质4020可以连接于诸如计算机等的计算设备,接着,在所述计算设备运行所述计算机存储介质4020上存储的计算机可读指令4010的情况下,可以进行如上所述的根据本公开的基于人工智能的医学图像处理方法。
本领域技术人员能够理解,本公开所披露的内容可以出现多种变型和改进。例如,以上所描述的各种设备或组件可以通过硬件实现,也可以通过软件、固件、或者三者中的一些或全部的组合实现。
此外,虽然本公开对根据本公开的实施例的系统中的某些单元做出了各种引用,然而,任何数量的不同单元可以被使用并运行在客户端和/或服务器上。所述单元仅是说明性的,并且所述系统和方法的不同方面可以使用不同单元。
本领域普通技术人员可以理解上述方法中的全部或部分的步骤可通过程序来指令相关硬件完成,所述程序可以存储于计算机可读存储介质中,如只读存储器、磁盘或光盘等。可选地,上述实施例的全部或部分步骤也可以使用一个或多个集成电路来实现。相应地,上述实施例中的各模块/单元可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。本公开并不限制于任何特定形式的硬件和软件的结合。
除非另有定义,这里使用的所有术语(包括技术和科学术语)具有与本公开所属领域的普通技术人员共同理解的相同含义。还应当理解,诸如在通常字典里定义的那些术语应当被解释为具有与它们在相关技术的上下文中的含义相一致的含义,而不应用理想化或极度形式化的意义来解释,除非这里明确地这样定义。
以上是对本公开的说明,而不应被认为是对其的限制。尽管描述了本公开的若干示例性实施例,但本领域技术人员将容易地理解,在不背离本公开的新颖教学和优点的前提下可以对示例性实施例进行许多修改。因此,所有这些修改都意图包含在权利要求书所限定的本公开范围内。应当理解,上面是对本公开的说明,而不应被认为是限于所公开的特定实施例,并且对所公开的实施例以及其他实施例的修改意图包含在所附权利要求书的范围内。本公开由权利要求书及其等效物限定。

Claims (15)

  1. 一种基于人工智能的医学图像处理方法,应用于一个或多个计算设备,包括:
    对所述医学图像进行处理,生成表示所述医学图像的结构特征的编码中间图像;
    根据第一特征对所述编码中间图像进行分割,从而生成包括所述第一特征的分割中间图像;
    基于注意力机制对所述编码中间图像和分割中间图像进行处理,生成经过注意力增强的检测中间输入图像;以及
    对所述检测中间输入图像进行第二特征的检测,从而确定所述第一特征所在的图像内容中是否包括第二特征。
  2. 根据权利要求1所述的医学图像处理方法,其中,对所述医学图像进行处理,生成编码中间图像包括:
    利用编码神经网络对所述医学图像进行处理,生成编码中间图像,所述编码神经网络包括M个处理层,M为正整数,所述处理层包括卷积网络、转置卷积网络和池化层中的至少一种,
    所述利用编码神经网络对所述医学图像进行处理包括:
    利用所述编码神经网络的第1处理层对所述医学图像进行处理,输出第1编码中间图像;
    利用所述编码神经网络的第m1处理层对所述编码神经网络的第m1-1处理层输出的第m1-1编码中间图像进行处理,输出第m1编码中间图像,其中,m1为正整数,m1大于1小于等于M;
    所述对所述编码中间图像进行处理,分割第一特征并生成分割中间图像包括:
    利用分割神经网络对所述编码中间图像进行处理,分割第一特征并生成分割中间图像,所述分割神经网络包括M个处理层,其中,利用分割神经网络的第1处理层对所述编码神经网络的第M处理层输出的第M编码中间图像进行处理,输出第1分割中间图像;
    利用分割神经网络的第m2处理层对所述分割神经网络的第m2-1处理层输出的第m2-1分割中间图像以及所述编码神经网络的第M-m2+1处理层输出的第M-m2+1编码中间图像进行处理,输出第m2分割中间图像,其中,m2为正整数,m2大于1小于等于M;
    利用卷积网络对分割神经网络的第M处理层输出的第M分割中间图像进行处理,生成所述第一特征的分割结果。
  3. 根据权利要求2所述的医学图像处理方法,其中,所述检测中间输入图像进行处理,检测所述第一特征中包括的第二特征包括:
    利用检测神经网络检测中间输入图像进行处理,检测所述第一特征中包括的第二特征,所述检测神经网络包括N个处理层,N为正整数,其中,
    利用检测神经网络的第1处理层对所述编码神经网络的第M处理层输出的第M编码中间图像进行处理,输出第1检测中间图像;
    基于注意力机制对所述编码中间图像和分割中间图像进行处理,生成检测中间输入图像包括:
    利用注意力网络基于注意力机制对所述编码中间图像和分割中间图像进行处理,生成检测中间输入图像,其中,
    利用注意力网络对检测神经网络的第n-1处理层输出的第n-1检测中间图像、所述编码神经网络的第m1处理层输出的第m1编码中间图像以及所述分割神经网络的第m2处理层输出的第m2分割中间图像进行处理,输出第n检测中间输入图像;
    利用检测神经网络的第n处理层所述第n检测中间输入图像进行处理,输出第n检测中间图像,
    其中,第m1编码中间图像和第m2分割中间图像与所述第n-1检测中间图像具有相同的图像尺寸,n为正整数,n大于1小于等于N。
  4. 根据权利要求3所述的医学图像处理方法,其中,利用注意力网络输出第n检测中间输入图像包括:
    对所述第m1编码中间图像和第n-1检测中间图像进行通道串联得到串联图像;
    对所述串联图像和第m2分割中间图像进行相加得到相加图像;
    利用激活函数对所述相加图像进行处理得到注意力特征图像;
    对所述注意力特征图像与所述串联图像进行相乘得到注意力增强图像;
    对所述注意力增强图像和串联图像进行相加得到所述第n检测中间输入图像。
  5. 根据权利要求1所述的医学图像处理方法,其中,所述医学图像为计算机断层扫描血管造影图像,所述第一特征为动脉特征,所述第二特征为动脉瘤特征、动脉血管壁钙化特征、动脉血管阻塞特征中的至少一种。
  6. 根据权利要求1所述的医学图像处理方法,其中,检测所述第一特征中包括的第二特征包括:输出所述第二特征的检测结果,所述检测结果包括:第二特征的预测框参数和所述预测框中包含第二特征的预测概率,所述方法还包括:在包含所述第一特征的图像上显示候选框,所述候选框内包括所述检测神经网络检测得到的所述第二特征的预测框。
  7. 根据权利要求2所述的医学图像处理方法,还包括:
    按照戴斯损失函数和交叉熵损失函数来训练所述分割神经网络和编码神经网络,其中,所述分割神经网络和编码神经网络为三维卷积神经网络。
  8. 根据权利要求3所述的医学图像处理方法,还包括:
    按照分类损失函数和回归损失函数来训练所述检测神经网络和编码神经网络,其中,所述检测神经网络为三维卷积神经网络。
  9. 根据权利要求8所述的医学图像处理方法,其中,按照戴斯损失函数和交叉熵损失函数来训练所述分割神经网络和编码神经网络包括:
    按照戴斯损失函数基于真实分割标签和分割神经网络输出的第一特征的分割标签计算戴斯损失值;
    按照交叉熵损失函数基于真实分割标签和分割神经网络输出的第一特征的分割标签计算交叉熵损失值;
    根据预设阈值,基于所述戴斯损失值和交叉熵损失值进行训练,其中,
    戴斯损失函数
    Figure PCTCN2020105461-appb-100001
    和交叉熵损失函数
    Figure PCTCN2020105461-appb-100002
    分别表示为:
    Figure PCTCN2020105461-appb-100003
    Figure PCTCN2020105461-appb-100004
    其中,s i表示所述医学图像中第i个像素的真实分割标签,q i表示分割神经网络输出的所述第i个像素的预测分割标签,V表示医学图像中包括的总的像素数目。
  10. 根据权利要求9所述的医学图像处理方法,其中,根据预设阈值,基于所述戴斯损失值和交叉熵损失值进行训练包括:
    在所述交叉熵损失值小于所述预设阈值的情况下,基于所述戴斯损失值进行训练;
    在所述交叉熵损失值不小于所述预设阈值的情况下,基于所述交叉熵损失值进 行训练。
  11. 根据权利要求8所述的医学图像处理方法,其中,所述分割神经网络包括M个处理层,所述按照戴斯损失函数和交叉熵损失函数来训练所述分割神经网络和编码神经网络还包括:
    按照戴斯损失函数基于真实分割标签和分割神经网络的第m个处理层输出的第一特征的分割标签计算中间戴斯损失值;
    按照交叉熵损失函数基于真实分割标签和分割神经网络的第m个处理层输出的第一特征的分割标签计算中间交叉熵损失值;
    根据所述预设阈值,基于所述中间戴斯损失值和中间交叉熵损失值进行训练,其中,m和M为正整数,m大于1小于M。
  12. 根据权利要求8所述的医学图像处理方法,其中,所述按照分类损失函数和回归损失函数来训练所述检测神经网络和编码神经网络包括:
    利用所述编码神经网络、分割神经网络和检测神经网络对训练样本进行处理,得到检测结果,所述预测框参数包括所述预测框的中心点位置坐标和尺寸;
    按照分类损失函数基于所述预测概率计算分类损失值,按照回归损失函数基于所述预测框参数以及所述第二特征的真实框参数计算回归损失值;
    基于所述分类损失值和回归损失值进行训练。
  13. 根据权利要求12所述的医学图像处理方法,其中,按照分类损失函数和回归损失函数来训练所述检测神经网络和编码神经网络还包括:
    在医学图像中进行采样得到至少一个训练样本;
    计算所述至少一个训练样本的边界框与所述第二特征的边界框的面积比值;
    将所述面积比值大于第一阈值的训练样本确定为正训练样本,将所述面积比值小于第二阈值的训练样本确定为负训练样本,
    其中,所述正训练样本用于训练分类损失和回归损失,所述负训练样本用于训练分类损失。
  14. 一种基于人工智能的医学设备,包括:
    图像采集装置,配置成获取医学图像;
    处理器;和
    存储器,其中,所述存储器中存储有计算机可读代码,所述计算机可读代码当 由所述处理器运行时,执行如权利要求1-13中任一项所述的基于人工智能的医学图像处理方法。
  15. 一种计算机可读存储介质,其上存储有指令,所述指令在被处理器执行时,使得所述处理器执行如权利要求1-13中任一项所述的基于人工智能的医学图像处理方法。
PCT/CN2020/105461 2019-08-15 2020-07-29 基于人工智能的医学图像处理方法、医学设备和存储介质 WO2021027571A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/503,160 US11941807B2 (en) 2019-08-15 2021-10-15 Artificial intelligence-based medical image processing method and medical device, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910752632.6 2019-08-15
CN201910752632.6A CN110458833B (zh) 2019-08-15 2019-08-15 基于人工智能的医学图像处理方法、医学设备和存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/503,160 Continuation US11941807B2 (en) 2019-08-15 2021-10-15 Artificial intelligence-based medical image processing method and medical device, and storage medium

Publications (1)

Publication Number Publication Date
WO2021027571A1 true WO2021027571A1 (zh) 2021-02-18

Family

ID=68486812

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/105461 WO2021027571A1 (zh) 2019-08-15 2020-07-29 基于人工智能的医学图像处理方法、医学设备和存储介质

Country Status (3)

Country Link
US (1) US11941807B2 (zh)
CN (1) CN110458833B (zh)
WO (1) WO2021027571A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112906685A (zh) * 2021-03-04 2021-06-04 重庆赛迪奇智人工智能科技有限公司 一种目标检测方法、装置、电子设备及存储介质
CN113159147A (zh) * 2021-04-08 2021-07-23 平安科技(深圳)有限公司 基于神经网络的图像识别方法、装置、电子设备
CN116721159A (zh) * 2023-08-04 2023-09-08 北京智源人工智能研究院 超声颈动脉中心点坐标预测方法及颈动脉横切面追踪方法
CN118072976A (zh) * 2024-04-22 2024-05-24 吉林大学 基于数据分析的儿童呼吸道疾病预测系统及方法

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110458833B (zh) 2019-08-15 2023-07-11 腾讯科技(深圳)有限公司 基于人工智能的医学图像处理方法、医学设备和存储介质
CN111222466B (zh) * 2020-01-08 2022-04-01 武汉大学 一种基于三维空间-通道注意力机制的遥感影像滑坡自动探测方法
CN111724365B (zh) * 2020-06-16 2021-11-09 中国科学院自动化研究所 血管内动脉瘤修复手术的介入器械检测方法、系统及装置
CN112102204B (zh) * 2020-09-27 2022-07-08 苏州科达科技股份有限公司 图像增强方法、装置及电子设备
US20220351366A1 (en) * 2021-05-03 2022-11-03 Rakuten Group, Inc. Deep learning model to predict data from an image
CN112966792B (zh) * 2021-05-19 2021-08-13 腾讯科技(深圳)有限公司 血管图像分类处理方法、装置、设备及存储介质
CN113888475A (zh) * 2021-09-10 2022-01-04 上海商汤智能科技有限公司 图像检测方法及相关模型的训练方法、相关装置和设备
US20230177747A1 (en) * 2021-12-06 2023-06-08 GE Precision Healthcare LLC Machine learning generation of low-noise and high structural conspicuity images
CN116823833B (zh) * 2023-08-30 2023-11-10 山东科技大学 全方位mip图像颅内动脉瘤检测方法、系统及设备
CN117911418B (zh) * 2024-03-20 2024-06-21 常熟理工学院 基于改进yolo算法的病灶检测方法、系统及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180204088A1 (en) * 2017-01-13 2018-07-19 Beihang University Method for salient object segmentation of image by aggregating multi-linear exemplar regressors
CN109685813A (zh) * 2018-12-27 2019-04-26 江西理工大学 一种自适应尺度信息的u型视网膜血管分割方法
CN109978037A (zh) * 2019-03-18 2019-07-05 腾讯科技(深圳)有限公司 图像处理方法、模型训练方法、装置、和存储介质
CN109993726A (zh) * 2019-02-21 2019-07-09 上海联影智能医疗科技有限公司 医学图像的检测方法、装置、设备和存储介质
CN110458833A (zh) * 2019-08-15 2019-11-15 腾讯科技(深圳)有限公司 基于人工智能的医学图像处理方法、医学设备和存储介质

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9519868B2 (en) * 2012-06-21 2016-12-13 Microsoft Technology Licensing, Llc Semi-supervised random decision forests for machine learning using mahalanobis distance to identify geodesic paths
KR102592076B1 (ko) * 2015-12-14 2023-10-19 삼성전자주식회사 딥러닝 기반 영상 처리 장치 및 방법, 학습 장치
US9589374B1 (en) * 2016-08-01 2017-03-07 12 Sigma Technologies Computer-aided diagnosis system for medical images using deep convolutional neural networks
US10600185B2 (en) * 2017-03-08 2020-03-24 Siemens Healthcare Gmbh Automatic liver segmentation using adversarial image-to-image network
CN110838124B (zh) * 2017-09-12 2021-06-18 深圳科亚医疗科技有限公司 用于分割具有稀疏分布的对象的图像的方法、系统和介质
US11918333B2 (en) * 2017-12-29 2024-03-05 Analytics For Life Inc. Method and system to assess disease using phase space tomography and machine learning
US10902585B2 (en) * 2018-03-19 2021-01-26 General Electric Company System and method for automated angiography utilizing a neural network
CN109598728B (zh) * 2018-11-30 2019-12-27 腾讯科技(深圳)有限公司 图像分割方法、装置、诊断系统及存储介质
CN109598722B (zh) * 2018-12-10 2020-12-08 杭州帝视科技有限公司 基于递归神经网络的图像分析方法
CN109685819B (zh) * 2018-12-11 2021-02-26 厦门大学 一种基于特征增强的三维医学图像分割方法
CN109815850B (zh) * 2019-01-02 2020-11-10 中国科学院自动化研究所 基于深度学习的虹膜图像分割及定位方法、系统、装置
CN109872306B (zh) * 2019-01-28 2021-01-08 腾讯科技(深圳)有限公司 医学图像分割方法、装置和存储介质
CN109886282B (zh) * 2019-02-26 2021-05-28 腾讯科技(深圳)有限公司 对象检测方法、装置、计算机可读存储介质和计算机设备
CN110522465A (zh) * 2019-07-22 2019-12-03 通用电气精准医疗有限责任公司 基于图像数据的血液动力学参数估计

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180204088A1 (en) * 2017-01-13 2018-07-19 Beihang University Method for salient object segmentation of image by aggregating multi-linear exemplar regressors
CN109685813A (zh) * 2018-12-27 2019-04-26 江西理工大学 一种自适应尺度信息的u型视网膜血管分割方法
CN109993726A (zh) * 2019-02-21 2019-07-09 上海联影智能医疗科技有限公司 医学图像的检测方法、装置、设备和存储介质
CN109978037A (zh) * 2019-03-18 2019-07-05 腾讯科技(深圳)有限公司 图像处理方法、模型训练方法、装置、和存储介质
CN110458833A (zh) * 2019-08-15 2019-11-15 腾讯科技(深圳)有限公司 基于人工智能的医学图像处理方法、医学设备和存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
LIU MING; YE HONGWEI: "Study on the Segmentation Method of Liver and Liver Tumor Based on CT Image", PROCEEDINGS OF CHINA MEDICAL EQUIPMENT CONFERENCE AND 2019 MEDICAL EQUIPMENT EXHIBITION; 2019-07-18, 18 July 2019 (2019-07-18), pages 7 - 11, XP009526033 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112906685A (zh) * 2021-03-04 2021-06-04 重庆赛迪奇智人工智能科技有限公司 一种目标检测方法、装置、电子设备及存储介质
CN112906685B (zh) * 2021-03-04 2024-03-26 重庆赛迪奇智人工智能科技有限公司 一种目标检测方法、装置、电子设备及存储介质
CN113159147A (zh) * 2021-04-08 2021-07-23 平安科技(深圳)有限公司 基于神经网络的图像识别方法、装置、电子设备
CN113159147B (zh) * 2021-04-08 2023-09-26 平安科技(深圳)有限公司 基于神经网络的图像识别方法、装置、电子设备
CN116721159A (zh) * 2023-08-04 2023-09-08 北京智源人工智能研究院 超声颈动脉中心点坐标预测方法及颈动脉横切面追踪方法
CN116721159B (zh) * 2023-08-04 2023-11-03 北京智源人工智能研究院 超声颈动脉中心点坐标预测方法及颈动脉横切面追踪方法
CN118072976A (zh) * 2024-04-22 2024-05-24 吉林大学 基于数据分析的儿童呼吸道疾病预测系统及方法

Also Published As

Publication number Publication date
CN110458833B (zh) 2023-07-11
CN110458833A (zh) 2019-11-15
US20220036550A1 (en) 2022-02-03
US11941807B2 (en) 2024-03-26

Similar Documents

Publication Publication Date Title
WO2021027571A1 (zh) 基于人工智能的医学图像处理方法、医学设备和存储介质
US10706333B2 (en) Medical image analysis method, medical image analysis system and storage medium
US10600185B2 (en) Automatic liver segmentation using adversarial image-to-image network
CN111429421B (zh) 模型生成方法、医学图像分割方法、装置、设备及介质
CN109754403A (zh) 一种ct图像内的肿瘤自动分割方法及系统
CN110475505A (zh) 利用全卷积网络的自动分割
CN109087306A (zh) 动脉血管图像模型训练方法、分割方法、装置及电子设备
US10997724B2 (en) System and method for image segmentation using a joint deep learning model
JP7250166B2 (ja) 画像分割方法及び装置、画像分割モデルのトレーニング方法及び装置
CN111899244B (zh) 图像分割、网络模型的训练方法及装置,及电子设备
Yang et al. A deep learning segmentation approach in free‐breathing real‐time cardiac magnetic resonance imaging
CN114897780A (zh) 一种基于mip序列的肠系膜上动脉血管重建方法
Cheng et al. DDU-Net: A dual dense U-structure network for medical image segmentation
CN112669247A (zh) 一种用于多任务医学图像合成的先验指导型网络
Sirjani et al. Automatic cardiac evaluations using a deep video object segmentation network
CN117152442A (zh) 影像靶区自动勾画方法、装置、电子设备和可读存储介质
WO2024051018A1 (zh) 一种pet参数图像的增强方法、装置、设备及存储介质
Wu et al. Pneumothorax segmentation in routine computed tomography based on deep neural networks
US20240095885A1 (en) Performing denoising on an image
EP4009268A1 (en) Performing denoising on an image
CN115375706A (zh) 图像分割模型训练方法、装置、设备及存储介质
Al-antari et al. Deep learning myocardial infarction segmentation framework from cardiac magnetic resonance images
Lang et al. LCCF-Net: Lightweight contextual and channel fusion network for medical image segmentation
CN112837318A (zh) 超声图像生成模型的生成方法、合成方法、介质及终端
Hua et al. Dual attention based multi-scale feature fusion network for indoor RGBD semantic segmentation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20852707

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20852707

Country of ref document: EP

Kind code of ref document: A1