WO2024041058A1

WO2024041058A1 - Follow-up case data processing method and apparatus, device, and storage medium

Info

Publication number: WO2024041058A1
Application number: PCT/CN2023/095932
Authority: WO
Inventors: 唐雯; 王大为; 王少康; 陈宽
Original assignee: 推想医疗科技股份有限公司
Priority date: 2022-08-25
Filing date: 2023-05-24
Publication date: 2024-02-29
Also published as: CN115359010A

Abstract

The present disclosure provides a follow-up case data processing method and apparatus, a device, and a storage medium. The processing method comprises: according to position information and size information of a target lesion in a first medical image, determining a first detection result of the target lesion in a first space; according to a registration transformation matrix between the first medical image and a second medical image, performing transformation processing on the first detection result to obtain an initial lesion matching result of the first detection result in a second space; and inputting the first medical image, the second medical image, the first detection result, and the initial lesion matching result into a feature extraction model, and outputting a lesion feature extraction result of the second medical image. In this way, according to the present disclosure, the model can effectively use anatomical structure information of the lesion on the basis of medical image information, so that the feature extraction accuracy of the model for the same lesion in different medical images and the accuracy of matching and positioning of the same lesion in follow-up case data are improved.

Description

A method, device, equipment and storage medium for processing follow-up case data

Cross-references to related applications

This disclosure requests the priority of the Chinese patent application with application number 202211026079.6 and titled "A processing method, device, equipment and storage medium for follow-up case data" submitted to the China Patent Office on August 25, 2022, and the entire content thereof incorporated by reference into this disclosure.

Technical field

The present disclosure relates to the field of image processing technology, and specifically, to a processing method, device, equipment and storage medium for follow-up case data.

Background technique

Lesions refer to the diseased parts of the body, which are more common in acute lung infections such as COVID-19. If a certain part of the lung is damaged by tuberculosis bacteria, the damaged part is a tuberculosis focus. In the medical field, lesions are often treated using follow-up tracking methods. Doctors can grasp the changes in lesions over time based on the medical image data continuously collected for the same patient at different times (i.e., the patient’s follow-up case data); among them, When analyzing patient follow-up case data, since medical images at different times contain different spatial information, it is often necessary to match and locate the position of the same target lesion in different medical images.

Currently, the matching and positioning of target lesions in follow-up cases is mainly achieved by comparing the similarity between lesions in different medical images. When the first lesion in the first medical image and the second lesion in the second medical image When the similarity between them is higher than the preset threshold, it is determined that the first lesion and the second lesion belong to the same lesion, thereby achieving matching and positioning of the same target lesion in the follow-up case data. However, in this way, when patients undergo treatment such as surgery, the characteristics of the lesions will change drastically in the follow-up case data, resulting in obvious differences between the same lesions in different medical images, making The accuracy of the lesion matching results based on similarity comparison is greatly reduced.

Contents of the invention

In view of this, the purpose of the present disclosure is to provide a processing method, device, equipment and storage medium for follow-up case data, so that the model can effectively combine the anatomical structure information of the lesion on the basis of medical image information, and improve the model's accuracy in different situations. The accuracy of feature extraction of the same lesion in medical images is helpful to improve the accuracy of matching and positioning of the same lesion in follow-up case data.

In order to make the above-mentioned objects, features and advantages of the present disclosure more obvious and understandable, preferred embodiments are given below and described in detail with reference to the accompanying drawings.

In a first aspect, embodiments of the present disclosure provide a method for processing follow-up case data, where the follow-up case data includes at least a first medical image and a second medical image; the first medical image and the second medical image They are medical images collected for the same object at different times; the processing method includes:

According to the position information and size information of the target lesion in the first medical image, the first detection result of the target lesion in the first space is determined; wherein the first space represents the location of the first medical image. coordinate space;

According to the registration transformation matrix between the first medical image and the second medical image, the first detection result is transformed and the initial lesion matching result of the first detection result in the second space is obtained. ; Wherein, the second space represents the coordinate space in which the second medical image is located;

The first medical image, the second medical image, the first detection result and the initial lesion matching result are input into the feature extraction model, and the lesion feature extraction result of the second medical image is output; wherein, The lesion feature extraction result is at least used for a lesion segmentation task for the second medical image.

In a second aspect, embodiments of the present disclosure provide a processing device for follow-up case data, where the follow-up case data includes at least a first medical image and a second medical image; the first medical image and the second medical image They are medical images collected for the same object at different times; the processing device includes:

Determining module, configured to determine the first detection result of the target lesion in the first space according to the position information and size information of the target lesion in the first medical image; wherein the first space represents the first detection result of the target lesion in the first medical image. 1. The coordinate space in which the medical image is located;

a registration module, configured to perform transformation processing on the first detection result according to the registration transformation matrix between the first medical image and the second medical image, and obtain the first detection result in the second space The initial lesion matching result under; wherein, the second space represents the coordinate space in which the second medical image is located;

A processing module configured to input the first medical image, the second medical image, the first detection result and the initial lesion matching result into a feature extraction model, and output the lesion features of the second medical image. Extraction results; wherein the lesion feature extraction results are used at least for a lesion segmentation task for the second medical image.

In a third aspect, embodiments of the present disclosure provide a computer device, including a memory, a processor, and a computer program stored on the memory and executable on the processor. When the processor executes the computer program Steps to implement the above-mentioned processing method of follow-up case data.

In a fourth aspect, embodiments of the present disclosure provide a computer-readable storage medium. A computer program is stored on the computer-readable storage medium. The computer program executes the above-mentioned processing method of follow-up case data when run by a processor. step.

The technical solutions provided by the embodiments of the present disclosure may include the following beneficial effects:

Embodiments of the present disclosure provide a method, device, equipment and storage medium for processing follow-up case data, which determines the first detection of the target lesion in the first space based on the position information and size information of the target lesion in the first medical image. Result: According to the registration transformation matrix between the first medical image and the second medical image, the first detection result is transformed to obtain the initial lesion matching result of the first detection result in the second space; the first medical image is , the second medical image, the first detection result and the initial lesion matching result are input into the feature extraction model, and the lesion feature extraction result of the second medical image is output. In this way, the present disclosure enables the model to effectively combine the anatomical structure information of the lesion on the basis of medical image information, improves the model's feature extraction accuracy for the same lesion in different medical images, and improves the model's matching and positioning of the same lesion in follow-up case data. accuracy.

Description of drawings

In order to explain the technical solutions of the embodiments of the present disclosure more clearly, the drawings needed to be used in the embodiments will be briefly introduced below. It should be understood that the following drawings only show some embodiments of the present disclosure and therefore do not It should be regarded as a limitation of the scope. For those of ordinary skill in the art, other relevant drawings can be obtained based on these drawings without exerting creative efforts.

Figure 1 shows a schematic flow chart of a method for processing follow-up case data provided by an embodiment of the present disclosure;

Figure 2 shows a schematic model structure diagram of a feature extraction model provided by an embodiment of the present disclosure;

Figure 3 shows a schematic flowchart of a method for obtaining the first enhanced feature corresponding to the first input data within the first feature extraction window provided by an embodiment of the present disclosure;

Figure 4 shows a schematic flowchart of a method for obtaining second enhanced features corresponding to second input data within the first feature extraction window provided by an embodiment of the present disclosure;

Figure 5 shows a schematic flowchart of a method for obtaining the third enhanced feature corresponding to the third input data within the first feature extraction window provided by an embodiment of the present disclosure;

Figure 6 shows a schematic flowchart of a method for obtaining the fourth enhanced feature corresponding to the fourth input data within the first feature extraction window provided by an embodiment of the present disclosure;

Figure 7 shows a schematic flowchart of a first method for using lesion feature extraction results of a second medical image provided by an embodiment of the present disclosure;

Figure 8 shows a schematic flowchart of the second method of using the lesion feature extraction results of the second medical image provided by the embodiment of the present disclosure;

Figure 9 shows a schematic structural diagram of a follow-up case data processing device provided by an embodiment of the present disclosure;

FIG. 10 is a schematic structural diagram of a computer device 1000 provided by an embodiment of the present disclosure.

Detailed ways

In order to make the purpose, technical solutions and advantages of the embodiments of the present disclosure clearer, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present disclosure. It should be understood that the technical solutions attached in the embodiments of the present disclosure are The drawings are for illustration and description purposes only and are not intended to limit the scope of the present disclosure. Additionally, it should be understood that the schematic drawings are not drawn to scale. The flowcharts used in this disclosure illustrate operations implemented in accordance with some embodiments of the disclosure. It should be understood that the operations of the flowchart may be implemented out of sequence, and steps without logical context may be implemented in reverse order or simultaneously. In addition, those skilled in the art can add one or more other operations to the flowchart, and can also remove one or more operations from the flowchart under the guidance of this disclosure.

In addition, the described embodiments are only some, not all, of the embodiments of the present disclosure. The components of the embodiments of the present disclosure generally described and illustrated in the figures herein may be arranged and designed in a variety of different configurations. Therefore, the following detailed description of the embodiments of the disclosure provided in the appended drawings is not intended to limit the scope of the claimed disclosure, but rather to represent selected embodiments of the disclosure. Based on the embodiments of the present disclosure, all other information obtained by those skilled in the art without any creative efforts Other embodiments all fall within the protection scope of this disclosure.

It should be noted that the term "comprising" will be used in the embodiments of the present disclosure to indicate the existence of the features stated subsequently, but does not exclude the addition of other features.

Based on this, embodiments of the present disclosure provide a method, device, equipment and storage medium for processing follow-up case data, which determines the location of the target lesion in the first space based on the position information and size information of the target lesion in the first medical image. The first detection result; according to the registration transformation matrix between the first medical image and the second medical image, perform transformation processing on the first detection result to obtain the initial lesion matching result of the first detection result in the second space; convert the first detection result to the second medical image. The first medical image, the second medical image, the first detection result and the initial lesion matching result are input into the feature extraction model, and the lesion feature extraction result of the second medical image is output. In this way, the present disclosure enables the model to effectively combine the anatomical structure information of the lesion on the basis of medical image information, improves the model's feature extraction accuracy for the same lesion in different medical images, and improves the model's matching and positioning of the same lesion in follow-up case data. accuracy.

It should be noted that the method for processing follow-up case data provided by the embodiments of the present disclosure is applicable to a processing device for follow-up case data, and the processing device can be integrated in a computer device.

Specifically, the above-mentioned computer equipment can be a terminal device, such as a mobile phone, a tablet computer, a notebook computer, a desktop computer, etc.; the above-mentioned computer equipment can also be a server, and the server can be an independent physical server, or it can be composed of multiple physical servers. Server clusters or distributed systems can also provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, CDN (Content Delivery Network, content Distribution network), and cloud servers for basic cloud computing services such as big data and artificial intelligence platforms, but are not limited to this.

In order to facilitate understanding of the embodiments of the present disclosure, a detailed introduction is given below to a method, device, equipment and storage medium for processing follow-up case data provided by the embodiments of the present disclosure.

Referring to Figure 1, Figure 1 shows a schematic flow chart of a method for processing follow-up case data provided by an embodiment of the present disclosure. The processing method includes steps S101-S103; specifically:

S101. Determine the first detection result of the target lesion in the first space based on the position information and size information of the target lesion in the first medical image.

Here, the follow-up case data includes at least a first medical image and a second medical image; wherein the first medical image and the second medical image are medical images collected for the same object at different times. That is, the follow-up case data is used to represent multiple medical images continuously collected for the target lesions of the same patient at different times. The embodiment of the present disclosure does not make any limit on the specific number of medical images included in the follow-up case data.

It should be noted that different medical images in the follow-up case data can be of the same type of medical images. For example, the medical images in the follow-up case data can be CT (Computed Tomography, CT) collected from the target lesions of the same patient at different times. Computerized tomography (computed tomography) images, the embodiments of the present disclosure do not make any restrictions on the specific image types to which the medical images in the follow-up case data belong.

Here, two different medical images can be arbitrarily selected from the follow-up case data as the above-mentioned first medical image and the second medical image, wherein the position information and size information of the target lesion in each medical image (that is, the location of the target lesion in the medical image) Information such as shape and size in the image) is known information, but there are differences in spatial information based on different medical images. Therefore, in the embodiment of the present disclosure, it is necessary to place the same lesion (i.e., the above-mentioned target lesion) in different spaces (i.e., Image information under different medical images) is registered into the same reference space, thereby obtaining anatomical structure information about the target lesion (that is, structural information such as the shape and size of the target lesion itself).

Specifically, the above-mentioned first space represents the coordinate space in which the first medical image is located. When performing step S101, based on the position information and size information of the target lesion in the first medical image, a coordinate space that can represent the position of the target lesion in the first space can be generated. The first spatial matrix of position and size information is used as the above-mentioned first detection result.

Exemplary explanation, based on the fact that medical images usually exist in the form of three-dimensional image data, as an optional embodiment, the above-mentioned first spatial matrix (that is, the first detection result) can be a 3D (three-dimensional, three-dimensional) Gaussian matrix, The spherical center of the 3D Gaussian matrix is the center point of the target lesion in the first medical image, and the radius of the 3D Gaussian matrix is determined according to the size of the target lesion in the first medical image.

It should be noted that, in addition to the above-mentioned 3D Gaussian matrix, the first detection result can also be other types of spatial matrices, as long as the first detection result can represent the position and size information of the target lesion in the first space, The embodiment of the present disclosure does not impose any limitation on the specific existence form of the above-mentioned first detection result.

S102, perform transformation processing on the first detection result according to the registration transformation matrix between the first medical image and the second medical image, and obtain the initial lesion of the first detection result in the second space. Matching results.

Here, the second space represents the coordinate space in which the second medical image is located.

Specifically, before performing step S102, using the second medical image as the reference image and the first medical image as the floating image, through a rigid registration method, it is possible to determine a method that can transform the first medical image (ie, the floating image) into A graphics transformation matrix in the same coordinate space as the second medical image (ie, the reference image); wherein the determined graphics transformation matrix is the above-mentioned registration transformation matrix.

It should be noted that in the embodiments of the present disclosure, the above-mentioned registration transformation matrix is not uniquely obtained. For example, in addition to the above-mentioned rigid registration method, the above-mentioned registration transformation can also be obtained through non-rigid registration. Matrix, among which, the difference between rigid registration and non-rigid registration is: rigid registration usually takes the entire image as the registration operation object, and the images can generally be aligned through operations such as translation, rotation, and scaling of the entire image; rather than The rigid matching principle is to perform separate transformation processing on each local area (or even each pixel) in the image. Based on this, whether through rigid registration or non-rigid registration, the registration transformation matrix between the first medical image and the second medical image can be obtained. Regarding the specific acquisition method of the above-mentioned registration transformation matrix, embodiments of the present disclosure Without any qualification.

Specifically, based on the registration transformation matrix, the image information in the first space can be registered and transformed into the second space. Therefore, when performing step S102, based on the above registration transformation matrix, the first detection result (representation The anatomical structure information such as the position and size of the target lesion in the first space is registered and transformed to the second space, and the initial lesion matching result of the first detection result in the second space is obtained.

S103: Input the first medical image, the second medical image, the first detection result and the initial lesion matching result into a feature extraction model, and output the lesion feature extraction result of the second medical image.

Here, in the feature extraction model, the feature extraction model is used to perform the same feature extraction operation on each type of input data to obtain the feature extraction results corresponding to each type of input data. That is, the feature extraction model can output the first medical image. The image feature vector, the image feature vector of the second medical image, the feature vector of the first detection result, and the feature vector of the initial lesion matching result.

It should be noted that in the embodiment of the present disclosure, the obtained initial lesion matching result is equivalent to the anatomical structure information (such as position information, size information, etc.) of the target lesion in the second space. At this time, the initial lesion matching result is equal to The actual detection results of the target lesion in the second medical image (such as the real position and real size of the target lesion in the second medical image) are irrelevant.

Based on this, when the feature extraction model performs feature extraction on the input image information (ie, the first medical image, the second medical image), the first detection result and the initial lesion matching result are used to provide the feature extraction model with information about the target lesion itself. Anatomical structure information. In this way, the embodiments of the present disclosure enable the feature extraction model to effectively combine the anatomical structure information of the target lesion on the basis of medical image information, improve the feature extraction accuracy of the feature extraction model for the same lesion in different medical images, and help improve The accuracy of matching and positioning of the same lesion in follow-up case data.

Specifically, the above-mentioned lesion feature extraction result is the image feature vector of the second medical image output by the feature extraction model, wherein the lesion feature extraction result is at least used for the lesion segmentation task of the second medical image. At this time, the lesion feature extraction result It can replace the second medical image as the model input data of the lesion segmentation model, and obtain the second medical image through the output of the lesion segmentation model. The lesion segmentation prediction result of the medical image is based on the degree of matching (that is, the similarity) between the actual detection result of the target lesion in the second medical image (i.e., the image area where the target lesion is located in the second medical image) and the lesion segmentation prediction result. degree), the feature extraction ability of the feature extraction model and the lesion segmentation ability of the lesion segmentation model can be evaluated.

It should be noted that, in addition to the lesion segmentation task, the above-mentioned lesion feature extraction results can also be used for the lesion detection task for the second medical image (that is, classifying the pixels belonging to the lesion area in the second medical image + for the second (locating the location of the lesion in the medical image), at this time, it is only necessary to adaptively replace the above-mentioned lesion segmentation model with the lesion detection model. Based on this, the embodiments of the present disclosure do not impose any limitations on the specific use of the lesion feature extraction results obtained by the above output.

The specific implementation process of each of the above steps in the embodiment of the present disclosure will be described in detail below:

In the embodiment of the present disclosure, in addition to the existing commonly used feature extraction models (such as deep learning models based on Unet network structure or convolutional neural network), in a preferred implementation, by Some Unet network structures have undergone overall structural improvements. For the feature extraction model in the above step S103, the embodiment of the present disclosure also provides a brand new model structure as shown in Figure 2, specifically:

Referring to Figure 2, Figure 2 shows a schematic model structure diagram of a feature extraction model provided by an embodiment of the present disclosure. The feature extraction model adopts the Unet network structure with the swin-transformer module as the core, as shown in Figure 2 As shown, the Unet network structure includes multiple sets of symmetrical encoders 201 and decoders 202. The encoder 201 is used to downsample the input image data, and the decoder 202 is used to upsample the input image data. Each set of symmetrical encoders 201 and decoders 202 corresponds to a processing scale of image data. Different encoders 201 correspond to different processing scales, and different decoders 202 correspond to different processing scales.

Specifically, regarding the encoder part in the feature extraction model, as shown in Figure 2, the encoder of each layer includes at least one swin-transformer module with four inputs and four outputs, where each swin-transformer module has a function for each The input data of each channel performs the same operation, that is, when there are n swin-transformer modules in each layer of encoder, then in this layer of encoder, n repeated swin-transformer is performed on the input data of each channel. Just operate. The embodiment of the present disclosure does not make any limit on the specific number of swin-transformer modules included in each layer of encoder.

Here, regarding the existing swin-transformer module, it should be noted that the existing swin-transformer module usually only has one input data. The main operation of the swin-transformer module is: first split the input data into multiple sub-data, where, Each sub-data corresponds to a data processing window. Therefore, by performing the same data processing operation on each sub-data in each data processing window, the window characteristics corresponding to each sub-data are obtained; then the window characteristics corresponding to each sub-data are spliced. After processing, the data characteristics corresponding to the complete input data can be obtained.

Based on this, in the embodiment of the present disclosure, the swin-transformer module also contains multiple feature extraction windows. At this time, what is different from the existing swin-transformer module is that in the embodiment of the present disclosure, based on the swin-transformer module There are 4 data transmission channels. Therefore, within each feature extraction window, the swin-transformer module in the embodiment of the present disclosure performs data processing on the respective sub-data of the 4 input data, thereby obtaining the input data of each channel. Corresponding data characteristics.

Specifically, in the embodiment of the present disclosure, each swin-transformer module performs the same operation steps. Taking a swin-transformer module as an example, the swin-transformer module described in the encoder part is specifically used to perform the following steps a1-step a3:

Step a1: Receive the output data of each channel in the upper layer encoder as the input data of the same channel in the current layer encoder.

For example, for the first layer encoder, the input data of the four channels in the first layer encoder are the first medical image, the second medical image, the first detection result and the initial input in step S103. Lesion matching results; for the second-layer encoder, the input data of the 4 channels in the second-layer encoder are the first medical image, the second medical image, the first detection result and the initial The lesion matching results are the data processing results output after data processing respectively.

Step a2: Divide the input data of each channel into multiple sub-data, and perform the same feature extraction and feature enhancement processing on the sub-data of different input data in each feature extraction window to obtain the result of each sub-data. Corresponding enhanced features within each feature extraction window.

It should be noted that in step a2, the same as the existing swin-transformer module, in each feature extraction window, a feature extraction operation can be performed for each sub-data input, and each sub-data is obtained in the feature extraction inside window An initial data feature; on this basis, the swin-transformer module in the embodiment of the present disclosure will also perform a feature enhancement process on the initial data feature of each sub-data in each feature extraction window, so as to obtain each sub-data The corresponding enhanced features of the data within the feature extraction window thus enhance the feature extraction capability of the data within each feature extraction window.

Step a3: perform splicing processing on the enhanced features of the sub-data belonging to the same channel, and output the result of the splicing processing to the corresponding channel in the lower-layer encoder.

For example, taking the first channel corresponding to the first medical image as an example, the first medical image is input to the first-layer encoder through the first channel. In the first-layer encoder, when the swin-transformer module When 5 feature extraction windows are included, the swin-transformer module can divide the first medical image into 5 sub-data, and through the same feature extraction and feature enhancement processing operations in each feature extraction window, obtain each sub-data in For the corresponding enhanced features in each feature extraction window, the enhanced features corresponding to the five sub-data are spliced to obtain the enhanced features of the first medical image, and the obtained enhanced features are input to the second layer of coding through the first channel. In the encoder, the same operations as the above steps a1 to a3 are performed in the second layer encoder.

Specifically, regarding the decoder part in the feature extraction model, each set of symmetrical encoders 201 and decoders 202 corresponds to a processing scale of image data. Based on this, each decoder 202 corresponds to a scale feature of image processing, In the same channel of each decoder 202, the decoder 202 first splices its corresponding scale features and the data features output by the previous stage decoder in the same channel in the feature dimension to obtain the same channel of the current layer decoder. Corresponding input data; then, repeat the reverse operation of the swin-transformer module of the encoder part above for the input data corresponding to each channel. In the last layer of encoder 202, the corresponding input data of the second medical image can be obtained from The feature extraction result of the second medical image is output in the second channel as the lesion feature extraction result in step S103.

In addition to the above improvements to the overall model structure of the feature extraction model, for the specific feature extraction and feature enhancement operations performed within each feature extraction window, the first feature extraction in the swin-transformer module in the first layer encoder is Taking the window as an example, the first sub-data segmented from the first medical image is used as the first input data of the first feature extraction window, and the first sub-data segmented from the second medical image is used as the first input data of the first feature extraction window. The second input data, the first sub-data segmented from the first detection result are used as the third input data of the first feature extraction window, and the first sub-data segmented from the initial lesion matching result are used as the first feature extraction window. For the fourth input data, the following is a detailed description of how to output the corresponding enhanced features of different sub-data input from different channels within the first feature extraction window:

For the above-mentioned first input data (that is, the first sub-data segmented from the first medical image), in an optional implementation, refer to FIG. 3. FIG. 3 shows a diagram provided by an embodiment of the present disclosure. A flowchart of a method for obtaining the first enhanced feature corresponding to the first input data in the first feature extraction window. The method includes steps S301-S306; specifically:

S301. In the first feature extraction window, perform feature extraction on the first input data, the second input data, the third input data and the fourth input data respectively to obtain the first A first data characteristic of the input data, a second data characteristic of the second input data, a third data characteristic of the third input data, and a fourth data characteristic of the fourth input data.

For example, taking the first input data as I1, the second input data as I2, the third input data as G1, and the fourth input data as G2, then in the first feature extraction window, the first input data are respectively The data I1, the second input data I2, the third input data G1 and the fourth input data G2 perform feature extraction to obtain the first data feature i1 of the first input data I1, the first data feature i2 of the second input data I2, and the first data feature i2 of the second input data I2. The third data feature g1 of the third input data G1 and the fourth data feature g2 of the fourth input data G2.

S302: Calculate the first self-attention feature of the first data feature under the self-attention mechanism using the Q feature matrix and K feature matrix of the first data feature under the attention mechanism.

Here, as an optional embodiment, the first data feature can be input into the first neural network, and the Q feature matrix, K feature matrix and V feature matrix of the first data feature under the attention mechanism can be obtained through the first neural network output.

Specifically, when performing step S302, the first self-attention feature A11 of the first data feature i1 under the self-attention mechanism (that is, only focusing on its own data features) can be calculated by the following formula:
A11＝Q1 ^T ×K1;

Among them, Q1 represents the Q feature matrix of the first data feature i1 under the attention mechanism;

K1 represents the K feature matrix of the first data feature i1 under the attention mechanism;

Q1 ^T represents the transposed matrix of Q1.

S303, use the Q feature matrix of the first data feature under the attention mechanism and the K feature matrix of the second data feature under the attention mechanism to calculate the Q feature matrix of the first data feature under the mutual attention mechanism. The first mutual attention feature.

Here, as an optional embodiment, the second data feature can be input into the second neural network, and the Q feature matrix, K feature matrix and V feature matrix of the second data feature under the attention mechanism can be obtained through the second neural network output.

It should be noted that since the first data feature comes from the first medical image and the second data feature comes from the second medical image, both the first medical image and the second medical image belong to image information, that is, when paying attention Under the force mechanism, the focus on the first data feature and the second data feature is the same (both focus on the image information side).

Based on this, in a preferred embodiment, the second neural network may be a neural network that shares parameters with the first neural network, so as to improve the Q feature matrix for the first data feature/second data feature under the attention mechanism, Extraction accuracy of K feature matrix and V feature matrix.

Specifically, when performing step S303, the first data feature i1 can be calculated by the following formula under the mutual attention mechanism (that is, on the image information side, pay attention to the data feature between the first data feature and the second data feature) The first mutual attention feature A12:
A12＝Q1 ^T ×K2;

K2 represents the K feature matrix of the second data feature i2 under the attention mechanism;

Q1 ^T represents the transposed matrix of Q1.

S304: Calculate the third self-attention feature of the third data feature under the self-attention mechanism using the Q feature matrix and K feature matrix of the third data feature under the attention mechanism.

Here, as an optional embodiment, the third data feature can be input into the third neural network, and the Q feature matrix, K feature matrix and V feature matrix of the third data feature under the attention mechanism can be obtained through the third neural network output.

Specifically, when performing step S304, the third self-attention feature Ag11 of the third data feature g1 under the self-attention mechanism (that is, only focusing on its own data features) can be calculated by the following formula:
Ag11=Qg1 ^T ×Kg1;

Among them, Qg1 represents the Q feature matrix of the third data feature g1 under the attention mechanism;

Kg1 represents the K feature matrix of the third data feature g1 under the attention mechanism;

Qg1 ^T represents the transposed matrix of Qg1.

S305, use the Q feature matrix of the third data feature under the attention mechanism and the K feature matrix of the fourth data feature under the attention mechanism to calculate the Q feature matrix of the third data feature under the mutual attention mechanism. The third mutual attention feature.

Here, as an optional embodiment, the fourth data feature can be input into the fourth neural network, and the Q feature matrix, K feature matrix and V feature matrix of the fourth data feature under the attention mechanism can be obtained through the fourth neural network output.

It should be noted that since the third data feature is derived from the first detection result and the fourth data feature is derived from the initial lesion matching result, both the first detection result and the initial lesion matching result belong to the anatomical structure information (location) of the target lesion. information, size information, etc.), that is, under the attention mechanism, the third data feature and the fourth data feature have the same focus (both focus on the anatomical structure information side).

Based on this, in a preferred embodiment, the fourth neural network may be a neural network that shares parameters with the third neural network, so as to improve the Q feature matrix for the third data feature/fourth data feature under the attention mechanism, Extraction accuracy of K feature matrix and V feature matrix.

Specifically, when performing step S305, the third data feature g1 can be calculated by the following formula in the mutual attention mechanism (that is, on the anatomical structure information side, pay attention to the data feature between the third data feature and the fourth data feature) The third mutual attention feature Ag12 below:
Ag12=Qg1 ^T ×Kg2;

Kg2 represents the K feature matrix of the fourth data feature g2 under the attention mechanism;

Qg1 ^T represents the transposed matrix of Qg1.

S306: Use the first data feature, the first self-attention feature, the first mutual attention feature, the third self-attention feature, the third mutual attention feature, the first The first enhanced feature is calculated by calculating the V feature matrix of the data feature under the attention mechanism and the V feature matrix of the second data feature under the attention mechanism.

Specifically, when performing step S306, the first enhanced feature T1 can be calculated by the following formula:
T1＝i1+softmax(A11+softmax(Ag11))×V1+softmax(A12+softmax(Ag12))×V2;

Among them, i1 represents the first data feature, A11 represents the first self-attention feature, A12 represents the first mutual attention feature, Ag11 represents the third self-attention feature, Ag12 represents the third mutual attention feature, and V1 represents the first data The V feature matrix of feature i1 under the attention mechanism and V2 represent the V feature matrix of the second data feature i2 under the attention mechanism.

It should be noted that when calculating the first enhanced feature T1, in addition to the softmax function, other types of functions can also be used for calculation (for example, the softmax function in the above formula can also be replaced by the sigmoid function). For calculating the third The embodiment of the present disclosure does not impose any limitation on the specific function type used when enhancing feature T1.

For the above-mentioned second input data (that is, the first sub-data segmented from the second medical image), in an optional implementation, refer to Figure 4. Figure 4 shows a diagram provided by an embodiment of the present disclosure. A flowchart of a method for obtaining the second enhanced feature corresponding to the second input data in the first feature extraction window. The method includes steps S401-S405; specifically:

S401: Calculate the second self-attention feature of the second data feature under the self-attention mechanism using the Q feature matrix and K feature matrix of the second data feature under the attention mechanism.

Here, the Q feature matrix and K feature matrix of the second data feature under the attention mechanism are obtained in the same manner as described in the above step S303, and the repeated points will not be repeated here.

Specifically, when performing step S401, the second self-attention feature A22 of the second data feature i2 under the self-attention mechanism (that is, only focusing on its own data features) can be calculated by the following formula:
A22＝Q2 ^T ×K2;

Among them, Q2 represents the Q feature matrix of the second data feature i2 under the attention mechanism;

Q2 ^T represents the transposed matrix of Q2.

S402, use the Q feature matrix of the second data feature under the attention mechanism and the K feature matrix of the first data feature under the attention mechanism to calculate the Q feature matrix of the second data feature under the mutual attention mechanism. Second mutual attention feature.

Specifically, when performing step S402, the second data feature i2 can be calculated by the following formula under the mutual attention mechanism (that is, on the image information side, pay attention to the data feature between the first data feature and the second data feature) The second mutual attention feature A21:
A21＝Q2 ^T ×K1;

Q2 ^T represents the transposed matrix of Q2.

S403: Calculate the fourth self-attention feature of the fourth data feature under the self-attention mechanism using the Q feature matrix and K feature matrix of the fourth data feature under the attention mechanism.

Here, the acquisition method of the Q feature matrix and the K feature matrix of the fourth data feature under the attention mechanism is the same as that described in the above step S305, and the duplication will not be repeated here.

Specifically, when performing step S403, the fourth self-attention feature Ag22 of the fourth data feature g2 under the self-attention mechanism (that is, only focusing on its own data features) can be calculated by the following formula:
Ag22=Qg2 ^T ×Kg2;

Among them, Qg2 represents the Q feature matrix of the fourth data feature g2 under the attention mechanism;

Qg2 ^T represents the transpose matrix of Qg2.

S404, use the Q feature matrix of the fourth data feature under the attention mechanism and the K feature matrix of the third data feature under the attention mechanism to calculate the Q feature matrix of the fourth data feature under the mutual attention mechanism. The fourth mutual attention feature.

Specifically, when performing step S404, the fourth data feature g2 can be calculated by the following formula in the mutual attention mechanism (that is, on the anatomical structure information side, pay attention to the data feature between the third data feature and the fourth data feature) The fourth mutual attention feature Ag21 below:
Ag21=Qg2 ^T ×Kg1;

Qg2 ^T represents the transpose matrix of Qg2.

S405: Use the second data feature, the second self-attention feature, the second mutual attention feature, the fourth self-attention feature, the fourth mutual attention feature, the first The V feature matrix of the data feature under the attention mechanism and the V feature matrix of the second data feature under the attention mechanism are calculated to obtain the second enhanced feature.

Specifically, when performing step S405, the second enhanced feature T2 can be calculated by the following formula:
T2＝i2+softmax(A22+softmax(Ag22))×V2+softmax(A21+softmax(Ag21))×V1;

Among them, i2 represents the second data feature, A22 represents the second self-attention feature, A21 represents the second mutual attention feature, Ag22 represents the fourth self-attention feature, Ag21 represents the fourth mutual attention feature, and V1 represents the first data The V feature matrix of feature i1 under the attention mechanism and V2 represent the V feature matrix of the second data feature i2 under the attention mechanism.

It should be noted that when calculating the second enhanced feature T2, in addition to the softmax function, other types of functions can also be used for calculation (for example, the softmax function in the above formula can also be replaced by the sigmoid function). For calculating the third The embodiment of the present disclosure also does not limit the specific function type used when enhancing feature T2.

Regarding the above-mentioned third input data (that is, the first sub-data segmented from the first detection result), in an optional implementation manner, refer to FIG. 5 . FIG. 5 shows the method provided by the embodiment of the present disclosure. A flowchart of a method for obtaining the third enhanced feature corresponding to the third input data in the first feature extraction window. The method includes steps S501-S502; specifically:

S501. Obtain the first self-attention feature, the first mutual attention feature, the third self-attention feature and the third mutual attention feature.

Specifically, for the specific acquisition method of the first self-attention feature A11, the first mutual attention feature A12, the third self-attention feature Ag11 and the third mutual attention feature Ag12, please refer to the above-mentioned steps S302-S305. The repetition is as follows: This will not be described again.

S502: Use the third data feature, the first self-attention feature, the first mutual attention feature, the third self-attention feature, the third mutual attention feature, the third The V feature matrix of the data feature under the attention mechanism and the V feature matrix of the fourth data feature under the attention mechanism are calculated to obtain the third enhanced feature.

Specifically, when performing step S502, the third enhanced feature T3 can be calculated by the following formula:
T3＝g1+softmax(Ag11+softmax(A11))×Vg1+softmax(Ag12+softmax(A12))×
Vg2;

Among them, g1 represents the third data feature, A11 represents the first self-attention feature, A12 represents the first mutual attention feature, Ag11 represents the third self-attention feature, Ag12 represents the third mutual attention feature, and Vg1 represents the third data. The V feature matrix and Vg2 of the feature under the attention mechanism represent the V feature matrix of the fourth data feature under the attention mechanism.

It should be noted that when calculating the third enhanced feature T3, in addition to the softmax function, other types of functions can also be used for calculation. For the specific function type used when calculating the third enhanced feature T3 (for example, the above-mentioned The softmax function in the formula is replaced by the sigmoid function), and the embodiments of this disclosure also do not impose any limitations.

For the above-mentioned fourth input data (that is, the first sub-data segmented from the initial lesion matching result), in an optional implementation, refer to FIG. 6 , which shows the method provided by the embodiment of the present disclosure. A schematic flow chart of a method for obtaining the fourth enhanced feature corresponding to the fourth input data in the first feature extraction window. The method includes steps S601-S602; specifically:

S601. Obtain the second self-attention feature, the second mutual attention feature, the fourth self-attention feature and the fourth mutual attention feature.

Specifically, for the specific acquisition method of the second self-attention feature A22, the second mutual attention feature A21, the fourth self-attention feature Ag22 and the fourth mutual attention feature Ag21, please refer to the above-mentioned steps S401-S404. The repetition is as follows: This will not be described again.

S602: Use the fourth data feature, the second self-attention feature, the second mutual attention feature, the fourth self-attention feature, the fourth mutual attention feature, the third The fourth enhanced feature is calculated by calculating the V feature matrix of the data feature under the attention mechanism and the V feature matrix of the fourth data feature under the attention mechanism.

Specifically, when executing step S602, the fourth enhanced feature T4 can be calculated by the following formula:
T4＝g2+softmax(Ag22+softmax(A22))×Vg2+softmax(Ag21+softmax(A21))×
Vg1;

Among them, g2 represents the fourth data feature, A22 represents the second self-attention feature, A21 represents the second mutual attention feature, Ag22 represents the fourth self-attention feature, Ag21 represents the fourth mutual attention feature, and Vg1 represents the third data. The V feature matrix and Vg2 of the feature under the attention mechanism represent the V feature matrix of the fourth data feature under the attention mechanism.

It should be noted that when calculating the fourth enhanced feature T4, in addition to the softmax function, other types of functions can also be used for calculation (for example, the softmax function in the above formula can also be replaced by the sigmoid function). For calculating the third The embodiment of the present disclosure also does not impose any limitation on the specific function type used to enhance the feature T4.

Regarding the lesion feature extraction result of the second medical image output from the feature extraction model (i.e., the image feature vector of the second medical image), based on the analysis content at the above step S103, it can be known that the above lesion feature extraction result can replace the second medical image. It serves as model input data for models such as the lesion segmentation model/lesion detection model, thereby helping the lesion segmentation model complete the lesion segmentation task for the second medical image/helping the lesion detection model complete the lesion detection task for the second medical image.

Specifically, for example, the above-mentioned lesion feature extraction result can replace the second medical image as the model input data of the lesion segmentation model. In an optional implementation, when the above-mentioned lesion feature extraction result replaces the second medical image as the lesion segmentation model, When the model inputs data in the model training phase, refer to FIG. 7 , which shows a schematic flowchart of the first method for using the lesion feature extraction results of the second medical image provided by an embodiment of the present disclosure. The method includes steps S701-S702; specifically:

S701: Input the lesion feature extraction result of the second medical image into the first lesion segmentation model, and output the lesion segmentation prediction result of the second medical image.

Here, the first lesion segmentation model represents the lesion segmentation model in the training stage; wherein, the specific model structure of the first lesion segmentation model is not subject to any limitation; for example, it can be a single-layer convolutional neural network structure, or It can be other more complex multi-layer neural network structures.

Here, the lesion segmentation prediction result represents the prediction result for the image area where the target lesion is located in the second medical image; for example, the lesion segmentation prediction result can be the labeling result of the second medical image based on 0-1 labeling, where the lesion segmentation prediction The image area marked 1 in the result represents the prediction result for the image area where the target lesion is located in the second medical image.

S702: According to the segmentation loss between the lesion segmentation prediction result and the second detection result of the target lesion in the second space, perform model parameters of the first lesion segmentation model and the feature extraction model. Adjust to obtain the first lesion segmentation model and feature extraction model including the adjusted parameters.

Here, the second detection result is determined based on the position information and size information of the target lesion in the second medical image.

It should be noted that when calculating the segmentation loss above, the cross-entropy loss function or other loss functions such as the focal loss function may be used. The embodiment of the present disclosure does not impose any restrictions on the specific calculation method of the segmentation loss.

Specifically, in another optional implementation, when the above-mentioned lesion feature extraction result replaces the second medical image as the model input data of the lesion segmentation model in the model application stage, refer to Figure 8, which shows A schematic flowchart of the second method for using the lesion feature extraction results of the second medical image provided by the embodiment of the present disclosure. The method includes step S801; specifically:

S801: Input the lesion feature extraction result of the second medical image into the second lesion segmentation model, and output the lesion segmentation result of the target lesion in the second medical image.

Here, the second lesion segmentation model represents the lesion segmentation model in the application stage, that is, the second lesion segmentation model represents the pre-trained lesion segmentation model. At this time, since the second lesion segmentation model has completed the model training process, , different from the above-mentioned steps S701-S702, step S801 no longer involves adjusting the model parameters of the second lesion segmentation model and the feature extraction model.

Based on the different usage methods of the lesion feature extraction results of the second medical image shown in the above steps S701-S702 and the above step S801, it should be noted that the embodiment of the present disclosure essentially uses the execution method shown in steps S101-S103. , provides a data processing method for any two different medical images in the follow-up case data, whether the data processing results (ie, the lesion feature extraction results of the second medical image) are specifically applied to the training phase of the model or to the application of the model stage, the embodiments of this disclosure do not impose any limitations.

Through the above-mentioned processing method of follow-up case data provided by the embodiments of the present disclosure, the first detection result of the target lesion in the first space is determined according to the position information and size information of the target lesion in the first medical image; according to the first medical image and the second medical image, transform the first detection result, and obtain the initial lesion matching result of the first detection result in the second space; combine the first medical image, the second medical image, and the second medical image. The first detection result and the initial lesion matching result are input into the feature extraction model, and the lesion feature extraction result of the second medical image is output. In this way, the present disclosure enables the model to effectively combine the anatomical structure information of the lesion on the basis of medical image information, improves the model's feature extraction accuracy for the same lesion in different medical images, and improves the model's matching and positioning of the same lesion in follow-up case data. accuracy.

Based on the same inventive concept, the present disclosure also provides a processing device corresponding to the above-mentioned processing method of follow-up case data, because the principle of solving the problem of the processing device in the embodiment of the present disclosure is similar to the above-mentioned processing method of follow-up case data in the embodiment of the present disclosure. , therefore the implementation of the processing device can be referred to the implementation of the above-mentioned processing method, and repeated details will not be repeated.

Referring to FIG. 9 , FIG. 9 shows a schematic structural diagram of a device for processing follow-up case data provided by an embodiment of the present disclosure. The follow-up case data includes at least a first medical image and a second medical image; The first medical image and the second medical image are medical images collected for the same object at different times; the processing device includes:

Determining module 901, configured to determine the first detection result of the target lesion in the first space according to the position information and size information of the target lesion in the first medical image; wherein the first space represents the The coordinate space where the first medical image is located;

The registration module 902 is configured to perform transformation processing on the first detection result according to the registration transformation matrix between the first medical image and the second medical image, so as to obtain the first detection result in the second The initial lesion matching result in space; wherein the second space represents the coordinate space in which the second medical image is located;

Processing module 903, configured to input the first medical image, the second medical image, the first detection result and the initial lesion matching result into a feature extraction model, and output the lesions of the second medical image. Feature extraction results; wherein the lesion feature extraction results are used at least for a lesion segmentation task for the second medical image.

In an optional implementation, the feature extraction model adopts the Unet network structure with the swin-transformer module as the core; wherein, the Unet network structure includes multiple groups of symmetrical encoders and decoders, each of which At least in the above encoder It includes a swin-transformer module with four inputs and four outputs. The swin-transformer module contains multiple feature extraction windows. The swin-transformer module is used for:

Receive the output data of each channel in the upper layer encoder as the input data of the same channel in the current layer encoder;

Divide the input data of each channel into multiple sub-data, and perform the same feature extraction and feature enhancement processing on the sub-data of different input data in each feature extraction window, so as to obtain the characteristics of each sub-data in each location. The corresponding enhanced features within the feature extraction window;

The enhanced features of the sub-data belonging to the same channel are spliced, and the results of the splicing processing are output to the corresponding channels in the lower-layer encoder.

In an optional implementation, in the feature extraction model, the first sub-data segmented from the first medical image is used as the first part of the first feature extraction window in the swin-transformer module. Input data, the first sub-data segmented from the second medical image are used as the second input data of the first feature extraction window, and the first sub-data segmented from the first detection result are used as the The third input data of the first feature extraction window and the first sub-data segmented from the initial lesion matching result are used as the fourth input data of the first feature extraction window. The swin-transformer module is used to pass the following The method obtains the first enhanced feature corresponding to the first input data within the first feature extraction window:

In the first feature extraction window, perform feature extraction on the first input data, the second input data, the third input data and the fourth input data respectively to obtain the first input data The first data feature of the second input data, the second data feature of the second input data, the third data feature of the third input data, and the fourth data feature of the fourth input data;

Using the Q feature matrix and K feature matrix of the first data feature under the attention mechanism, calculate the first self-attention feature of the first data feature under the self-attention mechanism;

Using the Q feature matrix of the first data feature under the attention mechanism and the K feature matrix of the second data feature under the attention mechanism, the first data feature under the mutual attention mechanism is calculated. Mutual attention features;

Using the Q feature matrix and K feature matrix of the third data feature under the attention mechanism, calculate the third self-attention feature of the third data feature under the self-attention mechanism;

Using the Q feature matrix of the third data feature under the attention mechanism and the K feature matrix of the fourth data feature under the attention mechanism, the third data feature of the third data feature under the mutual attention mechanism is calculated. Mutual attention features;

Using the first data feature, the first self-attention feature, the first mutual attention feature, the third self-attention feature, the third mutual attention feature, and the first data feature The first enhanced feature is calculated by calculating the V feature matrix under the attention mechanism and the V feature matrix of the second data feature under the attention mechanism.

In an optional implementation, the swin-transformer module is used to obtain the second enhanced feature corresponding to the second input data within the first feature extraction window through the following method:

Using the Q feature matrix and K feature matrix of the second data feature under the attention mechanism, calculate the second self-attention feature of the second data feature under the self-attention mechanism;

Using the Q feature matrix of the second data feature under the attention mechanism and the K feature matrix of the first data feature under the attention mechanism, the second data feature under the mutual attention mechanism is calculated. Mutual attention features;

Using the Q feature matrix and K feature matrix of the fourth data feature under the attention mechanism, calculate the fourth self-attention feature of the fourth data feature under the self-attention mechanism;

Using the Q feature matrix of the fourth data feature under the attention mechanism and the K feature matrix of the third data feature under the attention mechanism, the fourth data feature of the fourth data feature under the mutual attention mechanism is calculated. Mutual attention features;

Utilize the second data feature, the second self-attention feature, the second mutual attention feature, the fourth self-attention feature, the fourth mutual attention feature, and the first data The V feature matrix of the feature under the attention mechanism and the V feature matrix of the second data feature under the attention mechanism are calculated to obtain the second enhanced feature.

In an optional implementation, the swin-transformer module is used to obtain the third enhanced feature corresponding to the third input data within the first feature extraction window through the following method:

Obtain the first self-attention feature, the first mutual attention feature, the third self-attention feature and the third mutual attention feature;

Utilizing the third data feature, the first self-attention feature, the first mutual attention feature, the third self-attention feature, the third mutual attention feature, and the third data feature The V feature matrix under the attention mechanism and the V feature matrix of the fourth data feature under the attention mechanism are calculated to obtain the third enhanced feature.

In an optional implementation, the swin-transformer module is used to obtain the fourth enhanced feature corresponding to the fourth input data within the first feature extraction window through the following method:

Obtain the second self-attention feature, the second mutual attention feature, the fourth self-attention feature and the fourth mutual attention feature;

Using the fourth data feature, the second self-attention feature, the second mutual attention feature, the fourth self-attention feature, the fourth mutual attention feature, and the third data feature The fourth enhanced feature is calculated by calculating the V feature matrix under the attention mechanism and the V feature matrix of the fourth data feature under the attention mechanism.

In an optional implementation, the processing device further includes:

The first output module is used to input the lesion feature extraction result of the second medical image into the first lesion segmentation model, and output the lesion segmentation prediction result of the second medical image; wherein, the first lesion segmentation The model represents the lesion segmentation model in the training stage; the lesion segmentation prediction result represents the prediction result for the image area where the target lesion is located in the second medical image;

A training module configured to train the first lesion segmentation model and the feature extraction model based on the segmentation loss between the lesion segmentation prediction result and the second detection result of the target lesion in the second space. The model parameters are adjusted to obtain a first lesion segmentation model and a feature extraction model including the adjusted parameters; wherein the second detection result is based on the position information and size of the target lesion in the second medical image. Information confirmed.

In an optional implementation, the processing device further includes:

The second output module is used to input the lesion feature extraction result of the second medical image into the second lesion segmentation model, and output the lesion segmentation result of the target lesion in the second medical image; wherein, The second lesion segmentation model described above represents the lesion segmentation model in the application stage.

Based on the same inventive concept, as shown in Figure 10, an embodiment of the present disclosure provides a computer device 1000 for executing the processing method of follow-up case data in the present disclosure. The device includes a memory 1001, a processor 1002, and a computer device 1000 stored in the present disclosure. A computer program on the memory 1001 that can be run on the processor 1002, wherein the memory 1001 and the processor 1002 communicate through a bus. When the processor 1002 executes the above computer program, the above-mentioned processing method of follow-up case data is implemented. step.

Specifically, the above-mentioned memory 1001 and processor 1002 can be general-purpose memories and processors, which are not specifically limited here. When the processor 1002 runs the computer program stored in the memory 1001, it can execute the above-mentioned processing method of follow-up case data.

Corresponding to the processing method of follow-up case data in the present disclosure, embodiments of the present disclosure also provide a computer-readable storage medium. The computer-readable storage medium stores a computer program. The computer program executes the above when run by a processor. Steps of processing method for follow-up case data.

Specifically, the storage medium can be a general-purpose storage medium, such as a removable disk, a hard disk, etc. When the computer program on the storage medium is run, the above-mentioned processing method of follow-up case data can be executed.

In the embodiments provided by this disclosure, it should be understood that the disclosed systems and methods can be implemented in other ways. The system embodiments described above are only illustrative. For example, the division of the units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or can be integrated into another system, or some features can be ignored, or not implemented. On the other hand, the coupling or direct coupling or communication connection between each other shown or discussed may be through some communication interface, the indirect coupling or communication connection of the system or unit, which may be in electrical, mechanical or other forms.

The units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or they may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments provided by the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.

If the functions are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present disclosure is essentially or the part that contributes to the relevant technology or the part of the technical solution can be embodied in the form of a software product. The computer software product is stored in a storage medium and includes several The instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of the present disclosure. The aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program code. .

It should be noted that similar reference numerals and letters represent similar items in the following drawings. Therefore, once an item is defined in one drawing, it does not need further definition and explanation in subsequent drawings. In addition, the terms "first", "second", "third", etc. are only used to distinguish descriptions and shall not be understood as indicating or implying relative importance.

Finally, it should be noted that the above-mentioned embodiments are only specific implementations of the present disclosure and are used to illustrate the technical solutions of the present disclosure rather than to limit them. The protection scope of the present disclosure is not limited thereto. Although refer to the foregoing The embodiments describe the present disclosure in detail. Those of ordinary skill in the art should understand that any person familiar with the technical field can still modify the technical solutions recorded in the foregoing embodiments within the technical scope disclosed in the present disclosure. It is possible to easily think of changes or equivalent substitutions of some of the technical features; however, these modifications, changes or substitutions do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the present disclosure. All are covered by the protection scope of this disclosure. Therefore, the protection scope of the present disclosure should be subject to the protection scope of the claims.

Claims

A method for processing follow-up case data, wherein the follow-up case data includes at least a first medical image and a second medical image; the first medical image and the second medical image are respectively for the same subject at different times. Collected medical images; the processing method includes:

According to the position information and size information of the target lesion in the first medical image, the first detection result of the target lesion in the first space is determined; wherein the first space represents the location of the first medical image. coordinate space;

According to the registration transformation matrix between the first medical image and the second medical image, the first detection result is transformed and the initial lesion matching result of the first detection result in the second space is obtained. ; Wherein, the second space represents the coordinate space in which the second medical image is located;

The first medical image, the second medical image, the first detection result and the initial lesion matching result are input into the feature extraction model, and the lesion feature extraction result of the second medical image is output; wherein, The lesion feature extraction result is at least used for a lesion segmentation task for the second medical image.
The processing method according to claim 1, wherein the feature extraction model adopts a Unet network structure with a swin-transformer module as the core; wherein the Unet network structure includes multiple groups of symmetrical encoders and decoders, each 1. The encoder includes at least one swin-transformer module with four inputs and four outputs. The swin-transformer module contains multiple feature extraction windows. The swin-transformer module is used for:

Receive the output data of each channel in the upper layer encoder as the input data of the same channel in the current layer encoder;

Divide the input data of each channel into multiple sub-data, and perform the same feature extraction and feature enhancement processing on the sub-data of different input data in each feature extraction window, so as to obtain the characteristics of each sub-data in each location. The corresponding enhanced features within the feature extraction window;

The enhanced features of the sub-data belonging to the same channel are spliced, and the results of the splicing processing are output to the corresponding channels in the lower-layer encoder.
The processing method according to claim 2, wherein in the feature extraction model, the first sub-data segmented from the first medical image is used as the first feature extraction window in the swin-transformer module. The first input data, the first sub-data segmented from the second medical image are used as the second input data of the first feature extraction window, and the first sub-data segmented from the first detection result are used as The third input data of the first feature extraction window and the first sub-data segmented from the initial lesion matching result are used as the fourth input data of the first feature extraction window. The first data is obtained by the following method The first enhanced feature corresponding to the input data within the first feature extraction window:

In the first feature extraction window, perform feature extraction on the first input data, the second input data, the third input data and the fourth input data respectively to obtain the first input data The first data feature of the second input data, the second data feature of the second input data, the third data feature of the third input data, and the fourth data feature of the fourth input data;

Using the Q feature matrix and K feature matrix of the first data feature under the attention mechanism, calculate the first self-attention feature of the first data feature under the self-attention mechanism;

Using the Q feature matrix of the first data feature under the attention mechanism and the K feature matrix of the second data feature under the attention mechanism, the first data feature under the mutual attention mechanism is calculated. Mutual attention features;

Using the Q feature matrix and K feature matrix of the third data feature under the attention mechanism, calculate the third self-attention feature of the third data feature under the self-attention mechanism;

Using the Q feature matrix of the third data feature under the attention mechanism and the K feature matrix of the fourth data feature under the attention mechanism, the third data feature of the third data feature under the mutual attention mechanism is calculated. Mutual attention features;

Using the first data feature, the first self-attention feature, the first mutual attention feature, the third self-attention feature, the third mutual attention feature, and the first data feature The first enhanced feature is calculated by calculating the V feature matrix under the attention mechanism and the V feature matrix of the second data feature under the attention mechanism.
The processing method according to claim 3, wherein the second enhanced feature corresponding to the second input data within the first feature extraction window is obtained by the following method:

Using the Q feature matrix and K feature matrix of the second data feature under the attention mechanism, calculate the second self-attention feature of the second data feature under the self-attention mechanism;

Using the Q feature matrix of the second data feature under the attention mechanism and the K feature matrix of the first data feature under the attention mechanism, the second data feature under the mutual attention mechanism is calculated. Mutual attention features;

Using the Q feature matrix and K feature matrix of the fourth data feature under the attention mechanism, calculate the fourth self-attention feature of the fourth data feature under the self-attention mechanism;

Using the Q feature matrix of the fourth data feature under the attention mechanism and the K feature matrix of the third data feature under the attention mechanism, the fourth data feature of the fourth data feature under the mutual attention mechanism is calculated. Mutual attention features;

Using the second data feature, the second self-attention feature, the second mutual attention feature, the fourth self-attention feature, the fourth mutual attention feature, and the first data feature The second enhanced feature is calculated by calculating the V feature matrix under the attention mechanism and the V feature matrix of the second data feature under the attention mechanism.
The processing method according to claim 3, wherein the third enhanced feature corresponding to the third input data within the first feature extraction window is obtained by the following method:

Obtain the first self-attention feature, the first mutual attention feature, the third self-attention feature and the third mutual attention feature;

Utilizing the third data feature, the first self-attention feature, the first mutual attention feature, the third self-attention feature, the third mutual attention feature, and the third data feature The V feature matrix under the attention mechanism and the V feature matrix of the fourth data feature under the attention mechanism are calculated to obtain the third enhanced feature.
The processing method according to claim 1, wherein the fourth enhanced feature corresponding to the fourth input data within the first feature extraction window is obtained by the following method:

Obtain the second self-attention feature, the second mutual attention feature, the fourth self-attention feature and the fourth mutual attention feature;

Using the fourth data feature, the second self-attention feature, the second mutual attention feature, the fourth self-attention feature, the fourth mutual attention feature, and the third data feature The fourth enhanced feature is calculated by calculating the V feature matrix under the attention mechanism and the V feature matrix of the fourth data feature under the attention mechanism.
The processing method according to claim 1, wherein the processing method further includes:

The lesion feature extraction result of the second medical image is input into the first lesion segmentation model, and the lesion segmentation prediction result of the second medical image is output; wherein the first lesion segmentation model represents the lesion in the training stage Segmentation model; the lesion segmentation prediction result represents the prediction result for the image area where the target lesion is located in the second medical image;

adjusting the model parameters of the first lesion segmentation model and the feature extraction model according to the segmentation loss between the lesion segmentation prediction result and the second detection result of the target lesion in the second space, A first lesion segmentation model and a feature extraction model including adjusted parameters are obtained; wherein the second detection result is determined based on the position information and size information of the target lesion in the second medical image.
The processing method according to claim 1, wherein the processing method further includes:

Input the lesion feature extraction result of the second medical image into the second lesion segmentation model, and output the lesion segmentation result of the target lesion in the second medical image; wherein, the second lesion segmentation model represents Lesion segmentation model in application stage.
A processing device for follow-up case data, wherein the follow-up case data includes at least a first medical image and a second medical image; the first medical image and the second medical image are respectively for the same subject at different times. Collected medical images; the processing device includes:

Determining module, configured to determine the first detection result of the target lesion in the first space according to the position information and size information of the target lesion in the first medical image; wherein the first space represents the first detection result of the target lesion in the first medical image. 1. The coordinate space in which the medical image is located;

a registration module, configured to perform transformation processing on the first detection result according to the registration transformation matrix between the first medical image and the second medical image, and obtain the first detection result in the second space The initial lesion matching result under; wherein, the second space represents the coordinate space in which the second medical image is located;

A processing module configured to input the first medical image, the second medical image, the first detection result and the initial lesion matching result into a feature extraction model, and output the lesion features of the second medical image. Extraction results; wherein the lesion feature extraction results are used at least for a lesion segmentation task for the second medical image.
An electronic device, which includes: a processor, a memory, and a bus. The memory stores machine-readable instructions executable by the processor. When the electronic device is running, the processor and the memory are connected via Bus communication, when the machine-readable instructions are executed by the processor, the steps of the processing method according to any one of claims 1 to 8 are performed.
A computer-readable storage medium, wherein a computer program is stored on the computer-readable storage medium, and the computer program executes the steps of the processing method according to any one of claims 1 to 8 when run by a processor.