WO2022095258A1

WO2022095258A1 - Image object classification method and apparatus, device, storage medium and program

Info

Publication number: WO2022095258A1
Application number: PCT/CN2020/139913
Authority: WO
Inventors: 朱雅靖; 陈翼男; 罗祥德; 任家敏
Original assignee: 上海商汤智能科技有限公司
Priority date: 2020-11-03
Filing date: 2020-12-28
Publication date: 2022-05-12
Also published as: TW202219832A; CN112329844A

Abstract

Disclosed in embodiments are an image object classification method and apparatus, a device, a storage medium and a program. The image object classification method comprises: acquiring at least one image to be classified that includes a target object, wherein the at least one image is a medical image belonging to at least one type of scan images; and performing object classification on the at least one image by using a classification model to obtain the type of the target object. The described solution can be applied to a medical image including at least one phase of a tumor, so as to determine the type of the tumor in the medical image, that is, can realize intelligent object classification and improve object classification efficiency.

Description

Image object classification method, device, equipment, storage medium and program

CROSS-REFERENCE TO RELATED APPLICATIONS

The present disclosure is based on a Chinese patent application with application number 202011212261.1 and an application date of November 03, 2020, and claims the priority of the Chinese patent application, the entire contents of which are incorporated herein by reference.

technical field

The present disclosure relates to the technical field of image processing, and in particular, to an image object classification method, apparatus, device, storage medium and program.

Background technique

Medical images such as Computed Tomography (CT) and Magnetic Resonance (Magnetic Resonance) are of great clinical significance. Taking the clinical practice related to the liver as an example, the scan image categories often include time-related pre-contrast scan, early arterial phase, late arterial phase, portal venous phase, and delayed phase, etc. In addition, the scan image category can also include scan parameters related to the T1-weighted inverse imaging, T1-weighted in-phase imaging, T2-weighted imaging, diffusion-weighted imaging, and surface diffusion coefficient imaging, etc. Through the identification of medical images, it is helpful for clinicians to understand the disease.

In the related art, in the process of disease diagnosis and treatment, the doctor usually needs to repeatedly check the signs of the target object such as the tumor on the medical image, which makes the determination of the type of the tumor overly dependent on the professional level of the doctor, and the problem of low efficiency in determining the type of the tumor. .

SUMMARY OF THE INVENTION

The present disclosure provides at least an image object classification method, apparatus, device, storage medium and program.

The embodiment of the present disclosure provides an image object classification method, and the image object classification method includes:

acquiring at least one image to be classified containing the target object, wherein the at least one image to be classified is a medical image belonging to at least one scanned image category;

Using a classification model, the at least one image to be classified is subjected to target classification to obtain the type of the target object. In this way, after obtaining at least one image to be classified including the target object, the classification model is used to classify the at least one image to be classified, and the type of the target object is obtained. Because the classification model is used to classify the image to be classified, intelligence is realized. Target classification, and no manual target classification is required, which can reduce the dependence on manual work and improve the efficiency of target classification.

In some embodiments of the present disclosure, performing target classification on the at least one image to be classified to obtain the type of the target object includes:

Performing several layers of feature extraction on the at least one image to be classified, correspondingly obtaining several groups of initial feature information; wherein, the size of each group of the initial feature information is different;

Obtain final feature information based on at least one set of initial feature information in the several sets of initial feature information;

The final feature information is classified to obtain the type of the target object.

In this way, after the initial feature information is obtained through feature extraction, and the final feature information is obtained based on the initial feature information, the final feature information can be classified to obtain the type of the target object, so the target classification is realized by using the feature information of the target object.

In some embodiments of the present disclosure, before the target classification is performed on the at least one image to be classified and the type of the target object is obtained, the method further includes:

obtaining the final area of the target object based on the initial area corresponding to the target object in the image to be classified;

Correspondingly, several layers of feature extraction are performed on the at least one image to be classified, and several sets of initial feature information are correspondingly obtained, including:

Using the final area to perform several layers of feature extraction on the at least one image to be classified, correspondingly obtain several sets of initial feature information; wherein, in the feature extraction process, the weight of the image to be classified corresponding to the final area is high the weight of other regions in the image to be classified; and/or, the features corresponding to the final region in the initial feature information are more abundant than the features of other regions.

In this way, when using the final region to perform feature extraction on the image to be classified, the weight of the corresponding final region in the image to be classified is higher than the weights of other regions in the image to be classified, so that the classification model tends to extract features with more details for the final region; and /or, the features corresponding to the final region in the initial feature information are richer than those of other regions; thus, the classification model can learn the feature information of the target object itself better by using the initial feature information of the image to be classified, to a certain extent Reduce the impact of noise interference around the target object on target classification.

In some embodiments of the present disclosure, obtaining the final area of the target object based on the initial area corresponding to the target object in the image to be classified includes:

The union of the initial regions corresponding to the target object in the at least one image to be classified is obtained as the final region of the target object.

In this way, when the final area of the target object is the union of the initial areas of the target object in the image to be classified, the final area is greater than or equal to any initial area, ensuring that the final area of the target object can contain different targets in the to-be-classified image. The object corresponds to the area, so that the feature information of the target object can be paid attention to as much as possible when the feature extraction of the image to be classified is performed.

In some embodiments of the present disclosure, the at least one image to be classified includes a first image to be classified without an initial region of the target object and a second image to be classified with an initial region of the target object labeled; Before obtaining the final area of the target object based on the initial area corresponding to the target object in the image to be classified, the method further includes:

Using the classification model to detect that the first image to be classified is not marked with the initial area of the target object, and based on the initial area of the target object marked on the second to-be-classified image and the second to-be-classified image The registration relationship between the classified image and the first to-be-classified image determines the initial area of the target object on the first to-be-classified image.

In this way, the classification model can be used to determine the initial area of the target object for the first to-be-classified image that is not labeled with the initial area of the target object, so as to complete the labeling, so that the to-be-classified images all include the initial area.

In some embodiments of the present disclosure, before obtaining the final feature information based on at least one set of initial feature information in the several sets of initial feature information, the method further includes:

converting each group of the initial feature information into a preset dimension;

And/or, obtaining final feature information based on at least one set of initial feature information in the several groups of initial feature information, including:

Using the weight of the at least one set of initial feature information, the at least one set of initial feature information is fused to obtain the final feature information.

In this way, each group of initial feature information is uniformly converted into a preset dimension, which facilitates subsequent acquisition of final feature information. In addition, since each set of initial feature information reflects the characteristics of the target object, the weight of at least one set of initial feature information can be used to fuse the initial feature information of different sizes extracted from at least one layer of features to obtain the final feature information. The initial feature information of small size may be compressed to remove important features. By synthesizing feature information of different sizes, more comprehensive and useful final feature information can be obtained, thereby improving the subsequent classification performance.

In some embodiments of the present disclosure, the weight of each set of the initial feature information is determined during the training process of the classification model.

In this way, the weight of the initial feature information for fusion is determined through the iterative training of the classification model, so that the final feature information obtained by using the weight fusion can better reflect the characteristics of the target object and further improve the classification performance.

In some embodiments of the present disclosure, the preset dimension is one dimension.

In this way, each group of initial feature information can be converted into one-dimensional, data unification is realized, and subsequent fusion is facilitated.

In some embodiments of the present disclosure, the classification model uses an ArcFace loss function during the training process to determine the loss value of the classification model; and/or, the batch sample data selected for each training of the classification model is generated by using the data The number of different target types selected by the processor from the sample data set is a preset proportion of sample data.

In this way, using the ArcFace loss function to determine the loss value of the classification model can aggregate the feature information of the same target objects and keep the feature information of different types of target objects away, thereby improving the classification performance of the target objects. In addition, the data generator is used to select sample data from the sample data set, and the sample data with a preset ratio of different target types is used as the batch sample data, so that the target types of the batch sample data for training the classification model are more balanced.

In some embodiments of the present disclosure, the acquiring at least one image to be classified including the target object includes:

The to-be-classified images containing the target object are respectively extracted from a plurality of original medical images.

In this way, the image to be classified is obtained, and the image to be classified can be extracted from the original medical image. Compared with directly using the original medical image, the image size of the subsequent classification can be reduced, and some backgrounds in the original medical image can be avoided to a certain extent. Therefore, the consumption of processing resources for subsequent classification can be reduced, and the classification performance can be improved.

In some embodiments of the present disclosure, the image to be classified containing the target object is extracted from multiple original medical images, including:

determining the initial area of the target object in the original medical image, and expanding the initial area according to the preset ratio to obtain the area to be extracted;

The image data in the to-be-extracted area is extracted from the original medical image to obtain the to-be-classified image.

In this way, the initial area is the area containing the target object, and the initial area of the target object is expanded according to a preset ratio, so that the obtained area to be extracted contains both the target object and some background information around the target object, so that the area to be extracted can be extracted. After the image data is extracted as the image to be classified, the image to be classified can include the target object and some background information.

In some embodiments of the present disclosure, before the image to be classified including the target object is extracted from multiple original medical images, the method further includes at least one of the following steps:

resampling the original medical image to a preset resolution;

adjusting the range of pixel values in the original medical image;

normalizing the original medical image;

It is detected that the initial area of the target object is not marked in the first original medical image, and the initial area of the target object marked on the second original medical image and the second original medical image and the first original medical image are used. The registration relationship is determined, and the initial area of the target object on the first original medical image is determined.

In this way, by unifying the resolution, adjusting the pixel value range, normalizing, and determining the initial area of the target object, the original medical image can be preprocessed before the image to be classified is extracted from the original medical image, and the unclassified image can be unified. Image parameters of the image to improve the quality of the image to be classified.

In some embodiments of the present disclosure, the original medical image and the image to be classified are two-dimensional images; or, the original medical image is a three-dimensional image, and the image to be classified is a two-dimensional image or a three-dimensional image.

In this way, the image to be classified is extracted from the original medical image. If the original medical image is a two-dimensional image, the image to be classified is a two-dimensional image; and if the original medical image is a three-dimensional image, the image to be classified is a two-dimensional image. The dimensions can be two-dimensional or three-dimensional.

In some embodiments of the present disclosure, the original medical image is a three-dimensional image, and the image to be classified is a two-dimensional image obtained by extracting a layer where the target object has the largest area in the original medical image.

In this way, when the original medical image is a three-dimensional image and the graphic to be classified is a two-dimensional image, the layer where the maximum area of the target object is located in the original medical image can be extracted as the image to be classified, so that the extraction range of the target object in the image to be classified is larger. , contains more information about the target object and improves the classification accuracy of the target object.

For descriptions of the effects of the following apparatuses, electronic devices, etc., reference may be made to the descriptions of the above-mentioned methods, which will not be repeated here.

The embodiment of the present disclosure also provides an image object classification device, and the image object classification device includes:

an image acquisition module, configured to acquire at least one image to be classified including the target object, wherein the at least one image to be classified is a medical image belonging to at least one scanned image category;

The target classification module is configured to use a classification model to perform target classification on the at least one image to be classified to obtain the type of the target object.

In some embodiments of the present disclosure, the target classification module is configured to:

Accordingly, the target classification module, configured as:

and/or, the target classification module, configured as:

In some embodiments of the present disclosure, the image acquisition module is configured to:

resampling the original medical image to a preset resolution;

adjusting the range of pixel values in the original medical image;

normalizing the original medical image;

An embodiment of the present disclosure also provides an electronic device, including a memory and a processor coupled to each other, where the processor is configured to execute program instructions stored in the memory, so as to implement the image object classification method provided in any of the previous embodiments .

The embodiments of the present disclosure also provide a computer-readable storage medium, which stores program instructions, and the program instructions are executed by a processor as the image object classification method provided in any of the previous embodiments.

An embodiment of the present disclosure also provides a computer program, the computer program includes computer-readable codes, and when the computer-readable codes are executed in an electronic device, the processor of the electronic device executes any of the preceding implementations The image object classification method described in the example.

In an image object classification method, device, device, storage medium, and program provided by the embodiments of the present disclosure, after acquiring at least one image to be classified that includes a target object, the classification model is used to classify the at least one image to be classified, and the result is obtained Therefore, an image target classification method based on artificial intelligence technology is proposed to achieve intelligent target classification. Because the classification model is used to classify the images to be classified, it not only makes the target classification process simpler, reduces the dependence on doctors, improves the speed and accuracy of target classification, but also combines artificial intelligence technology to achieve target classification, so as to assist doctors in intelligent diseases diagnosis and treatment.

It is to be understood that the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the present disclosure.

Description of drawings

The accompanying drawings, which are incorporated into and constitute a part of this specification, illustrate embodiments consistent with the present disclosure, and together with the description, serve to explain the technical solutions of the present disclosure.

FIG. 1 is a schematic flowchart of an image target classification method provided by an embodiment of the present disclosure;

FIG. 2 is a schematic diagram of a system architecture to which the image object classification method according to the embodiment of the present disclosure can be applied;

FIG. 3 is a schematic flowchart of obtaining at least one image to be classified according to an embodiment of the present disclosure;

4 is a schematic flowchart of a target classification for at least one image to be classified according to an embodiment of the present disclosure;

5 is a schematic diagram of a network architecture used by a classification model in the image target classification method according to an embodiment of the present disclosure;

FIG. 6 is a schematic diagram of a framework of an image object classification apparatus 60 provided by an embodiment of the present disclosure;

FIG. 7 is a schematic frame diagram of an electronic device 70 provided by an embodiment of the present disclosure;

FIG. 8 is a schematic diagram of a framework of a computer-readable storage medium 80 provided by an embodiment of the present disclosure.

Detailed ways

The solutions of the embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings.

In the following description, for purposes of explanation and not limitation, specific details are set forth, such as specific system structures, interfaces, techniques, etc., in order to provide a thorough understanding of the present disclosure.

The term "and/or" in this article is only an association relationship to describe the associated objects, indicating that there can be three kinds of relationships, for example, A and/or B, it can mean that A exists alone, A and B exist at the same time, and A and B exist independently B these three cases. In addition, the character "/" in this document generally indicates that the related objects are an "or" relationship. Also, "multiple" as used herein means two or more than two. In addition, the term "at least one" herein refers to any combination of any one of the plurality or at least two of the plurality, for example, including at least one of A, B, and C, and may mean including from A, B, and C. Any one or more elements selected from the set of B and C.

In the related art, three-dimensional imaging technology based on CT and MR plays a crucial role in medical imaging diagnosis, and is one of the main imaging examination methods for diagnosing, for example, liver diseases. Taking the diagnosis of liver tumors as an example, the scanning sequence of CT examination mainly includes the plain scan phase, the dynamic enhancement phase, the arterial phase, the portal venous phase and the delayed phase. The plain scan period is generally used to observe changes in the liver surface, whether there are fatty liver, liver fibrosis, liver cirrhosis and other diseases. Several phase images with dynamic enhancement can show the specific image features of the lesion. Taking Hepatocellular Carcinoma HCC as an example, HCC mainly occurs in patients with chronic liver disease and liver cirrhosis, and the corresponding changes in liver surface morphology can be observed from the plain scan period. Or the same density as the liver parenchyma; after enhanced scanning, HCC in each phase showed: marked enhancement or inhomogeneous enhancement in arterial phase, accompanied by low-density capsule; contrast agent outflow in portal venous phase, showing enhanced capsule at the same time ; the delayed phase presents a delayed enhanced envelope. Therefore, in a feasible implementation manner, it can be determined whether the target tumor is HCC by identifying the imaging features exhibited by the images in multiple stages. Compared with the single-phase image, the judgment accuracy is higher, because the image characteristics of the small liver metastases with rich blood supply in the plain and arterial phase are similar to those of small HCC. classification task, which can further improve the accuracy of image classification.

Clinically, there are two main ways to diagnose the type of liver tumor: First, the radiologist repeatedly checks the signs of the tumor on the CT or MR multi-phase images, and then gives the benign and malignant classification of the tumor or the specific tumor in the diagnosis report. Typing, a process that will take doctors a certain amount of time to repeatedly compare the imaging features of tumors between sequences, may take 3 to 5 minutes. The second is to collect tumor lesion samples for pathological diagnosis. The sample processing is complicated and time-consuming, and it may take 2 to 3 days. In order to improve the efficiency of doctors in reading images, an auxiliary intelligent diagnosis of tumors is provided that combines artificial intelligence technology.

Among them, medical image analysis generally has problems such as less labeled data, complex and difficult tasks, and at the same time, in order to better characterize lesions, it is necessary to analyze the correlation between sequences. The existence of these problems limits the complexity and depth of deep learning networks to a certain extent, and some other strategies need to be introduced to solve the task of medical image analysis. Taking the problem of liver tumor classification as an example, the image features of the tumor itself are the main basis for judging its type, and there may be various noises around the target tumor, which will mislead the deep learning network to learn some wrong features; liver tumors vary in size and small. If it is less than 0.5cm and larger than 20cm, the network needs to be able to take into account the characteristics of liver tumors, and improve the identification ability of small tumors on the basis of ensuring high-precision classification and identification of large tumors; limited to the resolution of CT scan images, The imaging features of liver tumors are not necessarily obvious. There are many difficulties in the task of liver tumor classification, and it is necessary to introduce certain strategies to learn better feature representations, so as to achieve the aggregation of similar samples and the distance of heterogeneous samples.

Based on the above research, the present disclosure provides at least one image target classification method, which uses a classification model to classify images to be classified, which not only makes the target classification process simpler, reduces the dependence on doctors, and improves the speed and accuracy of target classification, And combined with artificial intelligence technology to achieve target classification, in order to assist doctors in intelligent disease diagnosis and treatment.

Please refer to FIG. 1. FIG. 1 is a schematic flowchart of an image object classification method provided by an embodiment of the present disclosure. Specifically, the following steps can be included:

Step S11: Acquire at least one image to be classified that includes the target object.

Wherein, at least one image to be classified is a medical image belonging to at least one scanned image category.

In this embodiment of the present disclosure, the images to be classified may be medical images, including but not limited to CT images and MR images, which are not limited herein. The images to be classified may all be CT images, may all be MR images, and may also be partly CT images and partly MR images, which are not specifically limited herein. In medical imaging diagnosis, CT images and MR images are multi-phase images or multi-sequence imaging. Each phase image or sequence shows different image information of the area where the target object is located or other areas. Combined effectively, the nature of the lesions can be more precisely defined.

The images to be classified may be obtained by scanning the abdomen, chest and other regions. For example, the image to be classified obtained by scanning the abdomen may include tissues and organs such as liver, spleen, and kidney, and the image to be classified obtained by scanning the chest may include tissues and organs such as the heart and lung. Specifically, the images to be classified may be scanned according to the actual application. Images, not limited here. The target object may be, but is not limited to, a liver tumor and other objects that need to be classified using the image object classification method of the embodiment of the present disclosure.

The at least one image to be classified may be a medical image belonging to at least one category of scanned images. Medical images of different scanned image categories can be used to display different characteristic information of target objects, thus improving the accuracy of image target classification. In some disclosed embodiments, the class of scanned images may also be referred to as the above-described images and/or sequences. The images of the different scanned image categories may be timing-dependent and/or scan-parameter-dependent images. For example, the scan image category may include time-series-related pre-contrast scan, early arterial phase, late arterial phase, portal venous phase, and delayed phase, etc.; alternatively, the scan image category may also include T1-weighted inverse imaging, T1 Weighted in-phase imaging, T2-weighted imaging, diffusion-weighted imaging, and surface diffusion coefficient imaging, etc.

Taking the liver as an example, the early arterial stage can indicate that the portal vein has not been enhanced, the late arterial stage can indicate that the portal vein has been enhanced, and the portal venous phase can indicate that the portal vein has been sufficiently enhanced and the liver blood vessels have been enhanced by forward blood flow. The delay period can indicate that the portal vein and arteries are in an enhanced state and weaker than the portal venous phase, and the liver parenchyma is in an enhanced state and weaker than the portal venous phase, and other scanning image categories will not be listed one by one here. When the image to be classified is a medical image obtained by scanning other organs, it can be deduced by analogy, and examples will not be given here.

Step S12: Using the classification model, perform target classification on at least one image to be classified to obtain the type of the target object.

After acquiring at least one image to be classified including the target object, the classification model is used to classify the at least one image to be classified, so as to obtain the type of the target object.

In a disclosed embodiment, the classification model performs target classification on at least one image to be classified, obtains probabilities that the target objects belong to different types, and uses the types that satisfy the preset probability conditions as the types of the target objects. The preset probability conditions include but are not limited to the maximum probability value and the like. The probability that the target objects belong to different types can be obtained by training the classification model. The batch sample data selected for each training of the classification model is sample data with a preset proportion of the number of different target types selected from the sample data set by the data generator. Since the data generator randomly selects sample data containing equal proportions of different target types as batch sample data, so as to avoid unbalanced classification performance due to too few sample data of a certain target type, the classification model is used for at least one image to be classified. The target classification is obtained by training a large number of batch sample data, which can improve the classification performance of the classification model. Using the classification model to obtain the type of the target object can assist the doctor in determining the type of the target object, save the doctor's time for reviewing the images to be classified, and thus can speed up the output of the report.

In a disclosed embodiment, target classification is performed on at least one image to be classified, and when the type of the target object is obtained, several layers of feature extraction are performed on at least one image to be classified, and several sets of initial feature information are correspondingly obtained; At least one set of initial feature information in the feature information is used to obtain final feature information; the final feature information is classified to obtain the type of the target object.

When performing feature extraction on at least one image to be classified, the number of layers for feature extraction may be one layer, two layers or even more layers. When performing feature extraction on at least one image to be classified, which layers to perform feature extraction on can be obtained through artificial settings, or can be determined through a large number of experiments when training a classification model, which is not specifically limited here. A layer of feature extraction is performed on at least one image to be classified, and a set of initial feature information is correspondingly obtained. Multi-layer feature extraction is performed on at least one image to be classified, and multiple sets of initial feature information are correspondingly obtained, wherein the multi-layer feature extraction may be continuous or discontinuous. The initial feature information may be a feature map of the target object, reflecting the feature information of the target object in the image to be classified. In a disclosed embodiment, the classification model is a deep learning network, and the deep learning network may include an encoder (encoder) or its variants, Resnet or its variants, and may be a neural network (Visual Geometry Group Network, VGG) 16 or its variants. , or other network model structures for classification. The classification model performs feature extraction on at least one image to be classified through a convolution layer, and different convolution layers correspond to different layers of feature extraction to obtain different groups of initial feature information.

In the above scheme, after obtaining at least one image to be classified including the target object, the classification model is used to classify the at least one image to be classified, and the type of the target object is obtained. Therefore, an image target classification method based on artificial intelligence technology is proposed, which realizes the Intelligent target classification, and no manual target classification is required, which can reduce the dependence on manual work and improve the efficiency of target classification.

In an application embodiment, in order to realize the classification of liver tumors, at least one image to be classified including liver tumors is acquired, and a classification model is used to perform target classification on at least one image to be classified, so as to obtain the type of liver tumor, and no manual classification is required. The images are classified, and the classification model can be used to realize the classification of liver tumors, so that the doctor can obtain the type of liver tumors.

FIG. 2 is a schematic diagram of a system architecture to which the image object classification method according to an embodiment of the present disclosure can be applied; as shown in FIG. 2 , the system architecture includes an image acquisition terminal 201 , a network 202 and an object classification terminal 203 . In order to support an exemplary application, the image acquisition terminal 201 and the target classification terminal 203 establish a communication connection through the network 202, and the image acquisition terminal 201 reports at least one image to be classified containing the target object to the target classification terminal 203 through the network 202, and the target classification The terminal 203 responds to the received at least one image to be classified, and uses the classification model to perform target classification on the at least one image to be classified to obtain the type of the target object. Finally, the target classification terminal 203 uploads the type of the target object to the network 202 and sends it to the image acquisition terminal 201 through the network 202 .

As an example, the image acquisition terminal 201 may include an image acquisition device, and the target classification terminal 203 may include a vision processing device or a remote server with visual information processing capability. Network 202 may employ wired or wireless connections. Wherein, when the target classification terminal 203 is a visual processing device, the image acquisition terminal 201 can be connected to the visual processing device through a wired connection, such as data communication through a bus; when the target classification terminal 203 is a remote server, the image acquisition terminal 201 can perform data interaction with a remote server through a wireless network.

Alternatively, in some scenarios, the image acquisition terminal 201 may be a vision processing device with an image acquisition module, which is specifically implemented as a host with a camera. At this time, the image object classification method according to the embodiment of the present disclosure may be executed by the image acquisition terminal 201 , and the above-mentioned system architecture may not include the network 202 and the object classification terminal 203 .

In order to make the at least one image to be classified more uniform, before the image to be classified is extracted from the original medical image, image preprocessing can be performed on the original medical image, and then the to-be-classified image containing the target object can be extracted from multiple original medical images respectively. image to obtain at least one image to be classified that contains the target object. Please refer to FIG. 3 , which is a schematic flowchart of acquiring at least one image to be classified according to an embodiment of the present disclosure. Specifically, the following steps can be included:

Step S111: Resampling the original medical image to a preset resolution.

The size of the preset resolution can be customized, and the preset resolution corresponding to the target object can be set according to different target objects, so as to unify the resolution of the original medical image to the resolution with the best image effect.

Step S112: Adjust the pixel value range in the original medical image.

By adjusting the pixel value range of the original medical image, the brightness and color of the original medical image are made easier to display the target object. The categories of the original medical images include, but are not limited to, CT images, MR images, and other images that can reflect the feature information of the target object, which are not limited herein. If the original medical image is a CT image, the original medical image can be unified to a preset window width and window level; if the original medical image is an MR image, since the dynamic range of the pixel distribution of the MR image changes greatly, in an implementation scenario of the present disclosure , the gray value corresponding to a preset ratio (for example, 99.9%) under the grayscale cumulative distribution function can be used as the normalized preprocessing clamp value, so that the contrast of the MR image data can be enhanced and the subsequent image target can be improved. Classification accuracy.

Step S113: Normalize the original medical image.

In a disclosed embodiment, the raw medical images may be normalized. The normalization process includes, but is not limited to, normalizing the intensity or pixel values of the original medical image to a preset range (eg, a range of 0 to 1).

Step S114: Detecting the initial area of the first original medical image that is not marked with the target object, using the initial area of the target object marked on the second original medical image and the registration relationship between the second original medical image and the first original medical image, An initial region of the target object on the first original medical image is determined.

In some embodiments of the present disclosure, not all original medical images may be marked with the initial area of the target object. Therefore, in order to use more images to be classified including the target object to perform image object classification and improve the accuracy of image object classification, The initial area of the original medical image can be filled. After detecting the initial area of the first original medical image that is not marked with the target object, use the initial area of the target object marked on the second original medical image and the registration relationship between the second original medical image and the first original medical image to determine the first An initial region of the target object on the original medical image. In a disclosed embodiment, in order to improve the convenience of determining the initial area of the target object, the above step of determining the initial area of the target object on the first original medical image may be performed by using a registration network.

Before the images to be classified containing the target object are extracted from multiple original medical images respectively, the image target classification method may include several steps from the above steps S111 to S114. The above steps S111 to S114 are only exemplary descriptions. In the disclosed embodiment, several steps can be selected to preprocess the original medical image as required, that is, the number of the above steps S111 to S114 can be arbitrarily selected, which is not specifically limited herein. By unifying the resolution, adjusting the pixel value range, normalizing, and determining the initial area of the target object, the original medical image can be preprocessed before the image to be classified is extracted from the original medical image, and the Image parameters to improve the quality of images to be classified.

After the original medical images are preprocessed, the images to be classified including the target object can be extracted from the multiple original medical images respectively. For details, refer to steps S115 and S116 below.

Step S115: Determine the initial area of the target object in the original medical image, and expand the initial area according to a preset ratio to obtain the area to be extracted.

The characteristics of the target object itself are the main basis for judging its type, and there may be a variety of noise interference around the target object, which will mislead the classification of the target object. Taking the target object as a liver tumor as an example, the background of chronic liver disease or cirrhosis, other types of tumors, and blood vessels close to the liver tumor will affect the classification accuracy of the target object. The area to be extracted is used as the area to be extracted, so that the area to be extracted contains the target object. In a disclosed embodiment, in order to use the background information around the target object as auxiliary information for target classification, or to avoid the determination error of the initial area, so as to improve the acquisition accuracy of the image to be classified, the initial area of the target object in the original medical image is determined. Afterwards, the initial area may be expanded according to a preset ratio to obtain the area to be extracted. The initial region is used to delineate the position of the target object in the original medical image. In a disclosed embodiment, an image segmentation technique can be used to determine the boundary contour of the target object in the original medical image, and mark the boundary contour to form an initial area.

Step S116: Extract the image data in the area to be extracted from the original medical image to obtain the image to be classified.

The image data is extracted from the original medical image by using the area to be extracted, and the obtained image to be classified includes the target object.

The original medical image can be a two-dimensional image or a three-dimensional image. In the case where the original medical image is a two-dimensional image, the image to be classified is a two-dimensional image. When the original medical image is a three-dimensional image, the image to be classified may be a three-dimensional image, or the image to be classified may be a two-dimensional image. In some embodiments of the present disclosure, since a three-dimensional image is composed of several layers of two-dimensional images, for example, when determining a two-dimensional image to be classified, the two-dimensional image of the layer where the target object has the largest area may be, but not limited to, the image to be classified. ; take the two-dimensional image of the layer where the target object diameter is the largest as the image to be classified; or take the middle layer in all two-dimensional images as the image to be classified; or take any layer in all the two-dimensional images as the image to be classified, here There is no specific limitation. In an application embodiment, the original medical image is a three-dimensional image, and the image to be classified is a two-dimensional image obtained by extracting the layer where the maximum area of the target object is located in the original medical image, so that the layer where the maximum area of the target object is located in the original medical image can be extracted. As an image to be classified, the extraction range of the target object in the to-be-classified image is larger and contains more information of the target object, thereby improving the classification accuracy of the target object. In the above manner, after the original medical image is preprocessed, the initial area of the target object in the original medical image is determined, and the initial area is expanded according to a preset ratio to obtain the area to be extracted; then the area to be extracted is extracted from the original medical image. image data to obtain an image to be classified. The initial area is the area containing the target object, and the initial area of the target object is expanded according to a preset ratio, so that the obtained area to be extracted contains both the target object and some background information around the target object, so that the image in the area to be extracted can be extracted. After the data is extracted as an image to be classified, the image to be classified can include the target object and some background information.

In addition, the to-be-classified images containing the target object are extracted from multiple original medical images respectively, so as to realize the acquisition of the to-be-classified images, and the to-be-classified images can be extracted from the original medical images. Compared with directly using the original medical images, the subsequent classification is reduced. It can avoid some background noise in the original medical image to a certain extent, so it can reduce the processing resource consumption of subsequent classification and improve the classification performance.

In the embodiments of the present disclosure, it is proposed to use a classification model of artificial intelligence technology to classify at least one image to be classified, which can greatly improve the efficiency of determining the type of the target object. Please refer to FIG. 4 . FIG. 4 is a schematic flowchart of a target classification for at least one image to be classified according to an embodiment of the present disclosure. Specifically, the following steps can be included:

Step S121 : extracting several layers of features on at least one image to be classified, and correspondingly obtaining several sets of initial feature information.

Among them, the size of each group of initial feature information is different.

When performing feature extraction on at least one image to be classified, the number of layers for feature extraction may be one layer, two layers or even more layers. Feature extraction can be implemented by convolutional layers, and each convolutional layer performs feature extraction on at least one image to be classified to obtain initial feature information. When performing feature extraction on at least one image to be classified, which layers to perform feature extraction on can be obtained through manual settings, or can be determined through a large number of experiments when training a classification model, which is not specifically limited here. A layer of feature extraction is performed on at least one image to be classified, and a set of initial feature information is correspondingly obtained, wherein the layer of feature extraction can be any layer, such as but not limited to the initial feature information obtained by extracting the last layer of features. as the basis for subsequent target classification. Multi-layer feature extraction is performed on at least one image to be classified, and multiple sets of initial feature information are correspondingly obtained, wherein the multi-layer feature extraction may be continuous or discontinuous. The initial feature information may be a feature map of the target object, reflecting the feature information of the target object in the image to be classified. The size of each set of initial feature information is different, wherein the size includes dimension and/or resolution, so that the multiple sets of initial feature information respectively reflect different feature information of the target object.

In a disclosed embodiment, the classification model is a deep learning network, and the included network model structure can be an encoder or its variant, Resnet or its variant, VGG16 or its variant, or other network model structures for classification. . The classification model performs feature extraction on at least one image to be classified through a convolution layer, and different convolution layers correspond to different layers of feature extraction to obtain different groups of initial feature information.

In the image to be classified, there may be noise interference around the target object. Taking the target object as a liver tumor as an example, the background of chronic liver disease or cirrhosis, other types of tumors, and blood vessels close to the liver tumor will affect the classification accuracy of the target object. Therefore, before using the classification model to perform target classification on at least one image to be classified to obtain the type of the target object, the final area of the target object can be obtained based on the initial area corresponding to the target object in the image to be classified. When determining the final area of the target object, an initial area can be used as the final area of the target object, or the final area of the target object can be obtained by combining the initial areas corresponding to the target object in at least one image to be classified. The union of the initial regions corresponding to the target object in the image to be classified is regarded as the final region of the target object, which is not limited here. In order to enable the classification model to learn some important features of the target object itself, and to reduce the influence of surrounding noise on the classification of the target object to a certain extent, the initial feature information of the image to be classified (such as the global features of the image to be classified, etc.) can be extracted. ), plus the supervision of the final area of the target object, for example: in the feature extraction process, the weight of the corresponding final area in the image to be classified is higher than the weight of other areas in the image to be classified, which makes the classification model tend to The final region extracts features with richer details, so that the initial feature information output by the classification model corresponding to the final region can be as rich in features as possible; and/or, the features corresponding to the final region in the initial feature information are more abundant than the features of other regions. When the initial feature information is obtained from the feature extraction of the image to be classified, not only the global features of the image to be classified are extracted, but also due to the addition of the supervision mechanism of the final area, the classification model is guided to pay more attention to the target object in the final area, so that the classification model can learn the target object. The feature information of itself can reduce the influence of noise interference around the target object on the target classification.

In a disclosed embodiment, when the final region of the target object is obtained based on the initial region corresponding to the target object in the image to be classified, a union of the initial regions corresponding to the target object in at least one image to be classified is obtained as the target object. Therefore, several layers of feature extraction can be performed on at least one image to be classified by using the final region, and several sets of initial feature information can be obtained correspondingly. Since the final area of the target object is the union of the initial areas of the target object in the image to be classified, the final area is greater than or equal to any initial area, ensuring that the final area of the target object can contain the corresponding areas of the target objects in different images to be classified, Therefore, when the feature extraction of the image to be classified is performed, the feature information of the target object can be paid attention to as much as possible. In a disclosed embodiment, the at least one image to be classified includes a first image to be classified without an initial area of the target object and a second image to be classified with an initial area of the target object marked; Before obtaining the final area of the target object, the classification model can also be used to detect the initial area of the first image to be classified that is not marked with the target object, and based on the initial area of the target object marked on the second image to be classified and the first image to be classified 2. The registration relationship between the image to be classified and the first image to be classified determines the initial area of the target object on the first image to be classified. Therefore, the classification model can be used to determine the initial area of the target object for the first to-be-classified image that is not labeled with the initial area of the target object, so as to complete the labeling, so that the to-be-classified images all include the initial area.

In a disclosed embodiment, a final area map including the final area of the target object may be generated, and the final area map and the image to be classified may be input into the classification model, so that the target classification is performed on at least one image to be classified by using the classification model, When the type of the target object is obtained, using the final region of the target object included in the final region map to perform several layers of feature extraction on at least one image to be classified can guide the network to pay more attention to the learning of the features of the final region and avoid the network to a certain extent. A lot of wrong feature information is learned, and the interference of noise around the target object on feature extraction is reduced. It can be understood that, before inputting the final area map and the image to be classified into the classification model, the sizes of the final area image and the image to be classified may be adjusted to a uniform size.

Step S122: Obtain final feature information based on at least one set of initial feature information in several sets of initial feature information.

Several layers of feature extraction are performed on at least one image to be classified, and after several sets of initial feature information are obtained, the final feature information can be obtained based on at least one set of initial feature information in several sets of initial feature information, and the selected initial feature information The information is different, and the final feature information obtained is different. The number of groups of initial feature information and the parameter information such as the convolution layer corresponding to the classification model may be manually set, or may be determined during the training process of the classification model, which is not limited here. Fusion of multiple sets of initial feature information can improve the performance of the classification model and the accuracy of target classification, but the fusion of too much initial feature information will cause overfitting problems. Therefore, rationally adjusting the number of groups of initial feature information to be fused can not only improve the classification performance can also reduce overfitting. Since each set of initial feature information is different in dimension and resolution, and reflects different feature information of the target object, at least one set of initial feature information can be fused to obtain the final feature information. When the layer high-dimensional feature map is used as the final feature information, after multiple convolutions, some important feature information may be compressed, especially the target object with small area and blurred image features is missed. Feature information, the initial feature information obtained in different feature extraction stages can be spliced together to improve the accuracy of image target classification.

In a disclosed embodiment, the weight of at least one set of initial feature information is used to fuse at least one set of initial feature information to obtain final feature information. The weight of each set of initial feature information may be manually set, or may be determined during the training process of the classification model, which is not limited here. For example, first initialize the weight of each group of initial feature information, and continuously update the weight during the training process of the classification model. The above steps of updating weights are continuously repeated by using the training classification model, so that the training classification model continuously learns and updates the weight of each group of initial feature information, and obtains the trained classification model and the weight of each group of initial feature information. It can be understood that the weights of each initial set of initial feature information may be the same or different, and the sum of the weights of each set of initial feature information is 1. Through the iterative training of the classification model, the weight of the initial feature information for fusion is determined, so that the final feature information obtained by using the weight fusion can better reflect the characteristics of the target object and further improve the classification performance. The weights of different groups of initial feature information may be the same or different, and the sum of the weights of each group of initial feature information is 1. When the final feature information is obtained by using multiple sets of initial feature information, the weight of the initial feature information can be used to fuse the initial feature information of different sizes extracted from at least one layer of features to obtain the final feature information, considering the initial feature information of smaller size. Important features may be compressed, and by synthesizing feature information of different sizes, more comprehensive and useful final feature information can be obtained, thereby improving the subsequent classification performance. In a disclosed embodiment, a feature fusion network can be used to obtain final feature information based on at least one set of initial feature information in several sets of initial feature information, and initial feature information of multiple sizes can be spliced together as the final feature of the classification task. At the same time, each initial feature information is given a weight, and the weight is continuously updated during the model training process after initialization, so as to integrate multiple initial feature information to obtain a better feature representation of the target object, thereby improving the accuracy of target classification. performance.

In a disclosed embodiment, before obtaining final feature information based on at least one set of initial feature information among several sets of initial feature information, each set of initial feature information may be converted into a preset dimension to facilitate subsequent acquisition of final feature information. For example, in an application scenario, a feature extraction network is used to convert each set of initial feature information into a preset dimension. The preset dimension can be set as required, for example, but not limited to, the preset dimension is one dimension.

Step S123: Classify the final feature information to obtain the type of the target object.

The final feature information carries the features of the target object, so that the final feature information is classified to obtain the type of the target object. When determining the type of the target object, including but not limited to, the classification model performs target classification on at least one image to be classified, obtains the probability that the target object belongs to different types, and takes the type that satisfies the preset probability condition as the type of the target object. The preset probability conditions include but are not limited to the maximum probability value and the like.

In a disclosed embodiment, the classification model uses the ArcFace loss function to determine the loss value of the classification model during the training process, and the ArcFace loss function is used to shorten the distance of similar target objects and shorten the distance of heterogeneous target objects, thereby increasing the confusion of target objects. classification ability. The ArcFace loss function is simple and easy to use, and can be well applied to the network structure of the classification model without being combined with other loss functions. At the same time, the overfitting problem is reduced to a certain extent, thereby improving the classification performance of the target object. Compared with loss functions such as softmax, when the ArcFace loss function is used to determine the loss value of the classification model, the training result of the classification model can be the cosine of the angle between the weight of the first fully connected layer and the feature entering the first fully connected layer. value. Specifically, the dot product between the features entering the first fully-connected layer of the classification model and the weights of the first fully-connected layer can be equal to the normalized cosine distance between the feature and the weight, so that the angular cosine function can be used to calculate the normalized The target angle between the normalized features and the normalized weights, then an additional angular margin is added to the target angle, and the logit of the target is obtained by the cosine function, and then all are rescaled with a fixed feature norm logits, and subsequent related steps are similar to the softmax loss function. Taking the target object as a liver tumor as an example, considering that the characteristic information of the liver tumor itself is the main basis for judging its type, but the size of the liver tumor varies, ranging from less than 0.5 cm to more than 20 cm, plus the size of the tumor outside the target object. Influencing factors, such as the low resolution of the image to be classified, other types of tumors around the liver tumor, blood vessels with similar characteristics to the target object, chronic liver disease or liver cirrhosis background, etc. In the embodiment of the present disclosure, the ArcFace loss function can learn better The feature representation of liver tumors can realize the aggregation of similar tumors and the distance of heterogeneous tumors, and can effectively improve the classification performance of tumors. In the classification of other target objects, the effect of using the ArcFace loss function to determine the loss value of the classification model in the training process of the classification model is similar, and no examples will be given here.

It should be noted that the ArcFace loss function is a loss function that uses margin to expand the distance between different classes. The predicted value is the cosine of the angle between the weight of the first fully connected layer and the feature entering the first fully connected layer. . Among them, the principle and operation process are as follows: first, the dot product between the feature entering the first fully connected layer and the weight of the first fully connected layer is equal to the cosine distance after the normalization of the feature and the weight, and secondly, using the angular cosine function (arc-cosine function) to calculate the angle between normalized features and normalized weights; then, add an additional angular margin (additive angular margin) to the target angle, and then obtain through the cosine function The logit of the target, then rescales all logits with a fixed feature norm, and the subsequent steps are exactly the same as in the softmax loss.

In the above manner, several layers of feature extraction are performed on at least one image to be classified by using the classification model, and several sets of initial feature information are correspondingly obtained; the final feature information is obtained based on at least one set of initial feature information in the several sets of initial feature information; the final feature information is obtained. The feature information is classified to obtain the type of the target object, which realizes the target classification using the feature information of the target object.

Those skilled in the art can understand that in the above method of the specific implementation, the writing order of each step does not mean a strict execution order but constitutes any limitation on the implementation process, and the specific execution order of each step should be based on its function and possible Internal logic is determined.

In some embodiments of the present disclosure, at least one image to be classified is preprocessed, and a corresponding two-dimensional or three-dimensional multi-phase image tumor sub-image block is extracted, that is, a multi-phase image tumor patch image, and corresponding The mask image, the mask patch image, is fed together into a deep learning classification network. As shown in FIG. 5 , it is a schematic diagram of the network architecture used by the classification model in the image target classification method according to the embodiment of the present disclosure; wherein, the batch data input to the classification model 501 randomly includes data of different types of tumors in equal proportions, including stage Like 1, phase like 2, ..., phase like m, and multi-phase like the union of lesion masks. 502 is the CNN backbone network, that is, CNN backbone, which can be the encoder of U-Net or its variant, Resnet or its variant, VGG16 or its variant, or other CNN structures for classification; 503 is Feature Block, which includes Adaptive average pooling, FC and Relu; the previously obtained feature map is subjected to adaptive average pooling, full connection and Relu activation to obtain a one-dimensional feature; at the same time, each Feature Block corresponds to A feature_1. 504 is a feature fusion layer (Feature Fusion), which splices multiple one-dimensional features, each feature has a corresponding weight coefficient, and the coefficient can be learned; including: the weight coefficient of feature_1_ 1, weight coefficient_2 of feature_2, ..., weight coefficient_n of feature_n. At the same time, the feature map from any convolutional layer in the CNN backbone needs to enter the feature block and feature fusion layer, which can be determined by experimenting in the training process. In the experiment process of this scheme, it is found that the introduction of feature fusion layers can improve the performance of the model; however, fusing too many feature maps will cause over-fitting problems, especially the feature maps obtained by fusing the front convolutional layers. Reasonably adjusting the number of feature maps entering the feature fusion structure can not only improve the classification performance but also reduce overfitting. 505 is a fully connected layer (Fully Connected), that is, the fused features are sent to FC, and converted into classification probability values of each tumor category through softmax. 506 is the predicted probability value.

Please refer to FIG. 6 . FIG. 6 is a schematic frame diagram of an image object classification apparatus 60 provided by an embodiment of the present disclosure. The image object classification device 60 includes an image acquisition module 61 and an object classification module 62 . The image acquisition module 61 is configured to: acquire at least one image to be classified including the target object, wherein at least one image to be classified is a medical image belonging to at least one scanned image category; the target classification module 62 is configured to: use the classification The model performs target classification on at least one image to be classified to obtain the type of the target object.

In some embodiments of the present disclosure, the target classification module 62 is configured to: perform several layers of feature extraction on at least one image to be classified, and correspondingly obtain several sets of initial feature information; wherein, the size of each set of initial feature information is different; At least one group of initial feature information in the group of initial feature information is obtained to obtain final feature information; the final feature information is classified to obtain the type of the target object.

In some embodiments of the present disclosure, the target classification module 62 is configured to: obtain the final area of the target object based on the initial area corresponding to the target object in the image to be classified; correspondingly, the target classification module 62 is configured to: use the final area to At least one image to be classified is subjected to several layers of feature extraction, corresponding to several sets of initial feature information; wherein, in the feature extraction process, the weight of the corresponding final area in the image to be classified is higher than the weight of other areas in the image to be classified; and/ Or, the features corresponding to the final region in the initial feature information are more abundant than the features of other regions.

In some embodiments of the present disclosure, the target classification module 62 is configured to obtain a union of initial regions corresponding to the target object in at least one image to be classified, as the final region of the target object.

In some embodiments of the present disclosure, the target classification module 62 is configured to: use the classification model to detect the initial area of the first image to be classified that is not marked with the target object, and based on the initial area of the target object marked on the second image to be classified and the registration relationship between the second to-be-classified image and the first to-be-classified image to determine the initial area of the target object on the first to-be-classified image.

In some embodiments of the present disclosure, the target classification module 62 is configured to: convert each set of initial feature information into a preset dimension; and/or, the target classification module 62 is configured to: use the weight of at least one set of initial feature information to classify At least one set of initial feature information is fused to obtain final feature information.

In some embodiments of the present disclosure, the weight of each set of initial feature information is determined during the training process of the classification model.

In some embodiments of the present disclosure, the classification model adopts the ArcFace loss function during the training process to determine the loss value of the classification model; and/or, the batch sample data selected for each training of the classification model is selected from the sample data set using a data generator The number of different target types is a preset ratio of sample data.

In some embodiments of the present disclosure, the image acquisition module 61 is configured to: extract images to be classified including the target object from a plurality of original medical images, respectively.

In some embodiments of the present disclosure, the image acquisition module 61 is configured to: determine the initial area of the target object in the original medical image, expand the initial area according to a preset ratio to obtain the area to be extracted; extract the area to be extracted from the original medical image image data to obtain the image to be classified.

In some embodiments of the present disclosure, the image acquisition module 61 is configured to: resample the original medical image to a preset resolution; adjust the range of pixel values in the original medical image; normalize the original medical image; detect The first original medical image is not marked with the initial area of the target object, and the first original medical image is determined by using the initial area of the target object marked on the second original medical image and the registration relationship between the second original medical image and the first original medical image. The initial area of the target object on the image.

In the image object classification device 60 provided by the embodiment of the present disclosure, after the image acquisition module 61 acquires at least one image to be classified including the target object, the object classification module 62 uses the classification model to perform object classification on the at least one image to be classified to obtain the target object Therefore, an image target classification method based on artificial intelligence technology is proposed to achieve intelligent target classification. Because the classification model is used to classify the images to be classified, it not only makes the target classification process simpler, reduces the dependence on doctors, and improves the speed of target classification, but also combines artificial intelligence technology to achieve target classification, so as to assist doctors in intelligent disease diagnosis and treatment.

Please refer to FIG. 7 , which is a schematic diagram of a framework of an embodiment of an electronic device 70 of the present disclosure. The electronic device 70 includes a memory 71 and a processor 72 coupled to each other, and the processor 72 is configured to execute program instructions stored in the memory 71 to implement the steps of any of the image object classification method embodiments described above. In some implementation scenarios of the present disclosure, the electronic device 70 may include, but is not limited to, a microcomputer and a server. In addition, the electronic device 70 may also include mobile devices such as a notebook computer and a tablet computer, which are not limited herein.

In some embodiments of the present disclosure, the processor 72 is configured to control itself and the memory 71 to implement the steps of any of the image object classification method embodiments described above. The processor 72 may also be referred to as a central processing unit (Central Processing Unit, CPU). The processor 72 may be an integrated circuit chip with signal processing capability. The processor 72 may also be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), a Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. In addition, the processor 72 may be jointly implemented by an integrated circuit chip.

The electronic device 70 provided in the embodiment of the present disclosure, after acquiring at least one image to be classified including a target object, uses a classification model to classify the at least one image to be classified, and obtains the type of the target object. Therefore, an artificial intelligence technology-based method is proposed. Image object classification method to achieve intelligent object classification. Because the classification model is used to classify the images to be classified, it not only makes the target classification process simpler, reduces the dependence on doctors, and improves the speed of target classification, but also combines artificial intelligence technology to achieve target classification, so as to assist doctors in intelligent disease diagnosis and treatment.

Please refer to FIG. 8 , which is a schematic diagram of a framework of an embodiment of a computer-readable storage medium 80 of the present disclosure. The computer-readable storage medium 80 stores program instructions 801 that can be executed by the processor, and the program instructions 801 are used to implement the steps of any of the foregoing image object classification method embodiments.

In the computer-readable storage medium 80 provided by the embodiment of the present disclosure, after obtaining at least one image to be classified including the target object, the classification model is used to classify the at least one image to be classified to obtain the type of the target object. The image target classification method of intelligent technology realizes intelligent target classification. Because the classification model is used to classify the images to be classified, it not only makes the target classification process simpler, reduces the dependence on doctors, and improves the speed of target classification, but also combines artificial intelligence technology to achieve target classification, so as to assist doctors in intelligent disease diagnosis and treatment.

An embodiment of the present disclosure further provides a computer program, where the computer program includes a computer-readable code, and when the computer-readable code is run in an electronic device, the processor of the electronic device executes the program for implementing any of the foregoing embodiments. Image object classification methods. The computer program product can be specifically implemented by hardware, software or a combination thereof. In some embodiments of the present disclosure, the computer program product is embodied as a computer storage medium, and in some embodiments of the present disclosure, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), etc. .

In some embodiments, the functions or modules included in the image object classification apparatus provided in the embodiments of the present disclosure may be used to execute the methods described in the above method embodiments, and the specific implementation may refer to the above method embodiments. It is concise and will not be repeated here.

The above descriptions of the various embodiments tend to emphasize the differences between the various embodiments, and the similarities or similarities can be referred to each other. For the sake of brevity, details are not repeated herein.

In the several embodiments provided in the present disclosure, it should be understood that the disclosed method and apparatus may be implemented in other manners. For example, the device implementations described above are only illustrative. For example, the division of modules or units is only a logical function division. In actual implementation, there may be other divisions. For example, units or components may be combined or integrated. to another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, which may be in electrical, mechanical or other forms.

In addition, each functional unit in each embodiment of the present disclosure may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.

The integrated unit, if implemented as a software functional unit and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solutions of the present disclosure can be embodied in the form of software products in essence, or the part that contributes to the prior art, or all or part of the technical solutions, and the computer software product is stored in a storage medium , including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the methods of the various embodiments of the present disclosure. The aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes .

Industrial Applicability

The present disclosure provides an image object classification method, device, device, storage medium and program, wherein the image object classification method includes: acquiring at least one image to be classified including a target object, wherein the at least one image to be classified is a medical image belonging to at least one type of scanned image; using a classification model, perform target classification on the at least one image to be classified to obtain the type of the target object.

Claims

An image object classification method, comprising:

acquiring at least one image to be classified containing the target object, wherein the at least one image to be classified is a medical image belonging to at least one scanned image category;

Using a classification model, the at least one image to be classified is subjected to target classification to obtain the type of the target object.
The method according to claim 1, wherein the target classification is performed on the at least one image to be classified to obtain the type of the target object, comprising:

Performing several layers of feature extraction on the at least one image to be classified, correspondingly obtaining several groups of initial feature information; wherein, the size of each group of the initial feature information is different;

Obtain final feature information based on at least one set of initial feature information in the several sets of initial feature information;

The final feature information is classified to obtain the type of the target object.
The method according to claim 1, before the target classification is performed on the at least one image to be classified and the type of the target object is obtained, the method further comprises:

obtaining the final area of the target object based on the initial area corresponding to the target object in the image to be classified;

Correspondingly, several layers of feature extraction are performed on the at least one image to be classified, and several sets of initial feature information are correspondingly obtained, including:

Using the final area to perform several layers of feature extraction on the at least one image to be classified, correspondingly obtain several sets of initial feature information; wherein, in the feature extraction process, the weight of the image to be classified corresponding to the final area is high the weight of other regions in the image to be classified; and/or, the features corresponding to the final region in the initial feature information are more abundant than the features of other regions.
The method according to claim 3, wherein the obtaining of the final area of the target object based on the initial area corresponding to the target object in the image to be classified includes:

The union of the initial regions corresponding to the target object in the at least one image to be classified is obtained as the final region of the target object.
The method according to claim 3 or 4, wherein the at least one image to be classified comprises a first image to be classified without an initial area of the target object and a second image to be classified with an initial area of the target object marked ; Before obtaining the final area of the target object based on the initial area corresponding to the target object in the image to be classified, the method further includes:

Using the classification model to detect that the first image to be classified is not marked with the initial area of the target object, and based on the initial area of the target object marked on the second to-be-classified image and the second to-be-classified image The registration relationship between the classified image and the first to-be-classified image determines the initial area of the target object on the first to-be-classified image.
The method according to any one of claims 2 to 5, before obtaining the final feature information based on at least one set of initial feature information in the several sets of initial feature information, the method further comprises:

converting each group of the initial feature information into a preset dimension;

And/or, obtaining final feature information based on at least one set of initial feature information in the several groups of initial feature information, including:

Using the weight of the at least one set of initial feature information, the at least one set of initial feature information is fused to obtain the final feature information.
According to the method of claim 6, the weight of each set of the initial feature information is determined during the training process of the classification model.
The method according to claim 6 or 7, wherein the preset dimension is one dimension.
The method according to any one of claims 1 to 8, wherein the classification model adopts an ArcFace loss function during the training process to determine the loss value of the classification model; and/or, the batch samples selected for each training of the classification model The data is sample data with a preset proportion of the number of different target types selected from the sample data set by the data generator.
The method according to any one of claims 1 to 9, wherein the acquiring at least one image to be classified containing the target object comprises:

The to-be-classified images containing the target object are respectively extracted from a plurality of original medical images.
The method according to claim 10, wherein the images to be classified including the target object are extracted from a plurality of original medical images respectively, comprising:

determining the initial area of the target object in the original medical image, and expanding the initial area according to the preset ratio to obtain the area to be extracted;

The image data in the to-be-extracted area is extracted from the original medical image to obtain the to-be-classified image.
The method according to claim 10 or 11, before the image to be classified containing the target object is extracted from a plurality of original medical images, the method further comprises at least one of the following steps:

resampling the original medical image to a preset resolution;

adjusting the range of pixel values in the original medical image;

normalizing the original medical image;

It is detected that the initial area of the target object is not marked in the first original medical image, and the initial area of the target object marked on the second original medical image and the second original medical image and the first original medical image are used. The registration relationship is determined, and the initial area of the target object on the first original medical image is determined.
The method according to any one of claims 10 to 12, wherein the original medical image and the image to be classified are two-dimensional images; or, the original medical image is a three-dimensional image, and the image to be classified is a two-dimensional image or 3D images.
The method according to claim 13, wherein the original medical image is a three-dimensional image, and the image to be classified is a two-dimensional image obtained by extracting the layer where the target object has the largest area in the original medical image.
An image object classification device, comprising:

an image acquisition module, configured to acquire at least one image to be classified including the target object, wherein the at least one image to be classified is a medical image belonging to at least one scanned image category;

The target classification module is configured to use a classification model to perform target classification on the at least one image to be classified to obtain the type of the target object.
An electronic device includes a memory and a processor coupled to each other, the processor is configured to execute program instructions stored in the memory, so as to implement the image object classification method according to any one of claims 1 to 14.
A computer-readable storage medium having program instructions stored thereon, the program instructions implement the image object classification method according to any one of claims 1 to 14 when the program instructions are executed by a processor.
A computer program comprising computer readable codes, when the computer readable codes are executed in an electronic device, the processor of the electronic device executes the code for realizing any one of claims 1 to 14 The image object classification method described in item.