CN112561926A

CN112561926A - Three-dimensional image segmentation method, system, storage medium and electronic device

Info

Publication number: CN112561926A
Application number: CN202011414367.XA
Authority: CN
Inventors: 李艺飞; 王同乐; 周星杰; 孙泽懿
Original assignee: Shanghai Minglue Artificial Intelligence Group Co Ltd
Current assignee: Shanghai Minglue Artificial Intelligence Group Co Ltd
Priority date: 2020-12-07
Filing date: 2020-12-07
Publication date: 2021-03-26

Abstract

The invention discloses a three-dimensional image segmentation method, a three-dimensional image segmentation system, a storage medium and electronic equipment, wherein the three-dimensional image segmentation method comprises the following steps: a hyper-voxel result obtaining step: performing initial segmentation on the three-dimensional image through a hyper-voxel generation algorithm to obtain a hyper-voxel; a hyper-voxel characteristic matrix obtaining step: calculating the characteristics of the hyper-voxels by taking the hyper-voxels as a unit, and then aggregating to obtain a hyper-voxel characteristic matrix; a clustering result obtaining step: obtaining a clustering result through a self-supervision depth subspace clustering network model according to the hyper-voxel characteristic matrix; an image segmentation step: and mapping the clustering result back to the three-dimensional image to finish the image segmentation of the three-dimensional image. Therefore, the invention completes the automatic segmentation of the three-dimensional image on the basis of the hyper-voxel without using the labeled data.

Description

Three-dimensional image segmentation method, system, storage medium and electronic device

Technical Field

The invention relates to the field of three-dimensional image segmentation, in particular to a three-dimensional image segmentation method and system based on self-monitoring depth subspace clustering, a storage medium and electronic equipment.

Background

Image segmentation (image segmentation) technology is an important research direction in the field of computer vision, and is an important part of image semantic understanding. Image segmentation refers to a process of dividing an image into several regions with similar properties, and from a mathematical point of view, image segmentation is a process of dividing an image into mutually disjoint regions. In recent years, with the gradual deepening of a deep learning technology, an image segmentation technology has been developed rapidly, and technologies such as scene object segmentation, human body front background segmentation, human body matching, three-dimensional reconstruction and the like related to the technology are widely applied to industries such as unmanned driving, augmented reality, security monitoring and the like.

The appearance of the deep learning technology drives the rapid development of the image and voice field, and the precision of related tasks is constantly refreshed. Deep learning can be classified into strong supervised learning, weak supervised learning and unsupervised learning according to the condition of the used labeled data. Strong supervised learning uses a large amount of complete and accurate data to train the model, and weak supervised learning uses label data with incomplete and accurate labels to train; in contrast, the unsupervised learning does not use the labeled data, and the labeled data is generated by the unsupervised learning in the training process, so that the unsupervised learning guides the network learning. Unsupervised learning does not depend on a large amount of labeled data, so that the unsupervised learning has a large application scene, and a large number of students research and try to apply the unsupervised learning to actual tasks.

The image segmentation technology is developed to the present, and there are many classical methods, such as a method based on graph theory, a method based on pixel clustering, a method based on depth semantics, etc., and various methods have certain application scenarios according to the algorithm characteristics.

The graph theory-based method utilizes the theory and the method of the neighborhood of the graph theory to map an image into a weighted undirected graph, takes pixels as nodes, takes the image segmentation problem as the vertex division problem of the graph, and utilizes the minimum shearing criterion to obtain the optimal segmentation of the image. Such methods associate the image segmentation problem with the min-cut problem of the graph, which is typically done by mapping the image to be segmented as a weighted undirected graph G ═ V, E, where V ═ E₁,…，v_nIs the collection of vertices and E is the collection of edges. Drawing (A)Each node N E V corresponds to each pixel in the image, each edge E connects a pair of adjacent pixels, and the weight w of the edge is (V ∈ V)_i,v_j)∈E,(v_i,v_j) E, represents a non-negative similarity in terms of gray scale, color, or texture between neighboring pixels. And a segmentation S on the image is a cut of the graph, and each segmented region C e S corresponds to a sub-graph in the graph. The principle of the segmentation is to keep the similarity of the divided sub-graphs maximum inside and the similarity between the sub-graphs minimum.

The clustering-based method is to use a clustering method in machine learning to solve the problem of image segmentation, and typical methods include k-means, spectral clustering, slic (simple linear iterative clustering), and the like.

The method comprises the following general steps:

1) initializing a coarse cluster;

2) and clustering pixel points with similar characteristics such as color, brightness, texture and the like to the same superpixel or superpixel by using an iteration mode, and iterating until convergence, thereby obtaining a final image segmentation result.

The segmentation method based on the depth semantics is characterized in that a convolution network model of deep learning is used for training, high-level information in an original image is extracted through a series of convolution layers, pooling layers, upper adoption layers and classification layers, and finally pixel-level or voxel-level classification is carried out, so that image segmentation is realized. Such methods are the research focus of recent researchers, and many classical models, such as fcn (fullconvolutionnetworks), deep lab series, and pspnet (pyramid Scene networking), are emerging.

However, in practical use, the existing image segmentation method has the following defects:

1. the segmentation method based on graph theory is very dependent on the contrast of color gray scale information between pixels or voxels in an image, and when some blurred images with small color gray scale difference are encountered, the processing result is not ideal.

2. The clustering-based method is greatly influenced by the initial segmentation result, and the first rough clustering result may be different each time, so that the results of the later iteration are different, and the method is not stable enough.

3. A deep learning method based on semantic segmentation generally uses a strong supervised learning model, trains the model through a large amount of labeled data, and extracts effective characteristics. However, in the image processing task in real life, the cost of a large amount of annotation data is high, and the use scenes of the annotation data are limited.

Therefore, there is an urgent need to develop a three-dimensional image segmentation method, system, storage medium and electronic device based on self-supervised depth subspace clustering, which overcome the above-mentioned defects, so that image segmentation can be effectively realized by extracting effective features without relying on a large amount of labeled data in an actual scene.

Disclosure of Invention

In view of the above problem, the present invention provides a three-dimensional image segmentation method, including:

a hyper-voxel result obtaining step: performing initial segmentation on the three-dimensional image through a hyper-voxel generation algorithm to obtain a hyper-voxel;

a hyper-voxel characteristic matrix obtaining step: calculating the characteristics of the super voxels by taking the super voxels as a unit, and then aggregating to obtain a super voxel characteristic matrix;

a clustering result obtaining step: obtaining a clustering result through a self-supervision deep subspace clustering network model according to the hyper-voxel characteristic matrix;

an image segmentation step: and mapping the clustering result back to the three-dimensional image to finish the image segmentation of the three-dimensional image.

In the above three-dimensional image segmentation method, the obtaining of the hyper-voxel result includes: performing an initial segmentation of the three-dimensional image using a hyper-voxel production algorithm, dividing the three-dimensional image into a series of hyper-voxel blocks to obtain a target number of the hyper-voxels.

The three-dimensional image segmentation method described above, wherein the obtaining of the hyper-voxel characteristic matrix includes:

splicing: splicing the features of different categories in the super voxels to obtain the features of the super voxels;

a polymerization step: and aggregating all the super voxel characteristic vectors together to obtain the super voxel characteristic matrix.

In the above three-dimensional image segmentation method, the obtaining of the clustering result includes:

training: training the self-supervision deep subspace clustering network model;

and (3) clustering result output step: and outputting a clustering result through the trained self-supervision depth subspace clustering network model according to the hyper-voxel characteristic matrix.

The invention also provides a three-dimensional image segmentation system, which comprises:

the voxel result obtaining unit is used for carrying out initial segmentation on the three-dimensional image through a voxel generating algorithm to obtain voxels;

a voxel characteristic matrix obtaining unit, which calculates the characteristics of the voxels with the voxels as units and then carries out aggregation to obtain a voxel characteristic matrix;

the clustering result obtaining unit is used for obtaining a clustering result through a self-supervision deep subspace clustering network model according to the hyper-voxel characteristic matrix;

and the image segmentation unit is used for mapping the clustering result back to the three-dimensional image to complete the image segmentation of the three-dimensional image.

The above three-dimensional image segmentation system, wherein the voxel result obtaining unit performs initial segmentation on the three-dimensional image by using a voxel production algorithm, and divides the three-dimensional image into a series of voxel blocks to obtain a target number of voxels.

The three-dimensional image segmentation system described above, wherein the super voxel characteristic matrix obtaining unit includes:

the splicing module is used for splicing the features of different categories in the super voxel to obtain the features of the super voxel;

and the aggregation module is used for aggregating all the super voxel characteristic vectors to obtain the super voxel characteristic matrix.

The above three-dimensional image segmentation system, wherein the clustering result obtaining unit includes:

the training module is used for training the self-supervision deep subspace clustering network model;

and the clustering result output module outputs a clustering result through the trained self-supervision deep subspace clustering network model according to the hyper-voxel characteristic matrix.

The invention also provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the three-dimensional image segmentation method as described in any one of the above when executing the computer program.

The present invention also provides a storage medium having a computer program stored thereon, wherein the program, when executed by a processor, implements a three-dimensional image segmentation method as defined in any one of the above.

In summary, compared with the prior art, the invention has the following effects: the invention completes the automatic segmentation of the three-dimensional image on the basis of the hyper-voxel without using the labeled data.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

FIG. 1 is a flow chart of a three-dimensional image segmentation method of the present invention;

FIG. 2 is a flowchart illustrating the substeps of step S2 in FIG. 1;

FIG. 3 is a flowchart illustrating the substeps of step S3 in FIG. 1;

FIG. 4 is a flowchart illustrating an application of the three-dimensional image segmentation method of the present invention;

FIG. 5 is a diagram of an auto-supervised deep subspace clustering-based network according to the present invention;

FIG. 6 is a schematic diagram of a three-dimensional image segmentation system according to the present invention;

fig. 7 is a schematic structural diagram of an electronic device according to the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The exemplary embodiments and descriptions of the present invention are provided to explain the present invention and not to limit the present invention. Additionally, the same or similar numbered elements/components used in the drawings and the embodiments are used to represent the same or similar parts.

As used herein, the terms "first", "second", "S1", "S2", …, etc. do not particularly denote an order or sequential meaning, nor are they intended to limit the present invention, but merely distinguish between elements or operations described in the same technical terms.

With respect to directional terminology used herein, for example: up, down, left, right, front or rear, etc., are simply directions with reference to the drawings. Accordingly, the directional terminology used is intended to be illustrative and is not intended to be limiting of the present teachings.

As used herein, the terms "comprising," "including," "having," "containing," and the like are open-ended terms that mean including, but not limited to.

As used herein, "and/or" includes any and all combinations of the described items.

References to "plurality" herein include "two" and "more than two"; reference to "sets" herein includes "two sets" and "more than two sets".

As used herein, the terms "substantially", "about" and the like are used to modify any slight variation in quantity or error that does not alter the nature of the variation. Generally, the range of slight variations or errors modified by such terms may be 20% in some embodiments, 10% in some embodiments, 5% in some embodiments, or other values. It should be understood by those skilled in the art that the aforementioned values can be adjusted according to actual needs, and are not limited thereto.

Certain words used to describe the present application are discussed below or elsewhere in this specification to provide additional guidance to those skilled in the art in describing the present application.

The following explains the related terms used in the present invention:

hyper-voxel: and (3) collecting adjacent voxel points with similar color, texture and other characteristics in the image.

Depth subspace: the original features are transformed through the nonlinear features peculiar to the neural network, and the space where the features are located is obtained.

Self-supervision learning: under the condition that manual data labeling is not needed, the data generate labels by themselves, the model is supervised for learning and training, and specific tasks are completed.

The invention aims to provide a three-dimensional image segmentation method based on self-supervision depth subspace clustering, which is used for carrying out three-dimensional image segmentation in an unsupervised learning mode under the condition that label data are not needed.

Referring to fig. 1-3, fig. 1 is a flowchart illustrating a three-dimensional image segmentation method according to the present invention; FIG. 2 is a flowchart illustrating the substeps of step S2 in FIG. 1; fig. 3 is a flowchart illustrating a substep of step S3 in fig. 1. As shown in fig. 1, the three-dimensional image segmentation method of the present invention includes:

the hyper-voxel result obtaining step S1: performing initial segmentation on the three-dimensional image through a hyper-voxel generation algorithm to obtain hyper-voxels;

a hyper-voxel characteristic matrix obtaining step S2: calculating the characteristics of the hyper-voxels by taking the hyper-voxels as a unit, and then aggregating to obtain a hyper-voxel characteristic matrix;

a clustering result obtaining step S3: obtaining a clustering result through a self-supervision depth subspace clustering network model according to the hyper-voxel characteristic matrix;

image segmentation step S4: and mapping the clustering result back to the three-dimensional image to finish the image segmentation of the three-dimensional image.

Further, the super voxel result obtaining step S1 includes: performing an initial segmentation of the three-dimensional image using a hyper-voxel production algorithm, dividing the three-dimensional image into a series of hyper-voxel blocks to obtain a target number of the hyper-voxels.

Still further, the super voxel characteristic matrix obtaining step S2 includes:

Still further, the clustering result obtaining step S3 includes:

training: training the self-supervision deep subspace clustering network model;

Based on the three-dimensional image segmentation method, the automatic segmentation of the three-dimensional image is completed on the basis of the hyper-voxel without using labeled data.

Referring to fig. 4-5, fig. 4 is a flowchart illustrating an application of the three-dimensional image segmentation method according to the present invention; FIG. 5 is a diagram illustrating an auto-supervised deep subspace clustering-based network according to the present invention. The specific operation of the three-dimensional image segmentation method according to the present invention is described in an embodiment with reference to fig. 4-5 as follows.

Step 1: and performing initial segmentation on the original three-dimensional image by using a hyper-voxel generation algorithm to obtain hyper-voxels.

Specifically, in this step, the original three-dimensional image is initially segmented using a voxel production algorithm, and the original three-dimensional image is divided into a series of voxel blocks to produce a target number of voxels. A hyper-voxel is a set of neighboring voxel points in the image that have similar color, texture, etc. The hyper-voxel production algorithm used by the invention is an iterative spatial Fuzzy Clustering algorithm (ISFC), and the algorithm gathers voxel points with high similarity together as a hyper-voxel on the basis of the selected characteristic and similarity calculation mode.

Step 2: the base features are calculated in units of hyper-voxels.

Specifically, in this step, features such as a gray histogram, a Local Binarization Pattern (LBP), and SIFT are calculated in units of hyper-voxels. And splicing the features of different classes in the hyper-voxels to serve as the features of the hyper-voxels. And all the hyper-voxel characteristic vectors are aggregated together to obtain a hyper-voxel characteristic matrix. In order to reduce the influence of abnormal voxel points on the whole body in the superpixel and maintain the consistency of all the superpixel characteristics, normalization preprocessing is needed.

And step 3: and sending the hyper-voxel characteristic matrix into a self-supervision deep subspace clustering network model for training.

Specifically, in this step, a Self-supervised deep subspace clustering network model is used for training, the voxel characteristic matrix obtained in the previous step is sent to a Self-supervised deep subspace clustering network (S3 CN) model for training, and different classes of voxels are clustered without label data.

1) The S3CN model consists of three parts, Auto-encoder (AE), Self-Expression Layer (SEL) and Spectral Clustering (SC), as shown in FIG. 5.

2) The automatic encoder consists of an encoder and a decoder, wherein the encoder consists of three fully-connected layers, and the dimensionality is gradually reduced; the decoder is also composed of three fully-connected layers, the dimensionality of the decoder is gradually increased, and the decoder are in a symmetrical structure. The encoder carries out nonlinear extraction on the original features, then carries out linear recombination through a self-expression layer, and finally sends the features to the decoder to restore the features to the original dimensions.

As shown in figure 5 of the drawings,

represents the output data of the m-th layer,

represents the data dimension of the mth layer,

representing the output of the 0 th layer, namely the input hyper-voxel characteristic matrix; output data of the (m + 1) th layer

Is calculated as shown in equation (2.1):

wherein W^m+1,b^m+1Represents the network parameter at layer m +1 and σ represents the activation function.

The self-expression module is an N multiplied by N matrix, N represents the number of the super voxels generated by each three-dimensional image, and the output result Z of the last layer of the encoder acquires the reconstruction data CZ after linear expression of other super voxels under the action of the self-expression layer C. The input data is subjected to two modules of feature extraction and self-expression to obtain three losses, as shown in formulas (4.2), (4.3) and (4.4).

L₁＝‖C‖_l#(2.3)

After a self-expression coefficient matrix C among the hyper-voxels is obtained, an affinity matrix A is constructed through a formula (2.5), and then a spectral clustering algorithm, namely a spectral clustering module in the graph 2, is used for the matrix A to obtain a segmentation result Q; the last result, Q, may in turn direct the construction of the next affinity matrix, a, and the target loss function is shown in equation (2.6):

wherein the clustering result Q belongs to {0,1}^n×NIs the result of clustering N superpixels into N classes, q_i，q_jIs the ith, j column vector of Q, representing the probability that a super voxel i, j belongs to each class. The result Q of each clustering is not necessarily correct, but has certain guiding significance, and can be used as a 'weak label' for supervising the training of the feature extraction module.

Finally, the loss functions of the S3CN model are as shown in formula (2.7), and the four loss functions are summed, so that the clustering result is gradually improved in the iterative optimization process.

L＝L₀+L₁+L₂+L₃#(2.7)

And 4, step 4: and mapping the clustering result of the hyper-voxels back to the original three-dimensional image to realize image segmentation.

Specifically, in the step, a clustering result Q is obtained through network model training in the step 3, and is mapped back to the original three-dimensional image, and voxel results in the same superpixel are the same; therefore, the segmentation of the three-dimensional image is completed under the condition that label data is not needed.

Referring to fig. 6, fig. 6 is a schematic structural diagram of a three-dimensional image segmentation system according to the present invention. As shown in fig. 6, the three-dimensional image segmentation system of the present invention includes:

a voxel result obtaining unit 11, which initially divides the three-dimensional image by a voxel generation algorithm to obtain voxels;

a voxel characteristic matrix obtaining unit 12, which calculates the characteristics of the voxels with the voxels as a unit and then aggregates the calculated characteristics to obtain a voxel characteristic matrix;

a clustering result obtaining unit 13 for obtaining a clustering result through a self-supervision depth subspace clustering network model according to the hyper-voxel characteristic matrix;

and the image segmentation unit 14 is used for mapping the clustering result back to the three-dimensional image to complete the image segmentation of the three-dimensional image.

Wherein the voxel result obtaining unit 11 performs initial segmentation on the three-dimensional image using a voxel production algorithm, and divides the three-dimensional image into a series of voxel blocks to obtain a target number of the voxels.

Further, the hyper-voxel characteristic matrix obtaining unit 12 includes:

a stitching module 121 for stitching the features of different categories in the voxels to obtain the features of the voxels;

and the aggregation module 122 aggregates all the hyper-voxel feature vectors together to obtain the hyper-voxel feature matrix.

Still further, the clustering result obtaining unit 13 includes:

a training module 131, which trains the self-supervised deep subspace clustering network model;

and a clustering result output module 132, which outputs a clustering result according to the hyper-voxel characteristic matrix through the trained self-supervised depth subspace clustering network model.

Referring to fig. 7, fig. 7 is a schematic structural diagram of an electronic device according to the present invention. As shown in fig. 7, the present embodiment discloses a specific implementation of an electronic device. The electronic device may include a processor 81 and a memory 82 storing computer program instructions.

Specifically, the processor 81 may include a Central Processing Unit (CPU), or A Specific Integrated Circuit (ASIC), or may be configured to implement one or more Integrated circuits of the embodiments of the present Application.

Memory 82 may include, among other things, mass storage for data or instructions. By way of example, and not limitation, memory 82 may include a Hard Disk Drive (Hard Disk Drive, abbreviated HDD), a floppy Disk Drive, a Solid State Drive (SSD), flash memory, an optical Disk, a magneto-optical Disk, tape, or a Universal Serial Bus (USB) Drive or a combination of two or more of these. Memory 82 may include removable or non-removable (or fixed) media, where appropriate. The memory 82 may be internal or external to the data processing apparatus, where appropriate. In a particular embodiment, the memory 82 is a Non-Volatile (Non-Volatile) memory. In particular embodiments, Memory 82 includes Read-Only Memory (ROM) and Random Access Memory (RAM). The ROM may be mask-programmed ROM, Programmable ROM (PROM), Erasable PROM (EPROM), Electrically Erasable PROM (EEPROM), Electrically rewritable ROM (EAROM), or FLASH Memory (FLASH), or a combination of two or more of these, where appropriate. If appropriate, the RAM may be a Static Random-Access Memory (SRAM) or a Dynamic Random-Access Memory (DRAM), where the DRAM may be a Fast Page Mode Dynamic Random Access Memory (FPMDRAM), an Extended data output Dynamic Random Access Memory (edram), a Synchronous Dynamic Random Access Memory (SDRAM), and the like.

The memory 82 may be used to store or cache various data files for processing and/or communication use, as well as possible computer program instructions executed by the processor 81.

The processor 81 implements any one of the three-dimensional image segmentation methods in the above-described embodiments by reading and executing computer program instructions stored in the memory 82.

In some of these embodiments, the electronic device may also include a communication interface 83 and a bus 80. As shown in fig. 7, the processor 81, the memory 82, and the communication interface 83 are connected via the bus 80 to complete communication therebetween.

The communication interface 83 is used for implementing communication between modules, devices, units and/or equipment in the embodiment of the present application. The communication port 83 may also be implemented with other components such as: the data communication is carried out among external equipment, image/data acquisition equipment, a database, external storage, an image/data processing workstation and the like.

The bus 80 includes hardware, software, or both to couple the components of the electronic device to one another. Bus 80 includes, but is not limited to, at least one of the following: data Bus (Data Bus), Address Bus (Address Bus), Control Bus (Control Bus), Expansion Bus (Expansion Bus), and Local Bus (Local Bus). By way of example, and not limitation, Bus 80 may include an Accelerated Graphics Port (AGP) or other Graphics Bus, an Enhanced Industrial Standard Architecture (EISA) Bus, a Front-Side Bus (Front Side Bus, FSB), a Hyper Transport (HT) Interconnect, an Industry Standard Architecture (ISA) Bus, an InfiniBand Interconnect, a Low Pin Count (LPC) Bus, a memory Bus, a Micro Channel Architecture (MCA) Bus, a Peripheral Component Interconnect (PCI) Bus, a PCI-Express (PCI-X) Bus, a Serial Technology Attachment (Serial attached Technology, SATA) Local Bus, a Video Standard Architecture (audio Electronics Standard), abbreviated VLB) bus or other suitable bus or a combination of two or more of these. Bus 80 may include one or more buses, where appropriate. Although specific buses are described and shown in the embodiments of the application, any suitable buses or interconnects are contemplated by the application.

In addition, in combination with the processing methods in the foregoing embodiments, the embodiments of the present application may be implemented by providing a computer-readable storage medium. The computer readable storage medium having stored thereon computer program instructions; the computer program instructions, when executed by a processor, implement any of the three-dimensional image segmentation methods of the embodiments described above.

In summary, the three-dimensional image segmentation method provided by the invention is adopted, the hyper-voxels are generated firstly, and then the unsupervised image segmentation is realized through clustering; meanwhile, a Self-supervision-based deep subspace clustering network (S3 CN) model is adopted, so that the automatic segmentation of the three-dimensional image is completed on the basis of the hyper-voxels without using labeled data.

Although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A three-dimensional image segmentation method, comprising:

a hyper-voxel characteristic matrix obtaining step: calculating the characteristics of the hyper-voxels by taking the hyper-voxels as a unit, and then aggregating to obtain a hyper-voxel characteristic matrix;

a clustering result obtaining step: obtaining a clustering result through a self-supervision depth subspace clustering network model according to the hyper-voxel characteristic matrix;

2. A three-dimensional image segmentation method as set forth in claim 1, wherein the super voxel result obtaining step includes: performing an initial segmentation of the three-dimensional image using a hyper-voxel production algorithm, dividing the three-dimensional image into a series of hyper-voxel blocks to obtain a target number of the hyper-voxels.

3. The three-dimensional image segmentation method according to claim 1, wherein the super voxel characteristic matrix obtaining step includes:

splicing: splicing the features of different categories in the hyper-voxels to obtain the features of the hyper-voxels;

a polymerization step: and aggregating all the hyper-voxel characteristic vectors together to obtain the hyper-voxel characteristic matrix.

4. The three-dimensional image segmentation method according to claim 1, wherein the clustering result obtaining step includes:

training: training the self-supervision deep subspace clustering network model;

5. A three-dimensional image segmentation system, comprising:

a voxel characteristic matrix obtaining unit, which is used for calculating the characteristics of the voxels with the voxels as units and then carrying out aggregation to obtain a voxel characteristic matrix;

and the image segmentation unit is used for mapping the clustering result back to the three-dimensional image and then completing the image segmentation of the three-dimensional image.

6. The three-dimensional image segmentation system of claim 5, wherein the superpixel result obtaining unit performs an initial segmentation on the three-dimensional image using a superpixel production algorithm to divide the three-dimensional image into a series of superpixel blocks to obtain a target number of the superpixels.

7. The three-dimensional image segmentation system according to claim 5, wherein the hyper-voxel characteristic matrix obtaining unit includes:

the splicing module is used for splicing the features of different categories in the voxels to obtain the features of the voxels;

and the aggregation module is used for aggregating all the hyper-voxel characteristic vectors to obtain the hyper-voxel characteristic matrix.

8. The three-dimensional image segmentation system according to claim 5, wherein the clustering result obtaining unit includes:

9. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor implements the three-dimensional image segmentation method according to any one of claims 1 to 4 when executing the computer program.

10. A storage medium on which a computer program is stored which, when being executed by a processor, carries out a three-dimensional image segmentation method as claimed in any one of claims 1 to 4.