CN113971677A

CN113971677A - Image segmentation method and device, electronic equipment and readable medium

Info

Publication number: CN113971677A
Application number: CN202111237559.2A
Authority: CN
Inventors: 喻庐军; 韩森尧; 于吉鹏; 侯博严
Original assignee: Taikang Insurance Group Co Ltd
Current assignee: Taikang Insurance Group Co Ltd
Priority date: 2021-10-21
Filing date: 2021-10-21
Publication date: 2022-01-25

Abstract

The present disclosure provides an image segmentation method, an apparatus, an electronic device and a readable medium, wherein the image segmentation method comprises: carrying out blocking processing on an image to be segmented to obtain a blocked image; classifying and labeling the block images according to the proportion of the target images in the block images; inputting the classified and labeled block images into a trained convolutional neural network to obtain segmented images; and guiding a machine learning algorithm to carry out edge correction on the image output by the convolutional neural network through the prior knowledge so as to obtain a corrected segmentation image. By the aid of the method and the device, accuracy, reliability and image processing speed of the obtained segmented image are improved.

Description

Image segmentation method and device, electronic equipment and readable medium

Technical Field

The present disclosure relates to the field of image processing technologies, and in particular, to an image segmentation method, an image segmentation apparatus, an electronic device, and a readable medium.

Background

At present, focus analysis, organ contour detection and the like in medical images all need to accurately segment target images, so that doctors can conveniently and accurately judge the target images.

In the related art, machine vision is widely used in medical image processing scenes, and a machine vision product converts a shot target into an image signal, transmits the image signal to a special image processing system, obtains morphological information of the shot target, and converts the morphological information into a digital signal according to information such as pixel distribution, brightness and color.

However, in most medical images, the colors between the target region and the background region are solved, and the problems of unclear, incoherent and missing edges often exist, so that the image segmentation of the medical images is poor in reliability, low in accuracy and slow in speed.

It is to be noted that the information disclosed in the above background section is only for enhancement of understanding of the background of the present disclosure, and thus may include information that does not constitute prior art known to those of ordinary skill in the art.

Disclosure of Invention

An object of the present disclosure is to provide an image segmentation method, apparatus, electronic device and readable medium for overcoming, at least to some extent, the problem of poor image segmentation accuracy due to the limitations and drawbacks of the related art.

According to a first aspect of the embodiments of the present disclosure, there is provided an image segmentation method, including: carrying out blocking processing on an image to be segmented to obtain a blocked image; classifying and labeling the block images according to the proportion of the target images in the block images; inputting the classified and labeled block images into a trained convolutional neural network to obtain segmented images; and guiding a machine learning algorithm to carry out edge correction on the image output by the convolutional neural network through the prior knowledge so as to obtain a corrected segmentation image.

In an exemplary embodiment of the present disclosure, the convolutional neural network includes a feature extraction layer and a deconvolution layer, which are connected in sequence, where the feature extraction layer includes an initial convolutional layer, a maximum pooling layer, and multiple convolutional layers, which are connected in sequence, the feature extraction layer is configured to extract feature vectors of the block images, and the deconvolution layer is configured to perform upsampling processing on the feature vectors to obtain a feature matrix of the block images.

In an exemplary embodiment of the present disclosure, a convolution kernel size of the initial convolutional layer is 7 × 7, a sampling interval of the initial convolutional layer is 2, and a channel number of the initial convolutional layer is 64.

In an exemplary embodiment of the present disclosure, the sampling interval of the maximum pooling layer is 2.

In an exemplary embodiment of the present disclosure, the feature extraction layer includes a maximum pooling layer, a first convolutional layer, a second convolutional layer, a third convolutional layer, a fourth convolutional layer, and a fifth convolutional layer, an output result of the maximum pooling layer is input to the first convolutional layer, the first convolutional layer includes two bittleneck layers, a convolutional kernel size of the first convolutional layer is 3 × 3, a sampling interval of the first convolutional layer is 1, a number of channels of the first convolutional layer is 64, an output result of the first convolutional layer is input to the second convolutional layer, the second convolutional layer includes two bittleneck layers, a convolutional kernel size of the second convolutional layer is 3 × 3, a sampling interval of the second convolutional layer is 2, a number of channels of the second convolutional layer is 128, an output result of the second convolutional layer is input to the third convolutional layer, the third convolutional layer includes two bittleneck layers, the convolution kernel size of the third convolution layer is 3 × 3, the sampling interval of the third convolution layer is 1, the number of channels of the third convolution layer is 256, the output result of the third convolution layer is input to the fourth convolution layer, the fourth convolution layer includes two bottleeck layers, the convolution kernel size of the fourth convolution layer is 3 × 3, the sampling interval of the fourth convolution layer is 2, and the number of channels of the fourth convolution layer is 512.

In an exemplary embodiment of the present disclosure, the deconvolution layers include a first deconvolution layer to which an output result of the feature extraction layer is input, a second deconvolution layer to which a convolution kernel size of the first deconvolution layer is 3 × 3, a sampling interval of the first deconvolution layer is 2, the number of channels of the first deconvolution layer is 256, an output result of the first deconvolution layer is input, a third deconvolution layer to which a sampling interval of the second deconvolution layer is 3 × 3, a fourth deconvolution layer to which an output result of the second deconvolution layer is input, and a fifth deconvolution layer to which an output result of the second deconvolution layer is input, the third deconvolution layer comprises two bottleeck layers, the convolution kernel size of the third deconvolution layer is 3 × 3, the sampling interval of the third deconvolution layer is 2, the number of channels of the third deconvolution layer is 64, the output result of the third deconvolution layer is input to the fourth deconvolution layer, the fourth deconvolution layer comprises two bottleeck layers, the convolution kernel size of the fourth deconvolution layer is 3 × 3, the sampling interval of the fourth deconvolution layer is 2, the number of channels of the third deconvolution layer is 64, the output result of the fourth deconvolution layer is input to the fifth deconvolution layer, the fifth deconvolution layer comprises two bottleeck layers, the convolution kernel size of the fifth deconvolution layer is 3 × 3, the sampling interval of the fifth deconvolution layer is 2, and the number of channels of the fifth deconvolution layer is 1.

In an exemplary embodiment of the present disclosure, before performing a blocking process on an image to be segmented, the method further includes:

inputting feature matrix samples of the classified and labeled block images into a full connection layer of the convolutional neural network to obtain feature vector samples output by the full connection layer of the convolutional neural network;

and constructing a ternary loss function according to the inter-class distance of the feature vector samples output by the full connection layer.

In an exemplary embodiment of the present disclosure, before performing a blocking process on an image to be segmented, the method further includes: and constructing the multilayer perceptron of each deconvolution layer, wherein the multilayer perceptron comprises an input layer, a hidden layer and an output layer which are sequentially connected, the feature vector samples of the block images are input into the input layer, the hidden layer is used for completing the deep extraction of the feature vector samples, and the output layer is used for executing the classification of the feature vector samples after the deep extraction.

In an exemplary embodiment of the present disclosure, before performing a blocking process on an image to be segmented, the method further includes: performing tiling operation on the feature matrix samples output by the deconvolution layer to obtain two-dimensional vector samples; inputting the two-dimensional vector samples to an input layer of the multilayer perceptron, and outputting classification results of the two-dimensional vector samples by the multilayer perceptron; and determining a classification loss function corresponding to the classification result through a cross entropy algorithm.

In an exemplary embodiment of the present disclosure, before performing a blocking process on an image to be segmented, the method further includes: inputting the classified and labeled block image samples into a convolutional neural network; and determining a regression loss function according to the output result sample of the convolutional neural network and the block image sample.

According to a second aspect of the embodiments of the present disclosure, there is provided an image segmentation apparatus including: the blocking module is used for carrying out blocking processing on the image to be segmented to obtain a blocking image; the labeling module is used for classifying and labeling the block images according to the proportion of the target images in the block images; the neural network module is used for inputting the classified and labeled block images into the trained convolutional neural network to obtain segmented images; and the correction module is used for guiding a machine learning algorithm to carry out edge correction on the image output by the convolutional neural network through priori knowledge so as to obtain a corrected segmentation image.

According to a third aspect of the present disclosure, there is provided an electronic device comprising: a memory; and a processor coupled to the memory, the processor configured to perform the method of any of the above based on instructions stored in the memory.

According to a fourth aspect of the present disclosure, there is provided a computer-readable storage medium having stored thereon a program which, when executed by a processor, implements the image segmentation method as set forth in any one of the above.

According to the method and the device, the segmented images labeled in a classified mode are identified through the convolutional neural network to obtain the segmented images, the edges of the images output by the neural network are further corrected through the machine learning algorithm, the corrected segmented images are obtained, and the accuracy, the reliability and the image processing speed of the obtained segmented images are improved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure. It is to be understood that the drawings in the following description are merely exemplary of the disclosure, and that other drawings may be derived from those drawings by one of ordinary skill in the art without the exercise of inventive faculty.

FIG. 1 shows a flow chart of an image segmentation method in an example embodiment of the present disclosure;

FIG. 2 shows a flow chart of an image segmentation method in another exemplary embodiment of the present disclosure;

FIG. 3 shows a flow chart of an image segmentation method in another exemplary embodiment of the present disclosure;

FIG. 4 shows a flow chart of an image segmentation method in another exemplary embodiment of the present disclosure;

FIG. 5 shows a flow chart of an image segmentation method in another exemplary embodiment of the present disclosure;

FIG. 6 shows a schematic diagram of a convolutional layer in an image segmentation scheme in an exemplary embodiment of the present disclosure;

FIG. 7 shows a schematic diagram of fully connected layers in an image segmentation scheme in another example embodiment of the present disclosure;

FIG. 8 shows a schematic diagram of mlp layers in an image segmentation scheme in another exemplary embodiment of the present disclosure;

FIG. 9 shows a schematic diagram of an improved unet network in an image segmentation scheme in another exemplary embodiment of the present disclosure;

FIG. 10 shows a schematic diagram of an image processed by an image segmentation scheme in an example embodiment of the present disclosure;

FIG. 11 shows a schematic diagram of an image processed by an image segmentation scheme in another example embodiment of the present disclosure;

FIG. 12 shows a schematic diagram of an image processed by an image segmentation scheme in another example embodiment of the present disclosure;

FIG. 13 shows a schematic diagram of an image processed by an image segmentation scheme in another example embodiment of the present disclosure;

FIG. 14 shows a block diagram of an image segmentation apparatus in an example embodiment of the present disclosure;

FIG. 15 illustrates a block diagram of an electronic device in an exemplary embodiment of the disclosure.

Detailed Description

Example embodiments will now be described more fully with reference to the accompanying drawings. Example embodiments may, however, be embodied in many different forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the disclosure. One skilled in the relevant art will recognize, however, that the subject matter of the present disclosure can be practiced without one or more of the specific details, or with other methods, components, devices, steps, and the like. In other instances, well-known technical solutions have not been shown or described in detail to avoid obscuring aspects of the present disclosure.

Further, the drawings are merely schematic illustrations of the present disclosure, in which the same reference numerals denote the same or similar parts, and thus, a repetitive description thereof will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in the form of software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.

The following detailed description of exemplary embodiments of the disclosure refers to the accompanying drawings.

Fig. 1 is a flowchart of an image segmentation method in an exemplary embodiment of the present disclosure.

Referring to fig. 1, the image segmentation method may include:

and step S102, carrying out blocking processing on the image to be segmented to obtain a blocked image.

And step S104, classifying and labeling the block images according to the proportion of the target images in the block images.

And S106, inputting the classified and labeled block images into the trained convolutional neural network to obtain segmented images.

And S108, guiding a machine learning algorithm to carry out edge correction on the image output by the convolutional neural network through priori knowledge so as to obtain a corrected segmentation image.

In an exemplary embodiment of the present disclosure, the machine learning algorithm may employ a watershed algorithm, use a watershed algorithm based on the labeled image, and guide the watershed algorithm by a priori knowledge so as to obtain a better image segmentation effect.

In an exemplary embodiment of the present disclosure, the block image is referred to as an image, and a segmented image obtained by performing convolutional neural network processing is referred to as a segImage (the image is a binarized image). And performing morphological corrosion operation on the segImage to obtain a binary image segImage _ enode, wherein the segImage _ eorde is a marked image in a watershed algorithm, and outputting an accurately segmented image after the segmented image is corrected by the watershed algorithm.

The following describes each step of the image segmentation method in detail.

In an exemplary embodiment of the present disclosure, as shown in fig. 6, in the bottleeck layer 600, NC denotes the number of channels of data, relu is an activation function, and 3 × 3 is a convolution kernel size.

In an exemplary embodiment of the present disclosure, as shown in fig. 2, before performing the blocking process on the image to be segmented, the method further includes:

step S202, inputting the feature matrix samples of the classified and labeled block images into the full-connection layer of the convolutional neural network to obtain feature vector samples output by the full-connection layer of the convolutional neural network.

And step S204, constructing a ternary loss function according to the inter-class distance of the feature vector samples output by the full connection layer.

In an exemplary embodiment of the present disclosure, as shown in fig. 3, before performing the blocking process on the image to be segmented, the method further includes:

step S302, constructing a multilayer perceptron of each deconvolution layer, wherein the multilayer perceptron comprises an input layer, a hidden layer and an output layer which are sequentially connected, the feature vector samples of the block images are input to the input layer, the hidden layer is used for completing the depth extraction of the feature vector samples, and the output layer is used for executing the classification of the feature vector samples after the depth extraction.

In an exemplary embodiment of the disclosure, as shown in fig. 8, a multi-layer Perceptron (mlp) module, where a circle represents a single neuron, an input layer 802 is connected to each neuron in a hidden layer 804, the hidden layer 804 is connected to each neuron in an output layer 806, that is, each neuron in adjacent layers may affect each other, the hidden layer 804 performs deep extraction of features, and the output layer 806 performs classification of categories. .

In an exemplary embodiment of the present disclosure, as shown in fig. 4, before performing the blocking process on the image to be segmented, the method further includes:

step S402, performing tiling operation on the feature matrix samples output by the deconvolution layer to obtain two-dimensional vector samples.

Step S404, inputting the two-dimensional vector samples to an input layer of the multilayer perceptron, and outputting the classification results of the two-dimensional vector samples by the multilayer perceptron.

Step S406, determining a classification loss function corresponding to the classification result through a cross entropy algorithm.

In an exemplary embodiment of the present disclosure, as shown in fig. 5, before performing the blocking process on the image to be segmented, the method further includes:

step S502, inputting the block image samples labeled by classification into a convolutional neural network.

Step S504, determining a regression loss function according to the output result sample of the convolutional neural network and the block image sample.

In an exemplary embodiment of the disclosure, as shown in fig. 6, fig. 7, fig. 8, and fig. 9, for the first feature matrix D1700 in the decoding process, a normalized feature vector V1 is obtained through a Full Connected layer module (FC for short). In the same way, for the second feature matrix D2, the third feature matrix D3 and the fourth feature matrix D4, the corresponding second feature vector V2, the third feature matrix V3 and the fourth feature matrix V4 are output through a Full connection layer, flatten represents a tiled feature matrix, batchnorm represents batch normalization, Full Connected represents a Full connection layer, and normalization represents single feature vector normalization.

In an exemplary embodiment of the present disclosure, as shown in fig. 9, the convolutional neural network employs a modified unet network, i.e., a full connection layer FC is added to the deconvolution layer of the unet network, and a multi-layer perceptron mlp shown in fig. 8 is constructed for the deconvolution layer.

As shown in fig. 9, the network input scales the image block 900 with a fixed size from a resolution of 1024 × 1024 to a resolution of 224 × 224, and the process of training the convolutional neural network includes:

(1) input of the network: rgb (red green blue three channel) pictures with an image size of 224x 224.

(2) Different scale features are extracted through a convolutional neural network, and the specific structure is as follows:

(2.1)1 convolutional layer, the input is 224x224x3 size pictures, the convolutional kernel size is 7x7, the sampling interval is 2, the number of channels is 64, and the output is recorded as E1.

(2.2) a maxpool layer with input E1, sampling interval 2, and output maxpool

(2.3)2 bottomlevk layers, input maxpool, convolution kernel size 3 × 3, sampling interval 1, channel number 64, output recorded as E2.

(2.4)2 bottomlevk layers, the input is E2, the convolution kernel size is 3x3, the sampling interval is 2, the number of channels is 128, and the output is recorded as E3.

(2.5)2 bottomleneck layers, wherein the input is E3, the convolution kernel size is 3x3, the sampling interval is 1, the number of channels is 256, and the output is recorded as E4.

(2.6)2 bottomleneck layers, wherein the input is E4, the size of a convolution kernel is 3x3, the sampling interval is 2, the number of channels is 512, and the output is recorded as E5.

(3) The deconvolution layer of the convolutional neural network has the following specific structure:

(3.1) image upsampling, 2 bottomLeeck layers, input of E5, convolution kernel size of 3x3, sampling interval of 2, channel number of 256, and output of D1.

(3.2) image upsampling, 2 bottomLeeck layers, D1 as input, 3x3 convolution kernel size, 2 sampling intervals, 128 channels as output, and D2 as output.

(3.3) image upsampling, 2 bottomLeeck layers, D2 as input, 3x3 convolution kernel size, 2 sampling intervals, 64 channel number, and D3 as output.

(3.4) image upsampling, 2 bottomLeeck layers, input D3, convolution kernel size 3x3, sampling interval 2, channel number 64, and output D4.

(3.5) image upsampling, 2 bottomLeeck layers, D4 as input, 3x3 convolution kernel size, 2 sampling intervals, 1 channel number, and output as output.

As shown in fig. 9, classification of categories is completed for four feature modules with different scales, and classification of target images with different scales has certain robustness, which is specifically as follows:

(1) and constructing a first mlp module, and performing tiling operation on the feature matrix D1 to obtain a two-dimensional vector VF 1. The VF1 is input to the first mlp module and is output as the number of categories. The classifier was denoted as CLS using Cross Entrophy as a loss function₁。

(2) And constructing a second mlp module, and performing tiling operation on the feature matrix D2 to obtain a two-dimensional vector VF 2. The VF2 is input to the second mlp module and is output as the number of categories. The classifier was denoted as CLS using Cross Entrophy as a loss function₂。

(3) And constructing a third mlp module, and performing tiling operation on the feature matrix D3 to obtain a two-dimensional vector VF 3. The VF3 is input to the third mlp module and is output as the number of classifications. The classifier was denoted as CLS using Cross Entrophy as a loss function₃。

(4) And constructing a fourth mlp module, and performing tiling operation on the feature matrix D4 to obtain a two-dimensional vector VF 4. The VF4 is input to the fourth mlp module and is output as the number of categories. The classifier was denoted as CLS using Cross Entrophy as a loss function₄。

Based on the improved structure of the convolutional neural network, the picture output vector with label as 1 is V₁The picture output vector with label as 0 is marked as V₀Feature vector V of reference sample_anchor(the reference sample is an output vector with label of 1), the principle of ternary Loss is that the distance between classes is maximized, the distance within a class is minimized, and the calculation method of ternary Loss Triplet Loss is shown as formula (one):

TripletLoss＝min(||V_anchor-V₁||₂-||V_anchor-V₀||₂equation (one).

Wherein, V₁Denotes a feature vector with label of 1, V₀Representing a feature vector with label 0.

Respectively calculating corresponding ternary losses of the eigenvectors obtained in the step (1), taking the average value of the ternary losses corresponding to the 4 eigenvectors as a final ternary Loss value, and using the calculation method of the ternary Loss value AveTri Loss as shown in the following formula (II):

based on the improved structure of the convolutional neural network, the calculation method of the classification Loss is shown as a formula (III):

as shown in fig. 10, the cross-sectional view 1000 of the glomerulus includes a glomerular region 1002, a non-glomerular region 1004, and a background region 1006 of the medical image.

As shown in fig. 11 and 12, the glomerular sectional view 1000 is processed into a non-glomerular sectional block image 1100 and a glomerular sectional block image 1200, and the glomerular sectional block image 1200 shows a glomerular region 1202.

As shown in fig. 13, a labeled image 1300 is obtained by labeling a glomerular region 1302 of a glomerular sectional block image 1300 and not labeling a non-labeled region 1304.

Based on the improved structure of the convolutional neural network, the output is recorded as the network output, the mask is the labeled image 1300 (binary image), and the calculation method of the regression Loss is shown as the formula (four):

regressionLoss＝‖output-mask‖₂equation (iv).

Obtaining the total Loss of the convolutional neural network according to the ternary Loss, the classification Loss and the regression Loss, wherein the calculation method of the total Loss is shown as a formula (five):

loss ═ avetrolass × Wt + Wc × classificationLoss + regressionLoss × Wr, equation (five).

Wherein Wt is a ternary loss weight, Wc is a classification error weight, and Wr is a regression loss weight.

Corresponding to the method embodiment, the present disclosure also provides an image segmentation apparatus, which may be used to execute the method embodiment.

Fig. 14 is a block diagram of an image segmentation apparatus in an exemplary embodiment of the present disclosure.

Referring to fig. 14, the image segmentation apparatus 1400 may include:

the block module 1402 is configured to perform block processing on the image to be segmented to obtain a block image.

And the labeling module 1404 is configured to classify and label the block images according to the proportion of the target image in the block images.

The neural network module 1406 is configured to input the classified and labeled block images into the trained convolutional neural network to obtain a segmented image.

A correction module 1408 configured to direct a machine learning algorithm to perform edge correction on the image output by the convolutional neural network through a priori knowledge to obtain a corrected segmented image.

In an exemplary embodiment of the present disclosure, a convolution kernel size of the initial convolutional layer is 14 × 14, a sampling interval of the initial convolutional layer is 2, and a channel number of the initial convolutional layer is 64.

In an exemplary embodiment of the disclosure, the neural network module 1406 is further configured to: inputting feature matrix samples of the classified and labeled block images into a full connection layer of the convolutional neural network to obtain feature vector samples output by the full connection layer of the convolutional neural network; and constructing a ternary loss function according to the inter-class distance of the feature vector samples output by the full connection layer.

In an exemplary embodiment of the disclosure, the neural network module 1406 is further configured to: and constructing the multilayer perceptron of each deconvolution layer, wherein the multilayer perceptron comprises an input layer, a hidden layer and an output layer which are sequentially connected, the feature vector samples of the block images are input into the input layer, the hidden layer is used for completing the deep extraction of the feature vector samples, and the output layer is used for executing the classification of the feature vector samples after the deep extraction.

In an exemplary embodiment of the disclosure, the neural network module 1406 is further configured to: performing tiling operation on the feature matrix samples output by the deconvolution layer to obtain two-dimensional vector samples; inputting the two-dimensional vector samples to an input layer of the multilayer perceptron, and outputting classification results of the two-dimensional vector samples by the multilayer perceptron; and determining a classification loss function corresponding to the classification result through a cross entropy algorithm.

In an exemplary embodiment of the disclosure, the neural network module 1406 is further configured to: inputting the classified and labeled block image samples into a convolutional neural network; and determining a regression loss function according to the output result sample of the convolutional neural network and the block image sample.

Since the functions of the apparatus 1400 have been described in detail in the corresponding method embodiments, the disclosure is not repeated herein.

It should be noted that although in the above detailed description several modules or units of the device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit, according to embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into embodiments by a plurality of modules or units.

In an exemplary embodiment of the present disclosure, an electronic device capable of implementing the above method is also provided.

As will be appreciated by one skilled in the art, aspects of the present invention may be embodied as a system, method or program product. Thus, various aspects of the invention may be embodied in the form of: an entirely hardware embodiment, an entirely software embodiment (including firmware, microcode, etc.) or an embodiment combining hardware and software aspects that may all generally be referred to herein as a "circuit," module "or" system.

An electronic device 1500 according to this embodiment of the invention is described below with reference to fig. 15. The electronic device 1500 shown in fig. 15 is only an example and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.

As shown in fig. 15, electronic device 1500 is in the form of a general purpose computing device. Components of electronic device 1500 may include, but are not limited to: the at least one processing unit 1510, the at least one memory unit 1520, and the bus 1530 that connects the various system components (including the memory unit 1520 and the processing unit 1510).

Wherein the memory unit stores program code that is executable by the processing unit 1510 to cause the processing unit 1510 to perform steps according to various exemplary embodiments of the present invention as described in the above section "exemplary methods" of the present specification. For example, the processing unit 1510 may perform a method as shown in embodiments of the present disclosure.

The storage unit 1520 may include readable media in the form of volatile storage units, such as a random access memory unit (RAM)15201 and/or a cache memory unit 15202, and may further include a read only memory unit (ROM) 15203.

Storage unit 1520 may also include a program/utility 15204 having a set (at least one) of program modules 15205, such program modules 15205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.

Bus 1530 may be any bus representing one or more of several types of bus structures, including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or a local bus using any of a variety of bus architectures.

The electronic device 1500 may also communicate with one or more external devices 1540 (e.g., keyboard, pointing device, bluetooth device, etc.), with one or more devices that enable a user to interact with the electronic device 1500, and/or with any devices (e.g., router, modem, etc.) that enable the electronic device 1500 to communicate with one or more other computing devices. Such communication may occur via input/output (I/O) interface 1550. Also, the electronic device 1500 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) via the network adapter 1560. As shown, the network adapter 1560 communicates with the other modules of the electronic device 1500 over the bus 1530. It should be appreciated that although not shown, other hardware and/or software modules may be used in conjunction with the electronic device 1500, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

Through the above description of the embodiments, those skilled in the art will readily understand that the exemplary embodiments described herein may be implemented by software, or by software in combination with necessary hardware. Therefore, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (which may be a CD-ROM, a usb disk, a removable hard disk, etc.) or on a network, and includes several instructions to enable a computing device (which may be a personal computer, a server, a terminal device, or a network device, etc.) to execute the method according to the embodiments of the present disclosure.

In an exemplary embodiment of the present disclosure, there is also provided a computer-readable storage medium having stored thereon a program product capable of implementing the above-described method of the present specification. In some possible embodiments, aspects of the invention may also be implemented in the form of a program product comprising program code means for causing a terminal device to carry out the steps according to various exemplary embodiments of the invention described in the above section "exemplary methods" of the present description, when said program product is run on the terminal device.

The program product for implementing the above method according to an embodiment of the present invention may employ a portable compact disc read only memory (CD-ROM) and include program codes, and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited in this regard and, in the present document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

A computer readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.

Program code for carrying out operations for aspects of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

Furthermore, the above-described figures are merely schematic illustrations of processes involved in methods according to exemplary embodiments of the invention, and are not intended to be limiting. It will be readily understood that the processes shown in the above figures are not intended to indicate or limit the chronological order of the processes. In addition, it is also readily understood that these processes may be performed synchronously or asynchronously, e.g., in multiple modules.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

Claims

1. An image segmentation method, comprising:

carrying out blocking processing on an image to be segmented to obtain a blocked image;

classifying and labeling the block images according to the proportion of the target images in the block images;

inputting the classified and labeled block images into a trained convolutional neural network to obtain segmented images;

and guiding a machine learning algorithm to carry out edge correction on the image output by the convolutional neural network through the prior knowledge so as to obtain a corrected segmentation image.

2. The image segmentation method according to claim 1,

the convolutional neural network comprises a feature extraction layer and a deconvolution layer which are sequentially connected, wherein the feature extraction layer comprises an initial convolutional layer, a maximum pooling layer and a plurality of convolutional layers which are sequentially connected, the feature extraction layer is used for extracting feature vectors of the block images, and the deconvolution layer is used for up-sampling the feature vectors to obtain feature matrices of the block images.

3. The image segmentation method of claim 2, wherein the convolution kernel size of the initial convolutional layer is 7x7, the sampling interval of the initial convolutional layer is 2, and the number of channels of the initial convolutional layer is 64.

4. The image segmentation method according to claim 2, characterized in that the sampling interval of the maximum pooling layer is 2.

5. The image segmentation method according to claim 2, wherein the feature extraction layers include a maximum pooling layer, a first convolutional layer, a second convolutional layer, a third convolutional layer, a fourth convolutional layer, and a fifth convolutional layer, an output result of the maximum pooling layer is input to the first convolutional layer, the first convolutional layer includes two bitteneck layers, a convolutional kernel size of the first convolutional layer is 3x3, a sampling interval of the first convolutional layer is 1, the number of channels of the first convolutional layer is 64, an output result of the first convolutional layer is input to the second convolutional layer, the second convolutional layer includes two bitteneck layers, a convolutional kernel size of the second convolutional layer is 3x3, a sampling interval of the second convolutional layer is 2, the number of channels of the second convolutional layer is 128, an output result of the second convolutional layer is input to the third convolutional layer, the third convolutional layer comprises two bittleneck layers, the convolutional kernel size of the third convolutional layer is 3 × 3, the sampling interval of the third convolutional layer is 1, the number of channels of the third convolutional layer is 256, the output result of the third convolutional layer is input to the fourth convolutional layer, the fourth convolutional layer comprises two bittleneck layers, the convolutional kernel size of the fourth convolutional layer is 3 × 3, the sampling interval of the fourth convolutional layer is 2, and the number of channels of the fourth convolutional layer is 512.

6. The image segmentation method according to claim 2, wherein the deconvolution layers include a first deconvolution layer to which an output result of the feature extraction layer is input, a second deconvolution layer to which the output result of the feature extraction layer is input, a third deconvolution layer to which the first deconvolution layer includes two bottleeck layers, a convolution kernel size of the first deconvolution layer is 3x3, a sampling interval of the first deconvolution layer is 2, the number of channels of the first deconvolution layer is 256, the output result of the first deconvolution layer is input, a second deconvolution layer including two bottleeck layers, a convolution kernel size of the second deconvolution layer is 3x3, a sampling interval of the second deconvolution layer is 2, the number of channels of the second deconvolution layer is 128, the output result of the second deconvolution layer is input, the third deconvolution layer comprises two bottleeck layers, the convolution kernel size of the third deconvolution layer is 3 × 3, the sampling interval of the third deconvolution layer is 2, the number of channels of the third deconvolution layer is 64, the output result of the third deconvolution layer is input to the fourth deconvolution layer, the fourth deconvolution layer comprises two bottleeck layers, the convolution kernel size of the fourth deconvolution layer is 3 × 3, the sampling interval of the fourth deconvolution layer is 2, the number of channels of the third deconvolution layer is 64, the output result of the fourth deconvolution layer is input to the fifth deconvolution layer, the fifth deconvolution layer comprises two bottleeck layers, the convolution kernel size of the fifth deconvolution layer is 3 × 3, the sampling interval of the fifth deconvolution layer is 2, and the number of channels of the fifth deconvolution layer is 1.

7. The image segmentation method according to any one of claims 1 to 6, further comprising, before the blocking process for the image to be segmented:

8. The image segmentation method of claim 7, wherein before the image to be segmented is subjected to the blocking process, the method further comprises:

and constructing the multilayer perceptron of each deconvolution layer, wherein the multilayer perceptron comprises an input layer, a hidden layer and an output layer which are sequentially connected, the feature vector samples of the block images are input into the input layer, the hidden layer is used for completing the deep extraction of the feature vector samples, and the output layer is used for executing the classification of the feature vector samples after the deep extraction.

9. The image segmentation method of claim 8, wherein before the image to be segmented is subjected to the blocking process, the method further comprises:

performing tiling operation on the feature matrix samples output by the deconvolution layer to obtain two-dimensional vector samples;

inputting the two-dimensional vector samples to an input layer of the multilayer perceptron, and outputting classification results of the two-dimensional vector samples by the multilayer perceptron;

and determining a classification loss function corresponding to the classification result through a cross entropy algorithm.

10. The image segmentation method as set forth in claim 1-6, further comprising, before the blocking process for the image to be segmented:

inputting the classified and labeled block image samples into a convolutional neural network;

and determining a regression loss function according to the output result sample of the convolutional neural network and the block image sample.

11. An image segmentation apparatus, comprising:

the blocking module is used for carrying out blocking processing on the image to be segmented to obtain a blocking image;

the labeling module is used for classifying and labeling the block images according to the proportion of the target images in the block images;

the neural network module is used for inputting the classified and labeled block images into the trained convolutional neural network to obtain segmented images;

and the correction module is used for guiding a machine learning algorithm to carry out edge correction on the image output by the convolutional neural network through priori knowledge so as to obtain a corrected segmentation image.

12. An electronic device, comprising:

a memory; and

a processor coupled to the memory, the processor configured to perform the image segmentation method of any of claims 1-10 based on instructions stored in the memory.

13. A computer-readable storage medium, on which a program is stored which, when being executed by a processor, carries out the image segmentation method according to any one of claims 1 to 10.