CN107945185A

CN107945185A - Image partition method and system based on wide residual pyramid pond network

Info

Publication number: CN107945185A
Application number: CN201711228818.9A
Authority: CN
Inventors: 王瑜; 朱婷; 马泽源
Original assignee: Beijing Technology and Business University
Current assignee: Beijing Technology and Business University
Priority date: 2017-11-29
Filing date: 2017-11-29
Publication date: 2018-04-20
Anticipated expiration: 2037-11-29
Also published as: CN107945185B

Abstract

The invention discloses a kind of image partition method and system based on wide residual pyramid pond network, wherein, method and step includes：Input image to be split；Segmentation figure picture is treated to be standardized；Obtain WRN PPNet models；Training image is pre-processed, and increases the pattern and quantity of training image by method for expanding data, obtains training image collection；Model training is carried out according to WRN PPNet models and training image collection, to generate WRN PPNet parted patterns；Image segmentation result is obtained by WRN PPNet parted patterns according to image to be split.This method fully automatically can split image based on WRN PPNet, realize the purpose to destination object segmentation, from the limitation of image category to be split, and also it is adaptable, model performance is good, so as to effectively improve the accuracy and convenience of image segmentation.

Description

Image partition method and system based on wide residual pyramid pond network

Technical field

The present invention relates to image procossing and technical field of computer vision, is specifically designed a kind of based on wide residual pyramid pond Change the image partition method and system of network.

Background technology

In correlation technique, FCN (fully convolutional network, full convolutional network) opens deep learning The gate of image, semantic segmentation is done, image, semantic segmentation deep learning model hereafter is mostly had made some improvements based on FCN. FCN is to do vision mode, study point using existing CNNs (convolutional neural networks, convolutional network) Layer feature, then is changed to full convolutional layer, then output characteristic figure by the last full articulamentum of sorter network, to substitute classification score, Deconvolution finally is done to these characteristic patterns, to produce the output figure of dense pixel level mark.This network model realizes use CNNs solves visual problem end-to-endly.But FCN lacks different characteristic perceptions, it is impossible in particular problem and scene very well Application because space-invariance existing for itself so that it cannot consider contextual information, it is impossible to perceive example etc.. For many disadvantages of FCN, researchers propose many improved methods, generally comprise：Decode mutation, integrate context letter Breath, condition random field, expansion convolution, multiple dimensioned polymerization, Fusion Features and recurrent neural network.Decode and compare in the method for mutation More typical is SegNet, it contains encoder (convolutional network) and decoder (deconvolution network) two parts, relative to general Logical full convolutional network, it is the processing realized by decoding network to the characteristic pattern of low resolution；Integrate contextual information Method has PSPNet (pyramid scene parsing network, pyramid scene parsing network), and this method mainly exists PPNet modules have been used in network；After being made of CRF (conditional random fields, condition random field) Phase is handled, to improve the ability that model catches details；Using expansion convolution, convolution is done by the paces for increasing convolution kernel, to obtain Obtain broader acceptance region etc..

However, model mentioned above is to be directed to specific semantic segmentation problem, the different journeys done on the basis of FCN The improvement of degree, not any model can be used to solving more different image, semantic segmentation problems well, institute for The deep learning network architecture of image, semantic segmentation also has very big exploration space.In correlation technique, conventional segmentation methods It is the suitable feature of extraction, further according to Image Segmentation Methods Based on Features, but this kind of method can only extract shallow-layer feature, and complicated, be applicable in Property is not strong, it is difficult to popularizes.

The content of the invention

It is contemplated that solve at least some of the technical problems in related technologies.

For this reason, an object of the present invention is to provide a kind of image segmentation side based on wide residual pyramid pond network Method, this method strong applicability, model performance is good, has robustness, makes segmentation not only more convenient, effective and easy to operate And performance is more preferable, so as to effectively improve the accuracy and convenience of image segmentation.

It is another object of the present invention to propose a kind of image segmentation system based on wide residual pyramid pond network.

To reach above-mentioned purpose, one aspect of the present invention embodiment proposes a kind of based on wide residual pyramid pond network Image partition method, comprises the following steps：Input image to be split；The image to be split is standardized, so that described The pixel average of image to be split is 0 and variance is 1；Obtain WRN-PPNet (wide ResNet and pyramid Pooling network, wide residual pyramid pond network) model, wherein, the WRN-PPNet models include WRN modules and PPNet modules, and the feature that the feature of WRN modules extraction is extracted with the PPNet modules blends；To training image into Row pretreatment so that the pixel average of the training image is 0 and variance is 1, and cause the pixel tag of cutting object for 1 and Remainder pixel tag is 0, and increases the pattern and quantity of the training image by method for expanding data, obtains training image Collection；Model training is carried out according to the WRN-PPNet models and the training image collection, to generate WRN-PPNet parted patterns； Image segmentation result is obtained by the WRN-PPNet parted patterns according to the image to be split.

The image partition method based on wide residual pyramid pond network of the embodiment of the present invention, can pass through depth residual error Network theory and deep learning network model training method obtain the Image Segmentation Model based on WRN-PPNet, so as to reality Now split task end to end, and can be used for fully automatically splitting image, and from the limit of image category to be split System, strong applicability, model performance is good, has robustness, makes segmentation not only more convenient, effective and easy to operate and performance More preferably, so as to effectively improve the accuracy and convenience of image segmentation.

In addition, the image partition method according to the above embodiment of the present invention based on wide residual pyramid pond network may be used also With with following additional technical characteristic：

Further, in one embodiment of the invention, the WRN modules include the first wide residual block group, second wide Residual block group and the 3rd wide residual block group, the first wide residual block group, the second wide residual block group and the 3rd width are residual Poor block group includes four wide residual blocks, and each wide residual block includes two convolutional layers, and convolution kernel size is 3*3, and every There is one crowd of standardization BN (batch normalization, batch standardization) layer before a convolutional layer, and described first is wide residual In poor block group, FMs that the size of the characteristic pattern FMs (feature maps, characteristic pattern) of each width residual block output is inputted with it It is equal sized, in the second wide residual block group and the 3rd wide residual block group, the size of the FMs of first wide residual block output is it The half of the FMs sizes of input, the size of the FMs of remaining wide residual block output are equal to the size of the FMs of its input.

Further, in one embodiment of the invention, the PPNet modules include the first pond path, the second pond Change path and the 3rd pond path, the pond window size of first pond path is 4*4, the pond of second pond path Change window size is 2*2, and the pond window size of the 3rd pond path is 1*1, and pond mode is average pond.

Further, in one embodiment of the invention, it is described to obtain wide residual pyramid pond network WRN-PPNet Model, further comprises：Make deconvolution operation on PPNet moulds pond path in the block, wherein, in first pond Make the operation of deconvolution twice on path, and make on the path of second pond deconvolution and operate, and described the There are two convolutional layers on three pond paths, and there are two convolutional layers before each warp lamination, convolution kernel size is 3*3.

Further, in one embodiment of the invention, the above method further includes：By way of series connection, in model Do and original input image is introduced into the WRN-PPNet models second before splitting, use the convolutional layer that two convolution kernels are 3*3 Obtain the optimum organization of the super Localization characteristic of model generation feature and original input image.

To reach above-mentioned purpose, another aspect of the present invention embodiment proposes a kind of based on wide residual pyramid pond network Image segmentation system, including：Input module, for inputting image to be split；First pretreatment module, for being treated point to described Cut image to be standardized, so that the pixel average of the image to be split is 0 and variance is 1；Acquisition module, for obtaining WRN-PPNet models, wherein, the WRN-PPNet models include WRN modules and PPNet modules, and the WRN modules are extracted The feature extracted with the PPNet modules of feature blend；Second pretreatment module, for being located in advance to training image Reason, so that the pixel average of the training image is 0 and variance is 1, and so that the pixel tag of cutting object is 1 and its remaining part It is 0 to divide pixel tag, and increases the pattern and quantity of the training image by method for expanding data, obtains training image collection；Instruction Practice module, for carrying out model training according to the WRN-PPNet models and the training image collection, to generate WRN-PPNet Parted pattern；Split module, split for obtaining image by the WRN-PPNet parted patterns according to the image to be split As a result.

The image segmentation system based on wide residual pyramid pond network of the embodiment of the present invention, can pass through depth residual error Network theory and deep learning network model training method obtain the Image Segmentation Model based on WRN-PPNet, so as to reality Now split task end to end, and can be used for fully automatically splitting image, and from the limit of image category to be split System, strong applicability, model performance is good, has robustness, makes segmentation not only more convenient, effective and easy to operate and performance More preferably, so as to effectively improve the accuracy and convenience of image segmentation.

In addition, the image segmentation system according to the above embodiment of the present invention based on wide residual pyramid pond network may be used also With with following additional technical characteristic：

Further, in one embodiment of the invention, the WRN modules include the first wide residual block group, second wide Residual block group and the 3rd wide residual block group, the first wide residual block group, the second wide residual block group and the 3rd width are residual Poor block group includes four wide residual blocks, and each wide residual block includes two convolutional layers, and convolution kernel size is 3*3, and every There is one batch of BN layers of standardization before a convolutional layer, and in the described first wide residual block group, the spy of each width residual block output The size of sign figure FMs is equal sized with its FMs's inputted, in the second wide residual block group and the 3rd wide residual block group, first The size that wide residual block exports FMs is the half of the FMs sizes of its input, the size of the FMs of remaining wide residual block output Equal to the size of the FMs of its input.

Further, in one embodiment of the invention, deconvolution is made on PPNet moulds pond path in the block Operation, wherein, make the operation of deconvolution twice on the path of first pond, and make once on the path of second pond Deconvolution operates, and has two convolutional layers on the 3rd pond path, and has two volumes before each warp lamination Lamination, convolution kernel size are 3*3.

Further, in one embodiment of the invention, by way of series connection, before model does and splits, second It is secondary that original input image is introduced into the WRN-PPNet models, it is special to obtain model generation using the convolutional layer that two convolution kernels are 3*3 Sign and the optimum organization of the super Localization characteristic of original input image.

The additional aspect of the present invention and advantage will be set forth in part in the description, and will partly become from the following description Obtain substantially, or recognized by the practice of the present invention.

Brief description of the drawings

Of the invention above-mentioned and/or additional aspect and advantage will become from the following description of the accompanying drawings of embodiments Substantially and it is readily appreciated that, wherein：

Fig. 1 is the stream according to the image partition method based on wide residual pyramid pond network of one embodiment of the invention Cheng Tu；

Fig. 2 is the image partition method based on wide residual pyramid pond network according to another embodiment of the present invention Flow chart；

Fig. 3 is the animal picture schematic diagram under the different background according to one embodiment of the invention；

Fig. 4 is the various animals picture schematic diagram according to one embodiment of the invention；

Fig. 5 is the WRN-PPNet model framework schematic diagrames according to one embodiment of the invention；

Fig. 6 is the WRN modular structure schematic diagrams according to one embodiment of the invention；

Fig. 7 is the structure diagram according to the wide residual block of one embodiment of the invention；

Fig. 8 is the PPNet modular structure schematic diagrams according to one embodiment of the invention；

Fig. 9 is the WRN-PPNet last part structure diagrams according to one embodiment of the invention；

Figure 10 is the image segmentation system based on wide residual pyramid pond network according to one embodiment of the invention Structure diagram.

Embodiment

The embodiment of the present invention is described below in detail, the example of the embodiment is shown in the drawings, wherein from beginning to end Same or similar label represents same or similar element or has the function of same or like element.Below with reference to attached The embodiment of figure description is exemplary, it is intended to for explaining the present invention, and is not considered as limiting the invention.

Before image partition method and device based on wide residual pyramid pond network is introduced, first simply introduce It is traditional to obtain the importance of image partition method and deep learning network in terms of image segmentation.

At present, the common partitioning algorithm of image segmentation field generally comprises following several classes：Dividing method, base based on threshold value Dividing method in edge, the dividing method based on region, figure segmentation method, the dividing method based on depth information, and be based on The dividing method of prior information.Wherein, the dividing method based on threshold value is extracted between destination object and background on gray value Difference, by one or several threshold values, whole tonal range is divided into two sections or multistage, so as to be partitioned into destination object portion Point and background parts；Dividing method based on edge is using the interregional heterogeneity of different images, passes through edge detection Method comes out the extracted region where destination object；Dividing method based on region is according to pixel in same object area Similar quality assembles pixel, forms region；Figure segmentation method is a kind of interactive dividing method, it firstly the need of user with Certain interactive means specifies the part prospect and part background of image, then constraint bar of the algorithm using the input of user as segmentation Part, automatically calculates the segmentation met under constraints；Dividing method based on depth information needs to extract depth map features, Classify further according to depth characteristic to scene areas, achieve the purpose that segmentation；Dividing method based on prior information needs to draw Use priori.

The traditional image partition method of the above largely needs first to extract the feature of image, then by Feature Mapping to relevant mode In type, its process is usually relatively complex, and the robustness of effect is inadequate, cannot many times provide semantic information.In addition, also have Following defect：Calculation amount is too big, it is difficult to is applied to real-time system；Need to introduce priori, it is impossible to realize full-automatic dividing；Calculate Method in itself there are smothing filtering is done to image using Gaussian function in contradiction, such as edge detection when, can produce edge blurry effect Should, make the noise smoothing ability of LOG operators and edge stationkeeping ability contradicts etc..

With the development of the arrival in " big data " epoch, and high-capability computing device, deep learning is known in voice Not, the artificial intelligence field such as recognition of face, target detection shows surprising breakthrough, meanwhile, deep learning is in image, semantic Also there is the performance to attract people's attention in terms of segmentation.Image, semantic segmentation means that machine is split and identified in image automatically Content, essence are exactly to classify to each pixel in image.Deep learning does the basic thought that image, semantic is split：Adopt By the use of the image that Pixel-level marks as training image, by the warp lamination in deep learning network, layer etc. is up-sampled by convolution The feature of the extractions such as layer, pond layer reverts to the size of original input image, it is a kind of dividing method end to end.Test table Bright, deep learning network has very superior performance in terms of image segmentation.

Above-mentioned reason is based on, a kind of figure based on wide residual pyramid pond network is proposed for the embodiment of the present invention As dividing method and device.

The image based on wide residual pyramid pond network proposed according to embodiments of the present invention is described with reference to the accompanying drawings Dividing method and device, describe to propose according to embodiments of the present invention first with reference to the accompanying drawings based on wide residual pyramid pond net The image partition method of network.

Fig. 1 is the flow of the image partition method based on wide residual pyramid pond network of one embodiment of the invention Figure.

Comprise the following steps as shown in Figure 1, being somebody's turn to do the image partition method based on wide residual pyramid pond network：

In step S101, image to be split is inputted.

It is understood that as shown in Fig. 2, step A1：Input view data to be split.For example, the embodiment of the present invention Image object to be split include different classifications, each class object has different backgrounds.Same class object, in different background Under image as shown in figure 3, different classes of object is as shown in Figure 4.

In step s 102, treat segmentation figure picture to be standardized, so that the pixel average of image to be split is 0 and variance For 1.

It is understood that as shown in Fig. 2, step A2：Segmentation figure picture is treated to be standardized.For example, training data is 500 × 375 image, subtracts the average value of all pixels on each image to be split, then divided by standard deviation so that pixel Average is 0, variance 1.

In step s 103, WRN-PPNet models are obtained, wherein, WRN-PPNet models include WRN modules and PPNet moulds Block, and the feature that the feature of WRN modules extraction is extracted with PPNet modules blends.

It is understood that as shown in Fig. 2, step A3：WRN-PPNet models are designed, WRN-PPNet models include at least WRN modules and PPNet modules.Wherein, as shown in figure 5, (1) represents WRN modules, (B) is represented by WRN moulds WRN-PPNet models The characteristic pattern of block extraction, (2) represent PPNet modules, C) represent the feature of WRN modules extraction and the feature of PPNet paths extraction Fusion feature.

Alternatively, in one embodiment of the invention, WRN modules include the first wide residual block group, the second wide residual block Group and the 3rd wide residual block group, the first wide residual block group, the second wide residual block group and the 3rd wide residual block group are comprising four wide Residual block, each wide residual block include two convolutional layers, and convolution kernel size is 3*3, and has one before each convolutional layer A batch standardizes BN layers, and in the first wide residual block group, the size of the characteristic pattern FMs of each width residual block output is inputted with it FMs's is equal sized, in the second wide residual block group and the 3rd wide residual block group, the size of the FMs of first wide residual block output It is the half of the FMs sizes of its input, the size of the FMs of remaining wide residual block output is equal to the size of the FMs of its input.

It is understood that as shown in figure 5, (1) is WRN modules, including 1a), 1b), 1c), 1d) four parts, its parameter Form is as shown in table 1, specifically, WRN modules as shown in fig. 6, the input size of the model is M*M*3, wherein, M*M represents defeated Enter graphical rule, " 3 " represent three passages of image, are R (red, red), G (green, green), B (blue, blueness) respectively Three passages.Wherein, 1a) represent the first convolutional layer Conv1,1b) represent the first wide residual block group Conv2,1c) represent that second is wide Residual block group Conv3,1d) represent the 3rd wide residual block group Conv4, (B) represents the FMs extracted by WRN modules.Wherein, table 1 is The parametric form table of WRN modular structures.

Table 1

Wherein, B (3,3) represents the convolutional layer that residual block has 2 convolution kernels to be 3 × 3, and M represents input picture or the ruler of FMs Very little, k represents the increase coefficient of FM quantity, and N represents the number of residual block.

For example, as shown in fig. 6, the WRN modules in the embodiment of the present invention are：Convolutional layer 1a) input size be 240*240*3 (last is channel number to formula, similar below), Output Size 240*240*16；First wide residual error Block group 1b) four wide residual blocks are included, wherein, the input size of first wide residual block is 240*240*16, and Output Size is 240*240*48, input size and the Output Size of the wide residual block of the other three are 240*240*48；Second wide residual block group Four wide residual blocks 1c) are included, wherein, the input size of first wide residual block is 240*240*48, Output Size 120* 120*96, input size and the Output Size of the wide residual block of the other three are 120*120*96；3rd wide residual block group 1d) bag Containing four wide residual blocks, wherein, the input size of first wide residual block is 120*120*96, Output Size 60*60*192, The input size of the wide residual block of the other three and Output Size are 60*60*192.In WRN modules, the form of wide residual block As shown in fig. 7, wherein (a) represents base residual block, (b) represents wide residual block 1, and (c) represents wide residual block 2, (a), (b) and (c) structure of conventional part is BN-conv3*3-ReLU, wherein, ReLU (RectifiedLinearUnit) is activation letter Count, the number that the number for the FMs that convolutional layer exports is the FMs that convolutional layer exports in (a) in (b), (c), which is multiplied by, widens coefficient. In the wide residual block group Conv2 of the first of WRN modules, form such as (b) in Fig. 7 of four wide residual blocks is shown, the second wide residual block Group with (c) in the form such as Fig. 7 of first wide residual block in the 3rd wide residual block group Suo Shi, remaining width residual block form such as In Fig. 7 shown in (b).

Further, in one embodiment of the invention, it is logical to include the first pond path, the second pondization for PPNet modules Road and the 3rd pond path, the pond window size of the first pond path is 4*4, and the pond window size of the second pond path is 2*2, the pond window size of the 3rd pond path is 1*1, and pond mode is average pond.

It is understood that as shown in figure 5, (2) represent that PPNet modules include three pond paths, specifically, PPNet Module includes the first pond path pool1-conv-deconv1-conv- as shown in figure 8, (2) expression PPNet modules Deconv2, the second pond path pool2-conv-deconv3, the 3rd pond path pool3-conv, (C) represent WRN modules The FMs's that the FMs and PPNet of output are exported merges；Wherein, 2a) represent the FMs that the 3rd pond path of PPNet modules exports, 2b) represent the FMs, 2c of the second pond of PPNet modules path output) represent the FMs that the first pond of PPNet modules path exports.

For example, as shown in figure 8, PPNet modules：It is BN layers before three pond passages, followed by three ponds Passage, the form of three pond passages is respectively the first pond passage pool1-conv-deconv1-conv-deconv2, second Pond passage pool2-conv-deconv3, the 3rd pond passage pool3-conv, the input size of three passages is 60* The pond window of 60*192, pool1 are 4*4, sliding step 4, its output size is 15*15*192, the pond window of pool2 For 2*2, sliding step 2, its output size is 30*30*192, and the pond window of pool3 is 1*1, and step-length 1, it exports big Small is 60*60*192, wherein, pool1, pool2, pool3 are uniform pond.Conv indicates that two convolution kernels are 3*3's The convolution block of convolutional layer composition, it is equal with output size that it inputs size；The input size of deconv1 is 15*15*128, output Size is 30*30*128；The input size of deconv2 is 30*30*128, output size 60*60*128, the first pond passage Output size be 60*60*128；The input size of deconv3 is 30*30*128, output size 60*60*128, the second pond The output size for changing passage is 60*60*128；The output size of 3rd pond passage is 60*60*128.All convolutional layers swash Function living is ReLU functions.Wherein, ReLU functional forms are, as shown in formula 1：

Wherein, y_iRepresent the input of the function.

Further, in one embodiment of the invention, wide residual pyramid pond network WRN-PPNet moulds are obtained Type, further comprises：Make deconvolution operation on PPNet moulds pond path in the block, wherein, make two on the first pond path Secondary deconvolution operation, and make deconvolution operation on the second pond path, and have two on the 3rd pond path Convolutional layer, and have two convolutional layers before each warp lamination, convolution kernel size is 3*3.

It is understood that as shown in figure 5, (3) represent the last part of WRN-PPNet models, specifically, such as Fig. 9 Shown, the last part of WRN-PPNet models includes conv-deconv-conv-deconv-conv-conv1, wherein, Conv indicates the convolution block that the convolutional layer that two convolution kernels are 3*3 forms, and deconv represents warp lamination, 3a) represent warp The FMs, 3b of lamination up1 outputs) represent the cascade that the feature of warp lamination up2 outputs is originally inputted with model, conv1 represents volume The convolutional layer that product core is 1*1, the FM of this layer output represent the segmentation effect to destination object in input picture.

For example, as shown in figure 9, WRN-PPNet model last parts：Conv represents two for 3*3 by convolution kernel The convolution block of a convolutional layer composition；The input size of up1 is 60*60*64, its output size is 120*120*64, the input of up2 Size is 120*120*64, its output size is 240*240*64；3b) represent that the output FMs of up2 connects with what is be originally inputted, The input size of conv1 is 240*240*64, and the output result of output size 240*240*1, conv1 are exactly the segmentation of model As a result.

Further, in one embodiment of the invention, the method for the embodiment of the present invention further includes：Pass through the side of series connection Formula, WRN-PPNet models are introduced second before model does and splits by original input image, the use of two convolution kernels are 3*3's Convolutional layer obtains the optimum organization of the super Localization characteristic of model generation feature and original input image.

It should be noted that the WRN-PPNet models pass through one piece of GPU (Graphics Processing Unit, figure Processor) video card NVIDIA Titan X (Pascal) training obtains, significant effect, it is not necessary to as traditional images processing method is first Manual extraction characteristics of image, then carry out the cumbersome step such as splitting, but feature is directly extracted, and image is carried out end-to-endly Segmentation.Moreover, in addition to standardized images, it is not necessary to do any pretreatment to image again.

In step S104, training image is pre-processed, so that the pixel average of training image is 0 and variance is 1, And cause the pixel tag of cutting object to be 1 and remainder pixel tag is 0, and training image is increased by method for expanding data Pattern and quantity, obtain training image collection.

It is understood that as shown in Fig. 2, step A4：Training image is standardized, makes the pixel of training image equal It is worth for 0, variance 1, and uses method for expanding data increase training data.For example, training image is subjected to flip horizontal, is vertically turned over Turn, translation, rotation, zoom, change brightness, elastic distortion, meanwhile, the corresponding label image of training image is also done accordingly Conversion.Data expansion method is specifically shown in Table 2.Finally, the training image after conversion is formed into training set together with former training image. Wherein, table 2 is view data extended method table.

Table 2

Sequence number	Method
		1	50% probability level upset
2	50% probability flip vertical
		3	± 20 ° of rotation
4	10% is translated in the horizontal and vertical directions
		5	Zoom ± 10%
6	Change brightness
		7	Elastic distortion

For example, the embodiment of the present invention can increase the view data of training set by implementing method for expanding data.Bag The image for including random selection 50% does flip horizontal, and the image for randomly choosing 50% does flip vertical, and random selection image carries out Rotation, both horizontally and vertically translates 10%, and amplification 10%, reduces 10%, changes brightness, does elastic distortion, then These transformed images are put into training set, to increase the pattern and quantity of image in training set.

In step S105, model training is carried out according to WRN-PPNet models and training image collection, to generate WRN- PPNet parted patterns.

It is understood that the embodiment of the present invention can carry out mould according to the data set after WRN-PPNet models and extension Type training, to generate WRN-PPNet parted patterns, specifically, as shown in Fig. 2, step A5：Model training, generates WRN-PPNet Parted pattern.Wherein, step A5 model trainings can include four steps in A51, A52, A53 and A54, specific as follows：

Step A51：First, model parameter initializes.Wherein, the initialization mode of weight is random normal in convolutional layer Distribution initialization, the initialization mode of bias vector is complete zero initialization.

Step A52：Adjust the weight of model.During training pattern, using Adam (Adaptive Moment Estimation, adaptive moments estimation method) Optimized model.Table 3 is the iterative process table of Adam algorithms.

Table 3

Step A53：The end condition of training is set.

EarlyStopping methods are used during the model training of the embodiment of the present invention, when verification collection accuracy not When improving again, or when frequency of training reaches the maximum of setting, training process terminates automatically.Wherein, EarlyStopping Refer to during model training, after the end condition for meeting setting, in spite of maximum training round is reached, training is automatic eventually Only.

Further, step A53 is using EarlyStopping controlled training processes, in the training process, verifies collection When accuracy no longer improves or training round reaches maximum, training is terminated.EarlyStopping refers in model training process In, after the end condition for meeting setting, in spite of maximum training round is reached, training is automatic to be terminated；Testing in training process Card collection accounts for the 20% of total training set.

It should be noted that the embodiment of the present invention can use Adam Algorithm for Training models, Adam algorithms utilize gradient The advantages of single order moments estimation and second order moments estimation dynamic adjust the learning rate of each parameter, the algorithm is by offset correction Afterwards, iterative learning rate has definite scope each time so that parameter is more steady.And determine model using grid data service Relevant parameter, such as convolution check figure, activation primitive, select the difficulty of parameter when effectively reducing Optimized model, and utilize EarlyStopping modes so that in the training process, terminate training when model performance no longer improves in time.

Step A54：Preserve trained model.

After training terminates, trained model is preserved：Including preservation model weight (file type .npz)

In step s 106, image segmentation result is obtained by WRN-PPNet parted patterns according to image to be split.

It is understood that the embodiment of the present invention can obtain image segmentation result, such as Fig. 2 by WRN-PPNet models It is shown, step A6：Export the segmentation result of destination object.It should be noted that whole or the portion of present invention method carrying It is that can be completed by the relevant hardware of programmed instruction step by step, program can be stored in a kind of computer-readable recording medium In, the program upon execution, including one or a combination set of the step of embodiment of the method.

For to sum up, the image partition method based on WRN-PPNet of the embodiment of the present invention can include：Input to be split View data；Standardize view data to be split；Training WRN-PPNet models, specifically include design WRN-PPNet models, Training data is pre-processed, model instruction is carried out using pretreated training data and designed WRN-PPNet models Practice.Wherein, design WRN-PPNet models include WRN modules and PPNet modules, standardized training data, and use Data expansion Method increases the pattern and quantity of training image, composing training collection, and model training process includes the weights of initialization model, adjusts mould The weight of type, sets trained end condition；Output image segmentation result, specifically includes and trains image input to be split WRN-PPNet models, testing image data are correctly split.

The image partition method based on wide residual pyramid pond network proposed according to embodiments of the present invention, can pass through Depth residual error network theory and deep learning network model training method obtain the Image Segmentation Model based on WRN-PPNet, from And can realize and split task end to end, and can be used for fully automatically splitting image, and from image to be split The limitation of classification, strong applicability, model performance is good, has robustness, makes segmentation not only more convenient, effective, and operates letter List and performance is more preferable, so as to effectively improve the accuracy and convenience of image segmentation.

The image based on wide residual pyramid pond network proposed according to embodiments of the present invention referring next to attached drawing description Segmenting system.

Figure 10 is the structure of the image segmentation system based on wide residual pyramid pond network of one embodiment of the invention Schematic diagram.

As shown in Figure 10, being somebody's turn to do the image segmentation system 10 based on wide residual pyramid pond network includes：Input module 100th, the first pretreatment module 200, acquisition module 300, the second pretreatment module 400, training module 500 and segmentation module 600.

Wherein, input module 100 is used to input image to be split.First pretreatment module 200 is used to treat segmentation figure picture It is standardized, so that the pixel average of image to be split is 0 and variance is 1.Acquisition module 300 is used to obtain WRN-PPNet Model, wherein, WRN-PPNet models include WRN modules and PPNet modules, and the feature and PPNet modules of WRN modules extraction The feature of extraction blends.Second pretreatment module 400 is used to pre-process training image, so that the pixel of training image Average is 0 and variance is 1, and causes the pixel tag of cutting object to be 1 and remainder pixel tag is 0, and passes through data Development method increases the pattern and quantity of training image, obtains training image collection.Training module 500 is used for according to WRN-PPNet moulds Type and training image collection carry out model training, to generate WRN-PPNet parted patterns.Split module 600 to be used for according to be split Image obtains image segmentation result by WRN-PPNet parted patterns.The system of the embodiment of the present invention can be based on WRN-PPNet Fully automatically image is split, realizes the purpose to destination object segmentation, from the limitation of image category to be split, and Adaptable, model performance is good, so as to effectively improve the accuracy and convenience of image segmentation.

Further, in one embodiment of the invention, WRN modules include the first wide residual block group, the second wide residual error Block group and the 3rd wide residual block group, the first wide residual block group, the second wide residual block group and the 3rd wide residual block group include four Wide residual block, each wide residual block include two convolutional layers, and convolution kernel size is 3*3, and is had before each convolutional layer One batch of BN layers of standardization, and in the first wide residual block group, the size of the characteristic pattern FMs of each width residual block output is inputted with it FMs it is equal sized, in the second wide residual block group and the 3rd wide residual block group, the ruler of the FMs of first wide residual block output The half of the very little FMs sizes for being its input, the size of the FMs of remaining wide residual block output are equal to the ruler of the FMs of its input It is very little.

Further, in one embodiment of the invention, deconvolution behaviour is made on PPNet moulds pond path in the block Make, wherein, make the operation of deconvolution twice on the first pond path, and make deconvolution behaviour on the second pond path Make, and there are two convolutional layers on the 3rd pond path, and there are two convolutional layers before each warp lamination, convolution kernel is big Small is 3*3.

Further, in one embodiment of the invention, by way of series connection, second before model does and splits By original input image introducing WRN-PPNet models, obtain model using the convolutional layer that two convolution kernels are 3*3 and generate feature and original The optimum organization of the super Localization characteristic of input picture.

It should be noted that the foregoing explanation to the image partition method embodiment based on wide residual pyramid pond network Illustrate the image segmentation system based on wide residual pyramid pond network for being also applied for the embodiment, details are not described herein again.

The image segmentation system based on wide residual pyramid pond network proposed according to embodiments of the present invention, can pass through Depth residual error network theory and deep learning network model training method obtain the Image Segmentation Model based on WRN-PPNet, from And can realize and split task end to end, and can be used for fully automatically splitting image, and from image to be split The limitation of classification, strong applicability, model performance is good, has robustness, makes segmentation not only more convenient, effective, and operates letter List and performance is more preferable, so as to effectively improve the accuracy and convenience of image segmentation.

In the description of the present invention, it is to be understood that term " " center ", " longitudinal direction ", " transverse direction ", " length ", " width ", " thickness ", " on ", " under ", "front", "rear", "left", "right", " vertical ", " level ", " top ", " bottom " " interior ", " outer ", " up time The orientation or position relationship of the instruction such as pin ", " counterclockwise ", " axial direction ", " radial direction ", " circumferential direction " be based on orientation shown in the drawings or Position relationship, is for only for ease of and describes the present invention and simplify description, rather than indicates or imply that signified device or element must There must be specific orientation, with specific azimuth configuration and operation, therefore be not considered as limiting the invention.

In addition, term " first ", " second " are only used for description purpose, and it is not intended that instruction or hint relative importance Or the implicit quantity for indicating indicated technical characteristic.Thus, define " first ", the feature of " second " can be expressed or Implicitly include at least one this feature.In the description of the present invention, " multiple " are meant that at least two, such as two, three It is a etc., unless otherwise specifically defined.

In the present invention, unless otherwise clearly defined and limited, term " installation ", " connected ", " connection ", " fixation " etc. Term should be interpreted broadly, for example, it may be fixedly connected or be detachably connected, or integrally；Can be that machinery connects Connect or be electrically connected；It can be directly connected, can also be indirectly connected by intermediary, can be in two elements The connection in portion or the interaction relationship of two elements, unless otherwise restricted clearly.For those of ordinary skill in the art For, the concrete meaning of above-mentioned term in the present invention can be understood as the case may be.

In the present invention, unless otherwise clearly defined and limited, fisrt feature can be with "above" or "below" second feature It is that the first and second features directly contact, or the first and second features pass through intermediary mediate contact.Moreover, fisrt feature exists Second feature " on ", " top " and " above " but fisrt feature are directly over second feature or oblique upper, or be merely representative of Fisrt feature level height is higher than second feature.Fisrt feature second feature " under ", " lower section " and " below " can be One feature is immediately below second feature or obliquely downward, or is merely representative of fisrt feature level height and is less than second feature.

In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or the spy for combining the embodiment or example description Point is contained at least one embodiment of the present invention or example.In the present specification, schematic expression of the above terms is not It must be directed to identical embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be in office Combined in an appropriate manner in one or more embodiments or example.In addition, without conflicting with each other, the skill of this area Art personnel can be tied the different embodiments or example described in this specification and different embodiments or exemplary feature Close and combine.

Although the embodiment of the present invention has been shown and described above, it is to be understood that above-described embodiment is example Property, it is impossible to limitation of the present invention is interpreted as, those of ordinary skill in the art within the scope of the invention can be to above-mentioned Embodiment is changed, changes, replacing and modification.

Claims

1. a kind of image partition method based on wide residual pyramid pond network, it is characterised in that comprise the following steps：

Input image to be split；

The image to be split is standardized, so that the pixel average of the image to be split is 0 and variance is 1；

WRN-PPNet models are obtained, wherein, the WRN-PPNet models include WRN modules and PPNet modules, and the WRN The feature that the feature of module extraction is extracted with the PPNet modules blends；

Training image is pre-processed, so that the pixel average of the training image is 0 and variance is 1, and causes segmentation pair The pixel tag of elephant is 1 and remainder pixel tag is 0, and by method for expanding data increase the training image pattern and Quantity, obtains training image collection；

Model training is carried out according to the WRN-PPNet models and the training image collection, to generate WRN-PPNet segmentation moulds Type；And

Image segmentation result is obtained by the WRN-PPNet parted patterns according to the image to be split.

2. the image partition method according to claim 1 based on wide residual pyramid pond network, it is characterised in that institute Stating WRN modules includes the first wide residual block group, the second wide residual block group and the 3rd wide residual block group, the described first wide residual block Group, the second wide residual block group and the 3rd wide residual block group include four wide residual blocks, and each width residual block includes Two convolutional layers, convolution kernel size are 3*3, and have before each convolutional layer one batch of BN layers of standardization, and described the In one wide residual block group, the size of the characteristic pattern FMs of each width residual block output is equal sized with its FMs's inputted, and second In wide residual block group and the 3rd wide residual block group, the size of the FMs of first wide residual block output is the FMs sizes of its input Half, the size of the FMs of remaining wide residual block output are equal to the size of the FMs of its input.

3. the image partition method according to claim 1 based on wide residual pyramid pond network, it is characterised in that institute Stating PPNet modules includes the first pond path, the second pond path and the 3rd pond path, the pond of first pond path Window size is 4*4, and the pond window size of second pond path is 2*2, the pond window of the 3rd pond path Size is 1*1, and pond mode is average pond.

4. the image partition method according to claim 3 based on wide residual pyramid pond network, it is characterised in that institute State and obtain WRN-PPNet models, further comprise：

Make deconvolution operation on the pond path of the PPNet modules, wherein, make on the path of first pond anti-twice Convolution operation, and make deconvolution operation on the path of second pond, and have on the 3rd pond path Two convolutional layers, and have two convolutional layers before each warp lamination, convolution kernel size is 3*3.

5. the image partition method according to claim 4 based on wide residual pyramid pond network, it is characterised in that also Including：

By way of series connection, original input image is introduced into the WRN-PPNet models second before model does and splits, is made The convolutional layer for being 3*3 with two convolution kernels, obtains the optimization group of the super Localization characteristic of model generation feature and original input image Close.

A kind of 6. image segmentation system based on wide residual pyramid pond network, it is characterised in that including：

Input module, for inputting image to be split；

First pretreatment module, for being standardized to the image to be split, so that the pixel of the image to be split is equal It is 1 to be worth for 0 and variance；

Acquisition module, for obtaining WRN-PPNet models, wherein, the WRN-PPNet models include WRN modules and PPNet moulds Block, and the feature that the feature of WRN modules extraction is extracted with the PPNet modules blends；

Second pretreatment module, for being pre-processed to training image, so that the pixel average of the training image is 0 and side Difference is 1, and so that the pixel tag of cutting object is 1, and remainder pixel tag is 0, and increased by method for expanding data The pattern and quantity of the training image, obtain training image collection；

Training module, for carrying out model training according to the WRN-PPNet models and the training image collection, to generate WRN- PPNet parted patterns；And

Split module, for obtaining image segmentation result by the WRN-PPNet parted patterns according to the image to be split.

7. the image segmentation system according to claim 6 based on wide residual pyramid pond network, it is characterised in that institute Stating WRN modules includes the first wide residual block group, the second wide residual block group and the 3rd wide residual block group, the described first wide residual block Group, the second wide residual block group and the 3rd wide residual block group include four wide residual blocks, and each width residual block includes Two convolutional layers, convolution kernel size are 3*3, and have before each convolutional layer one batch of BN layers of standardization, and described the In one wide residual block group, the size of the characteristic pattern FMs of each width residual block output is equal sized with its FMs's inputted, and second In wide residual block group and the 3rd wide residual block group, the size of the FMs of first wide residual block output is the FMs sizes of its input Half, the size of the FMs of remaining wide residual block output are equal to the size of the FMs of its input.

8. the image segmentation system according to claim 6 based on wide residual pyramid pond network, it is characterised in that institute Stating PPNet modules includes the first pond path, the second pond path and the 3rd pond path, the pond of first pond path Window size is 4*4, and the pond window size of second pond path is 2*2, the pond window of the 3rd pond path Size is 1*1, and pond mode is average pond.

9. the image segmentation system according to claim 6 based on wide residual pyramid pond network, it is characterised in that Make deconvolution operation on the PPNet moulds pond path in the block, wherein, make deconvolution twice on the path of first pond Operation, and make deconvolution operation on the path of second pond, and have two on the 3rd pond path Convolutional layer, and have two convolutional layers before each warp lamination, convolution kernel size is 3*3.

10. the image segmentation system according to claim 6 based on wide residual pyramid pond network, it is characterised in that By way of series connection, original input image is introduced into the WRN-PPNet models second before model does and splits, uses two The convolutional layer that a convolution kernel is 3*3 obtains the optimum organization that model generates the super Localization characteristic of feature and original input image.