WO2016172889A1 - 一种图像分割方法和装置 - Google Patents

一种图像分割方法和装置 Download PDF

Info

Publication number
WO2016172889A1
WO2016172889A1 PCT/CN2015/077859 CN2015077859W WO2016172889A1 WO 2016172889 A1 WO2016172889 A1 WO 2016172889A1 CN 2015077859 W CN2015077859 W CN 2015077859W WO 2016172889 A1 WO2016172889 A1 WO 2016172889A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
super
segmentation
pixel
super pixel
Prior art date
Application number
PCT/CN2015/077859
Other languages
English (en)
French (fr)
Inventor
赵瑞
欧阳万里
李鸿升
王晓刚
黎伟
刘健庄
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2015/077859 priority Critical patent/WO2016172889A1/zh
Priority to CN201580078960.2A priority patent/CN107533760B/zh
Publication of WO2016172889A1 publication Critical patent/WO2016172889A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/10Character recognition
    • G06V30/19Recognition using electronic means
    • G06V30/192Recognition using electronic means using simultaneous comparisons or correlations of the image signals with a plurality of references
    • G06V30/194References adjustable by an adaptive method, e.g. learning

Definitions

  • the present invention relates to the field of image processing, and in particular, to an image segmentation method and apparatus.
  • Image segmentation technology is one of the key technologies in the field of image processing, and it is a crucial preprocessing in image recognition and computer vision technology, and is the basis for image recognition, image analysis and image understanding.
  • image segmentation is a process of dividing an image into several specific regions with unique properties and proposing objects of interest.
  • image segmentation technology mainly designs a feature as the basis of image segmentation, and then based on the basis of image segmentation. For example: a threshold-based segmentation method, an edge-based segmentation method, or a region-based segmentation method. Since the above image segmentation techniques all need to artificially design a feature and then segment based on the feature, the features of the artificial design often have some limitations, which may result in poor segmentation effects.
  • the segmentation effect of a certain type of image is very good, but the segmentation effect is poor for another type of image with a large difference.
  • the characteristics of the artificial design are also prone to errors, resulting in poor image segmentation. It can be seen that the current segmentation effect of image segmentation is poor.
  • Embodiments of the present invention provide an image segmentation method and apparatus, which can improve the segmentation effect of image segmentation.
  • an embodiment of the present invention provides an image segmentation method, including:
  • the method further includes:
  • the image of the preset size is cut on the super-pixel image by using the super-pixels to obtain the image block corresponding to each super-pixel, including:
  • An image of a preset size is cut on the extended image centering on the respective super pixels to obtain image blocks corresponding to the respective super pixels.
  • the m segmentation criteria are required to be obtained, where the m is greater than or equal to 2.
  • the processing by using the neural network, the image blocks corresponding to the respective super pixels to obtain the segmentation class labels of the respective super pixels, including:
  • the segmentation class corresponding to the classification vector of any one of the superpixels in the m segmentation classmarks is used as the segmentation classifier of any one of the superpixels.
  • the identifying a classification vector of the each super pixel is corresponding to the m partitioning labels Split the class label, including:
  • connection value of the classification vector of the super pixel and the m segmentation class is calculated by the following formula:
  • y j is a connection value of a super-pixel classification vector and a j-th segmentation class identifier
  • the x i represents an i-dimensional vector in a classification vector of the target super-pixel
  • the n is the target super-pixel a dimension of a vector in the classification vector
  • the n is an integer greater than 1
  • the ⁇ i, j is a preset parameter for identifying the segmentation classifier
  • the Cutting the image of the preset size on the super pixel image to obtain the image block corresponding to each super pixel including:
  • the image blocks of each super pixel to obtain a classification vector of each super pixel, including:
  • the first image block and the second image block of the superpixel are respectively operated by using a neural network to obtain a first classification of any one of the superpixels. And a second classification vector, and synthesizing the first classification vector and the second classification vector to obtain a classification vector of any one of the superpixels.
  • the segmentation of the super-pixel image according to a preset second segmentation rule, to obtain a segmentation image including at least two regions includes:
  • the super-pixel of the segmentation class in the super-pixel image belongs to a pre-set segmentation classifier that the user needs to focus on, and is divided into a foreground region, and the segmentation classmark in the super-pixel image does not belong to the The preset super-pixel division of the segmentation class that the user needs to pay attention to is divided into a background area.
  • the method further includes:
  • the neural network includes:
  • Deep neural network or non-deep neural network are Deep neural network or non-deep neural network.
  • an embodiment of the present invention provides an image segmentation apparatus, including: a first segmentation unit, a cutting unit, a classifying unit, and a second segmentation unit, wherein:
  • the first dividing unit is configured to divide the image to be divided into a plurality of super pixels according to a preset first dividing rule to obtain a super pixel image
  • the cutting unit is configured to cut an image of a preset size on the super pixel image centering on each super pixel divided by the first dividing unit to obtain an image block corresponding to each super pixel;
  • the classifying unit is configured to process, by using a neural network, an image block corresponding to each super pixel obtained by the cutting unit, to obtain a segmentation class corresponding to each super pixel;
  • the second segmentation unit is configured to segment the super pixel image of the first segmentation unit according to a preset second segmentation rule to obtain a segmentation image including at least two regions; wherein the second segmentation rule It means dividing the superpixels with the same partitioning class into the same area.
  • the device further includes:
  • An expansion unit configured to expand the super pixel image of the first cutting unit to generate an extended image including the super pixel image
  • the cutting unit is configured to cut an image of a preset size on the extended image expanded by the expansion unit centering on the respective super pixels divided by the first dividing unit to obtain an image corresponding to each super pixel Piece.
  • the second in the second aspect In combination with the second aspect or the first possible implementation of the second aspect, the second in the second aspect In a possible implementation manner, it is required to obtain m dividing class labels, where m is a natural number greater than or equal to 2;
  • the classification unit includes:
  • An operation unit configured to perform, by using the neural network, an image block corresponding to each of the super pixels obtained by the cutting unit to obtain a classification vector of each super pixel;
  • a identifying unit configured to identify a segmentation classifier corresponding to a classification vector of each of the superpixels obtained by the operation unit in the m segmentation classes
  • a classifying subunit configured, for any one of the superpixels, a segmentation class corresponding to a classification vector of the any one of the superpixels identified by the recognition unit in the m segmentation classes As the segmentation class of any of the superpixels.
  • the identifying unit includes:
  • a calculation unit configured to calculate, according to a formula of any one of the super pixels, a connection value of the classification vector of the any one of the super pixels obtained by the operation unit and the m division type labels:
  • y j is a connection value of a super-pixel classification vector and a j-th segmentation class identifier
  • the x i represents an i-dimensional vector in a classification vector of the target super-pixel
  • the n is the target super-pixel a dimension of a vector in the classification vector
  • the n is an integer greater than 1
  • the ⁇ i, j is a preset parameter for identifying the segmentation classifier
  • a selection unit configured to select a largest connection value from the m connection values of the any one of the super pixels obtained by the calculation unit, and use the segmentation class corresponding to the maximum connection value as the super pixel
  • the classification vector corresponds to the segmentation class label in the m segmentation class labels.
  • the cutting unit is configured to use the first segmentation
  • the respective superpixels of the unit segmentation are centered on the superpixel image to cut an image of a first preset size to obtain a first image block corresponding to each of the superpixels, and centered on the respective superpixels Cutting a second preset size image on the super pixel image to obtain the respective super pixels Corresponding second image block;
  • the operation unit is configured to perform operations on the corresponding first image block and the second image block of the any super pixel cut by the cutting unit by using a neural network, respectively, for any one of the super pixels Obtaining a first classification vector and a second classification vector of any one of the superpixels, and synthesizing the first classification vector and the second classification vector to obtain a classification vector of any one of the superpixels.
  • the second segmentation unit is configured to divide the segmentation class in the super pixel image of the first segmentation unit into a preset user according to the second segmentation rule.
  • the super-pixel of the segmentation target to be focused is divided into a foreground region, and the super-pixel in the super-pixel image of the first segmentation unit that does not belong to the segmentation classifier that the user needs to pay attention to is divided into a background region. .
  • the device further includes:
  • a setting unit configured to set a color of the super pixel divided into the foreground region by the second segmentation unit in the super pixel image to preset a foreground color corresponding to the attention segmentation class label, and to set the super pixel image
  • the color of the super pixel in which the background area is divided by the second dividing unit is set as the background color corresponding to the attention-divided class label.
  • the neural network includes:
  • Deep neural network or non-deep neural network are Deep neural network or non-deep neural network.
  • an embodiment of the present invention provides an image segmentation apparatus, including: a processor, a network interface, a memory, and a communication bus, wherein the communication bus is configured to implement connection communication between the processor, the network interface, and the memory.
  • the processor executes a program stored in the memory for implementing the following method:
  • the program executed by the processor further includes:
  • An image of a preset size is cut on the extended image centering on the respective super pixels to obtain image blocks corresponding to the respective super pixels.
  • the m division type labels are required to be obtained, where the m is greater than or equal to 2. Natural number;
  • the processor performs processing on the image blocks corresponding to the respective super pixels by using a neural network to obtain a program for dividing the sub-pixels of the super-pixels, including:
  • the segmentation class corresponding to the classification vector of any one of the superpixels in the m segmentation classmarks is used as the segmentation classifier of any one of the superpixels.
  • a third possible manner in the third aspect The program for identifying the segmentation classifier corresponding to the classification vectors of the respective superpixels in the m segmentation classifiers, including:
  • connection value of the classification vector of the super pixel and the m segmentation class is calculated by the following formula:
  • y j is a connection value of a super-pixel classification vector and a j-th segmentation class identifier
  • the x i represents an i-dimensional vector in a classification vector of the target super-pixel
  • the n is the target super-pixel a dimension of a vector in the classification vector
  • the n is an integer greater than 1
  • the ⁇ i, j is a preset parameter for identifying the segmentation classifier
  • the processor performs the And a program for cutting an image of a preset size on the super pixel image to obtain an image block corresponding to each super pixel, including:
  • the first image block and the second image block of the superpixel are respectively operated by using a neural network to obtain a first classification of any one of the superpixels. And a second classification vector, and synthesizing the first classification vector and the second classification vector to obtain a classification vector of any one of the superpixels.
  • the processor performs a process of dividing the super pixel image according to a preset second segmentation rule to obtain a segmentation image including at least two regions.
  • the super-pixel of the segmentation class in the super-pixel image belongs to a pre-set segmentation classifier that the user needs to focus on, and is divided into a foreground region, and the segmentation classmark in the super-pixel image does not belong to the The preset super-pixel division of the segmentation class that the user needs to pay attention to is divided into a background area.
  • the program executed by the processor further includes:
  • the neural network includes:
  • Deep neural network or non-deep neural network are Deep neural network or non-deep neural network.
  • the image to be segmented is divided into a plurality of superpixels according to a preset first segmentation rule; and the image of the preset size is cut on the superpixel image centering on each superpixel to obtain the super An image block corresponding to the pixel; processing, by using a neural network, the image block corresponding to each super pixel to obtain a segmentation class corresponding to each super pixel; and segmenting the super pixel image according to a preset second segmentation rule And obtaining a segmentation image including at least two regions; wherein the second segmentation rule refers to dividing the superpixels having the same segmentation class into the same region.
  • FIG. 1 is a schematic flowchart of an image segmentation method according to an embodiment of the present invention.
  • FIG. 2 is a schematic flowchart of another image segmentation method according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of super pixel segmentation according to an embodiment of the present invention.
  • FIG. 4 is a schematic diagram of image block cutting according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of classification using deep neural networks according to an embodiment of the present invention.
  • FIG. 6 is a schematic structural diagram of a deep neural network according to an embodiment of the present invention.
  • FIG. 7 is a schematic diagram of convergence of a deep neural network according to an embodiment of the present invention.
  • FIG. 8 is a schematic diagram of color of a segmentation class label according to an embodiment of the present invention.
  • FIG. 9 is a schematic diagram of image segmentation according to an embodiment of the present invention.
  • FIG. 10 is a schematic diagram of experimental data provided by an embodiment of the present invention.
  • FIG. 11 is a schematic structural diagram of an image segmentation apparatus according to an embodiment of the present invention.
  • FIG. 12 is a schematic structural diagram of another image segmentation apparatus according to an embodiment of the present invention.
  • FIG. 13 is a schematic structural diagram of another image segmentation apparatus according to an embodiment of the present invention.
  • FIG. 14 is a schematic structural diagram of another image segmentation apparatus according to an embodiment of the present invention.
  • FIG. 15 is a schematic structural diagram of another image segmentation apparatus according to an embodiment of the present invention.
  • FIG. 1 is a schematic flowchart of an image segmentation method according to an embodiment of the present invention. As shown in FIG. 1 , the method includes the following steps:
  • the foregoing step may be that the image to be divided is divided into a plurality of super pixels, and
  • the superpixel may refer to a series of small regions that are adjacent to each other and have similar features such as color, brightness, and texture.
  • these small regions may retain effective information for further image segmentation, and It is possible to not destroy the physical boundary information in the image.
  • the first preset segmentation rule may be a segmentation rule of a segmentation method based on graph theory, or may be a segmentation rule of a segmentation method based on gradient descent.
  • the step may be to cut an image of a preset size on the super pixel image centering on a certain pixel point in the super pixel.
  • the image block corresponding to each super pixel may be one or more image blocks.
  • the image block corresponding to each super pixel is processed by using a neural network to obtain a segmentation class label corresponding to each super pixel.
  • the above-mentioned segmentation classmark can be understood as an area identifier of the image segmentation, that is, the super-pixels of the same segmentation classifier are divided into the same region at the time of image segmentation.
  • the processing of the image block corresponding to each super pixel by using the neural network may be understood as processing the image block of each super pixel by using a neural network model, wherein the neural network model may be pre-acquired, for example: pre-training The neural network model is obtained.
  • the superpixel image is segmented according to a preset second segmentation rule to obtain a segmentation image including at least two regions; wherein the second segmentation rule is to divide the superpixels with the same segmentation class into the same Area.
  • the preset second segmentation rule may be used to segment the superpixel image, for example, the superpixel of the same segmentation class is divided into the same region, so that the above The image to be segmented is divided into a plurality of regions.
  • the above method can be applied to any smart device with image processing functions, such as a tablet computer, a mobile phone, an e-reader, a remote controller, a personal computer (PC), a notebook computer, an in-vehicle device, and a network television.
  • smart devices with image processing functions such as wearable devices.
  • the image to be segmented is divided into a plurality of superpixels according to a preset first segmentation rule; a preset size image is cut on the superpixel image centering on each superpixel to obtain the superpixels.
  • the rule refers to dividing the superpixels with the same classifier into the same region.
  • FIG. 2 is a schematic flowchart of another image segmentation method according to an embodiment of the present invention. As shown in FIG. 2, the method includes the following steps:
  • step 201 may use the segmentation rule of the graph theory-based segmentation method to divide the image to be segmented into a plurality of super pixels, or step 201 may segment the image to be segmented using a segmentation rule based on a gradient descent segmentation method.
  • step 201 may use the SLIC algorithm in the gradient descent-based segmentation method to segment the image to be segmented into a plurality of superpixels.
  • the algorithm performs superpixel segmentation based on the similarity of color and distance, and the segmentation may be performed by using the algorithm.
  • Produce superpixels of uniform size and shape For example, the super-pixel diagram shown in FIG. 3, wherein, in the super-pixel diagram shown in FIG. 3, the pixel distribution of each super pixel in the order from the upper left corner to the lower right corner is 64, 256, and 1024 pixels in order.
  • the expansion may use the above-mentioned super-pixel image as a reference position, and the super-pixel image may be expanded, and may be an image that expands a fixed color value around the super-pixel image, for example, in a super-pixel image.
  • the step 202 may further expand based on the super pixel image.
  • 401 represents a super pixel image divided into super pixels
  • 402 is an expanded extended image.
  • the expanded image in step 202 is N times the size of the super pixel image, for example, N is 3, wherein N times here may mean that the length and the width are N times of the super pixel image.
  • the preset size may be set to a multiple of a super pixel image, where the multiple may be not only an integer multiple, but also a fractional multiple, such as 1.1 times, 1.2 times, or 1 time.
  • the image block cut in step 203 may be a partial image block 403, and the partial image block refers to a partial image in which the cut image block includes only the super pixel image.
  • the image block cut in step 203 may be a global image block 404, which means that the cut image block includes all images of the super pixel image.
  • each super pixel can cut a plurality of corresponding image blocks, for example, a partial image block and a global image block.
  • the extended image expanded in step 202 can satisfy the preset scale image cut on the extended image centering on any super pixel, and belongs to the extended image.
  • the extended image expanded in step 202 is super. 3 times of the pixel image, when the global image block is cut with any super pixel of the super pixel image as the center, the cut global image block belongs to the extended image, that is, the cut global image block does not exceed the extended image range. .
  • the image block corresponding to each super pixel is processed by using a neural network to obtain a segmentation class label corresponding to each super pixel.
  • step 204 may be to identify a segmentation class corresponding to each of the m segmentation classes of each superpixel.
  • Standard for example, step 204 can include:
  • the segmentation class corresponding to the classification vector of any one of the superpixels in the m segmentation classmarks is used as the segmentation classifier of any one of the superpixels.
  • the classification vector corresponding to the classification vector of each super pixel in the m division type labels may be implemented by using the above-mentioned classification vector through the all-connection layer in the deep neural network.
  • the step of identifying the segmentation classifier corresponding to the classification vector of each super pixel in the m segmentation class labels may include:
  • the super-pixel is calculated by the following formula The connection value of the pixel's classification vector and the m segmentation class labels:
  • y j is a connection value of a super-pixel classification vector and a j-th segmentation class identifier
  • the x i represents an i-dimensional vector in a classification vector of the target super-pixel
  • the n is the target super-pixel a dimension of a vector in the classification vector
  • the n is an integer greater than 1
  • the ⁇ i, j is a preset parameter for identifying the segmentation classifier
  • the above parameter ⁇ i,j can be learned through a large number of training samples.
  • the segmentation class of each super pixel can be obtained.
  • the step of cutting the image of the preset size on the super-pixel image to obtain the image block corresponding to each super pixel which may be:
  • the step of performing the operation of the image blocks of the respective super-pixels by using the neural network to obtain the classification vectors of the respective super-pixels may include:
  • the first image block and the second image block of the superpixel are respectively operated by using a neural network to obtain a first classification of any one of the superpixels. And a second classification vector, and synthesizing the first classification vector and the second classification vector to obtain a classification vector of any one of the superpixels.
  • the corresponding first image block and the second image block of the super pixel may be the local tile and the global image block introduced above.
  • the local tile and the global image block may be processed in the neural network. It better reflects the local and global features of superpixels in superpixel images, thereby improving image segmentation.
  • the same or different nerves can be used in this embodiment.
  • the network performs processing.
  • both the local image block and the global image block can be processed by using the same deep neural network, and the obtained classification vector is synthesized.
  • the super-pixel classification vector thus obtained is more abundant, so that the image segmentation effect can be improved.
  • the foregoing neural network may be a non-depth neural network, wherein the non-depth neural network may be understood as a single-layer neural network, such as a BP neural network, a Hebb neural network, or a DL neural network.
  • the above neural network may be a deep neural network, wherein the deep neural network may be understood as a multilayer neural network.
  • Clarifai deep neural network, AlexNet deep neural network, NIN deep neural network, OverFest deep neural network or GoogLeNet deep neural network, etc. are not limited thereto.
  • the Clarifai deep neural network includes a 5-layer convolutional layer and a 2-layer fully-connected layer.
  • step 204 is as shown in FIG. 5, wherein only FIG. 5 is drawn.
  • the convolutional layer and the fully connected layer with parameters are derived, and the parameters in Fig. 5 can all be learned through a large number of training samples.
  • the image block of the super pixel includes the partial image block and the global image block, the local image block and the global image block may be separately learned, and then the learned classification vector is synthesized to output the result. Among them, the output result in this way may be the split class label of each super pixel.
  • the step 204 is described in detail with parameters in the example shown in FIG. 5.
  • the first convolution layer uses a plurality of parameter templates to perform image blocks. Convolution, assuming that the size of the global image block and the local image block are linearly transformed to 227 ⁇ 227, assuming that the global image block and the partial image block are both 3-channel color images, then the input of the first layer convolutional layer is 227 ⁇ 227
  • the matrix of ⁇ 3 the first volume layer is convoluted with the input 227 ⁇ 227 ⁇ 3 matrix by 96 7 ⁇ 7 ⁇ 3 parameter templates, and the 96 ⁇ 7 ⁇ 7 ⁇ 3 parameters are unknown. Can be obtained through a large number of sample training.
  • the translation step in the x and y directions during convolution is 2 pixels, so that each convolution operation will get a matrix of 111 ⁇ 111, and the results of 96 convolution operations will be stitched together.
  • a matrix of 111 x 111 x 96 is obtained.
  • the layer 1 modified linear unit function may replace the value of less than 0 in the above 111 ⁇ 111 ⁇ 96 matrix with 0.
  • the layer 1 aggregation layer may refer to mapping a value of a certain area in the matrix to a value according to a certain rule (for example, taking a maximum value).
  • a certain rule for example, taking a maximum value.
  • FIG. 7 shows a diagram of mapping a 2 ⁇ 2 matrix of each of the 4 ⁇ 4 matrices to a value according to the maximum value, and finally obtaining a 2 ⁇ 2 matrix.
  • the height of the vertical bar indicates the size of the value of the position element. If no vertical bar is drawn, the value of the element at the position is 0.
  • the first layer convergence layer in FIG. 6 can be used to aggregate the previously obtained 111 ⁇ 111 ⁇ 96 matrix into a 55 ⁇ 55 ⁇ 96 matrix (ignoring the edge data of the matrix) according to the rule given in FIG.
  • a 55 ⁇ 55 ⁇ 96 matrix is used as the input of the second layer convolution layer, and the second layer convolution layer convolves the input 55 ⁇ 55 ⁇ 96 matrix with 256 3 ⁇ 3 ⁇ 96 parameter templates.
  • 256 ⁇ 3 ⁇ 3 ⁇ 96 parameters are unknown and can be obtained through a large number of sample training.
  • the translation step size in the x and y directions is 2 pixels, so that each convolution operation will get a 27 ⁇ 27 matrix, 256 convolution operations. The result is stitched together to get a 27 ⁇ 27 ⁇ 256 matrix.
  • the fully connected layer in FIG. 6 may mean that all the nodes of the upper layer and all the nodes of the layer are connected in two, each of which corresponds to an unknown parameter, and the unknown parameter can be obtained by training a large number of samples.
  • the 6th layer full connection means that 6 ⁇ 6 ⁇ 256 nodes of the 5th layer and 4096 nodes of the 6th layer are connected in pairs, which is expressed by the formula:
  • x i represents the node of the previous layer
  • y j represents the node of this layer
  • ⁇ i, j represents the unknown parameter
  • n is the number of nodes of the 5th layer
  • m is the number of nodes of the 6th layer.
  • the global image block and the partial image block in FIG. 5 are respectively subjected to neural network operation to the seventh layer to obtain a 4096-dimensional vector, and the two 4096-dimensional vectors are fully connected according to the above formula and the final output result.
  • the final output result layer contains 2 nodes, wherein the final output result layer here can be understood as the segmentation classifier introduced above, that is, the final output result includes 2 segmentation class labels. .
  • the value of the final node is the largest, and the input superpixel is classified to the node.
  • the foregoing method may further include the following steps:
  • Deep learning acquires the above deep neural network model.
  • a model that includes a large number of unknown parameters is preset, and an initial value is assigned to each unknown parameter, and each initial value is randomly generated by a computer; then, training is performed through a large number of training samples, which are artificially segmented samples. That is to say, the segmentation class corresponding to each image block is known.
  • the training process is to constantly adjust the values of those unknown parameters, so that all the image blocks can be classified correctly after passing through the deep neural network. Finally, when the classification error is the smallest At this time, the values of these unknown parameters can be determined, and the training is completed, thereby generating the above-mentioned deep neural network.
  • the deep neural network may also be a deep neural network that has been trained, such as receiving a deep neural network sent by other devices.
  • the super-pixels in the super-pixel image that belong to the pre-set segmentation classifier that the user needs to pay attention to are divided into foreground regions according to the second segmentation rule, and the segmentation classmarks in the super-pixel image do not belong to
  • the super-pixel divided by the preset user-requested segmentation target is divided into a background region.
  • the attention segmentation class label may be one or more.
  • step 205 may divide the superpixel corresponding to the segmentation classifier into a foreground region. The remaining superpixels of the superpixel image are segmented into background regions.
  • the superpixels belonging to the plurality of segmentation classmarks are divided into foreground regions, and all remaining superpixels of the superpixel image are segmented into background regions.
  • the foreground area here includes a plurality of areas, and each area is composed of super-pixels of the same division type label.
  • the foregoing method may further include the following steps:
  • the foreground color of the different attention division type labels may be different, and the background colors corresponding to all the attention division type labels are the same.
  • the foreground color corresponding to each segmentation identifier may be as shown in FIG. 8 , and different segmentation class labels correspond to different color.
  • the image to be segmented mainly includes a sky background, a building, and a plant, and the image to be segmented is segmented by using the method of the method, and the sky background, the building, and the sky can be The plant is divided into different regions. If the superpixel segmentation class of the building is marked as the attention segmentation class, then the area of the building can be divided into the foreground, the sky background and the plant are divided into the background, and the building When the color corresponding to the segmentation type of the super pixel of the object is white, a segmentation image as shown in FIG. 9 can be generated.
  • FIG. 10 is a schematic diagram of experimental data provided by an embodiment of the present invention.
  • the leftmost column is the original image
  • the second column is the manually labeled true value (Ground Truth, GT).
  • the three columns are the segmentation results of the image segmentation technique provided by the embodiments of the present invention, and the following three columns are the segmentation results of the DRFI, GBMR, and HS image segmentation techniques, respectively. It can be seen from the figure that the segmentation result of the image segmentation technique provided by the embodiment of the present invention is closer to the actual value of the manual annotation.
  • the embodiment of the present invention further provides an image segmentation technology provided by an embodiment of the present invention on the image segmentation disclosure libraries of ASD, SED1, SED2, ECSSD, and PASCAL-S, and IS, GBVS, SF, GC, CEOS, PCAS, GBMR, HS, DRFI, the most promising image segmentation techniques, have been experimentally compared.
  • Table 1 shows the F-measurement score F ⁇ on the five public libraries of the image segmentation technique and other image segmentation techniques provided by the embodiments of the present invention, wherein the higher the F ⁇ score, the better the segmentation effect, and the F ⁇ is as follows. Show:
  • ASD SED1 SED2 ECSSD PASCAL-S IS 0.5943 0.5540 0.5682 0.4731 0.4901 GBVS 0.6499 0.7125 0.5862 0.5528 0.5929 SF 0.8879 0.7533 0.7961 0.5448 0.5740 GC 0.8811 0.8066 0.7728 0.5821 0.6184 CEOS 0.9020 0.7935 0.6198 0.6465 0.6557 PCAS 0.8613 0.7586 0.7791 0.5800 0.6332 GBMR 0.9100 0.9062 0.7974 0.6570 0.7055 HS 0.9307 0.8744 0.8150 0.6391 0.6819 DRFI 0.9448 0.9018 0.8725 0.6909 0.7447 this invention 0.9548 0.9295 0.8903 0.7322 0.7930
  • the device embodiment of the present invention is used to perform the method for implementing the first to second embodiments of the present invention.
  • the device embodiment of the present invention is used to perform the method for implementing the first to second embodiments of the present invention.
  • Only parts related to the embodiment of the present invention are shown, and the specific technical details are not disclosed. Please refer to Embodiment 1 and Embodiment 2 of the present invention.
  • FIG. 11 is a schematic structural diagram of an image segmentation apparatus according to an embodiment of the present invention. As shown in FIG. 11, the first segmentation unit 111, the cutting unit 112, the classifying unit 113, and the second segmentation unit 114 are included. ,among them:
  • the first dividing unit 111 is configured to divide the image to be divided into a plurality of super pixels according to a preset first dividing rule to obtain a super pixel image.
  • the first dividing unit 111 may divide the image to be divided into a plurality of super pixels, wherein the super pixel may refer to a series of positions adjacent to each other and the colors, brightness, and texture are similar. Small areas composed of pixels, in addition, these small areas can retain effective information for further image segmentation, and may not destroy physical boundary information in the image.
  • the first preset segmentation rule may be a segmentation rule of a segmentation method based on graph theory, or may be a segmentation rule of a segmentation method based on gradient descent.
  • the cutting unit 112 is configured to cut an image of a preset size on the super pixel image centering on each super pixel divided by the first dividing unit 111 to obtain an image block corresponding to each super pixel.
  • the cutting unit 112 may cut an image of a preset size on the super pixel image centering on a pixel point in the super pixel.
  • the image block corresponding to each super pixel may be one or more image blocks.
  • the classifying unit 113 is configured to process the image blocks corresponding to the respective super pixels obtained by the cutting unit 112 by using the neural network to obtain the segmentation class labels corresponding to the respective super pixels.
  • the above-mentioned segmentation classmark can be understood as an area identifier of the image segmentation, that is, the super-pixels of the same segmentation classifier are divided into the same region at the time of image segmentation.
  • the processing of the image block corresponding to each super pixel by using the neural network may be understood as processing the image block of each super pixel by using a neural network model, wherein the neural network model may be pre-acquired, for example: pre-training The neural network model is obtained.
  • the second segmentation unit 114 is configured to segment the super pixel image segmented by the first segmentation unit 111 according to a preset second segmentation rule to obtain a segmentation image including at least two regions; wherein the second segmentation rule It means dividing the superpixels with the same partitioning class into the same area.
  • the preset second segmentation rule may be used to segment the superpixel image, for example, the superpixel of the same segmentation class is segmented into the same region, so that the segment to be segmented may be The image is divided into multiple regions.
  • the foregoing apparatus can be applied to any smart device having an image processing function, such as a tablet computer, a mobile phone, an e-reader, a remote controller, a PC, a notebook computer, an in-vehicle device, a network television, a wearable device, etc., having an image.
  • a smart device that handles functionality such as a tablet computer, a mobile phone, an e-reader, a remote controller, a PC, a notebook computer, an in-vehicle device, a network television, a wearable device, etc.
  • the image to be segmented is divided into a plurality of superpixels according to a preset first segmentation rule; and the image of the preset size is cut on the superpixel image centering on each superpixel to obtain the super An image block corresponding to the pixel; processing, by using a neural network, the image block corresponding to each super pixel to obtain a segmentation class corresponding to each super pixel; and segmenting the super pixel image according to a preset second segmentation rule And obtaining a segmentation image including at least two regions; wherein the second segmentation rule refers to dividing the superpixels having the same segmentation class into the same region.
  • FIG. 12 is a schematic structural diagram of another image segmentation apparatus according to an embodiment of the present invention. As shown in FIG. 12, the method includes: a first segmentation unit 121, an expansion unit 122, a cutting unit 123, a classification unit 124, and a second dividing unit 125, wherein:
  • the first dividing unit 121 is configured to divide the image to be divided into a plurality of super pixels according to a preset first dividing rule to obtain a super pixel image.
  • the first segmentation unit 121 may divide the image to be segmented into a plurality of super pixels by using a segmentation rule based on the segmentation method of the graph theory, or the first segmentation unit 121 may use a segmentation rule based on the gradient descent segmentation method.
  • the above-mentioned image to be divided is divided into a plurality of super pixels.
  • the first segmentation unit 121 may use the SLIC algorithm in the gradient descent-based segmentation method to segment the image to be segmented into a plurality of superpixels, and the algorithm performs superpixel segmentation based on the similarity of color and distance, using the algorithm. Segmentation produces superpixels of uniform size and shape.
  • the super-pixel diagram shown in FIG. 3 wherein, in the super-pixel diagram shown in FIG. 3, the pixel distribution of each super pixel in the order from the upper left corner to the lower right corner is 64, 256, and 1024 pixels in order.
  • the expansion unit 122 is configured to expand the super pixel image divided by the first dividing unit 112 to generate an extended image including the super pixel image.
  • the expansion may use the above-mentioned super-pixel image as a reference position, and the super-pixel image may be expanded, and may be an image that expands a fixed color value around the super-pixel image, for example, in a super-pixel image.
  • the expansion unit 122 may further expand based on the super pixel image.
  • 401 represents a super pixel image divided into super pixels
  • 402 is an expanded extended image.
  • the expanded image in the expansion unit 122 is N times larger than the super pixel image, for example, N is 3, wherein N times here may mean that the length and the width are N times of the super pixel image.
  • the cutting unit 123 is configured to cut an image of a preset size on the extended image expanded by the expansion unit 121 centering on each super pixel divided by the first dividing unit 121 to obtain the respective super pixels. Corresponding image block.
  • the preset size may be set to a multiple of a super pixel image, where the multiple may be not only an integer multiple, but also a fractional multiple, such as 1.1 times, 1.2 times, or 1 time.
  • the image block cut by the cutting unit 123 may be a partial image block 403, and the partial image block means that the cut image block includes only a partial image of the super pixel image.
  • the image block cut by the cutting unit 123 may be a global image block 404, which means that the cut image block includes all images of the super pixel image.
  • each super pixel may divide a plurality of image blocks, for example, a partial image block and a global image block.
  • the extended image extended by the expansion unit 122 can satisfy the preset scale image cut on the extended image centering on any super pixel, and belongs to the extended image, for example, the extended image expanded by the expansion unit 122.
  • the cut global image block belongs to the extended image, that is, the cut global image block does not exceed the expanded image.
  • the classifying unit 124 is configured to process, by using the neural network, the image blocks corresponding to the respective super pixels obtained by the cutting unit 123 to obtain the segmentation class labels corresponding to the respective super pixels.
  • the classification unit 124 may identify the segmentation of each superpixel in the m segmentation classes.
  • Class standard the classification unit 124 can include:
  • the operation unit 1241 is configured to perform, by using the neural network, an image block of each super pixel obtained by the dividing unit 123 to obtain a classification vector of each super pixel;
  • the identifying unit 1242 is configured to identify the segmentation classifier corresponding to the classification vector of each super pixel obtained by the operation unit 1241 in the m segmentation class labels;
  • a classifying sub-unit 1243 configured to, for any one of the super-pixels, the segmentation class corresponding to the classification vector of the any one of the super-pixels identified by the recognition unit 1242 in the m segmentation classes As the segmentation class of any of the superpixels.
  • the classification vector that identifies the super-pixel classification vector in the m-divided class label is implemented by using the all-connection layer in the deep neural network.
  • the identifying unit 1242 may include:
  • the calculating unit 12421 is configured to: for any one of the super pixels, The formula calculates the connection value of the classification vector of any super pixel obtained by the operation unit 1241 and the m division type labels:
  • y j is a connection value of a super-pixel classification vector and a j-th segmentation class identifier
  • the x i represents an i-dimensional vector in a classification vector of the target super-pixel
  • the n is the target super-pixel a dimension of a vector in the classification vector
  • the n is an integer greater than 1
  • the ⁇ i, j is a preset parameter for identifying the segmentation classifier
  • the selecting unit 12422 is configured to select a maximum connection value from the m connection values of the any one of the super pixels obtained by the calculating unit 12421, and use the segmentation class corresponding to the largest connection value as the super
  • the classification vector of the pixel is a segmentation class corresponding to the m segmentation class labels.
  • the above parameter ⁇ i,j can be learned through a large number of training samples.
  • the segmentation class of each super pixel can be obtained.
  • the dividing unit 122 may be configured to cut an image of a first preset size on the super pixel image centering on each super pixel divided by the first dividing unit 121 to obtain a first image corresponding to each super pixel. a block, and an image of the second predetermined size is cut on the super pixel image centering on each super pixel divided by the first dividing unit 121 to obtain a second image block corresponding to each super pixel;
  • the operation unit 1241 may be configured to calculate, by using the neural network, the corresponding first image block and the second image block of the any super pixel divided by the dividing unit 123, respectively, for any one of the super pixels. Obtaining a first classification vector and a second classification vector of any of the superpixels, and synthesizing the first classification vector and the second classification vector to obtain a classification vector of any of the superpixels.
  • the corresponding first image block and the second image block of the super pixel may be the local tile and the global image block introduced above.
  • the local tile and the global image block may be processed in the neural network. It better reflects the local and global features of superpixels in superpixel images, thereby improving image segmentation.
  • both the local image block and the global image block may be processed by using the same deep neural network, and the obtained classification vector is obtained. And then carry out the synthesis. The classification of superpixels thus obtained The vector is more abundant, which can improve the image segmentation effect.
  • the foregoing neural network may be a non-depth neural network, wherein the non-depth neural network may be understood as a single-layer neural network, such as a BP neural network, a Hebb neural network, or a DL neural network.
  • the above neural network may be a deep neural network, wherein the deep neural network may be understood as a multilayer neural network.
  • Clarifai deep neural network, AlexNet deep neural network, NIN deep neural network, OverFest deep neural network or GoogLeNet deep neural network, etc. are not limited thereto.
  • the second dividing unit 125 is configured to divide, according to the second dividing rule, the super-pixels in the super-pixel image divided by the first dividing unit 121 into the super-pixels of the segmentation classifier that the user needs to pay attention to into the foreground region.
  • the super-pixel in the super-pixel image divided by the first dividing unit 121 is not divided into the super-pixels of the segmentation classifier that the user needs to pay attention to, and is divided into the background region.
  • the attention segmentation class label may be one or more.
  • the second segmentation unit 125 may divide the super-pixel of the segmentation class label belonging to the attention segmentation class marker into The foreground area divides all remaining superpixels of the superpixel image into background regions.
  • the superpixels belonging to the plurality of segmentation classmarks are divided into foreground regions, and all remaining superpixels of the image to be segmented are segmented into background regions. It should be noted here that since there are a plurality of attention-dividing class labels, the foreground here includes a plurality of regions, and each region is composed of super-pixels of the same segmentation class.
  • the foregoing apparatus may further include:
  • the setting unit 126 is configured to set a color of the super pixel divided into the foreground area by the second dividing unit 125 in the super pixel image to preset a foreground color corresponding to the attention segmentation class label, and set the super pixel image
  • the color of the super pixel in which the background area is divided by the second dividing unit 125 is set as the background color corresponding to the attention-divided class label.
  • the foreground color of the different attention division type labels may be different, and the background colors corresponding to all the attention division type labels are the same.
  • FIG. 15 is a schematic structural diagram of another image segmentation apparatus according to an embodiment of the present invention.
  • the processor 151 includes a processor 151, a network interface 152, a memory 153, and a communication bus 154.
  • the communication bus 154 is used to implement connection communication between the processor 151, the network interface 152 and the memory 153, and the processor 151 executes a program stored in the memory 153 for implementing the following method:
  • the processor 151 performs the process of dividing the image to be segmented into a plurality of super pixels according to a preset first segmentation rule, the image of the specific size is cut on the super pixel by using each super pixel as a center to obtain an image.
  • the program executed by the processor may further include:
  • the program executed by the processor 151 to cut the image of the preset size on the super-pixel image and the image block corresponding to each super-pixel is obtained by the processor 151, and may include:
  • An image of a preset size is cut on the extended image centering on the respective super pixels to obtain image blocks corresponding to the respective super pixels.
  • the program executed by the processor 151 to process the image blocks corresponding to the respective super-pixels by using the neural network to obtain the segmentation type of the super-pixels may include:
  • the segmentation class corresponding to the classification vector of any one of the superpixels in the m segmentation classmarks is used as the segmentation classifier of any one of the superpixels.
  • the program that is used by the processor 151 to identify the segmentation class corresponding to the classification vector of the each super pixel in the m segmentation class labels may include:
  • connection value of the classification vector of the super pixel and the m segmentation class is calculated by the following formula:
  • y j is a connection value of a super-pixel classification vector and a j-th segmentation class identifier
  • the x i represents an i-dimensional vector in a classification vector of the target super-pixel
  • the n is the target super-pixel a dimension of a vector in the classification vector
  • the n is an integer greater than 1
  • the ⁇ i, j is a preset parameter for identifying the segmentation classifier
  • the program executed by the processor 151 to cut the image of the preset size on the super-pixel image and the image block corresponding to each super-pixel is obtained by the processor 151, and may include:
  • the program executed by the processor 151 to calculate the image blocks of the respective superpixels by using the neural network to obtain the classification vectors of the respective superpixels may include:
  • the first image block and the second image block of the superpixel are respectively operated by using a neural network to obtain a first classification of any one of the superpixels. And a second classification vector, and synthesizing the first classification vector and the second classification vector to obtain a classification vector of any one of the superpixels.
  • the process performed by the processor 151 to segment the super pixel image according to the preset second segmentation rule to obtain a segmentation image including at least two regions may include:
  • the super-pixel of the segmentation class in the super-pixel image belongs to a pre-set segmentation classifier that the user needs to focus on, and is divided into a foreground region, and the segmentation classmark in the super-pixel image does not belong to the The preset super-pixel division of the segmentation class that the user needs to pay attention to is divided into a background area.
  • the program executed by the processor 151 may further include:
  • the neural network may include:
  • Deep neural network or non-deep neural network are Deep neural network or non-deep neural network.
  • the image to be segmented is divided into a plurality of superpixels according to a preset first segmentation rule; and the image of the preset size is cut on the superpixel image centering on each superpixel to obtain the super An image block corresponding to the pixel; processing, by using a neural network, the image block corresponding to each super pixel to obtain a segmentation class corresponding to each super pixel; and segmenting the super pixel image according to a preset second segmentation rule And obtaining a segmentation image including at least two regions; wherein the second segmentation rule refers to dividing the superpixels having the same segmentation class into the same region.
  • the storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

一种图像分割方法和装置,该方法可包括:将待分割图像按照预设的第一分割规则分割成若干个超像素,得到超像素图像;以各个超像素为中心在所述超像素图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块;利用神经网络对所述各个超像素对应的图像块进行处理,得到所述各个超像素对应的分割类标;按照预设的第二分割规则对所述超像素图像进行分割,得到包括至少两个区域的分割图像;其中,所述第二分割规则是指将分割类标相同的超像素分割至相同的区域。该方法和装置可以提高图像分割的分割效果。

Description

一种图像分割方法和装置 技术领域
本发明涉及图像处理领域,尤其涉及一种图像分割方法和装置。
背景技术
图像分割技术是目前图像处理领域中关键技术之一,且是图像识别和计算机视觉技术中至关重要的预处理,以及是进行图像识别、图像分析和图像理解的基础。其中,图像分割是把图像分割成若干个特定的且具有独特性质的区域,并提出感兴趣目标的过程。目前图像分割技术主要是通过人为设计一种特征作为图像分割的依据,再基于该依据进行图像分割。例如:基于阈值的分割方法、基于边缘的分割方法或者基于区域的分割方法等。由于上述图像分割技术都是需要人为设计一种特征,再基于该特征进行分割,而人为设计的特征往往具有某种局限性,从而导致分割效果可能会存在较差的问题。比如对某类图像的分割效果很好,但对另一类差异较大的图像则分割效果很差。另外,人为设计的特征也容易出错,从而导致图像的分割效果较差。可见,目前的图像分割的分割效果较差。
发明内容
本发明实施例提供了一种图像分割方法和装置,可以提高图像分割的分割效果。
第一方面,本发明实施例提供一种图像分割方法,包括:
将待分割图像按照预设的第一分割规则分割成若干个超像素,得到超像素图像;
以各个超像素为中心在所述超像素图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块;
利用神经网络对所述各个超像素对应的图像块进行处理,得到所述各个超像素对应的分割类标;
按照预设的第二分割规则对所述超像素图像进行分割,得到包括至少两个区域的分割图像;其中,所述第二分割规则是指将分割类标相同的超像素分割至相同的区域。
在第一方面的第一种可能的实现方式中,在所述将待分割图像按照预设的第一分割规则分割成若干个超像素之后,在所述以各个超像素为中心在所述超像素图像切割特定尺度的图像之前,所述方法还包括:
对所述超像素图像进行扩充,以生成包括所述超像素图像的扩充图像;
所述以各个超像素为中心在所述超像素图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块,包括:
以所述各个超像素为中心在所述扩充图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块。
结合第一方面或者第一方面的第一种可能的实现方式,在第一方面的第二种可能的实现方式中,预先设定需要得到m个分割类标,所述m为大于或者等于2的自然数;
所述利用神经网络对所述各个超像素对应的图像块进行处理,得到所述各个超像素的分割类标,包括:
利用所述神经网络对所述各个超像素对应的图像块进行运算,以得到所述各个超像素的分类向量;
识别所述各个超像素的分类向量在所述m个分割类标中所对应的分割类标;
针对所述各个超像素中的任一个超像素,将所述任一个超像素的分类向量在所述m个分割类标中所对应的分割类标作为所述任一个超像素的分割类标。
结合第一方面的第二种可能的实现方式,在第一方面的第三种可能的实现方式中,所述识别所述各个超像素的分类向量在所述m个分割类标中所对应的分割类标,包括:
针对所述各个超像素中的任一个超像素,通过如下公式计算所述任一个超像素的分类向量与所述m个分割类标的连接值:
Figure PCTCN2015077859-appb-000001
其中,所述yj为超像素的分类向量与第j个分割类标的连接值,所述xi表示所述目标超像素的分类向量中第i维向量,所述n为所述目标超像素的分类向量中向量的维数,且所述n为大于1整数,所述αi,j为预先设定的用于识别分割类标的参数;
从所述任一个超像素的m个连接值中选择最大的连接值,将所述最大的连接值对应的分割类标作为所述任一个超像素的分类向量在所述m个分割类标中所对应的分割类标。
结合第一方面的第二种可能的实现方式或者第一方面的第三种可能的实现方式,在第一方面的第四种可能的实现方式中,所述以各个超像素为中心在所述超像素图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块,包括:
以所述各个超像素为中心在所述超像素图像切割上第一预设尺度的图像,以得到所述各个超像素对应的第一图像块;
以所述各个超像素为中心在所述超像素图像切割上第二预设尺度的图像,以得到所述各个超像素对应的第二图像块;
所述利用所述神经网络对所述各个超像素的图像块进行运算,以得到所述各个超像素的分类向量,包括:
针对所述各个超像素中的任一个超像素,利用神经网络分别对所述任一个超像素的对应的第一图像块和第二图像块进行运算,得到所述任一个超像素的第一分类向量和第二分类向量,并将所述第一分类向量和第二分类向量进行合成得到所述任一个超像素的分类向量。
结合第一方面或者第一方面的第一种可能的实现方式或者第一方面的第二种可能的实现方式或者第一方面的第三种可能的实现方式或者第一方面的第四种可能的实现方式,在第一方面的第五种可能的实现方式中,所述按照预设的第二分割规则对所述超像素图像进行分割,得到包括至少两个区域的分割图像,包括:
按照所述第二分割规则将所述超像素图像中分割类标属于预先设定的用户需要关注的分割类标的超像素分割为前景区域,将所述超像素图像中分割类标不属于所述预先设定的用户需要关注的分割类标的超像素分割为背景区域。
结合第一方面的第五种可能的实现方式或者第一方面的第三种可能的实现方式,在第一方面的第六种可能的实现方式中,所述方法还包括:
将所述超像素图像中被分割为前景区域的超像素的颜色设置为预先设置与所述关注分割类标对应的前景颜色,将所述超像素图像中被分割为背景区域的超像素的颜色设置为与所述关注分割类标对应的背景颜色。
结合第一方面或者第一方面的第一种可能的实现方式或者第一方面的第二种可能的实现方式或者第一方面的第三种可能的实现方式或者第一方面的第四种可能的实现方式或者第一方面的第五种可能的实现方式或者第一方面的第六种可能的实现方式,在第一方面的第七种可能的实现方式中,所述神经网络包括:
深度神经网络或者非深度神经网络。
第二方面,本发明实施例提供一种图像分割装置,包括:第一分割单元、切割单元、分类单元和第二分割单元,其中:
所述第一分割单元,用于将待分割图像按照预设的第一分割规则分割成若干个超像素,得到超像素图像;
所述切割单元,用于以所述第一分割单元分割的各个超像素为中心在所述超像素图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块;
所述分类单元,用于利用神经网络对所述切割单元得到的各个超像素对应的图像块进行处理,得到所述各个超像素对应的分割类标;
所述第二分割单元,用于按照预设的第二分割规则对所述第一分割单元所述超像素图像进行分割,得到包括至少两个区域的分割图像;其中,所述第二分割规则是指将分割类标相同的超像素分割至相同的区域。
在第二方面的第一种可能的实现方式中,所述装置还包括:
扩充单元,用于对所述第一切割单元所述超像素图像进行扩充,以生成包括所述超像素图像的扩充图像;
所述切割单元,用于以所述第一分割单元分割的所述各个超像素为中心在所述扩充单元扩充的扩充图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块。
结合第二方面或者第二方面的第一种可能的实现方式,在第二方面的第二 种可能的实现方式中,预先设定需要得到m个分割类标,所述m为大于或者等于2的自然数;
所述分类单元包括:
运算单元,用于利用所述神经网络对所述切割单元得到的所述各个超像素对应的图像块进行运算,以得到所述各个超像素的分类向量;
识别单元,用于识别所述运算单元得到的所述各个超像素的分类向量在所述m个分割类标中所对应的分割类标;
分类子单元,用于针对所述各个超像素中的任一个超像素,将所述识别单元识别的所述任一个超像素的分类向量在所述m个分割类标中所对应的分割类标作为所述任一个超像素的分割类标。
结合第二方面的第二种可能的实现方式,在第二方面的第三种可能的实现方式中,所述识别单元,包括:
计算单元,用于针对所述各个超像素中的任一个超像素,通过如下公式计算所述运算单元得到的所述任一个超像素的分类向量与所述m个分割类标的连接值:
Figure PCTCN2015077859-appb-000002
其中,所述yj为超像素的分类向量与第j个分割类标的连接值,所述xi表示所述目标超像素的分类向量中第i维向量,所述n为所述目标超像素的分类向量中向量的维数,且所述n为大于1整数,所述αi,j为预先设定的用于识别分割类标的参数;
选择单元,用于从所述计算单元得到的所述任一个超像素的m个连接值中选择最大的连接值,将所述最大的连接值对应的分割类标作为所述任一个超像素的分类向量在所述m个分割类标中所对应的分割类标。
结合第二方面的第二种可能的实现方式或者第二方面的第三种可能的实现方式,在第二方面的第四种可能的实现方式中,所述切割单元用以所述第一分割单元分割的所述各个超像素为中心在所述超像素图像上切割第一预设尺度的图像,以得到所述各个超像素对应的第一图像块,以及以所述各个超像素为中心在所述超像素图像上切割第二预设尺度的图像,以得到所述各个超像素 对应的第二图像块;
所述运算单元用于针对所述各个超像素中的任一个超像素,利用神经网络分别对所述切割单元切割的所述任一超像素的对应的第一图像块和第二图像块进行运算,得到所述任一个超像素的第一分类向量和第二分类向量,并将所述第一分类向量和第二分类向量进行合成得到所述任一个超像素的分类向量。
结合第二方面的第一种可能的实现方式或者第二方面的第二种可能的实现方式或者第二方面的第三种可能的实现方式或者第二方面的第四种可能的实现方式,在第二方面的第五种可能的实现方式中,所述第二分割单元用于按照所述第二分割规则将所述第一分割单元所述超像素图像中分割类标属于预先设定的用户需要关注的分割类标的超像素分割为前景区域,将所述第一分割单元所述超像素图像中分割类标不属于所述预先设定的用户需要关注的分割类标的超像素分割为背景区域。
结合第二方面的第五种可能的实现方式,在第二方面的第六种可能的实现方式中,所述装置还包括:
设置单元,用于将所述超像素图像中被所述第二分割单元分割为前景区域的超像素的颜色设置为预先设置与所述关注分割类标对应的前景颜色,将所述超像素图像中被所述第二分割单元分割背景区域的超像素的颜色设置为与所述关注分割类标对应的背景颜色。
结合第二方面或者第二方面的第一种可能的实现方式或者第二方面的第二种可能的实现方式或者第二方面的第三种可能的实现方式或者第二方面的第四种可能的实现方式或者第二方面的第五种可能的实现方式或者第二方面的第六种可能的实现方式,在第二方面的第七种可能的实现方式中,所述神经网络包括:
深度神经网络或者非深度神经网络。
第三方面,本发明实施例提供一种图像分割装置,包括:处理器、网络接口、存储器和通信总线,其中,所述通信总线用于实现所述处理器、网络接口和存储器之间连接通信,所述处理器执行所述存储器中存储的程序用于实现以下方法:
将待分割图像按照预设的第一分割规则分割成若干个超像素,得到超像素 图像;
以各个超像素为中心在所述超像素图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块;
利用神经网络对所述各个超像素对应的图像块进行处理,得到所述各个超像素对应的分割类标;
按照预设的第二分割规则对所述超像素图像进行分割,得到包括至少两个区域的分割图像;其中,所述第二分割规则是指将分割类标相同的超像素分割至相同的区域。
在第三方面的第一种可能的方式中,所述处理器在执行将待分割图像按照预设的第一分割规则分割成若干个超像素之后,所述以各个超像素为中心在所述超像素图像切割特定尺度的图像,以得到所述各个超像素的图像块之前,所述处理器执行的程序还包括:
对所述超像素图像进行扩充,以生成包括所述超像素图像的扩充图像;
所述处理器执行的以各个超像素为中心在所述超像素图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块的程序,包括:
以所述各个超像素为中心在所述扩充图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块。
结合第三方面或者第三方面的第一种可能的实现方式,在第三方面的第二种可能的方式中,预先设定需要得到m个分割类标,所述m为大于或者等于2的自然数;
所述处理器执行的利用神经网络对所述各个超像素对应的图像块进行处理,得到所述各个超像素的分割类标的程序,包括:
利用所述神经网络对所述各个超像素对应的图像块进行运算,以得到所述各个超像素的分类向量;
识别所述各个超像素的分类向量在所述m个分割类标中所对应的分割类标;
针对所述各个超像素中的任一个超像素,将所述任一个超像素的分类向量在所述m个分割类标中所对应的分割类标作为所述任一个超像素的分割类标。
结合第三方面的第二种可能的实现方式,在第三方面的第三种可能的方式 中,所述处理器执行的识别所述各个超像素的分类向量在所述m个分割类标中所对应的分割类标的程序,包括:
针对所述各个超像素中的任一个超像素,通过如下公式计算所述任一个超像素的分类向量与所述m个分割类标的连接值:
Figure PCTCN2015077859-appb-000003
其中,所述yj为超像素的分类向量与第j个分割类标的连接值,所述xi表示所述目标超像素的分类向量中第i维向量,所述n为所述目标超像素的分类向量中向量的维数,且所述n为大于1整数,所述αi,j为预先设定的用于识别分割类标的参数;
从所述任一个超像素的m个连接值中选择最大的连接值,将所述最大的连接值对应的分割类标作为所述任一个超像素的分类向量在所述m个分割类标中所对应的分割类标。
结合第三方面的第二种可能的实现方式或者第三方面的第三种可能的实现方式,在第三方面的第四种可能的方式中,所述处理器执行的以各个超像素为中心在所述超像素图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块的程序,包括:
以所述各个超像素为中心在所述超像素图像上切割第一预设尺度的图像,以得到所述各个超像素对应的第一图像块;
以所述各个超像素为中心在所述超像素图像上切割第二预设尺度的图像,以得到所述各个超像素对应的第二图像块;
所述处理器执行的利用所述神经网络对所述各个超像素的图像块进行运算,以得到所述各个超像素的分类向量的程序,包括:
针对所述各个超像素中的任一个超像素,利用神经网络分别对所述任一个超像素的对应的第一图像块和第二图像块进行运算,得到所述任一个超像素的第一分类向量和第二分类向量,并将所述第一分类向量和第二分类向量进行合成得到所述任一个超像素的分类向量。
结合第三方面的第一种可能的实现方式或者第三方面的第二种可能的实现方式或者第三方面的第三种可能的实现方式或者第三方面的第四种可能的 实现方式,在第三方面的第五种可能的方式中,所述处理器执行的按照预设的第二分割规则对所述超像素图像进行分割,得到包括至少两个区域的分割图像的程序,包括:
按照所述第二分割规则将所述超像素图像中分割类标属于预先设定的用户需要关注的分割类标的超像素分割为前景区域,将所述超像素图像中分割类标不属于所述预先设定的用户需要关注的分割类标的超像素分割为背景区域。
结合第三方面的第五种可能的实现方式,在第三方面的第六种可能的方式中,所述处理器执行的程序还包括:
将所述超像素图像中被分割为前景区域的超像素的颜色设置为预先设置与所述关注分割类标对应的前景颜色,将所述超像素图像中被分割背景区域的超像素的颜色设置为与所述关注分割类标对应的背景颜色。
结合第三方面或者第三方面的第一种可能的实现方式或者第三方面的第二种可能的实现方式或者第三方面的第三种可能的实现方式或者第三方面的第四种可能的实现方式或者第三方面的第五种可能的实现方式或者第三方面的第六种可能的实现方式,在第三方面的第七种可能的方式中,所述神经网络包括:
深度神经网络或者非深度神经网络。
上述技术方案中,将待分割图像按照预设的第一分割规则分割成若干个超像素;以各个超像素为中心在所述超像素图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块;利用神经网络对所述各个超像素对应的图像块进行处理,得到所述各个超像素对应的分割类标;按照预设的第二分割规则对所述超像素图像进行分割,得到包括至少两个区域的分割图像;其中,所述第二分割规则是指将分割类标相同的超像素分割至相同的区域。这样相比现有技术中使用人为设计的特征,上述技术方案可以避免人为设计的特征带来的局限性,以及可以避免人为设计的特征容易出错的问题,从而可以提高图像分割的分割效果。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施 例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本发明实施例提供的一种图像分割方法的流程示意图;
图2是本发明实施例提供的另一种图像分割方法的流程示意图;
图3是本发明实施例提供的一种超像素分割示意图;
图4是本发明实施例提供的一种图像块切割示意图;
图5是本发明实施例提供的一种使用深度神经网络分类的示意图;
图6是本发明实施例提供的一种深度神经网络结构示意图;
图7是本发明实施例提供的一种深度神经网络的汇聚示意图;
图8是本发明实施例提供的一种分割类标的颜色示意图;
图9是本发明实施例提供的一种图像分割示意图;
图10是本发明实施例提供的一种实验数据示意图;
图11是本发明实施例提供的一种图像分割装置的结构示意图;
图12是本发明实施例提供的另一种图像分割装置的结构示意图;
图13是本发明实施例提供的另一种图像分割装置的结构示意图;
图14是本发明实施例提供的另一种图像分割装置的结构示意图;
图15是本发明实施例提供的另一种图像分割装置的结构示意图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
请参阅图1,图1是本发明实施例提供的一种图像分割方法的流程示意图,如图1所示,包括以下步骤:
101、将待分割图像按照预设的第一分割规则分割成若干个超像素,得到超像素图像。
本实施例中,上述步骤可以是将上述待分割图像划分为若干个超像素,其 中,上述超像素(Superpixel)可以是指一系列位置相邻且颜色、亮度和纹理等特征相似的像素点组成的小区域,另外,这些小区域可以保留了进一步进行图像分割的有效信息,且可以不破坏图像中物理的边界信息。另外,上述第一预设分割规则可以是基于图论的分割方法的分割规则,或者可以是基于梯度下降的分割方法的分割规则。
102、以各个超像素为中心在所述超像素图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块。
该步骤可以是以超像素中某一个像素点为中心在所述超像素图像上切割预设尺度的图像。另外,每个超像素对应的图像块可以是一个或者多个图像块。
103、利用神经网络对所述各个超像素对应的图像块进行处理,得到所述各个超像素对应的分割类标。
其中,上述分割类标可以理解为图像分割的区域标识,即在图像分割时相同分割类标的超像素被分割至相同的区域。另外,上述利用神经网络对各个超像素对应的图像块进行处理可以理解为利用神经网络模型对各个超像素的图像块进行处理,其中,该神经网络模型可以是预先获取的,例如:预先通过训练得到该神经网络模型。
104、按照预设的第二分割规则对所述超像素图像进行分割,得到包括至少两个区域的分割图像;其中,所述第二分割规则是指将分割类标相同的超像素分割至相同的区域。
当各个超像素的分割类标确定后,就可以将预设的第二分割规则对所述所述超像素图像进行分割,例如:相同分割类标的超像素分割至相同的区域,从而可以将上述待分割图像分割成多个区域。
本实施例中,上述方法可以应用于任何具备图像处理功能的智能设备,例如:平板电脑、手机、电子阅读器、遥控器、个人计算机(Personal Computer,PC)、笔记本电脑、车载设备、网络电视、可穿戴设备等具有图像处理功能的智能设备。
本实施例中,将待分割图像按照预设的第一分割规则分割成若干个超像素;以各个超像素为中心在所述超像素图像上切割预设尺度图像,以得到所述各个超像素对应的图像块;利用神经网络对所述各个超像素对应的图像块进行 处理,得到所述各个超像素对应的分割类标;按照预设的第二分割规则对所述所述超像素图像进行分割,得到包括至少两个区域的分割图像;其中,所述第二分割规则是指分割类标相同的超像素分割至相同的区域。这样相比现有技术中使用人为设计的特征,上述技术方案可以避免人为设计的特征带来的局限性,以及可以避免人为设计的特征容易出错的问题,从而可以提高图像分割的分割效果。
请参阅图2,图2是本发明实施例提供的另一种图像分割方法的流程示意图,如图2所示,包括以下步骤:
201、将待分割图像按照预设的第一分割规则分割成若干个超像素,得到超像素图像。
本实施例中,步骤201可以使用基于图论的分割方法的分割规则将上述待分割图像分割成若干个超像素,或者步骤201可以使用基于梯度下降的分割方法的分割规则将上述待分割图像分割成若干个超像素。例如:步骤201可以使用基于梯度下降的分割方法中的SLIC算法将上述待分割图像分割成若干个超像素,该算法是基于颜色和距离的相似性进行超像素分割的,使用该算法进行分割可以生产大小均匀和形状规则的超像素。例如:如图3所示的超像素示意图,其中,图3所示的超像素示意图中,从左上角到右下角顺序中各超像素的像素分布依次为64、256和1024个像素。
202、对所述超像素图像进行扩充,以生成包括所述超像素图像的扩充图像。
本实施例中,上述扩充可以以上述所述超像素图像为参考位置,对该超像素图像进行扩充,可以是在超像素图像的四周扩充某一固定颜色值的图像,例如:在超像素图像的四周扩充某一固定均值或者某一固定灰度值的图像。
本实施例中,步骤202还可以是以上述超像素图像为中心进行扩充,例如:如图4所示,401表示分割成超像素后的超像素图像,402是扩充后的扩充图像,该图像中,步骤202中扩充后的扩充图像是超像素图像的N倍大小,例如:N为3,其中,这里的N倍可以是指长和宽都是超像素图像的N倍。
203、以所述各个超像素为中心在所述扩充图像上切割预设尺度的图像, 以得到所述各个超像素对应的图像块。
本实施例中,可以设置上述预设尺度为超像素图像的倍数,其中,这里的倍数可以不仅是整数倍数,可以是小数分倍数,例如:1.1倍、1.2倍或者1倍等。例如:如图4所示,步骤203切割的图像块可以是局部图像块403,局部图像块是指切割的图像块只包括超像素图像的局部图像。另外,步骤203切割的图像块可以是全局图像块404,全局图像块是指切割的图像块包括超像素图像的全部图像。当然,本实施例中,每个超像素可以切割多个对应的图像块,例如:切割局部图像块和全局图像块。
另外,需要说明的是,步骤202扩充的扩充图像可以满足以任何超像素为中心在所述扩充图像上切割的预设尺度图像都属于该扩充图像内,例如:步骤202扩充的扩充图像为超像素图像的3倍,这样以超像素图像的任何超像素为中心进行全局图像块切割时,切割的全局图像块都属于该扩充图像内,即切割的全局图像块都不会超过扩充图像的范围。
204、利用神经网络对所述各个超像素对应的图像块进行处理,得到所述各个超像素对应的分割类标。
本实施例中,预先设定需要得到m个分割类标,所述m为大于或者等于2的自然数,这样步骤204就可以是识别各个超像素在上述m个分割类标中所对应的分割类标。例如:步骤204可以包括:
利用所述神经网络对所述各个超像素对应的图像块进行运算,以得到所述各个超像素的分类向量;
识别所述各个超像素的分类向量在所述m个分割类标中所对应的分割类标;
针对所述各个超像素中的任一个超像素,将所述任一个超像素的分类向量在所述m个分割类标中所对应的分割类标作为所述任一个超像素的分割类标。
该实施方式中具体可以是将上述分类向量通过深度神经网络中的全连接层实现识别各个超像素的分类向量在所述m个分割类标中所对应的分割类标。
本实施例中,上述识别所述各个超像素的分类向量在所述m个分割类标中所对应的分割类标的步骤,可以包括:
针对所述各个超像素中的任一个超像素,通过如下公式计算所述任一个超 像素的分类向量与所述m个分割类标的连接值:
Figure PCTCN2015077859-appb-000004
其中,所述yj为超像素的分类向量与第j个分割类标的连接值,所述xi表示所述目标超像素的分类向量中第i维向量,所述n为所述目标超像素的分类向量中向量的维数,且所述n为大于1整数,所述αi,j为预先设定的用于识别分割类标的参数;
从所述任一个超像素的m个连接值中选择最大的连接值,将所述最大的连接值对应的分割类标作为所述任一个超像素的分类向量在所述m个分割类标中所对应的分割类标。
其中,上述参数αi,j可以是通过大量训练样本学习得到。
通过上述方式就可以得到各个超像素的分割类标。
本实施例中,上述以各个超像素为中心在所述超像素图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块的步骤,可以包括:
以所述各个超像素为中心在所述超像素图像上切割第一预设尺度的图像,以得到所述各个超像素对应的第一图像块;
以所述各个超像素为中心在所述超像素图像上切割第二预设尺度的图像,以得到所述各个超像素对应的第二图像块;
上述利用所述神经网络对所述各个超像素的图像块进行运算,以得到所述各个超像素的分类向量的步骤,可以包括:
针对所述各个超像素中的任一个超像素,利用神经网络分别对所述任一个超像素的对应的第一图像块和第二图像块进行运算,得到所述任一个超像素的第一分类向量和第二分类向量,并将所述第一分类向量和第二分类向量进行合成得到所述任一个超像素的分类向量。
其中,上述超像素的对应的第一图像块和第二图像块可以是上述介绍的局部图块和全局图像块,本实施方式中采用局部图块和全局图像块可以在神经网络进行处理时更好地体现出超像素在超像素图像中的局部特征和全局特征,从而提高图像分割效果。
另外,对于不同尺度的图像块,本实施例中可以使用相同或者不同的神经 网络进行处理,例如:对于局部图像块和全局图像块都可以使用同一个深度神经网络进行处理,得到的分类向量后,再进行合成。这样得到的超像素的分类向量更加丰富,从而可以提高图像分割效果。
另外,本实施例中,上述神经网络可以是非深度神经网络,其中,非深度神经网络可以理解为单层神经网络,例如:BP神经网络、Hebb神经网络或者DL神经网络。另外,上述神经网络可以是深度神经网络,其中,深度神经网络可以理解为多层神经网络。例如:Clarifai深度神经网络、AlexNet深度神经网络、NIN深度神经网络、OverFest深度神经网络或者GoogLeNet深度神经网络等,对此不作限定。
下面以Clarifai深度神经网络进行详细说明:
Clarifai深度神经网络包括5层卷积层和2层全连接层,当使用Clarifai深度神经网络对各个超像素的图像块进行深度学习时,步骤204如图5所示,其中,图5中只画出了具有参数的卷积层和全连接层,且图5中的参数都可以是通过大量训练样本学习得的。当对超像素的图像块包括局部图像块和全局图像块时,可以分别对局部图像块和全局图像块进行学习,再将学习得到分类向量进行合成,以输出结果。其中,这样里的输出结果可以是各个超像素的分割类标。
另外,以图5所示的例子中的参数对步骤204进行详细说明,具体请参考图6,如图6所示,在图6中,第1卷积层利用多个参数模板对图像块进行卷积,假设全局图像块和局部图像块的尺寸都线性变换到227×227,假设全局图像块和局部图像块都是3通道的彩色图像,那么第1层卷积层的输入是227×227×3的矩阵,第1卷积层用96个7×7×3的参数模板对输入的227×227×3的矩阵进行卷积运算,这96×7×7×3个参数均为未知,可以通过大量样本训练得到。另外,为了加快速度,卷积时在x和y方向的平移步长为2个像素,这样每一个卷积运算会得到一个111×111的矩阵,96个卷积运算的结果拼接在一起就会得到一个111×111×96的矩阵。
第1层修正线性单元函数可以将上述111×111×96的矩阵中小于0的值用0替换。另外,本实施例中,修正线性单元函数的表达式可以为relu(x)=max(x,0)。
第1层汇聚层的可以是指将矩阵中的某一区域的值按照某一规则(比如取最大值),映射为一个值。例如,如图7所示,图7给出了将一个4×4的矩阵中每一个2×2的矩阵按最大值的规则映射为一个值,最终得到一个2×2的矩阵的示意图.另外,竖长条的高度表示该位置元素值的大小,没有画竖长条的表示该位置元素值为0。这样图6中第1层汇聚层按照图7给出的规则,可以将前面得到的111×111×96的矩阵汇聚成55×55×96的矩阵(忽略矩阵的边缘数据)。这样将55×55×96的矩阵作为第2层卷积层的输入,第2层卷积层用256个3×3×96的参数模板对输入的55×55×96的矩阵进行卷积运算,这256×3×3×96个参数均为未知,可以通过大量样本训练得到。另外,为了加快速度,第2层卷积层卷积时在x和y方向的平移步长为2个像素,这样每一个卷积运算会得到一个27×27的矩阵,256个卷积运算的结果拼接在一起就会得到一个27×27×256的矩阵。
需要说明的是,后面的第3、4、5层和前面的过程类似,只是第3层和第4层不做汇聚,第5层汇聚后得到一个6×6×256的矩阵。此处不作重复说明。
图6中全连接层可以是指将上一层的所有节点和这一层的所有节点两两连接,每一个连接对应一个未知参数,该未知参数可以通过大量样本训练得到。比如,第6层全连接表示将第5层的6×6×256个节点和第6层的4096个节点进行两两连接,用公式表示为:
Figure PCTCN2015077859-appb-000005
其中xi表示上一层的节点,yj表示这一层的节点,αi,j表示未知参数,n为第5层的节点的数量,m为第6层节点的数量。
这样图5中的全局图像块和局部图像分别块经过神经网络运算到第7层分别得到一个4096维的向量,将这两个4096维向量按上述公式和最终输出结果进行全连接。比如2类分类问题,即上述m为2,则最终输出结果层包含2个节点,其中,这里的最终输出结果层可以理解为上述介绍的分割类标,即最终输出结果包含2个分割类标。这样最终那个节点的值最大,则输入的超像素被分类到该节点。
需要说明的是,上述未知参数都可以通过大量训练样本学习得到。
本实施例中,上述方法还可以包括如下步骤:
深度学习获取上述深度神经网络模型。
例如:预设包括大量未知参数的模型,并给每一个未知参数赋一个初始值,每一个初始值都是由计算机随机生成;然后通过大量训练样本进行训练,训练样本是人工分割好的样本,也就是每一个图像块对应的分割类标是已知的,训练的过程就是不断调整那些未知参数的值,使得所有的图像块经过这个深度神经网络后尽量能够分类正确,最后,当分类错误最小的时候,就可以把这些未知参数的值确定下来,训练也就完成了,从而生成上述深度神经网络。
当然,在本实施例中,上述深度神经网络也可以是已经训练好的深度神经网络,如接收其他设备发送的深度神经网络。
205、按照所述第二分割规则将所述超像素图像中分割类标属于预先设定的用户需要关注的分割类标的超像素分割为前景区域,将所述超像素图像中分割类标不属于所述预先设定的用户需要关注的分割类标的超像素分割为背景区域。
本实施例中,上述关注分割类标可以是一个或者多个,例如:当上述关注分割类标为一个时,步骤205就可以将分割类标属于该关注分割类标的超像素分割为前景区域,而将超像素图像的其余所有超像素分割为背景区域。例如:当上述关注分割类标为多个时,将这属于这多个分割类标的超像素分割为前景区域,而将超像素图像的其余所有超像素分割为背景区域。这里需要说明的是,由于是多个关注分割类标,那么,这里的前景区域是包含多个区域的,每个区域由相同的分割类标的超像素组成。
本实施例中,上述方法还可以包括如下步骤:
将所述超像素图像中被分割为前景区域的超像素的颜色设置为预先设置与所述关注分割类标对应的前景颜色,将所述超像素图像中被分割为背景区域的超像素的颜色设置为与所述关注分割类标对应的背景颜色。
该实施方式中,当上述关注分割类标为多个时,不同的关注分割类标可以对应的前景颜色可以不相同,而所有关注分割类标对应的背景颜色都是相同的。
其中,各分割标识对应的前景颜色可以如图8所示,不同的分割类标对应 不同的颜色。例如:如图9所示的待分割图像,该待分割图像主要包括天空背景、建筑物和植物,针对该待分割图像使用本方法的步骤对其进行分割,就可以将天空背景、建筑物和植物分割为不同的区域,如果将建筑物的超像素的分割类标为作关注分割类标时,那么,就可以将建筑物的区域分割为前景,将天空背景和植物分割为背景,且建筑物的超像素的分割类标对应的颜色为白色时,那么就可以生成如9所示的分割图像。
本实施例,在图1所示的实施例的基础上增加了多种可选的实施方式,且都可以实现提高图像分割效果。
请参阅图10、图10是本发明实施例提供的一种实验数据示意图,如图10所示,最左列为原始图像,第二列为手工标注的真实值(Ground Truth,GT),第三列为本发明实施例提供的图像分割技术的分割结果,后面三列分别为DRFI、GBMR、HS图像分割技术的分割结果。从图中可以看出本发明实施例提供的图像分割技术的分割结果更接近人工标注的真实值。
另外,本发明实施例还提供本发明实施例提供的图像分割技术在ASD、SED1、SED2、ECSSD、PASCAL-S这些图像分割公开库上和IS、GBVS、SF、GC、CEOS、PCAS、GBMR、HS、DRFI这些目前最具代理性的图像分割技术做了实验对比。表1给出了本发明实施例提供的图像分割技术和其他图像分割技术这5个公开库上的F-测量得分Fβ,其中Fβ得分越高表示分割效果越好,Fβ如下公式所示:
Figure PCTCN2015077859-appb-000006
其中,Precision是精度,指分类正确的像素个数÷总的像素个数,Recall是召回率,指正确分类为前景的像素个数÷总的前景的像素个数,β2=0.3。
表1:
  ASD SED1 SED2 ECSSD PASCAL-S
IS 0.5943 0.5540 0.5682 0.4731 0.4901
GBVS 0.6499 0.7125 0.5862 0.5528 0.5929
SF 0.8879 0.7533 0.7961 0.5448 0.5740
GC 0.8811 0.8066 0.7728 0.5821 0.6184
CEOS 0.9020 0.7935 0.6198 0.6465 0.6557
PCAS 0.8613 0.7586 0.7791 0.5800 0.6332
GBMR 0.9100 0.9062 0.7974 0.6570 0.7055
HS 0.9307 0.8744 0.8150 0.6391 0.6819
DRFI 0.9448 0.9018 0.8725 0.6909 0.7447
本发明 0.9548 0.9295 0.8903 0.7322 0.7930
通过上述实验数据,可以很清楚地得到本发明实施例提供的图像分割技术的分割效果优于目前最具代表性的图像分割技术。
下面为本发明装置实施例,本发明装置实施例用于执行本发明方法实施例一至二实现的方法,为了便于说明,仅示出了与本发明实施例相关的部分,具体技术细节未揭示的,请参照本发明实施例一和实施例二。
请参阅图11,图11是本发明实施例提供的一种图像分割装置的结构示意图,如图11所示,包括:第一分割单元111、切割单元112、分类单元113和第二分割单元114,其中:
第一分割单元111,用于将待分割图像按照预设的第一分割规则分割成若干个超像素,得到超像素图像。
本实施例中,第一分割单元111可以是将上述待分割图像划分为若干个超像素,其中,上述超像素(Superpixel)可以是指一系列位置相邻且颜色、亮度和纹理等特征相似的像素点组成的小区域,另外,这些小区域可以保留了进一步进行图像分割的有效信息,且可以不破坏图像中物理的边界信息。另外, 上述第一预设分割规则可以是基于图论的分割方法的分割规则,或者可以是基于梯度下降的分割方法的分割规则。
切割单元112,用于以第一分割单元111分割的各个超像素为中心在所述超像素图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块。
切割单元112可以是以超像素中某一个像素点为中心在所述超像素图像上切割预设尺度的图像。另外,每个超像素对应的图像块可以是一个或者多个图像块。
分类单元113,用于利用神经网络对切割单元112得到的各个超像素对应的图像块进行处理,得到所述各个超像素对应的分割类标。
其中,上述分割类标可以理解为图像分割的区域标识,即在图像分割时相同分割类标的超像素被分割至相同的区域。另外,上述利用神经网络对各个超像素对应的图像块进行处理可以理解为利用神经网络模型对各个超像素的图像块进行处理,其中,该神经网络模型可以是预先获取的,例如:预先通过训练得到该神经网络模型。
第二分割单元114,用于按照预设的第二分割规则对第一分割单元111分割的所述超像素图像进行分割,得到包括至少两个区域的分割图像;其中,所述第二分割规则是指将分割类标相同的超像素分割至相同的区域。
当各个超像素的分割类标确定后,就可以将预设的第二分割规则对所述超像素图像进行分割,例如:相同分割类标的超像素分割至相同的区域,从而可以将上述待分割图像分割成多个区域。
本实施例中,上述装置可以应用于任何具备图像处理功能的智能设备,例如:平板电脑、手机、电子阅读器、遥控器、PC、笔记本电脑、车载设备、网络电视、可穿戴设备等具有图像处理功能的智能设备。
本实施例中,将待分割图像按照预设的第一分割规则分割成若干个超像素;以各个超像素为中心在所述超像素图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块;利用神经网络对所述各个超像素对应的图像块进行处理,得到所述各个超像素对应的分割类标;按照预设的第二分割规则对所述超像素图像进行分割,得到包括至少两个区域的分割图像;其中,所述第二分割规则是指将分割类标相同的超像素分割至相同的区域。这样相比现有技术 中使用人为设计的特征,上述技术方案可以避免人为设计的特征带来的局限性,以及可以避免人为设计的特征容易出错的问题,从而可以提高图像分割的分割效果。
请参阅图12,图12是本发明实施例提供的另一种图像分割装置的结构示意图,如图12所示,包括:第一分割单元121、扩充单元122、切割单元123、分类单元124和第二分割单元125,其中:
第一分割单元121,用于将待分割图像按照预设的第一分割规则分割成若干个超像素,得到超像素图像。
本实施例中,第一分割单元121可以使用基于图论的分割方法的分割规则将上述待分割图像分割成若干个超像素,或者第一分割单元121可以使用基于梯度下降的分割方法的分割规则将上述待分割图像分割成若干个超像素。例如:第一分割单元121可以使用基于梯度下降的分割方法中的SLIC算法将上述待分割图像分割成若干个超像素,该算法是基于颜色和距离的相似性进行超像素分割的,使用该算法进行分割可以生产大小均匀和形状规则的超像素。例如:如图3所示的超像素示意图,其中,图3所示的超像素示意图中,从左上角到右下角顺序中各超像素的像素分布依次为64、256和1024个像素。
扩充单元122,用于对第一分割单元112分割的所述超像素图像进行扩充,以生成包括所述超像素图像的扩充图像。
本实施例中,上述扩充可以以上述所述超像素图像为参考位置,对该超像素图像进行扩充,可以是在超像素图像的四周扩充某一固定颜色值的图像,例如:在超像素图像的四周扩充某一固定均值或者某一固定灰度值的图像。
本实施例中,扩充单元122还可以是以上述超像素图像为中心进行扩充,例如:如图4所示,401表示分割成超像素后的超像素图像,402是扩充后的扩充图像,该图像中,扩充单元122中扩充后的扩充图像是超像素图像的N倍大小,例如:N为3,其中,这里的N倍可以是指长和宽都是超像素图像的N倍。
切割单元123,用于以第一分割单元121分割的各个超像素为中心在所述扩充单元121扩充的扩充图像上切割预设尺度的图像,以得到所述各个超像素 对应的图像块。
本实施例中,可以设置上述预设尺度为超像素图像的倍数,其中,这里的倍数可以不仅是整数倍数,可以是小数分倍数,例如:1.1倍、1.2倍或者1倍等。例如:如图4所示,切割单元123切割的图像块可以是局部图像块403,局部图像块是指切割的图像块只包括超像素图像的局部图像。另外,切割单元123切割的图像块可以是全局图像块404,全局图像块是指切割的图像块包括超像素图像的全部图像。当然,本实施例中,每个超像素可以分割多个图像块,例如:切割局部图像块和全局图像块。
另外,需要说明的是,扩充单元122扩充的扩充图像可以满足以任何超像素为中心在所述扩充图像上切割的预设尺度图像都属于该扩充图像内,例如:扩充单元122扩充的扩充图像为超像素图像的3倍,这样以超像素图像的任何超像素为中心进行全局图像块切割时,切割的全局图像块都属于该扩充图像内,即切割的全局图像块都不会超过扩充图像的范围。
分类单元124,用于利用神经网络对切割单元123得到的各个超像素对应的图像块进行处理,得到所述各个超像素对应的分割类标。
本实施例中,可以预先设定需要得到m个分割类标,所述m为大于或者等于2的自然数,这样分类单元124就可以是识别各个超像素在上述m个分割类标中所属的分割类标。例如:分类单元124可以包括:
运算单元1241,用于利用所述神经网络对分割单元123得到的各个超像素的图像块进行运算,以得到所述各个超像素的分类向量;
识别单元1242,用于识别运算单元1241得到的各个超像素的分类向量在所述m个分割类标中所对应的分割类标;
分类子单元1243,用于针对所述各个超像素中的任一个超像素,将识别单元1242识别的所述任一个超像素的分类向量在所述m个分割类标中所对应的分割类标作为所述任一个超像素的分割类标。
该实施方式中具体可以是将上述分类向量通过深度神经网络中的全连接层实现识别各个超像素的分类向量在所述m个分割类标中所属的分割类标。
本实施例中,如图13所示,识别单元1242可以包括:
计算单元12421,用于针对所述各个超像素中的任一个超像素,通过如下 公式计算运算单元1241得到的任一个超像素的分类向量与所述m个分割类标的连接值:
Figure PCTCN2015077859-appb-000007
其中,所述yj为超像素的分类向量与第j个分割类标的连接值,所述xi表示所述目标超像素的分类向量中第i维向量,所述n为所述目标超像素的分类向量中向量的维数,且所述n为大于1整数,所述αi,j为预先设定的用于识别分割类标的参数;
选择单元12422,用于从所述计算单元12421得到的所述任一个超像素的m个连接值中选择最大的连接值,将所述最大的连接值对应的分割类标作为所述任一个超像素的分类向量在所述m个分割类标中所对应的分割类标。
其中,上述参数αi,j可以是通过大量训练样本学习得到。
通过上述方式就可以得到各个超像素的分割类标。
本实施例中,分割单元122可以用于以第一分割单元121分割的各个超像素为中心在超像素图像上切割第一预设尺度的图像,以得到所述各个超像素对应的第一图像块,以及以第一分割单元121分割的各个超像素为中心在超像素图像上切割第二预设尺度的图像,以得到所述各个超像素对应的第二图像块;
运算单元1241可以用于针对所述各个超像素中的任一个超像素,利用神经网络分别对分割单元123分割的所述任一超像素的对应的第一图像块和第二图像块进行运算,得到所述任一超像素的第一分类向量和第二分类向量,并将所述第一分类向量和第二分类向量进行合成得到所述任一超像素的分类向量。
其中,上述超像素的对应的第一图像块和第二图像块可以是上述介绍的局部图块和全局图像块,本实施方式中采用局部图块和全局图像块可以在神经网络进行处理时更好地体现出超像素在超像素图像中的局部特征和全局特征,从而提高图像分割效果。
另外,对于不同尺度的图像块,本实施例中可以使用相同或者不同的神经网络进行处理,例如:对于局部图像块和全局图像块都可以使用同一个深度神经网络进行处理,得到的分类向量后,再进行合成。这样得到的超像素的分类 向量更加丰富,从而可以提高图像分割效果。
另外,本实施例中,上述神经网络可以是非深度神经网络,其中,非深度神经网络可以理解为单层神经网络,例如:BP神经网络、Hebb神经网络或者DL神经网络。另外,上述神经网络可以是深度神经网络,其中,深度神经网络可以理解为多层神经网络。例如:Clarifai深度神经网络、AlexNet深度神经网络、NIN深度神经网络、OverFest深度神经网络或者GoogLeNet深度神经网络等,对此不作限定。
第二分割单元125,用于按照所述第二分割规则将第一分割单元121分割后的超像素图像中分割类标属于预先设定的用户需要关注的分割类标的超像素分割为前景区域,将第一分割单元121分割后的超像素图像中分割类标不属于所述预先设定的用户需要关注的分割类标的超像素分割为背景区域。
本实施例中,上述关注分割类标可以是一个或者多个,例如:当上述关注分割类标为一个时,第二分割单元125就可以将分割类标属于该关注分割类标的超像素分割为前景区域,而将超像素图像的其余所有超像素分割为背景区域。例如:当上述关注分割类标为多个时,将这属于这多个分割类标的超像素分割为前景区域,而将待分割图像的其余所有超像素分割为背景区域。这里需要说明的是,由于是多个关注分割类标,那么,这里的前景是包含多个区域的,每个区域由相同的分割类标的超像素组成。
本实施例中,如图14所示,上述装置还可以包括:
设置单元126,用于将所述超像素图像中被第二分割单元125分割为前景区域的超像素的颜色设置为预先设置与所述关注分割类标对应的前景颜色,将所述超像素图像中被第二分割单元125分割背景区域的超像素的颜色设置为与所述关注分割类标对应的背景颜色。
该实施方式中,当上述关注分割类标为多个时,不同的关注分割类标可以对应的前景颜色可以不相同,而所有关注分割类标对应的背景颜色都是相同的。
本实施例,在图11所示的实施例的基础上增加了多种可选的实施方式,且都可以实现提高图像分割效果。
请参阅图15,图15是本发明实施例提供的另一种图像分割装置的结构示意图,如图15所示,包括:处理器151、网络接口152、存储器153和通信总线154,其中,所述通信总线154用于实现所述处理器151、网络接口152和存储器153之间连接通信,所述处理器151执行所述存储器153中存储的程序用于实现以下方法:
将待分割图像按照预设的第一分割规则分割成若干个超像素,得到超像素;
以各个超像素为中心在所述超像素图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块;
利用神经网络对所述各个超像素对应的图像块进行处理,得到所述各个超像素对应的分割类标;
按照预设的第二分割规则对所述超像素图像进行分割,得到包括至少两个区域的分割图像;其中,所述第二分割规则是指将分割类标相同的超像素分割至相同的区域。
本实施例中,处理器151在执行将待分割图像按照预设的第一分割规则分割成若干个超像素之后,所述以各个超像素为中心在超像素上切割特定尺度的图像,以得到所述各个超像素的图像块之前,所述处理器执行的程序还可以包括:
对所述超像素图像进行扩充,以生成包括所述超像素图像的扩充图像;
处理器151执行的以各个超像素为中心在所述超像素图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块的程序,可以包括:
以所述各个超像素为中心在所述扩充图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块。
本实施例中,预先设定需要得到m个分割类标,所述m为大于或者等于2的自然数;
处理器151执行的利用神经网络对所述各个超像素对应的图像块进行处理,得到所述各个超像素的分割类标的程序,可以包括:
利用所述神经网络对所述各个超像素对应的图像块进行运算,以得到所述各个超像素的分类向量;
识别所述各个超像素的分类向量在所述m个分割类标中所对应的分割类标;
针对所述各个超像素中的任一个超像素,将所述任一个超像素的分类向量在所述m个分割类标中所对应的分割类标作为所述任一个超像素的分割类标。
本实施例中,处理器151执行的识别所述各个超像素的分类向量在所述m个分割类标中所对应的分割类标的程序,可以包括:
针对所述各个超像素中的任一个超像素,通过如下公式计算所述任一个超像素的分类向量与所述m个分割类标的连接值:
Figure PCTCN2015077859-appb-000008
其中,所述yj为超像素的分类向量与第j个分割类标的连接值,所述xi表示所述目标超像素的分类向量中第i维向量,所述n为所述目标超像素的分类向量中向量的维数,且所述n为大于1整数,所述αi,j为预先设定的用于识别分割类标的参数;
从所述任一个超像素的m个连接值中选择最大的连接值,将所述最大的连接值对应的分割类标作为所述任一个超像素的分类向量在所述m个分割类标中所对应的分割类标。
本实施例中,处理器151执行的以各个超像素为中心在所述超像素图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块的程序,可以包括:
以所述各个超像素为中心在所述超像素图像上切割第一预设尺度的图像,以得到所述各个超像素对应的第一图像块;
以所述各个超像素为中心在所述超像素图像上切割第二预设尺度的图像,以得到所述各个超像素对应的第二图像块;
处理器151执行的利用所述神经网络对所述各个超像素的图像块进行运算,以得到所述各个超像素的分类向量的程序,可以包括:
针对所述各个超像素中的任一个超像素,利用神经网络分别对所述任一个超像素的对应的第一图像块和第二图像块进行运算,得到所述任一个超像素的第一分类向量和第二分类向量,并将所述第一分类向量和第二分类向量进行合成得到所述任一个超像素的分类向量。
本实施例中,处理器151执行的按照预设的第二分割规则对超像素图像进行分割,得到包括至少两个区域的分割图像的程序,可以包括:
按照所述第二分割规则将所述超像素图像中分割类标属于预先设定的用户需要关注的分割类标的超像素分割为前景区域,将所述超像素图像中分割类标不属于所述预先设定的用户需要关注的分割类标的超像素分割为背景区域。
本实施例中,处理器151执行的程序还可以包括:
将所述超像素图像中被分割为前景区域的超像素的颜色设置为预先设置与所述关注分割类标对应的前景颜色,将所述超像素图像中被分割为背景区域的超像素的颜色设置为与所述关注分割类标对应的背景颜色。
本实施例中,所述神经网络可以包括:
深度神经网络或者非深度神经网络。
本实施例中,将待分割图像按照预设的第一分割规则分割成若干个超像素;以各个超像素为中心在所述超像素图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块;利用神经网络对所述各个超像素对应的图像块进行处理,得到所述各个超像素对应的分割类标;按照预设的第二分割规则对所述超像素图像进行分割,得到包括至少两个区域的分割图像;其中,所述第二分割规则是指将分割类标相同的超像素分割至相同的区域。这样相比现有技术中使用人为设计的特征,上述技术方案可以避免人为设计的特征带来的局限性,以及可以避免人为设计的特征容易出错的问题,从而可以提高图像分割的分割效果。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存取存储器(Random Access Memory,简称RAM)等。
以上所揭露的仅为本发明较佳实施例而已,当然不能以此来限定本发明之权利范围,因此依本发明权利要求所作的等同变化,仍属本发明所涵盖的范围。

Claims (24)

  1. 一种图像分割方法,其特征在于,包括:
    将待分割图像按照预设的第一分割规则分割成若干个超像素,得到超像素图像;
    以各个超像素为中心在所述超像素图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块;
    利用神经网络对所述各个超像素对应的图像块进行处理,得到所述各个超像素对应的分割类标;
    按照预设的第二分割规则对所述超像素图像进行分割,得到包括至少两个区域的分割图像;其中,所述第二分割规则是指将分割类标相同的超像素分割至相同的区域。
  2. 如权利要求1所述的方法,其特征在于,在所述将待分割图像按照预设的第一分割规则分割成若干个超像素之后,在所述以各个超像素为中心在所述超像素图像上切割特定尺度的图像之前,所述方法还包括:
    对所述超像素图像进行扩充,以生成包括所述超像素图像的扩充图像;
    所述以各个超像素为中心在所述超像素图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块,包括:
    以所述各个超像素为中心在所述扩充图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块。
  3. 如权利要求1或2所述的方法,其特征在于,预先设定需要得到m个分割类标,所述m为大于或者等于2的自然数;
    所述利用神经网络对所述各个超像素对应的图像块进行处理,得到所述各个超像素的分割类标,包括:
    利用所述神经网络对所述各个超像素对应的图像块进行运算,以得到所述各个超像素的分类向量;
    识别所述各个超像素的分类向量在所述m个分割类标中所对应的分割类标;
    针对所述各个超像素中的任一个超像素,将所述任一个超像素的分类向量在所述m个分割类标中所对应的分割类标作为所述任一个超像素的分割类标。
  4. 如权利要求3所述的方法,其特征在于,所述识别所述各个超像素的分类向量在所述m个分割类标中所对应的分割类标,包括:
    针对所述各个超像素中的任一个超像素,通过如下公式计算所述任一个超像素的分类向量与所述m个分割类标的连接值:
    Figure PCTCN2015077859-appb-100001
    其中,所述yj为超像素的分类向量与第j个分割类标的连接值,所述xi表示所述目标超像素的分类向量中第i维向量,所述n为所述目标超像素的分类向量中向量的维数,且所述n为大于1整数,所述αi,j为预先设定的用于识别分割类标的参数;
    从所述任一个超像素的m个连接值中选择最大的连接值,将所述最大的连接值对应的分割类标作为所述任一个超像素的分类向量在所述m个分割类标中所对应的分割类标。
  5. 如权利要求3或4所述的方法,其特征在于,所述以各个超像素为中心在所述超像素图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块,包括:
    以所述各个超像素为中心在所述超像素图像上切割第一预设尺度的图像,以得到所述各个超像素对应的第一图像块;
    以所述各个超像素为中心在所述超像素图像上切割第二预设尺度的图像,以得到所述各个超像素对应的第二图像块;
    所述利用所述神经网络对所述各个超像素对应的图像块进行运算,以得到所述各个超像素的分类向量,包括:
    针对所述各个超像素中的任一个超像素,利用神经网络分别对所述任一个超像素的对应的第一图像块和第二图像块进行运算,得到所述任一个超像素的第一分类向量和第二分类向量,并将所述第一分类向量和第二分类向量进行合 成得到所述任一个超像素的分类向量。
  6. 如权利要求1-5中任一项所述的方法,其特征在于,所述按照预设的第二分割规则对所述超像素图像进行分割,得到包括至少两个区域的分割图像,包括:
    按照所述第二分割规则将所述超像素图像中分割类标属于预先设定的用户需要关注的分割类标的超像素分割为前景区域,将所述超像素图像中分割类标不属于所述预先设定的用户需要关注的分割类标的超像素分割为背景区域。
  7. 如权利要求6所述的方法,其特征在于,所述方法还包括:
    将所述超像素图像中被分割为前景区域的超像素的颜色设置为预先设置与所述关注分割类标对应的前景颜色,将所述超像素图像中被分割为背景区域的超像素的颜色设置为与所述关注分割类标对应的背景颜色。
  8. 如权利要求1-7中任一项所述的方法,其特征在于,所述神经网络包括:
    深度神经网络或者非深度神经网络。
  9. 一种图像分割装置,其特征在于,包括:第一分割单元、切割单元、分类单元和第二分割单元,其中:
    所述第一分割单元,用于将待分割图像按照预设的第一分割规则分割成若干个超像素,得到超像素图像;
    所述切割单元,用于以所述第一分割单元分割的各个超像素为中心在所述超像素图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块;
    所述分类单元,用于利用神经网络对所述切割单元得到的各个超像素对应的图像块进行处理,得到所述各个超像素对应的分割类标;
    所述第二分割单元,用于按照预设的第二分割规则对所述第一分割单元分割的所述超像素图像进行分割,得到包括至少两个区域的分割图像;其中,所述第二分割规则是指将分割类标相同的超像素分割至相同的区域。
  10. 如权利要求9所述的装置,其特征在于,所述装置还包括:
    扩充单元,用于对所述第一切割单元所述超像素图像进行扩充,以生成包括所述超像素图像的扩充图像;
    所述切割单元,用于以所述第一分割单元分割的各个超像素为中心在所述扩充单元扩充的扩充图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块。
  11. 如权利要求9或10所述的装置,其特征在于,预先设定需要得到m个分割类标,所述m为大于或者等于2的自然数;
    所述分类单元包括:
    运算单元,用于利用所述神经网络对所述切割单元得到的所述各个超像素对应的图像块进行运算,以得到所述各个超像素的分类向量;
    识别单元,用于识别所述运算单元得到的各个超像素的分类向量在所述m个分割类标中所对应的分割类标;
    分类子单元,用于针对所述各个超像素中的任一个超像素,将所述识别单元识别的所述任一个超像素的分类向量在所述m个分割类标中所对应的分割类标作为所述任一个超像素的分割类标。
  12. 如权利要求11所述的装置,其特征在于,所述识别单元,包括:
    计算单元,用于针对所述各个超像素中的任一个超像素,通过如下公式计算所述运算单元得到的任一个超像素的分类向量与所述m个分割类标的连接值:
    Figure PCTCN2015077859-appb-100002
    其中,所述yj为超像素的分类向量与第j个分割类标的连接值,所述xi表示所述目标超像素的分类向量中第i维向量,所述n为所述目标超像素的分类向量中向量的维数,且所述n为大于1整数,所述αi,j为预先设定的用于识别分割类标的参数;
    选择单元,用于从所述计算单元得到的所述任一个超像素的m个连接值中选择最大的连接值,将所述最大的连接值对应的分割类标作为所述任一个超像素的分类向量在所述m个分割类标中所对应的分割类标。
  13. 如权利要求11或12所述的装置,其特征在于,所述切割单元用以所述第一分割单元分割的各个超像素为中心在所述超像素图像上切割第一预设尺度的图像,以得到所述各个超像素对应的第一图像块,以及以所述第一分割单元分割的各个超像素为中心在所述超像素图像上切割第二预设尺度的图像,以得到所述各个超像素对应的第二图像块;
    所述运算单元用于针对所述各个超像素中的任一个超像素,利用神经网络分别对所述切割单元切割的所述任一超像素的对应的第一图像块和第二图像块进行运算,得到所述任一个超像素的第一分类向量和第二分类向量,并将所述第一分类向量和第二分类向量进行合成得到所述任一个超像素的分类向量。
  14. 如权利要求9-13中任一项所述的装置,其特征在于,所述第二分割单元用于按照所述第二分割规则将所述第一分割单元分割后的超像素图像中分割类标属于预先设定的用户需要关注的分割类标的超像素分割为前景区域,将所述第一分割单元分割后的超像素图像中分割类标不属于所述预先设定的用户需要关注的分割类标的超像素分割为背景区域。
  15. 如权利要求14所述的装置,其特征在于,所述装置还包括:
    设置单元,用于将所述超像素图像中被所述第二分割单元分割为前景区域的超像素的颜色设置为预先设置与所述关注分割类标对应的前景颜色,将所述超像素图像中被所述第二分割单元分割为背景区域的超像素的颜色设置为与所述关注分割类标对应的背景颜色。
  16. 如权利要求9-15中任一项所述的装置,其特征在于,所述神经网络包括:
    深度神经网络或者非深度神经网络。
  17. 一种图像分割装置,其特征在于,包括:处理器、网络接口、存储器和通信总线,其中,所述通信总线用于实现所述处理器、网络接口和存储器之间连接通信,所述处理器执行所述存储器中存储的程序用于实现以下方法:
    将待分割图像按照预设的第一分割规则分割成若干个超像素,得到超像素图像;
    以各个超像素为中心在所述超像素图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块;
    利用神经网络对所述各个超像素对应的图像块进行处理,得到所述各个超像素对应的分割类标;
    按照预设的第二分割规则对所述超像素图像进行分割,得到包括至少两个区域的分割图像;其中,所述第二分割规则是指将分割类标相同的超像素分割至相同的区域。
  18. 如权利要求17所述的装置,其特征在于,所述处理器在执行将待分割图像按照预设的第一分割规则分割成若干个超像素之后,所述以各个超像素为中心在所述超像素图像上切割特定尺度的图像,以得到所述各个超像素的图像块之前,所述处理器执行的程序还包括:
    对所述超像素图像进行扩充,以生成包括所述超像素图像的扩充图像;
    所述处理器执行的以各个超像素为中心在所述超像素图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块的程序,包括:
    以所述各个超像素为中心在所述扩充图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块。
  19. 如权利要求17或18所述的装置,其特征在于,预先设定需要得到m个分割类标,所述m为大于或者等于2的自然数;
    所述处理器执行的利用神经网络对所述各个超像素对应的图像块进行处理,得到所述各个超像素的分割类标的程序,包括:
    利用所述神经网络对所述各个超像素对应的图像块进行运算,以得到所述 各个超像素的分类向量;
    识别所述各个超像素的分类向量在所述m个分割类标中所对应的分割类标;
    针对所述各个超像素中的任一个超像素,将所述任一个超像素的分类向量在所述m个分割类标中所对应的分割类标作为所述任一个超像素的分割类标。
  20. 如权利要求19所述的装置,其特征在于,所述处理器执行的识别所述各个超像素的分类向量在所述m个分割类标中所对应的分割类标的程序,包括:
    针对所述各个超像素中的任一个超像素,通过如下公式计算所述任一个超像素的分类向量与所述m个分割类标的连接值:
    Figure PCTCN2015077859-appb-100003
    其中,所述yj为超像素的分类向量与第j个分割类标的连接值,所述xi表示所述目标超像素的分类向量中第i维向量,所述n为所述目标超像素的分类向量中向量的维数,且所述n为大于1整数,所述αi,j为预先设定的用于识别分割类标的参数;
    从所述任一个超像素的m个连接值中选择最大的连接值,将所述最大的连接值对应的分割类标作为所述任一个超像素的分类向量在所述m个分割类标中所对应的分割类标。
  21. 如权利要求19或20所述的装置,其特征在于,所述处理器执行的以各个超像素为中心在所述超像素图像上切割预设尺度的图像,以得到所述各个超像素对应的图像块的程序,包括:
    以所述各个超像素为中心在所述超像素图像上切割第一预设尺度的图像,以得到所述各个超像素对应的第一图像块;
    以所述各个超像素为中心在所述超像素图像上切割第二预设尺度的图像,以得到所述各个超像素对应的第二图像块;
    所述处理器执行的利用所述神经网络对所述各个超像素的图像块进行运 算,以得到所述各个超像素的分类向量的程序,包括:
    针对所述各个超像素中的任一个超像素,利用神经网络分别对所述任一个超像素的对应的第一图像块和第二图像块进行运算,得到所述任一个超像素的第一分类向量和第二分类向量,并将所述第一分类向量和第二分类向量进行合成得到所述任一个超像素的分类向量。
  22. 如权利要求17-21中任一项所述的装置,其特征在于,所述处理器执行的按照预设的第二分割规则对所述超像素图像进行分割,得到包括至少两个区域的分割图像的程序,包括:
    按照所述第二分割规则将所述超像素图像中分割类标属于预先设定的用户需要关注的分割类标的超像素分割为前景区域,将所述超像素图像中分割类标不属于所述预先设定的用户需要关注的分割类标的超像素分割为背景区域。
  23. 如权利要求22所述的装置,其特征在于,所述处理器执行的程序还包括:
    将所述超像素图像中被分割为前景区域的超像素的颜色设置为预先设置与所述关注分割类标对应的前景颜色,将所述超像素图像中被分割背景区域的超像素的颜色设置为与所述关注分割类标对应的背景颜色。
  24. 如权利要求17-23中任一项所述的装置,其特征在于,所述神经网络包括:
    深度神经网络或者非深度神经网络。
PCT/CN2015/077859 2015-04-29 2015-04-29 一种图像分割方法和装置 WO2016172889A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2015/077859 WO2016172889A1 (zh) 2015-04-29 2015-04-29 一种图像分割方法和装置
CN201580078960.2A CN107533760B (zh) 2015-04-29 2015-04-29 一种图像分割方法和装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/077859 WO2016172889A1 (zh) 2015-04-29 2015-04-29 一种图像分割方法和装置

Publications (1)

Publication Number Publication Date
WO2016172889A1 true WO2016172889A1 (zh) 2016-11-03

Family

ID=57197932

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/077859 WO2016172889A1 (zh) 2015-04-29 2015-04-29 一种图像分割方法和装置

Country Status (2)

Country Link
CN (1) CN107533760B (zh)
WO (1) WO2016172889A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106780582A (zh) * 2016-12-16 2017-05-31 西安电子科技大学 基于纹理特征和颜色特征融合的图像显著性检测方法
CN111860465A (zh) * 2020-08-10 2020-10-30 华侨大学 基于超像素的遥感图像提取方法、装置、设备及存储介质
CN117058393A (zh) * 2023-08-30 2023-11-14 南通大学 用于眼底硬性渗出图像分割的超像素三支证据dpc方法

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110889857A (zh) * 2019-11-15 2020-03-17 北京邮电大学 一种移动Web实时视频帧分割方法及系统
CN112217958B (zh) * 2020-09-15 2022-04-22 陕西科技大学 与设备颜色空间无关的数字水印载体图像预处理的方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101169867A (zh) * 2007-12-04 2008-04-30 北京中星微电子有限公司 图像分割方法、图像处理设备及系统
US20100014755A1 (en) * 2008-07-21 2010-01-21 Charles Lee Wilson System and method for grid-based image segmentation and matching
US20120275703A1 (en) * 2011-04-27 2012-11-01 Xutao Lv Superpixel segmentation methods and systems
CN103164858A (zh) * 2013-03-20 2013-06-19 浙江大学 基于超像素和图模型的粘连人群分割与跟踪方法
CN104050677A (zh) * 2014-06-30 2014-09-17 南京理工大学 一种基于多层神经网络的超光谱图像分割方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103353986B (zh) * 2013-05-30 2015-10-28 山东大学 一种基于超像素模糊聚类的脑部mr图像分割方法
CN103679719A (zh) * 2013-12-06 2014-03-26 河海大学 一种图像分割方法
CN103984958B (zh) * 2014-05-07 2017-11-07 深圳大学 宫颈癌细胞分割方法及系统
CN104537676B (zh) * 2015-01-12 2017-03-22 南京大学 一种基于在线学习的渐进式图像分割方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101169867A (zh) * 2007-12-04 2008-04-30 北京中星微电子有限公司 图像分割方法、图像处理设备及系统
US20100014755A1 (en) * 2008-07-21 2010-01-21 Charles Lee Wilson System and method for grid-based image segmentation and matching
US20120275703A1 (en) * 2011-04-27 2012-11-01 Xutao Lv Superpixel segmentation methods and systems
CN103164858A (zh) * 2013-03-20 2013-06-19 浙江大学 基于超像素和图模型的粘连人群分割与跟踪方法
CN104050677A (zh) * 2014-06-30 2014-09-17 南京理工大学 一种基于多层神经网络的超光谱图像分割方法

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106780582A (zh) * 2016-12-16 2017-05-31 西安电子科技大学 基于纹理特征和颜色特征融合的图像显著性检测方法
CN106780582B (zh) * 2016-12-16 2019-08-13 西安电子科技大学 基于纹理特征和颜色特征融合的图像显著性检测方法
CN111860465A (zh) * 2020-08-10 2020-10-30 华侨大学 基于超像素的遥感图像提取方法、装置、设备及存储介质
CN117058393A (zh) * 2023-08-30 2023-11-14 南通大学 用于眼底硬性渗出图像分割的超像素三支证据dpc方法

Also Published As

Publication number Publication date
CN107533760B (zh) 2021-03-23
CN107533760A (zh) 2018-01-02

Similar Documents

Publication Publication Date Title
CN109685067B (zh) 一种基于区域和深度残差网络的图像语义分割方法
JP7236545B2 (ja) ビデオターゲット追跡方法と装置、コンピュータ装置、プログラム
CN109859171B (zh) 一种基于计算机视觉和深度学习的楼面缺陷自动检测方法
US10762608B2 (en) Sky editing based on image composition
WO2016172889A1 (zh) 一种图像分割方法和装置
US9129192B2 (en) Semantic object proposal generation and validation
US20150326845A1 (en) Depth value restoration method and system
CN109086777B (zh) 一种基于全局像素特征的显著图精细化方法
US20120092357A1 (en) Region-Based Image Manipulation
WO2016066042A1 (zh) 商品图片的分割方法及其装置
CN111507334B (zh) 一种基于关键点的实例分割方法
US20210248729A1 (en) Superpixel merging
CN110866896A (zh) 基于k-means与水平集超像素分割的图像显著性目标检测方法
CN111768415A (zh) 一种无量化池化的图像实例分割方法
CN113469092B (zh) 字符识别模型生成方法、装置、计算机设备和存储介质
KR20180067909A (ko) 영상 분할 장치 및 방법
CN112102929A (zh) 医学图像标注方法、装置、存储介质及电子设备
JP6389742B2 (ja) 画像セグメンテーション方法、装置、及びプログラム
CN114445651A (zh) 一种语义分割模型的训练集构建方法、装置及电子设备
CN109741358B (zh) 基于自适应超图学习的超像素分割方法
CN104268845A (zh) 极值温差短波红外图像的自适应双局部增强方法
CN112541856B (zh) 一种结合马尔科夫场和格拉姆矩阵特征的医学类图像风格迁移方法
CN106056575B (zh) 一种基于似物性推荐算法的图像匹配方法
CN107644233A (zh) 基于聚类分类的filtersim模拟方法
CN108898045B (zh) 基于深度学习的手势识别的多标签图像预处理方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15890269

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15890269

Country of ref document: EP

Kind code of ref document: A1