WO2022110969A1 - 无监督图像分割方法、电子设备和存储介质 - Google Patents

无监督图像分割方法、电子设备和存储介质 Download PDF

Info

Publication number
WO2022110969A1
WO2022110969A1 PCT/CN2021/117515 CN2021117515W WO2022110969A1 WO 2022110969 A1 WO2022110969 A1 WO 2022110969A1 CN 2021117515 W CN2021117515 W CN 2021117515W WO 2022110969 A1 WO2022110969 A1 WO 2022110969A1
Authority
WO
WIPO (PCT)
Prior art keywords
superpixel
node
foreground
background
nodes
Prior art date
Application number
PCT/CN2021/117515
Other languages
English (en)
French (fr)
Inventor
周悦芝
黄权伟
Original Assignee
清华大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 清华大学 filed Critical 清华大学
Priority to US17/696,668 priority Critical patent/US11928825B2/en
Publication of WO2022110969A1 publication Critical patent/WO2022110969A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/12Edge-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformation in the plane of the image
    • G06T3/40Scaling the whole image or part thereof
    • G06T3/4053Super resolution, i.e. output image resolution higher than sensor resolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/162Segmentation; Edge detection involving graph-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/26Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/28Quantising the image, e.g. histogram thresholding for discrimination between background and foreground patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/42Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation
    • G06V10/422Global feature extraction by analysis of the whole pattern, e.g. using frequency domain transformations or autocorrelation for representing the structure of the pattern or shape of an object therefor
    • G06V10/426Graphical representations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
    • G06V10/507Summing image-intensity values; Histogram projection analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/762Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks
    • G06V10/7635Arrangements for image or video recognition or understanding using pattern recognition or machine learning using clustering, e.g. of similar faces in social networks based on graphs, e.g. graph cuts or spectral clustering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20072Graph-based image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20132Image cropping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/03Recognition of patterns in medical or anatomical images

Definitions

  • the invention belongs to the field of computer application technology and image segmentation, and particularly relates to an unsupervised image segmentation method, an electronic device and a storage medium.
  • Image segmentation belongs to the field of computer vision and is widely used in autonomous driving, face recognition, and image recognition.
  • obtaining the segmentation results of medical targets such as tumors quickly and accurately is helpful for surgical operators to carry out pre-operative planning, intra-operative guidance and post-operative evaluation.
  • accurate image segmentation can divide an image into multiple regions, where each region has a uniform color (or texture) and the boundaries between each other are simple and accurate.
  • the present disclosure proposes an unsupervised image segmentation method.
  • the present application proposes an unsupervised image segmentation method, which includes: performing superpixel segmentation on an image to be processed including a target object to obtain multiple superpixel sets, each superpixel set corresponding to its own superpixel node; Undirected graph.
  • the undirected graph includes a first edge connected between two adjacent superpixel nodes, a foreground edge connected between each superpixel node and a virtual foreground node, and a background connected between each superpixel node and a virtual background node edge; based on the first label set corresponding to multiple superpixel nodes, determine the foreground superpixel node and the background superpixel node in the undirected graph; the foreground superpixel node refers to the superpixel node belonging to the foreground, and the background superpixel node refers to the superpixel node that belongs to the foreground.
  • the superpixel node belonging to the background based on the foreground superpixel node and the background superpixel node, the minimized objective function is generated; based on the minimized objective function, the undirected graph is segmented to obtain the foreground part and the background part, and the second label set is generated ; Perform image segmentation on the image to be processed based on the comparison result between the first label set and the second label set.
  • the method further includes determining a weight of the first edge based on the grayscale values of the superpixel nodes.
  • the method further includes: taking the average value of the grayscale values of the foreground superpixel nodes as the grayscale value of the virtual foreground node; and taking the average value of the grayscale values of the background superpixel nodes as the grayscale value of the virtual background node value.
  • the method further includes: obtaining a foreground weighted difference based on a normalized expected round-trip time matrix between the two superpixel nodes, grayscale values of the superpixel nodes, and grayscale values of the foreground superpixel nodes; and based on The normalized expected round-trip time matrix between two superpixel nodes, the gray value of the superpixel node, and the gray value of the background superpixel node, to obtain the background weighted difference.
  • determining the foreground superpixel node and the background superpixel node based on the first label set includes: arranging the grayscale values corresponding to the plurality of superpixel nodes in descending order, and determining a preset number of superpixel nodes ranked first. is the foreground superpixel node, and the remaining superpixel nodes are determined as the background superpixel nodes.
  • performing superpixel segmentation on the to-be-processed image containing the target object includes: cropping the to-be-processed image to obtain a cropped image, and performing superpixel segmentation on the cropped image.
  • the cropped image refers to a rectangular area obtained by extending the smallest rectangle containing the target object by a preset distance.
  • the method further includes: performing image segmentation on the image to be processed based on a second set of labels that is the same as the first set of labels; and determining an updated foreground in the undirected graph based on the second set of labels different from the first set of labels Superpixel node and update background superpixel node; build update objective function based on update foreground superpixel node and update background superpixel node; segment undirected graph based on update objective function, generate updated label set until the updated label set Same as the previous label set.
  • the weight eij of the first edge is represented by the following equation:
  • the weight esi of the foreground edge is represented by the following equation:
  • i represents the superpixel node i
  • gi represents the gray value of the superpixel node i
  • gt represents the gray value of the virtual background node
  • c i1 is the background weighted difference between the superpixel node i and the background superpixel node ;
  • g s represents the gray value of the virtual foreground node
  • c i0 represents the foreground weighted difference between the superpixel node i and the foreground superpixel node.
  • the foreground weighted difference c i0 is represented by the following equation:
  • a j 0 indicates that the superpixel node j belongs to the foreground (that is, the foreground superpixel node), G(x) indicates the Gaussian function, and T ij indicates the normalized expected time matrix T elements between the superpixel nodes i and j ;
  • g i represents the grayscale value of superpixel node i;
  • gj represents the grayscale value of superpixel node j;
  • the background weighted difference c i1 is expressed by the following equation:
  • the elements T ij of the normalized expected round-trip time matrix T are represented by the following equation:
  • T ij E i (T j )+E j (T i )
  • the minimization objective function is represented by the following equation:
  • a i represents the label of the superpixel node i
  • a j represents the label of the superpixel node j
  • esi represents the weight of the foreground edge of the superpixel node i; e it represents the weight of the background edge of the superpixel node i
  • eij represents the connection between the superpixel node i and the superpixel node j
  • the weight of the first side; Vs denotes the set of superpixel nodes; ⁇ denotes the weight factor.
  • the present application also proposes an electronic device including a processor and a memory.
  • Memory is used to store instructions executable by the processor.
  • the processor is configured to perform the unsupervised image segmentation method as described above.
  • the present application also proposes a storage medium on which a computer program is stored.
  • the processor is adapted to perform the unsupervised image segmentation method as described above.
  • the present disclosure can automatically and accurately segment images in the absence of labeled data, and obtain more accurate image segmentation results. Compared with other unsupervised methods, the method of the present disclosure can accurately segment the desired target in a shorter time and with less manual intervention.
  • FIG. 1 is a flowchart of an unsupervised image segmentation method according to an embodiment of the present disclosure.
  • FIG. 2 is a process flow diagram of an image segmentation method according to an embodiment of the present disclosure.
  • 3A is a schematic diagram of an original image according to an embodiment of the present disclosure.
  • 3B is a schematic diagram of a minimum rectangular area including a target object according to an embodiment of the present disclosure.
  • 3C is an example diagram of a cropped image according to an embodiment of the present disclosure.
  • FIG. 4 is a schematic diagram of a superpixel segmentation result according to an embodiment of the present disclosure.
  • FIG. 5 is a schematic diagram of an undirected graph according to an embodiment of the present disclosure.
  • 6A is a schematic diagram of a foreground portion obtained by segmenting an undirected graph according to an embodiment of the present disclosure.
  • 6B is a schematic diagram of a background portion obtained by segmenting an undirected graph according to an embodiment of the present disclosure.
  • FIG. 7 is a schematic diagram of a finally obtained segmentation result according to an embodiment of the present disclosure.
  • FIG. 8 is a block diagram of an electronic device according to an embodiment of the present disclosure.
  • the objects to be segmented in medical images are not as clear as those in natural images. Because most medical imaging techniques are computer imaging, they are all indirect image representations. For example, CT uses a series of X-rays, followed by complex mathematical calculations to obtain the final image representation. Therefore, compared with natural images that contain 3 channels of image information, most medical images are single-channel, which makes the medical images themselves provide very little information, which is not conducive to subsequent accurate segmentation. In addition, in medical images, the contrast between the target tissue and the background is also small, especially when the noise near the boundary is large, making it challenging even for an experienced physician to perform the segmentation manually.
  • the medical field also has some very special properties.
  • many data involve ethics and other issues, which makes the data itself inconvenient for labeling, and it is more difficult to use the deep network that requires a large amount of labeling data to process; on the other hand, even if the deep segmentation network can be used to obtain better segmentation, but due to The deep network is generally end-to-end, and its results are relatively fixed, which makes it impossible for users to manually improve or adjust the segmentation results; in addition, since the current deep networks are almost unexplainable, this also limits its application in some special fields. use.
  • unsupervised learning does not require any annotated data, it automatically analyzes a large amount of data to understand its internal data laws.
  • image segmentation will be used to predict more general labels such as "foreground” and "background”, especially without any prior information.
  • unsupervised segmentation techniques including normalized cuts, Markov random field-based methods, mean shift, active contour methods, etc., which aim to automatically segment object instances in each image without the need for any supervisory information.
  • many unsupervised methods are time-consuming and require a lot of post-processing to achieve better results, which generally require more manual intervention and cost more. big.
  • Active contour model is a widely used unsupervised segmentation method, which obtains the final segmentation result by iteratively solving the energy minimum.
  • the Chan-Vese model is a region-based active contour model, which is an example of the Mumford-Shah model. Since it does not depend on the gradient information of object boundaries, it is widely used in the field of medical images, especially for brain , Cardiopulmonary, and vascular segmentation.
  • the level set function can be used to make the solution of the problem easier, but it also makes it necessary to keep the level set function as a signed distance function at all times during the problem solving.
  • a distance regularization method is proposed in the DRLSE model, which makes it unnecessary to re-initialize the level set function in the iterative process.
  • the LBF model introduces a local binary energy term to obtain more accurate segmentation by extracting local information of the image.
  • these methods cannot well balance the relationship between the distance between pixels and the gray level, and are not suitable for the segmentation of targets with uneven intensity.
  • they use gradient descent to iteratively solve the problem, which takes a long time and cannot meet a large amount of data. segmentation.
  • the present disclosure proposes an unsupervised image segmentation method, apparatus, electronic device and storage medium.
  • the above method, apparatus, electronic device and storage medium will be described in further detail below with reference to the accompanying drawings.
  • FIG. 1 shows a flowchart of an unsupervised image segmentation method according to an embodiment of the present disclosure.
  • the unsupervised image segmentation method can be implemented by a mobile terminal, such as a smart phone, a tablet computer, a laptop computer, or a wearable device.
  • the execution subject of the above unsupervised image segmentation method is a processor.
  • the unsupervised image segmentation method includes the following steps.
  • step S110 the processor performs superpixel segmentation on the to-be-processed image containing the target object to obtain multiple superpixel sets, each superpixel set corresponding to its own superpixel node.
  • Step S120 the processor generates an undirected graph according to the superpixel nodes.
  • the undirected graph includes a first edge connected between two adjacent superpixel nodes, a foreground edge connected between each superpixel node and a virtual foreground node, and a background connected between each superpixel node and a virtual background node side.
  • Step S130 the processor determines the foreground superpixel node and the background superpixel node in the undirected graph based on the first label set corresponding to the multiple superpixel nodes.
  • the foreground superpixel node refers to the superpixel node that belongs to the foreground
  • the background superpixel node refers to the superpixel node that belongs to the background.
  • Step S140 the processor generates a minimization objective function based on the foreground superpixel nodes and the background superpixel nodes.
  • Step S150 the processor divides the undirected graph based on the objective function to be minimized to obtain a foreground part and a background part, and generates a second label set.
  • Step S160 the processor performs image segmentation on the image to be processed based on the comparison result between the first label set and the second label set.
  • constructing the minimization objective function includes constructing the minimization objective function based on the weight of the first edge, the weight of the foreground edge, and the weight of the background edge of each superpixel node.
  • the method further includes determining a weight of the first edge based on the grayscale values of the superpixel nodes.
  • the method further includes: determining a weight of the foreground edge based on the gray value of the virtual background node, the gray value of the superpixel node, and the background weight difference; and based on the gray value of the virtual foreground node, the superpixel node The gray value of , and the foreground weighted difference determine the weight of the background edge.
  • the method further includes: taking the average value of the grayscale values of the foreground superpixel nodes as the grayscale value of the virtual foreground node; and taking the average value of the grayscale values of the background superpixel nodes as the grayscale value of the virtual background node value.
  • the method further includes: obtaining a foreground weighted difference based on a normalized expected round-trip time matrix between the two superpixel nodes, grayscale values of the superpixel nodes, and grayscale values of the foreground superpixel nodes; and based on The normalized expected round-trip time matrix between two superpixel nodes, the gray value of the superpixel node, and the gray value of the background superpixel node, to obtain the background weighted difference.
  • determining the foreground superpixel node and the background superpixel node based on the first label set includes: arranging the grayscale values corresponding to the plurality of superpixel nodes in descending order, and determining a preset number of superpixel nodes ranked first. is the foreground superpixel node, and the remaining superpixel nodes are determined as the background superpixel nodes.
  • performing superpixel segmentation on the to-be-processed image containing the target object includes: cropping the to-be-processed image to obtain a cropped image, and performing superpixel segmentation on the cropped image.
  • the cropped image refers to a rectangular area obtained by extending the smallest rectangle containing the target object by a preset distance.
  • the method further includes: performing image segmentation on the image to be processed based on a second set of labels that is the same as the first set of labels; and determining an updated foreground in the undirected graph based on the second set of labels different from the first set of labels Superpixel node and update background superpixel node; build update objective function based on update foreground superpixel node and update background superpixel node; segment undirected graph based on update objective function, generate updated label set until the updated label set Same as the previous label set.
  • This application converts pixel-level segmentation into superpixel-level segmentation.
  • superpixels By introducing superpixels, not only can superpixels be used to obtain precise edges, but also the number of basic units for subsequent processing can be reduced and processing time can be reduced.
  • This application preprocesses the images to be segmented.
  • the cropped image is obtained by cropping the approximate area to be segmented, which reduces the influence of a large number of backgrounds unrelated to the target object on the subsequent segmentation, making the final segmentation result more accurate.
  • the present application is particularly suitable for segmenting pictures with uneven intensity, especially when the edges of the objects to be segmented are not clear. Therefore, the present application can be applied to the segmentation of organs such as medical images, bones and lesions such as tumors, and the method has no special requirements on the imaging method of the pictures, and can be applied to medical images such as CT and MRI.
  • the unsupervised image segmentation method can be unsupervised image segmentation based on the Chan-Vese model, including cropping of rough regions; superpixel segmentation; establishment of wireless graphs; maximum flow segmentation; If the segmentation does not change, the process ends; if the segmentation changes, it returns to the establishment of the wireless graph.
  • the original image to be segmented is obtained, and the original image is cropped to obtain a cropped image containing the target object.
  • any image can be used as the original image, for example, the original image output by the image sensor or the processed image obtained by processing the original image.
  • cropped image For most of the original images to be segmented, since the target object exists in one or more local areas in the original image, a cropping operation can be used to extract the approximate area to be segmented.
  • the cropped image (called cropped image) needs to retain the complete target object, which can reduce the subsequent interference of the background unrelated to the target object, thereby improving the final segmentation accuracy.
  • the original image is shown in Figure 3A.
  • the original image is a CT (Computed Tomography) slice of the human lumbar spine.
  • the target object is shown in Figure 3B (ie, the vertebral body inside the CT slice).
  • the approximate area where the target object is located is extracted for subsequent processing, such as segmenting the target object.
  • the approximate area obtained by the cropping operation is a rectangular area obtained by extending the smallest rectangle including the target object by about 10 pixels in four directions (up, down, left, and right).
  • the final cropped image ie, the cropped image
  • Figure 3C the final cropped image
  • the size of the image to be processed is reduced from 512 ⁇ 512 to 103 ⁇ 78, and the area proportion of the background irrelevant to the target is reduced by 97%. Thereby, the efficiency of subsequent processing for the region where the target object is located is improved, and the segmentation result is improved.
  • an existing Simple Linear Iterative Clustering (SLIC) segmentation algorithm can be used to perform superpixel segmentation on the cropped image, resulting in a series of superpixel sets.
  • SLIC Simple Linear Iterative Clustering
  • Pixels within each superpixel set are spatially adjacent and similar in features such as color texture. Different superpixel sets do not overlap each other. Superpixels can provide accurate segmentation edges, and can reduce the number of processing units and subsequent processing time compared to the method that uses pixels as the basic processing unit.
  • K the desired number of superpixel sets (K) needs to be given.
  • K the desired number of superpixel sets (K) needs to be given.
  • K it is expected that each superpixel set will eventually contain about 50 pixels, so K is about in Indicates rounded up, and N is the total number of pixels in the cropped image.
  • N the total number of pixels in the cropped image. For example, if the size of the cropped image is 103 ⁇ 78, the value of K can be (108 ⁇ 78) ⁇ 50 ⁇ 160.
  • each grid (or subregion) in Figure 3 represents a collection of superpixels. Since the number of superpixel sets finally obtained by the SLIC algorithm is not strictly equal to the expected number K (the reason is: the superpixel centers initialized by the algorithm are uniformly sampled (that is, the difference between the initialized center of each superpixel and the center of adjacent superpixels) The distance is the same), and the algorithm may involve the combination of superpixels later), in the subsequent processing, K represents the number of superpixel sets obtained through superpixel segmentation.
  • the present application converts pixel-level segmentation into super-pixel-level segmentation. Since the boundaries between superpixels are accurate and the superpixels are spatially consistent, the edges obtained by segmentation directly using pixels are more accurate. In addition, the use of superpixels can also significantly reduce the number of subsequent processing units and greatly reduce processing time.
  • G (V, E)
  • the undirected graph contains (K+2) nodes in total, the K superpixel nodes correspond to K superpixel sets respectively, and the two additional nodes are the virtual foreground node s and the virtual background node t.
  • the set is denoted as V s .
  • an edge also called the first node connecting the two nodes can be established in the undirected graph. side). For example, for nodes i,j ⁇ Vs, an edge is formed between nodes i ,j if their corresponding superpixel sets are adjacent in the cropped image.
  • the set of edges between all superpixel nodes corresponding to two adjacent superpixel sets is denoted as E 1 . Determine the length l ij of the common edge of each two adjacent superpixel sets, where the length of the common edge is defined as: the average of the number of adjacent pixels in the two adjacent superpixel sets on the cropped image .
  • the superpixel set corresponding to node i is adjacent to the superpixel set corresponding to node j
  • the superpixel set corresponding to node j is adjacent to the superpixel set corresponding to node j.
  • Each node belonging to V s will be connected to the virtual foreground node s and the virtual background node t respectively, and all the edges connected to the virtual foreground node s (also called the foreground edge) and the virtual background node t (also called the background The set of edges) is denoted as E 2 , wherein the virtual foreground node s and the virtual background node t are not connected to each other.
  • each superpixel node is initially connected to a virtual foreground node and a virtual background node.
  • each edge is assigned a value, and some edges in the set E2 will be disconnected during subsequent processing (such as ⁇ maximum flow division>). Therefore, the corresponding segmentation result is obtained according to the situation where the edge is broken.
  • Each edge in E1 does not change with subsequent segmentation results because the weight of each edge only needs to be calculated once.
  • the weight reflects the similarity between adjacent superpixel nodes, and the larger the weight value, the more similar the two nodes are.
  • all superpixel nodes are initialized to reflect whether the node belongs to the foreground (eg, target object) or the background.
  • all superpixel nodes are sorted in descending order according to their grayscale values.
  • the first 30% of the nodes after sorting are initialized to belong to the foreground, that is, the node label of these nodes is set to 0, and the other superpixel nodes are initialized to belong to the background, that is, the node label of these nodes is set to 1.
  • a 0 (a 1 , a 2 , . . . , a K ) and use A 0 as the current label set of all superpixel nodes.
  • the weight of each edge in E2 reflects the similarity between the corresponding superpixel node and the virtual foreground node or virtual background node. The larger the weight value is, the more similar the corresponding superpixel node is to the virtual foreground node or the virtual background node.
  • the weighted difference from the superpixel node currently belonging to the background where the weight here is the Gaussian function value corresponding to the expected round-trip time between nodes in subsequent processing
  • a transition matrix P K ⁇ K between adjacent superpixel nodes is obtained, thereby obtaining the background weighted difference c i1 and the foreground weighted difference c i0 .
  • the value of i is equal to the sum of w ij of superpixel node i and all connected superpixel nodes j.
  • the expected round-trip time takes into account the distance between nodes and the information of the gray value. The smaller the value of the expected round-trip time is, the more similar the two nodes are.
  • the weighted difference c i1 and the weighted difference c i0 can be obtained according to the obtained normalized expected time matrix T, as expressed by the following formulas:
  • the weighted difference c i1 represents the weighted difference between node i and the superpixel node that currently belongs to the background; the weighted difference c i0 represents the weighted difference between node i and the superpixel node that belongs to the foreground, respectively.
  • the normalized expected time matrix T may be as follows:
  • the minimization objective function is set as follows:
  • is a weighting factor
  • v i represents pixel node i
  • v j represents superpixel node j, respectively, where v i and i can be used interchangeably, and v j and j can be used interchangeably.
  • the value of ⁇ ranges from 0.001 to 0.1, for example, ⁇ is 0.01, which is used to adjust the weight relationship between the first two items and the last item in the objective function.
  • the minimum value of the above objective function can be solved by cutting the undirected graph using the existing maximum flow split.
  • the foreground part and the background part are obtained by segmenting the undirected graph in the process of ⁇ the establishment of the undirected graph> by maximum flow segmentation, wherein the superpixel node in the foreground part is connected with the virtual foreground node s, and the superpixel node in the background part is connected with the virtual foreground part s.
  • the background node t is connected, and the foreground part and the background part are not connected to each other.
  • a 1 is used to denote the updated label set composed of all a i .
  • FIG. 6A is the foreground part connected to the virtual foreground node s
  • FIG. 6B is the background part connected to the virtual background node t.
  • obtaining the final segmentation result includes: according to the label of each superpixel node corresponding to the updated segmentation result A1, marking all pixels in the superpixel set of the corresponding superpixel node as the label of the corresponding superpixel node a i , to get the final pixel-level segmentation result.
  • the finally obtained segmentation result is shown in FIG. 7 , and the white area surrounded by the black outline in the figure is the final segmentation result.
  • the original image is cropped, and the cropped image is subjected to superpixel segmentation.
  • superpixel segmentation results an undirected graph is established, and the energy term of the Chan-Vese model is represented by the weight of the edge.
  • the distance between superpixels and the gray level can be considered in the process of edge assignment. relation.
  • the corresponding segmentation results can be obtained by repeatedly using the maximum flow segmentation and updating the weights of the edges.
  • the present application can automatically and accurately segment images in the medical field lacking labeled data, thereby providing accurate information for segmenting objects for subsequent medical research.
  • the present application can accurately segment the desired target in a shorter time and with less manual intervention.
  • the present application also proposes an electronic device including a processor and a memory.
  • Memory is used to store executions executable by the processor.
  • the processor is configured to perform the unsupervised image segmentation method as described above.
  • the present application also proposes a storage medium on which a computer program is stored.
  • the processor is adapted to perform the unsupervised image segmentation method as described above.
  • Fig. 8 is a block diagram of an electronic device according to an exemplary embodiment.
  • device 900 may be a mobile phone, computer, digital broadcast terminal, messaging device, game console, tablet device, medical device, fitness device, personal digital assistant, and the like.
  • the device 900 may include one or more of the following components: a processing component 902, a memory 904, a power supply component 906, a multimedia component 908, an audio component 910, an input/output (I/O) interface 912, a sensor component 914, And the communication component 916 .
  • the processing component 902 generally controls the overall operation of the device 900, such as operations associated with display, phone calls, data communications, camera operations, and recording operations.
  • the processing component 902 may include one or more processors 920 to execute instructions to perform all or some of the steps of the methods described above. Additionally, processing component 902 may include one or more modules to facilitate interaction between processing component 902 and other components. For example, processing component 902 may include a multimedia module to facilitate interaction between multimedia component 908 and processing component 902.
  • Memory 904 is configured to store various types of data to support operation at device 900 . Examples of such data include instructions for any application or method operating on device 900, contact data, phonebook data, messages, pictures, videos, and the like. Memory 904 may be implemented by any type of volatile or non-volatile storage device or combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Magnetic or Optical Disk.
  • SRAM static random access memory
  • EEPROM electrically erasable programmable read only memory
  • EPROM erasable Programmable Read Only Memory
  • PROM Programmable Read Only Memory
  • ROM Read Only Memory
  • Magnetic Memory Flash Memory
  • Magnetic or Optical Disk Magnetic Disk
  • Power supply assembly 906 provides power to various components of device 900 .
  • Power components 906 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power to device 900 .
  • Multimedia component 908 includes a screen that provides an output interface between the device 900 and the user.
  • the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user.
  • the touch panel includes one or more touch sensors to sense touch, swipe, and gestures on the touch panel. The touch sensor may not only sense the boundaries of a touch or swipe action, but also detect the duration and pressure associated with the touch or swipe action.
  • the multimedia component 908 includes a front-facing camera and/or a rear-facing camera. When the device 900 is in an operation mode, such as a shooting mode or a video mode, the front camera and/or the rear camera may receive external multimedia data. Each of the front and rear cameras can be a fixed optical lens system or have focal length and optical zoom capability.
  • Audio component 910 is configured to output and/or input audio signals.
  • audio component 910 includes a microphone (MIC) that is configured to receive external audio signals when device 900 is in operating modes, such as call mode, recording mode, and voice recognition mode. The received audio signal may be further stored in memory 904 or transmitted via communication component 916 .
  • audio component 910 also includes a speaker for outputting audio signals.
  • the I/O interface 912 provides an interface between the processing component 902 and a peripheral interface module, which may be a keyboard, a click wheel, a button, or the like. These buttons may include, but are not limited to: home button, volume buttons, start button, and lock button.
  • Sensor assembly 914 includes one or more sensors for providing status assessments of various aspects of device 900 .
  • the sensor assembly 914 can detect the open/closed state of the device 900, the relative positioning of components, such as the display and keypad of the device 900, and the sensor assembly 914 can also detect a change in the position of the device 900 or a component of the device 900 , the presence or absence of user contact with the device 900 , the orientation or acceleration/deceleration of the device 900 and the temperature change of the device 900 .
  • Sensor assembly 914 may also include a proximity sensor configured to detect the presence of nearby objects in the absence of any physical contact.
  • Sensor assembly 914 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications.
  • the sensor assembly 914 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
  • Communication component 916 is configured to facilitate wired or wireless communication between device 900 and other devices.
  • Device 900 may access wireless networks based on communication standards, such as WiFi, 2G or 3G, 4G or 5G, or a combination thereof.
  • the communication component 916 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel.
  • the communication component 916 also includes a near field communication (NFC) module to facilitate short-range communication.
  • the NFC module may be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.
  • RFID radio frequency identification
  • IrDA infrared data association
  • UWB ultra-wideband
  • Bluetooth Bluetooth
  • device 900 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable A gate array (FPGA), controller, microcontroller, microprocessor or other electronic component implementation for performing the above method.
  • ASICs application specific integrated circuits
  • DSPs digital signal processors
  • DSPDs digital signal processing devices
  • PLDs programmable logic devices
  • FPGA field programmable A gate array
  • controller microcontroller, microprocessor or other electronic component implementation for performing the above method.
  • non-transitory computer-readable storage medium including instructions, such as memory 904 including instructions, executable by processor 920 of device 900 to perform the method described above.
  • the non-transitory computer-readable storage medium may be ROM, random access memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, and the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)

Abstract

本发明提出一种无监督图像分割方法,属于计算机应用技术领域及精准医疗领域。对原始图像进行超像素分割,生成无向图。基于第一标签集合,确定前景超像素节点和背景超像素节点。基于前景超像素节点和背景超像素节点,生成最小化目标函数。基于最小化目标函数对无向图进行分割获得前景部分和背景部分,并生成第二标签集合。基于第一标签集合与第二标签集合的比较结果,对待处理图像进行图像分割。本公开可在缺少标注数据的情况下对图像进行自动的精准的分割,得到更为准确的图像分割结果。本公开的方法相比其他的无监督方法能够在更短的时间,以及更少的人工干预下准确地分割出所需目标。

Description

无监督图像分割方法、电子设备和存储介质
相关申请的交叉引用
本申请基于申请号为202011353104.2、申请日为2020年11月27日的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此引入本申请作为参考。
技术领域
本发明属于计算机应用技术领域及图像分割领域,特别涉及一种无监督图像分割方法、电子设备和存储介质。
背景技术
图像分割属于计算机视觉领域,被广泛的应用于自动驾驶、人脸识别以及图像识别等领域。如今随着精准医疗的提出与快速发展,快速准确地获取例如肿瘤等医学目标的分割结果有助于外科手术操作者进行手术前的规划、手术中的指导以及手术后的评估。一般而言,精准的图像分割可以将图像划分成多个区域,其中每个区域具有均匀的颜色(或纹理),并且相互之间的边界是简单而准确的。
发明内容
本公开提出一种无监督图像分割方法。
本申请提出一种无监督图像分割方法,包括:对包含目标对象的待处理图像进行超像素分割,得到多个超像素集合,每个超像素集合对应各自的超像素节点;根据超像素节点生成无向图。无向图包括两个相邻超像素节点之间连接的第一边、每个超像素节点与虚拟前景节点之间连接的前景边、和每个超像素节点与虚拟背景节点之间连接的背景边;基于多个超像素节点对应的第一标签集合,确定无向图中的前景超像素节点和背景超像素节点;前景超像素节点指的是属于前景的超像素节点,背景超像素节点指的是属于背景的超像素节点;基于前景超像素节点和背景超像素节点,生成最小化目标函数;基于最小化目标函数对无向图进行分割获得前景部分和背景部分,并生成第二标签集合;基于第一标签集合与第二标签集合的比较结果,对待处理图像进行图像分割。
在一些示例中,构建最小化目标函数包括:基于每个超像素节点的第一边的权重、前景边的权重、和背景边的权重,构建所述最小化目标函数。
在一些示例中,方法还包括:基于超像素节点的灰度值,确定第一边的权重。
在一些示例中,方法还包括:基于虚拟背景节点的灰度值、超像素节点的灰度值、和背景加权差异,确定前景边的权重;以及基于虚拟前景节点的灰度值、超像素节点的灰度值、和前景加权差异,确定所述背景边的权重。
在一些示例中,方法还包括:将前景超像素节点的灰度值的平均值作为虚拟前景节点的灰度值;以及将背景超像素节点的灰度值的平均值作为虚拟背景节点的灰度值。
在一些示例中,方法还包括:基于两个超像素节点之间的归一化期望往返时间矩阵、超像素节点的灰度值、前景超像素节点的灰度值,获取前景加权差异;以及基于两个超像素节点之间的归一化期望往返时间矩阵、超像素节点的灰度值、背景超像素节点的灰度值,获取背景加权差异。
在一些示例中,基于第一标签集合确定前景超像素节点和背景超像素节点包括:将多个超像素节点对应的灰度值进行降序排列,将排名在前的预设数量的超像素节点确定为前景超像素节点,并剩余超像素节点确定为背景超像素节点。
在一些示例中,对包含目标对象的待处理图像进行超像素分割包括:对待处理图像进行裁剪,以获得裁剪图像,对裁剪图像进行超像素分割。裁剪图像指的是包含所述目标对象的最小矩形向外扩展预设距离获得的矩形区域。
在一些示例中,方法还包括:基于与第一标签集合相同的第二标签集合,对待处理图像进行图像分割;以及基于与第一标签集合不同的第二标签集合确定无向图中的更新前景超像素节点和更新背景超像素节点;基于更新前景超像素节点和更新背景超像素节点,构建更新目标函数;基于更新目标函数对无向图进行分割,生成更新的标签集合,直到更新的标签集合与前一标签集合相同。
在一些示例中,对于超像素节点i,第一边的权重e ij由如下等式表示:
e ij=G(|g i-g j|)+l ij
其中,i表示所述超像素节点i;j表示与超像素节点i连接的另一超像素节点j;G(x)表示高斯函数;l ij表示超像素节点i和j对应的超像素集合之间的公共边的长度;g i表示超像素节点i的灰度值;g j表示超像素节点j的灰度值。
在一些示例中,对于超像素节点i,前景边的权重e si由如下等式表示:
e si=(g i-g t) 2+c i1
其中,i表示超像素节点i;g i表示超像素节点i的灰度值;g t表示虚拟背景节点的灰度值,c i1为超像素节点i与背景超像素节点之间的背景加权差异;
对于超像素节点i,背景边的权重e it由如下等式表示:
e it=(g i-g s) 2+c i0
其中,g s表示虚拟前景节点的灰度值,c i0表示超像素节点i与前景超像素节点之间的前景加权差异。
在一些示例中,对于超像素节点i,前景加权差异c i0由如下等式表示:
Figure PCTCN2021117515-appb-000001
其中,a j=0表示超像素节点j属于前景(即前景超像素节点),G(x)表示高斯函数,T ij表示超像素节点i和j之间的归一化期望时间矩阵T的元素;g i表示超像素节点i的灰度值;g j表示超像素节点j的灰度值;
对于超像素节点i,背景加权差异c i1由如下等式表示:
Figure PCTCN2021117515-appb-000002
其中,a j=1表示超像素节点j属于背景(即背景超像素节点),G(x)表示高斯函数,T ij表示超像素节点i和j之间的归一化期望时间矩阵T的元素;g i表示超像素节点i的灰度值;g j表示超像素节点j的灰度值。
在一些示例中,归一化期望往返时间矩阵T的元素T ij由如下等式表示:
T ij=E i(T j)+E j(T i)
其中,E i(T j)表示当超像素节点i与j不同时,超像素节点i到超像素节点j的期望时间,E i(T j)=(Z jj-Z ij)/π j;E i(T i)表示当超像素节点i与j相同时,超像素节点i到超像素节点j的期望时间,E i(T i)=1/π i;Z jj和Z ij是中间矩阵Z的元素,Z=(I-P+W) -1;I是单位矩阵;P是超像素节点i和超像素节点j之间的转移矩阵;W是构造矩阵,每个行向量等于π;π i和π j是构造矩阵W的元素,π i=d i/∑ jd j;d i表示超像素节点i的度,d j表示超像素节点j的度。
在一些示例中,对于超像素节点i,最小化目标函数由如下等式表示:
Figure PCTCN2021117515-appb-000003
其中,a i表示超像素节点i的标签,a j表示超像素节点j的标签,其中,a i=1表示超像素节点i属于背景(背景超像素节点),a i=0表示超像素节点i属于前景(前景超像素节点);e si表示超像素节点i的前景边的权重;e it表示超像素节点i的背景边的权重;e ij表示超像素节点i和超像素节点j连接的第一边的权重; V s表示超像素节点的集合;μ表示权重因子。
本申请还提出一种电子设备,包括处理器和存储器。存储器用于存储处理器可执行的指令。当指令被处理器执行时,处理器用于执行如上描述的无监督图像分割方法。
本申请还提出一种存储介质,其上存储有计算机程序。当计算机程序被处理器执行时,处理器用于执行如上描述的无监督图像分割方法。
本公开的实施例提供的技术方案可以包括以下有益效果:
本公开可在缺少标注数据的情况下对图像进行自动的精准的分割,得到更为准确的图像分割结果。本公开的方法相比其他的无监督方法能够在更短的时间,以及更少的人工干预下准确地分割出所需目标。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本公开。
附图说明
图1是根据本公开实施例的无监督图像分割方法的流程图。
图2是根据本公开实施例的图像分割方法的处理流程图。
图3A是根据本公开实施例的原始图像的示意图。
图3B是根据本公开实施例的包括目标对象的最小矩形区域的示意图。
图3C是根据本公开实施例的剪裁图像的示例图。
图4是根据本公开实施例的超像素分割结果的示意图。
图5是根据本公开实施例的无向图的示意图。
图6A是根据本公开实施例的分割无向图得到的前景部分的示意图。
图6B是根据本公开实施例的分割无向图得到的背景部分的示意图。
图7是根据本公开实施例的最终得到的分割结果的示意图。
图8是根据本公开实施例的电子设备的框图。
具体实施方式
以下结合附图对本公开的示范性实施例做出说明,其中包括本公开实施例的各种细节以助于理解,应当将它们认为仅仅是示范性的。因此,本领域普通技术人员应当认识到,可以对这里描述的实施例做出各种改变和修改,而不会背离本公开的范围和精神。同样,为了清楚和简明,以下的描述中省略了对公知功能和结构的描述。
目前,图像分割在医学领域里具有挑战性,原因至少有以下3点:
(1)医学图像中需要分割的对象没有自然图像中的对象清晰。因为大多数的医学成像技术都是计算机成像,它们都是间接的图像表示。比如CT使用了一系列的X射线,后续通过复杂的数学计算得到最终的图像表示。因此相比较自然图像包含了3个通道的图像信息,大多数的医学图像都是单通道的,这使得医学图像本身提供的信息就很少,不利于后续的精准分割。另外,在医学图像中,目标组织和背景之间的对比度也较小,特别是当边界附近的噪声较大时,即使让一个有经验的医生手动进行分割也是很有挑战的。
(2)在最近的10年里面,随着深度学习的流行以及其在诸多任务上表现出的卓越性能,许多研究人员都投身到医学图像的深度分割算法的研究之中。比如卷积网络U-Net,该网络由一个编码器以及对应的解码器构成,在许多生物医学图像上都表现出了非常优良的性能。但是现在深度分割网络的成功是建立在拥有大量的带标签的训练数据上的,而这些训练数据一般都需要非常有经验的相关人士(如医生)来进行标注,这提高了时间成本,因此现在在许多医学子领域里面,都难以拥有足够的训练数据,这也使得深度分割网络还不能大面积的应用于医学领域。
(3)医学领域还具有一些非常特殊的性质。一方面许多数据都涉及到伦理等问题,这使得数据本身都不便于进行标注,更难以使用需要大量标注数据的深度网络去进行处理;另外即使使用深度分割网络可以得到比较好的分割,但是由于深度网络一般是端到端的,其结果是相对固定的,这使得用户无法手动改善或调整分割结果;另外由于现在的深度网络几乎都是不可解释的,这也限制了其在一些特殊领域中的使用。
由于在医学领域难以获取到足够的监督样本以及医学图像成像的复杂性,现在的许多有监督的分割方法在医学图像都难以取得较大的进展,这无法满足如今精准医疗的需求。此外,即使基于大量的标签数据,深度学习方法也会对某些图像产生不正确的分割,这在实际应用中是无法接受的,因为无法对结果进行校正。
另一方面,无监督学习不需要任何带注释的数据,它通过自动的分析大量数据,来了解其内部的数据规律。在无监督的情况下,图像分割将用于预测更为通用的标签,例如“前景”和“背景”,特别是在没有任何先验信息的情况下。目前有许多众所周知的无监督分割技术,包括归一化切割,基于马尔可夫随机场的方法,均值平移,主动轮廓方法等等,这些算法旨在自动分割每个图像中的对象实例,而无需任何监督信息。但是由于医学图像中对象的分布非常不均匀,因此使用许多无监督方法耗时较长,并且需要进行大量后处理才能够获得比较好的效果,而这其中一般需要较多的人工干预,成本较大。
主动轮廓模型是一种被广泛使用的无监督分割方法,它通过迭代求解能量最小值获得最终的分割结果。Chan-Vese模型是一种基于区域的主动轮廓模型,它属于Mumford-Shah 模型的一个示例,由于其不依赖于对象边界的梯度信息,因此被广泛地应用于医学图像领域,尤其被用于脑、心肺、血管的分割。求解Chan-Vese模型的能量最小值过程可以通过使用水平集函数,让问题的求解更加简便,但也使得在问题求解中需要时刻保持水平集函数为符号距离函数的性质。为了使模型在实际应用中获得更好的效果,DRLSE模型中提出了一种距离正则化方法,使得无需在迭代过程中重新初始化水平集函数。LBF模型引入了一个局部二值能量项,通过提取图像局部的信息来获得更加精确的分割。但是,这些方法都不能很好的权衡像素之间距离与灰度的关系,不适合于强度不均匀的目标的分割,加上它们使用梯度下降流迭代求解,耗时较长,无法满足大量数据的分割。
鉴于此,本公开提出了一种无监督图像分割方法、装置、电子设备和存储介质。下面结合附图对上述方法、装置、电子设备和存储介质进一步详细说明。
图1示出根据本公开实施例的无监督图像分割方法的流程图。该无监督图像分割方法可以由移动终端来实施,例如智能手机、平板电脑、笔记本电脑或可穿戴设备等。在一些示例中,上述无监督图像分割方法的执行主体是处理器。
如图1所示,无监督图像分割方法包括以下步骤。
步骤S110,由处理器对包含目标对象的待处理图像进行超像素分割,得到多个超像素集合,每个超像素集合对应各自的超像素节点。
步骤S120,由处理器根据超像素节点生成无向图。无向图包括两个相邻超像素节点之间连接的第一边、每个超像素节点与虚拟前景节点之间连接的前景边、和每个超像素节点与虚拟背景节点之间连接的背景边。
步骤S130,由处理器基于多个超像素节点对应的第一标签集合,确定无向图中的前景超像素节点和背景超像素节点。前景超像素节点指的是属于前景的超像素节点,背景超像素节点指的是属于背景的超像素节点。
步骤S140,由处理器基于前景超像素节点和背景超像素节点,生成最小化目标函数。
步骤S150,由处理器基于最小化目标函数对无向图进行分割获得前景部分和背景部分,并生成第二标签集合。
步骤S160,由处理器基于第一标签集合与第二标签集合的比较结果,对待处理图像进行图像分割。
在一些示例中,构建最小化目标函数包括:基于每个超像素节点的第一边的权重、前景边的权重、和背景边的权重,构建所述最小化目标函数。
在一些示例中,方法还包括:基于超像素节点的灰度值,确定第一边的权重。
在一些示例中,方法还包括:基于虚拟背景节点的灰度值、超像素节点的灰度值、和 背景加权差异,确定前景边的权重;以及基于虚拟前景节点的灰度值、超像素节点的灰度值、和前景加权差异,确定所述背景边的权重。
在一些示例中,方法还包括:将前景超像素节点的灰度值的平均值作为虚拟前景节点的灰度值;以及将背景超像素节点的灰度值的平均值作为虚拟背景节点的灰度值。
在一些示例中,方法还包括:基于两个超像素节点之间的归一化期望往返时间矩阵、超像素节点的灰度值、前景超像素节点的灰度值,获取前景加权差异;以及基于两个超像素节点之间的归一化期望往返时间矩阵、超像素节点的灰度值、背景超像素节点的灰度值,获取背景加权差异。
在一些示例中,基于第一标签集合确定前景超像素节点和背景超像素节点包括:将多个超像素节点对应的灰度值进行降序排列,将排名在前的预设数量的超像素节点确定为前景超像素节点,并剩余超像素节点确定为背景超像素节点。
在一些示例中,对包含目标对象的待处理图像进行超像素分割包括:对待处理图像进行裁剪,以获得裁剪图像,对裁剪图像进行超像素分割。裁剪图像指的是包含所述目标对象的最小矩形向外扩展预设距离获得的矩形区域。
在一些示例中,方法还包括:基于与第一标签集合相同的第二标签集合,对待处理图像进行图像分割;以及基于与第一标签集合不同的第二标签集合确定无向图中的更新前景超像素节点和更新背景超像素节点;基于更新前景超像素节点和更新背景超像素节点,构建更新目标函数;基于更新目标函数对无向图进行分割,生成更新的标签集合,直到更新的标签集合与前一标签集合相同。
本申请基于Chan-Vese模型进行改进,与其他无监督的分割方法相比,主要具有如下不同点:
(1)本申请将像素级的分割转化成超像素级别的分割。通过引入超像素,不仅可以利用超像素获得精准的边缘,还可以减少后续处理的基本单元的数量,降低处理时间。
(2)本申请通过将Chan-Vese模型的能量项转化成无向图的边的权重,通过最大流算法可以直接对Chan-Vese的能量函数进行求解,最终只需要几次迭代便可以获取到最终的分割结果,而现有其他方法利用梯度下降流的方法迭代求解的次数都非常多。
(3)本申请对需要分割的图像进行了预处理。将需要分割的大致区域裁剪出来得到剪裁图像,减少了大量与目标对象无关的背景对后续分割的影响,使得最终的分割结果更加准确。
本申请采用超像素作为分割单元,基于超像素之间的距离和灰度值的相似性,通过马尔科夫链对无向图的边进行赋值,最后通过最大流算法对Chan-Vese模型的能量函数进行 求解,得到分割结果,具有简单、直观、高效的特点。
本申请特别适合对强度不均匀的图片进行分割,特别是待分割目标的边缘不清晰的情况。因此本申请可以应用于如医学图像的器官,骨头以及诸如肿瘤的病灶分割,并且该方法对图片的成像方式没有特别的要求,可以应用于CT、MRI等医学图像。
如图2所示,无监督图像分割方法可以为基于Chan-Vese模型的无监督图像分割,包括大致区域的裁剪;超像素分割;无线图的建立;最大流分割;判断分割是否不变;若分割不变,则结束流程;若分割改变,则返回无线图的建立。
下面,具体介绍图2中的方法流程。
<大致区域的裁剪>
获取待分割的原始图像,对原始图像进行裁剪,得到包含目标对象的裁剪图像。
需要注意的是,在本申请中,对原始图像无限制。也就是说,任何图像均可以作为原始图像,例如图像传感器输出的初始图像或初始图像经处理后得到的处理图像。
对于大部分待分割的原始图像,由于目标对象存在于原始图像中的一个或多个局部区域,因此可以通过裁剪操作来提取需要分割的大致区域。裁剪后得到的图像(称为裁剪图像)需要保留完整的目标对象,能够减少后续与目标对象无关的背景的干扰,从而提高最终的分割精度。
例如,原始图像如图3A所示。该原始图像是人体腰椎部位的CT(Computed Tomography)切片。目标对象如图3B所示(即CT切片里面的椎体)。通过裁剪操作,将目标对象所在的大致区域提取以进行后续的处理,例如对目标对象进行分割。
在本申请中,通过裁剪操作获得的大致区域为包含目标对象的最小矩形向4个方向(上、下、左、右)分别扩展约10个像素得到的矩形区域。例如,最终裁剪得到的图像(即,剪裁图像)如图3C所示。
在提取大致区域后,需要处理的图像的大小从512×512降为103×78,且与目标无关的背景的区域占比降低了97%。由此,提高后续处理用于目标对象所在的区域的效率,并提高分割结果。
<超像素分割>
对剪裁图像进行超像素分割,得到裁剪图像对应的超像素分割结果。
在一些示例中,可以采用已有的简单线性迭代聚类(SLIC)分割算法对裁剪图像进行超像素分割,得到一系列超像素集合。
每个超像素集合内的像素在空间上是相邻的,并且在颜色纹理等特征上是相似的。不 同的超像素集合之间相互不重叠。超像素能够提供精准的分割边缘,并且相比以像素作为基本处理单元的方法,能够减少了处理单元的数量,可以减少后续的处理时间。
对于SLIC算法,需要给出期望的超像素集合的个数(K)。在本公开中,期望最终每个超像素集合内包含约50个像素,因此K约为
Figure PCTCN2021117515-appb-000004
其中
Figure PCTCN2021117515-appb-000005
表示向上取整,N为裁剪图像中像素的总数。例如,若剪裁图像的大小为103×78,因此K的值可以为(108×78)÷50≈160。
通过SLIC算法,本申请可以得到如图4所示的超像素分割的结果。图3中的每个网格(或子区域)代表一个超像素集合。由于通过SLIC算法最终得到的超像素集合个数并不严格等于期望个数K(原因是:算法初始化的超像素中心为均匀采样(即,初始化的每个超像素中心与相邻超像素中心的距离是一致的),且算法后续可能涉及到超像素的合并),在后续处理中,K表示经由超像素分割得到的超像素集合个数。
本申请将像素级的分割转化成超像素级别的分割,由于超像素相互之间的边界准确且超像素在空间上连续一致,因此相比于直接使用像素进行分割得到的边缘会更加准确。另外,使用超像素还可以显著的减少后续处理单元的数量,大幅减少处理时间。
<无向图的建立>
(1)根据超像素分割结果建立无向图。
本申请中,用G=(V,E)表示无向图,其中V代表无向图中所有节点的集合,E表示无向图中所有边的集合。本申请建立的无向图的示意图如图5所示。
无向图中共包含(K+2)个节点,K个超像素节点分别对应于K个超像素集合,且两个额外节点为虚拟前景节点s和虚拟背景节点t,将K个超像素节点的集合记为V s
若一个超像素节点对应的超像素集合在超像素分割结果中与另一节点对应的超像素集合存在相邻关系,则可以在无向图中建立连接这两个节点的边(也称为第一边)。例如,对于节点i,j∈V s,若它们对应的超像素集合在裁剪图像中是相邻的,则在节点i,j之间形成一条边。将所有相邻的两个超像素集合对应的超像素节点之间的边的集合记为E 1。确定每两个相邻超像素集合的公共边的长度l ij,其中,公共边的长度被定义为:在裁剪图像上两个相邻的超像素集合中相邻像素点的个数的平均值。例如,若节点i对应的超像素集合与节点j对应的超像素集合相邻,节点i对应的超像素集合中有m个像素与节点j对应的超像素集合相邻,且节点j对应的超像素集合中有n个像素与节点i对应的超像素集合相邻,则l ij=(m+n)/2)。
每个属于V s的节点都会分别与虚拟前景节点s和虚拟背景节点t相连,将所有与虚拟前景节点s的边(也称为前景边)和虚拟背景节点t连接的边(也称为背景边)的集合记 为E 2,其中虚拟前景节点s和虚拟背景节点t互相之间不相连。
本申请中,每个超像素节点初始会和虚拟前景节点与虚拟背景节点相连。初始化时,对每个边进行赋值,后续处理过程(例如<最大流分割>)中,集合E2中的一些边会断掉。因此,根据边断掉的情况来得到对应的分割结果。
(2)根据无向图中节点之间的关系对每条边进行赋值,包括如下步骤。
(2-1)对于属于E 1的每条边,确定每条边的权重(也称为第一边的权重)。
在本申请中,例如对于属于E 1的每条边的权重为:e ij=G(|g i-g j|)+l ij,其中i,j表示该边连接的两个超像素节点,G(x)表示高斯函数,l ij表示节点i和j相邻的长度(即公共边长的长度),g i表示超像素节点i的灰度值,例如g i的值等于节点i对应的超像素集合内所有像素的灰度值的平均值。
E 1中的每条边不会随着后续分割结果的改变而改变,原因是每条边的权重只需要计算一次。权重反映了相邻超像素节点之间的相似性,权重值越大,两个节点越相似。
(2-2)确定每个超像素节点属于前景还是背景。
在本申请中,对于所有超像素节点进行初始化,来反映该节点属于前景(例如目标对象)还是背景。在一些示例中,采用超像素节点的标签来表示该超像素节点属于前景还是背景。例如,超像素节点i的标签a i=0时,表示该超像素节点i属于前景(也称为前景超像素节点);超像素节点i的标签a i=1时,表示该超像素节点i属于背景(也称为背景超像素点)。
在本申请中,将所有超像素节点按照其灰度值进行降序排序。将排序后前30%的节点初始化为属于前景,即设置这些节点的节点标签为0,其它超像素节点均初始化为属于背景,即设置这些节点的节点标签为1。以A 0=(a 1,a 2,…,a K)表示所有超像素节点初始化的结果并将A 0作为所有超像素节点的当前标签集合。
(2-3)对于属于E 2的每条边,确定每条边的权重。
在本申请中,E 2中每条边的权重反映了对应的超像素节点与虚拟前景节点或虚拟背景节点之间的相似性。该权重值越大,表示对应的超像素节点与虚拟前景节点或虚拟背景节点越相似。
在一些示例中,对于与虚拟前景节点s相连的每条边,该边的当前权重(也称为前景边的权重)为:e si=(g i-g t) 2+c i1,其中下标i表示超像素节点i;,g i表示超像素节点i的灰度值,例如g i的值等于节点i对应的超像素集合内所有像素的灰度值的平均值;g t表示虚拟背景节点的当前灰度值,例如g t的值等于当前所有属于背景的超像素节点(即当前所有a i=1的超像素节点)的灰度值的平均值,而c i1表示超像素节点i与当前属于背景的超像 素节点之间的加权差异(其中这里的权重为后续处理(例如步骤(2-4))中节点之间的期望往返时间对应的高斯函数值,也称为背景加权差异)。
在一些示例中,对于与虚拟背景节点t相连的每条边,该边的当前权重(也称为背景边的权重)为:e it=(g i-g s) 2+c i0,其中,g s表示虚拟前景节点的当前灰度值,例如g s的值等于当前所有属于前景的超像素节点(即当前所有a i=0的超像素节点)的灰度值的平均值;c i0表示超像素节点i与当前属于前景的超像素节点之间的加权差异(也称为前景加权差异)。
(2-4)对于属于V s的每个超像素节点,获得相应的加权差异和加权差异。
例如,对于属于V s的每个超像素节点,使用马尔科夫链,获得相邻超像素节点之间的转移矩阵P K×K,从而获得背景加权差异c i1和前景加权差异c i0。转移矩阵P K×K中每个元素p ij为:p ij=w ij/d i,其中w ij=G(|g i-g j|),d i表示超像素节点i的度,例如d i的值等于超像素节点i与所有相连的超像素节点j的w ij之和。
根据马尔科夫链的性质,马尔科夫链收敛到分布π i=d i/∑ jd j,其中π i表示超像素节点i的稳定分布概率。基于矩阵:Z=(I-P+W) -1可以得到超像素节点之间的期望往返时间,其中WW K×K的每个行向量等于π,I K×K是单位矩阵。该期望往返时间考虑了节点之间的距离与灰度值的信息,该期望往返时间的值越小,两个节点越相似。
当节点i和节点j为不同超像素节点时,从节点i到节点j的期望时间为:E i(T j)=(Z jj-Z ij)/π j;当节点i和节点j为相同超像素节点时,节点i的期望时间为:E i(T i)=1/π i。因此,两个节点之间的期望往返时间可以表示为T ij=E i(T j)+E j(T i),将元素T ij形成的期望时间矩阵按行进行归一化得到归一化后的期望时间矩阵T。可以根据得到的归一化期望时间矩阵T得到加权差异c i1和加权差异c i0,如下述公式表示:
Figure PCTCN2021117515-appb-000006
Figure PCTCN2021117515-appb-000007
其中a j=0表示超像素节点i属于前景,a j=1属于背景。加权差异c i1表示节点i与当前属于背景的超像素节点之间的加权差异;加权差异分别c i0表示节点i与属于前景的超像素节点之间的加权差异。
在一些示例中,归一化期望时间矩阵T的可以如下:
Figure PCTCN2021117515-appb-000008
(3)基于无向图中每条边的权重,得到所有超像素节点更新后的标签集合。
设定最小化目标函数如下:
Figure PCTCN2021117515-appb-000009
其中,μ为权重因子,v i表示像素节点i,v j分别表示超像素节点j,其中,v i和i可互换使用,且v j和j可互换使用。本申请中,μ的取值范围为0.001至0.1,例如,μ为0.01,用于调整前面两项与最后一项在目标函数中的权重关系。
可以使用已有的最大流分割对无向图进行切割来对上述目标函数进行最小值的求解。通过最大流分割将<无向图的建立>过程中的无向图分割得到前景部分和背景部分,其中前景部分中的超像素节点与虚拟前景节点s相连,背景部分中的超像素节点与虚拟背景节点t相连,且前景部分和背景部分之间互不连接。
根据最大流分割的分割结果,将和虚拟前景节点s相连的超像素节点划分为属于前景,即将这些超像素节点的标签更新为a i=0,并将和虚拟背景节点t相连的超像素节点划分为属于背景,即将这些超像素节点的标签更新为a i=1。在更新所有超像素节点的标签后,用A 1表示更新后的所有a i组成的标签集合。
例如,本申请中通过最大流分割将无向图分割成前景部分和背景部分的结果如图6所示。图6A为与虚拟前景节点s相连的前景部分,图6B为与虚拟背景节点t相连的背景部分。
<分割是否不变>
判断比较更新分割结果(A 1)与当前分割结果(A 0)是否相同。如果两者相同,则获得最终分割结果;如果两者不相同,将更新分割结果(A 1)作为新的当前标签集合,即更新A 0=A 1,并返回确定属于E 2的每条边的权重的步骤(2-3),直至更新分割结果与上次分割结果相同,并获得最终分割结果。
在本申请中,获得最终分割结果包括:根据更新分割结果A 1对应的每个超像素节点的标签,将对应超像素节点的超像素集合内的所有像素都标记为该对应超像素节点的标签a i,来得到最终像素级的分割结果。
其中,标记为属于前景的全部像素即构成最终的分割结果。后续可以选择是否在该结果的基础上进行略微的调整以获取到更精确的分割。
在本申请中,最终得到的分割结果如图7所示,图中黑色轮廓包围的白色区域即为最终的分割结果。
在本公开中,对原始图像进行裁剪,并对裁剪图像进行超像素分割。根据超像素分割结果建立无向图,将Chan-Vese模型的能量项用边的权重表示,通过使用马尔科夫链,可以在边的赋值过程中同时考虑超像素之间的距离和灰度的关系。最后通过重复使用最大流分割以及更新边的权重可以得到相应的分割结果。本申请可在缺少标注数据的医学领域里对图像进行自动的精准的分割,从而为后续的医学研究提供分割物体的准确信息。本申请能够在更短的时间以及更少的人工干预下准确地分割出所需目标。
本申请还提出一种电子设备,包括处理器和存储器。存储器用于存储处理器可执行的执行。当指令被处理器执行时,处理器用于执行如上描述的无监督图像分割方法。
本申请还提出一种存储介质,其上存储有计算机程序。当计算机程序被处理器执行时,处理器用于执行如上描述的无监督图像分割方法。
图8是根据一示例性实施例示出的一种电子设备的框图。例如,设备900可以是移动电话,计算机,数字广播终端,消息收发设备,游戏控制台,平板设备,医疗设备,健身设备,个人数字助理等。
参照图8,设备900可以包括以下一个或多个组件:处理组件902,存储器904,电源组件906,多媒体组件908,音频组件910,输入/输出(I/O)的接口912,传感器组件914,以及通信组件916。
处理组件902通常控制设备900的整体操作,诸如与显示,电话呼叫,数据通信,相机操作和记录操作相关联的操作。处理组件902可以包括一个或多个处理器920来执行指令,以完成上述的方法的全部或部分步骤。此外,处理组件902可以包括一个或多个模块,便于处理组件902和其他组件之间的交互。例如,处理组件902可以包括多媒体模块,以方便多媒体组件908和处理组件902之间的交互。
存储器904被配置为存储各种类型的数据以支持在设备900的操作。这些数据的示例包括用于在设备900上操作的任何应用程序或方法的指令,联系人数据,电话簿数据,消息,图片,视频等。存储器904可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。
电源组件906为设备900的各种组件提供电力。电源组件906可以包括电源管理系统,一个或多个电源,及其他与为设备900生成、管理和分配电力相关联的组件。
多媒体组件908包括在所述设备900和用户之间的提供一个输出接口的屏幕。在一些实施例中,屏幕可以包括液晶显示器(LCD)和触摸面板(TP)。如果屏幕包括触摸面板,屏幕可以被实现为触摸屏,以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。所述触摸传感器可以不仅感测触摸或滑动动作的边界,而且还检测与所述触摸或滑动操作相关的持续时间和压力。在一些实施例中,多媒体组件908包括一个前置摄像头和/或后置摄像头。当设备900处于操作模式,如拍摄模式或视频模式时,前置摄像头和/或后置摄像头可以接收外部的多媒体数据。每个前置摄像头和后置摄像头可以是一个固定的光学透镜系统或具有焦距和光学变焦能力。
音频组件910被配置为输出和/或输入音频信号。例如,音频组件910包括一个麦克风(MIC),当设备900处于操作模式,如呼叫模式、记录模式和语音识别模式时,麦克风被配置为接收外部音频信号。所接收的音频信号可以被进一步存储在存储器904或经由通信组件916发送。在一些实施例中,音频组件910还包括一个扬声器,用于输出音频信号。
I/O接口912为处理组件902和外围接口模块之间提供接口,上述外围接口模块可以是键盘,点击轮,按钮等。这些按钮可包括但不限于:主页按钮、音量按钮、启动按钮和锁定按钮。
传感器组件914包括一个或多个传感器,用于为设备900提供各个方面的状态评估。例如,传感器组件914可以检测到设备900的打开/关闭状态,组件的相对定位,例如所述组件为设备900的显示器和小键盘,传感器组件914还可以检测设备900或设备900一个组件的位置改变,用户与设备900接触的存在或不存在,设备900方位或加速/减速和设备900的温度变化。传感器组件914还可以包括接近传感器,被配置用来在没有任何的物理接触时检测附近物体的存在。传感器组件914还可以包括光传感器,如CMOS或CCD图像传感器,用于在成像应用中使用。在一些实施例中,该传感器组件914还可以包括加速度传感器,陀螺仪传感器,磁传感器,压力传感器或温度传感器。
通信组件916被配置为便于设备900和其他设备之间有线或无线方式的通信。设备900可以接入基于通信标准的无线网络,如WiFi,2G或3G,4G或5G或它们的组合。在一个示例性实施例中,通信组件916经由广播信道接收来自外部广播管理系统的广播信号或广播相关信息。在一个示例性实施例中,所述通信组件916还包括近场通信(NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(RFID)技术,红外数据协会(IrDA)技术,超宽带(UWB)技术,蓝牙(BT)技术和其他技术来实现。
在示例性实施例中,设备900可以被一个或多个应用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门 阵列(FPGA)、控制器、微控制器、微处理器或其他电子组件实现,用于执行上述方法。
在示例性实施例中,还提供了一种包括指令的非临时性计算机可读存储介质,例如包括指令的存储器904,上述指令可由设备900的处理器920执行以完成上述方法。例如,所述非临时性计算机可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。
本领域技术人员在考虑说明书及实践这里公开的公开后,将容易想到本公开的其它实施方案。本公开旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由下面的权利要求指出。
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限制。

Claims (17)

  1. 一种无监督图像分割方法,应用于包括处理器的终端设备,包括:
    对包含目标对象的待处理图像进行超像素分割,得到多个超像素集合,每个超像素集合对应各自的超像素节点;
    根据所述超像素节点生成无向图;无向图包括两个相邻超像素节点之间连接的第一边、每个超像素节点与虚拟前景节点之间连接的前景边、和每个超像素节点与虚拟背景节点之间连接的背景边;
    基于多个超像素节点对应的第一标签集合,确定所述无向图中的前景超像素节点和背景超像素节点,所述前景超像素节点指的是属于前景的超像素节点,所述背景超像素节点指的是属于背景的超像素节点;
    基于前景超像素节点和所述背景超像素节点,生成最小化目标函数;
    基于最小化目标函数对所述无向图进行分割获得前景部分和背景部分,并生成第二标签集合;
    基于第一标签集合与第二标签集合的比较结果,对待处理图像进行图像分割。
  2. 根据权利要求1所述的方法,其中构建最小化目标函数包括:
    对于每个超像素节点,确定第一边的权重、前景边的权重、背景边的权重;以及
    基于多个超像素节点对应的第一权重、前景权重和背景权重,构建所述最小化目标函数。
  3. 根据权利要求2所述的方法,还包括:
    基于所述超像素节点的灰度值和与所述超像素节点连接的另一超像素节点的灰度值,确定所述超像素节点与所述另一超像素节点之间的第一边的权重。
  4. 根据权利要求3所述的方法,还包括:
    获取所述超像素节点对应的超像素集合内每个像素的灰度值;
    获取所述灰度值的平均值,并将所述平均值作为所述超像素节点的灰度值。
  5. 根据权利要求2至4中任一项所述的方法,还包括:
    基于虚拟背景节点的灰度值、所述超像素节点的灰度值、和背景加权差异,确定所述前景边的权重;以及
    基于虚拟前景节点的灰度值、所述超像素节点的灰度值、和前景加权差异,确定所述背景边的权重。
  6. 根据权利要求5所述的方法,还包括:
    获取前景超像素节点的灰度值,并将灰度值的平均值作为所述虚拟前景节点的灰度值; 以及
    获取背景超像素节点的灰度值,并将灰度值的平均值作为所述虚拟背景节点的灰度值。
  7. 根据权利要求5或6所述的方法,还包括:
    基于两个超像素节点之间的归一化期望往返时间矩阵、所述超像素节点的灰度值、前景超像素节点的灰度值,获取所述前景加权差异;以及
    基于两个超像素节点之间的归一化期望往返时间矩阵、所述超像素节点的灰度值、背景超像素节点的灰度值,获取所述背景加权差异。
  8. 根据权利要求1至7中任一项所述的方法,其中基于第一标签集合确定前景超像素节点和背景超像素节点包括:
    将所述多个超像素节点对应的灰度值进行降序排列,将排名在前的预设数量的超像素节点确定为前景超像素节点,并剩余超像素节点确定为背景超像素节点。
  9. 根据权利要求1至8中任一项所述的方法,其中,对包含目标对象的待处理图像进行超像素分割包括:
    对待处理图像进行裁剪,以获得裁剪图像,所述裁剪图像指的是包含所述目标对象的最小矩形向外扩展预设距离获得的矩形区域;以及
    对裁剪图像进行超像素分割。
  10. 如权利要求1至9中任一项所述的方法,还包括:
    基于与第一标签集合相同的第二标签集合,对所述待处理图像进行图像分割;以及
    基于与第一标签集合不同的第二标签集合确定所述无向图中的更新前景超像素节点和更新背景超像素节点;基于所述更新前景超像素节点和所述更新背景超像素节点,构建更新目标函数;基于更新目标函数对所述无向图进行分割,生成更新的标签集合,直到更新的标签集合与前一标签集合相同。
  11. 根据权利要求2或3所述的方法,其中,对于超像素节点i,所述第一边的权重e ij由如下等式表示:
    e ij=G(|g i-g j|)+l ij
    其中,i表示所述超像素节点i;j表示与超像素节点i连接的另一超像素节点j;G(x)表示高斯函数;l ij表示超像素节点i和j对应的超像素集合之间的公共边的长度;g i表示超像素节点i的灰度值;g j表示超像素节点j的灰度值。
  12. 根据权利要求2至5和11中任一项所述的方法,其中,对于超像素节点i,所述前景边的权重e si由如下等式表示:
    e si=(g i-g t) 2+c i1
    其中,i表示超像素节点i;g i表示超像素节点i的灰度值;g t表示虚拟背景节点的灰度值,c i1为超像素节点i与背景超像素节点之间的背景加权差异;
    对于超像素节点i,所述背景边的权重e it由如下等式表示:
    e it=(g i-g s) 2+c i0
    其中,g s表示虚拟前景节点的灰度值,c i0表示超像素节点i与前景超像素节点之间的前景加权差异。
  13. 根据权利要求2至7和11至12中任一项所述的方法,其中,对于超像素节点i,所述前景加权差异c i0由如下等式表示:
    Figure PCTCN2021117515-appb-100001
    其中,a j=0表示超像素节点j属于前景(即前景超像素节点),G(x)表示高斯函数,T ij表示超像素节点i和j之间的归一化期望时间矩阵T的元素;g i表示超像素节点i的灰度值;g j表示超像素节点j的灰度值;
    对于超像素节点i,所述背景加权差异c i1由如下等式表示:
    Figure PCTCN2021117515-appb-100002
    其中,a j=1表示超像素节点j属于背景(即背景超像素节点),G(x)表示高斯函数,T ij表示超像素节点i和j之间的归一化期望时间矩阵T的元素;g i表示超像素节点i的灰度值;g j表示超像素节点j的灰度值。
  14. 根据权利要求13所述的方法,其中,所述归一化期望往返时间矩阵T的元素T ij由如下等式表示:
    T ij=E i(T j)+E j( i)
    其中,E i(T j)表示当超像素节点i与j不同时,超像素节点i到超像素节点j的期望时间,E i(T j)=(Z ij-Z ij)/π j;E i(T i)表示当超像素节点i与j相同时,超像素节点i到超像素节点j的期望时间,E i(T i)=1/π i;Z jj和Z ij是中间矩阵Z的元素,Z=(I-P+W) -1;I是单位矩阵;P是超像素节点i和超像素节点j之间的转移矩阵;W是构造矩阵,每个行向量等于π;π i和π j是构造矩阵W的元素,π i=d i/∑ jd j;d i表示超像素节点i的度,d j表示超像素节点j的度。
  15. 根据权利要求1至14中任一项所述的方法,其中,对于超像素节点i,所述最小 化目标函数由如下等式表示:
    Figure PCTCN2021117515-appb-100003
    其中,a i表示超像素节点i的标签,a j表示超像素节点j的标签,其中,a i=1表示超像素节点i属于背景(背景超像素节点),a i=0表示超像素节点i属于前景(前景超像素节点);e si表示超像素节点i的前景边的权重;e it表示超像素节点i的背景边的权重;e ij表示超像素节点i和超像素节点j连接的第一边的权重;V s表示超像素节点的集合;μ表示权重因子;v i表示像素节点i,v j分别表示超像素节点j。
  16. 一种电子设备,包括:
    处理器;以及
    存储器,用于存储所述处理器可执行的指令;
    其中,当所述指令由所述处理器执行时,所述处理器用于执行如权利要求1至15中任一项所述的无监督图像分割方法。
  17. 一种计算机可读存储介质,其上存储有计算机程序,当所述计算机程序由处理器执行时,所述处理器用于执行如权利要求1至15中任一项所述的无监督图像分割方法。
PCT/CN2021/117515 2020-11-27 2021-09-09 无监督图像分割方法、电子设备和存储介质 WO2022110969A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/696,668 US11928825B2 (en) 2020-11-27 2022-03-16 Unsupervised image segmentation method and electronic device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202011353104.2A CN112419344B (zh) 2020-11-27 2020-11-27 一种基于Chan-Vese模型的无监督图像分割方法
CN202011353104.2 2020-11-27

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/696,668 Continuation US11928825B2 (en) 2020-11-27 2022-03-16 Unsupervised image segmentation method and electronic device

Publications (1)

Publication Number Publication Date
WO2022110969A1 true WO2022110969A1 (zh) 2022-06-02

Family

ID=74842633

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/117515 WO2022110969A1 (zh) 2020-11-27 2021-09-09 无监督图像分割方法、电子设备和存储介质

Country Status (3)

Country Link
US (1) US11928825B2 (zh)
CN (1) CN112419344B (zh)
WO (1) WO2022110969A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115311460A (zh) * 2022-08-16 2022-11-08 哈尔滨工业大学 一种缓动背景下融合时空域信息的红外小目标检测方法
CN116993947A (zh) * 2023-09-26 2023-11-03 光谷技术有限公司 一种三维场景可视化展示方法及系统

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112419344B (zh) * 2020-11-27 2022-04-08 清华大学 一种基于Chan-Vese模型的无监督图像分割方法
CN113177592B (zh) * 2021-04-28 2022-07-08 上海硕恩网络科技股份有限公司 一种图像分割方法、装置、计算机设备及存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090080774A1 (en) * 2007-09-24 2009-03-26 Microsoft Corporation Hybrid Graph Model For Unsupervised Object Segmentation
CN107273905A (zh) * 2017-06-14 2017-10-20 电子科技大学 一种结合运动信息的目标主动轮廓跟踪方法
CN112419344A (zh) * 2020-11-27 2021-02-26 清华大学 一种基于Chan-Vese模型的无监督图像分割方法

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9501837B2 (en) * 2014-10-01 2016-11-22 Lyrical Labs Video Compression Technology, LLC Method and system for unsupervised image segmentation using a trained quality metric
CN104732551A (zh) * 2015-04-08 2015-06-24 西安电子科技大学 基于超像素和图割优化的水平集图像分割方法
US10733737B2 (en) * 2017-07-31 2020-08-04 University Of Louisville Research Foundation, Inc. System and method of automated segmentation of anatomical objects through learned examples

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090080774A1 (en) * 2007-09-24 2009-03-26 Microsoft Corporation Hybrid Graph Model For Unsupervised Object Segmentation
CN107273905A (zh) * 2017-06-14 2017-10-20 电子科技大学 一种结合运动信息的目标主动轮廓跟踪方法
CN112419344A (zh) * 2020-11-27 2021-02-26 清华大学 一种基于Chan-Vese模型的无监督图像分割方法

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
GUI YAN, TANG WEN;ZENG GUANG: "Gradient-Constrained SLIC Based Fast Video Object Segmentation", JOURNAL OF FRONTIERS OF COMPUTER SCIENCE AND TECHNOLOGY, vol. 13, no. 2, 19 January 2018 (2018-01-19), pages 285 - 299, XP055933781, ISSN: 1673-9418, DOI: 10.3778/j.issn.1673-9418.1710063 *
HUANG QUANWEI, ZHOU YUEZHI, TAO LINMI, YU WEIKANG, ZHANG YAOXUE, HUO LI, HE ZUOXIANG: "A Chan-Vese model based on the Markov chain for unsupervised medical image segmentation", TSINGHUA SCIENCE AND TECHNOLOGY, vol. 26, no. 6, 1 December 2021 (2021-12-01), pages 833 - 844, XP055933768, DOI: 10.26599/TST.2020.9010042 *
TANG, WEN: "Research on Interactive Video Object Extraction Based on Gradient-Constrained SLIC Algorithm", CHINA MASTER'S THESES FULL-TEXT DATABASE, INFORMATION TECHNOLOGY, 1 October 2017 (2017-10-01), XP055933792, ISSN: 1674-0246 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115311460A (zh) * 2022-08-16 2022-11-08 哈尔滨工业大学 一种缓动背景下融合时空域信息的红外小目标检测方法
CN116993947A (zh) * 2023-09-26 2023-11-03 光谷技术有限公司 一种三维场景可视化展示方法及系统
CN116993947B (zh) * 2023-09-26 2023-12-12 光谷技术有限公司 一种三维场景可视化展示方法及系统

Also Published As

Publication number Publication date
CN112419344B (zh) 2022-04-08
CN112419344A (zh) 2021-02-26
US20220207752A1 (en) 2022-06-30
US11928825B2 (en) 2024-03-12

Similar Documents

Publication Publication Date Title
WO2022110969A1 (zh) 无监督图像分割方法、电子设备和存储介质
CN108764091B (zh) 活体检测方法及装置、电子设备和存储介质
TWI755175B (zh) 圖像分割方法、電子設備和儲存介質
WO2019128508A1 (zh) 图像处理方法、装置、存储介质及电子设备
TW202042175A (zh) 圖像處理方法及裝置、電子設備和電腦可讀儲存媒體
CN112767329B (zh) 图像处理方法及装置、电子设备
US9665945B2 (en) Techniques for image segmentation
US9554030B2 (en) Mobile device image acquisition using objects of interest recognition
CN108513060B (zh) 使用外部电子设备的拍摄方法和支持该方法的电子设备
US11526995B2 (en) Robust use of semantic segmentation for depth and disparity estimation
CN104641398B (zh) 被拍摄体跟踪装置以及相机
US9256950B1 (en) Detecting and modifying facial features of persons in images
US11030733B2 (en) Method, electronic device and storage medium for processing image
CN108875594B (zh) 一种人脸图像的处理方法、装置以及存储介质
WO2021212810A1 (zh) 图像处理方法、装置、电子设备及存储介质
CN112135041B (zh) 一种人脸特效的处理方法及装置、存储介质
US11720745B2 (en) Detecting occlusion of digital ink
CN114266840A (zh) 图像处理方法、装置、电子设备及存储介质
WO2022121577A1 (zh) 图像处理方法及装置
CN113160277A (zh) 一种图像处理方法、装置、电子设备及存储介质
KR20210049649A (ko) 얼굴 이미지를 강화하는 방법 및 장치, 전자 기기
CN113052242A (zh) 图像处理网络的训练方法及装置、图像处理方法及装置
CN113012207A (zh) 一种图像配准方法和装置
Lizarraga-Morales et al. Integration of color and texture cues in a rough set–based segmentation method
CN114066716A (zh) 一种图像处理方法、装置、电子设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21896473

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21896473

Country of ref document: EP

Kind code of ref document: A1