WO2016127883A1

WO2016127883A1 - Image area detection method and device

Info

Publication number: WO2016127883A1
Application number: PCT/CN2016/073274
Authority: WO
Inventors: 石克阳
Original assignee: 阿里巴巴集团控股有限公司; 石克阳
Priority date: 2015-02-12
Filing date: 2016-02-03
Publication date: 2016-08-18
Also published as: CN105989594B; CN105989594A

Abstract

An image area detection method and device. The method comprises: calculating color features and gradient features of pixels in an image to be processed, and constructing mixed feature vectors of the image to be processed (S1); clustering the mixed feature vectors to obtain a cluster after clustering (S2); calculating a clustering probability of the cluster according to predetermined rules, and calculating a pixel probability of pixels in the cluster according to the clustering probability (S3); and detecting the image to be processed according to the pixel probability to acquire a target area (S4). The method and the device can effectively cope with various complex situations in an actual image scene, realize accurate and effective separation of the main area in a commodity image, and improve extraction accuracy.

Description

Image area detecting method and device

Technical neighborhood

The present application belongs to a computer information processing neighborhood, and particularly relates to an image area detecting method and apparatus.

Background technique

With the development of the era of Internet consumption, websites that provide online product search and online shopping, such as Amoy, Taobao, and Tmall Mall, often provide a large number of images of products when displaying product information, so that consumers can make intuitive choices. Commodity images are more important information as a website for online search and shopping, and have a great impact on the transaction of goods.

In the online product information display, usually the product image can better reflect the intuitive characteristics of the product. The main body area (or foreground area, such as windbreaker, casual pants, leather shoes, mobile phone, sofa stool) in the product is usually in the product image. The largest and most important part of the information. For example, when displaying and placing an advertisement, it is usually necessary to consider whether the main body of the product is centered in an image, whether it occupies a prescribed ratio in the image displayed by the image, and whether the main body region is prominent with respect to the background. In the actual application, most of the product images are uploaded by the seller's merchants and uploaded on the website display window. The seller's merchants often do not have the professional shooting and image editing ability, and can not highlight the product features. Therefore, in some application scenarios, the business platform service party usually needs to analyze the image provided by the seller merchant, obtain the main body of the product, adjust the display angle of the product, the background matching, the placement position, the size of the main product, etc., so as to have the best display effect. Images so that consumers can more accurately get the items they are interested in, or be attracted by the merchant's items. Therefore, users of business platform servants or terminal applications often need to accurately and efficiently separate the product body area from the background area from the product image.

At present, the commonly used separation technology between the main body area and the background area mainly includes the image saliency area detection technology based on the color quantization feature in the academic world. Such techniques are typically processed only by relying on color features and can only process simple product images. The image of the product in the platform e-commerce website such as Taobao and Tmall can be uploaded by the seller, and the quality of the image is uneven and the complexity is very high. For example, when the color of the subject and the background are similar, it is easy to mix the two when using color modeling, which is difficult to distinguish and cannot effectively extract the main body area. Similarly, when the background complexity is high, that is, the color distribution of the non-subject area is complex, the use of color feature-based methods tends to model the background and foreground as too many blocks, resulting in the inability to accurately separate the foreground and background.

At present, the commodity image subject recognition technology in the prior art cannot accurately and effectively detect and separate the subject region when facing complex images with similar colors of the subject and the background region or high complexity of the background region. In the prior art, especially for complex image area detection, a more efficient and accurate detection method is needed.

Summary of the invention

The purpose of the present application is to provide an image region detecting method and device, which can effectively cope with various complicated situations in an actual image scene, realize accurate and effective separation of the main body region in the complex image, and improve extraction precision.

An image area detecting method and apparatus provided by the present application is implemented as follows:

An image area detecting method, the method comprising:

Calculating a color feature and a gradient feature of the pixel of the image to be processed, and constructing a mixed feature vector of the image to be processed;

Clustering the mixed feature vectors to obtain clusters after clustering;

Calculating a clustering probability of the cluster according to a predetermined rule, and calculating a pixel probability of the pixel point in the cluster based on the clustering probability;

The image to be processed is detected based on the pixel probability to acquire a target area.

An image area detecting device, the device comprising:

a feature calculation module, configured to calculate a color feature and a gradient feature of the pixel of the image to be processed, and construct a mixed feature vector of the image to be processed;

a clustering module, configured to cluster the mixed feature vectors to obtain clusters after clustering;

a clustering probability module, configured to calculate a clustering probability of the cluster according to a predetermined rule;

a pixel probability module, configured to calculate a pixel probability of a pixel point in the cluster based on the clustering probability;

And a detecting module, configured to detect the image to be processed based on the pixel probability, and acquire a target area.

An image area detecting device, the device being configured to include:

a first processing unit, configured to acquire a to-be-processed image of the user/client, calculate a color feature and a gradient feature of the pixel of the image to be processed, and construct a mixed feature vector of the image to be processed;

a second processing unit, configured to cluster the mixed feature vector to obtain clustered clusters; and further configured to calculate a clustering probability of the cluster according to a predetermined rule, and calculate the clustering probability based on the clustering probability The pixel probability of a pixel in the cluster;

And an output unit, configured to acquire a target area of the to-be-processed image based on the pixel probability, and store or display the acquired target area in a specified location.

The image area detecting method and apparatus provided by the present application adopts a unique mixed feature vector for each pixel in the image. The mixed feature vector includes a gradient feature in addition to the color feature of the pixel point. When calculating the pixel point, the information around the pixel point is simultaneously considered, and the feature value of the pixel point can be more accurately established, so that when the feature space is mixed The distance between the foreground and the background regions is similar to that of the mixed feature vectors of the two points. The distance is greatly increased, and the foreground and background areas can be effectively distinguished, and the accuracy of the current area detection is improved. Similarly, in a complex background image, the hybrid feature vector described in the present application can combine the color feature and the gradient feature to describe the pixel of the foreground and the pixel of the background into two different clusters, in the Euclidean distance. It is easy to separate the two when calculating. In the present application, the hybrid features are clustered, and the clustering probability that the clusters belong to the body region after clustering is calculated, and the pixel probability of each pixel in the cluster belongs to the body region is calculated based on the clustering probability, as described in the present application. The calculated saliency as a probability of belonging to the subject area can effectively and accurately detect the subject area in the image to be processed. In the present application, the clustering and other clustering distances and the sum of the sums are used as the clustering saliency to express the probability that the cluster belongs to the body region, which is more in line with the actual user's perception of the product subject in the image, so that the processing result More precise and effective.

DRAWINGS

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings to be used in the embodiments or the prior art description will be briefly described below. Obviously, the drawings in the following description are only It is a part of the embodiments described in the present application. For those of ordinary skill in the art, other drawings can be obtained according to the drawings without any inventive labor.

1 is a schematic flow chart of an embodiment of an image area detecting method according to the present application;

2 is a schematic diagram of a neighborhood window extraction of an image boundary point to be processed according to the present application;

3 is a schematic diagram of performing body region extraction using an image region detecting method according to the present application;

4 is a schematic diagram of performing body region extraction using an image region detecting method according to the present application;

FIG. 5 is a schematic structural diagram of a module of an image area detecting apparatus according to the present application; FIG.

6 is a schematic structural diagram of a module of an embodiment of a feature calculation module according to the present application;

7 is a block diagram showing the structure of an embodiment of a color feature module according to the present application;

FIG. 8 is a schematic structural diagram of a module of an embodiment of a pixel probability calculation module according to the present application.

detailed description

The technical solutions in the embodiments of the present application are clearly and completely described in the following, in order to better understand the technical solutions in the present application. The embodiments are only a part of the embodiments of the present application, and not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the scope of protection of the present application.

The product image uploaded by the seller merchant may include one or more subjects. For example, in order to save the window resource of the product display, the seller merchant may merge the multiple images into one image and upload the image as an item. An image area detecting method described in the present application may be applied to an image including one or more product bodies, where the image includes a plurality of In the main body, the image to be processed may be divided into a plurality of sub-pictures, each of which may include a single body, and then processed for each of the sub-pictures using the subject area extraction method described in the present application. Specifically, the method for dividing a to-be-processed image including a plurality of subjects may adopt an image segmentation method described in Patent No. CN102567952A, entitled "An Image Segmentation Method and System". After processing by the above method, the product image including the plurality of subjects may be divided into a plurality of sub-pictures including a single body.

The image processing method described in the present application will be described in detail below by taking an article image including a single subject or a sub-picture divided by the above image as an example. 1 is a flowchart of a method for detecting an image region according to an embodiment of the present invention. As shown in FIG. 1, the method may include:

S1: Calculating a color feature and a gradient feature of the pixel of the image to be processed, and constructing a mixed feature vector of the image to be processed.

As described above, the image to be processed described in this embodiment may be a single product image including one main body, or may be a sub-picture including a single main body after being divided by the image. After acquiring the image to be processed, the mixed features of the pixel points of the image to be processed may be constructed based on the feature values including the color and the gradient to form a mixed feature vector. In the actual image information processing, feature extraction of each pixel point can usually be performed by using local features. For example, for a certain pixel point P, a neighborhood window W(p) can be selected, and the neighborhood window is selected. W(p) can be a square region of N*N centered at point P. The value of the N may be reasonably selected according to the accuracy or speed of the image information processing, for example, an odd number such as 3, 5, 7, or 9 may be used according to the image size or the number of pixels included. In this embodiment, the value of N may be 5, and a 5*5 square neighborhood window region centered on the P point may be taken each time the color feature or the gradient feature in the mixed feature of the pixel is calculated.

The hybrid feature of the image to be processed constructed in this embodiment may include a color feature and a gradient feature of the pixel, and the color feature and the gradient feature may be combined in a predetermined format to form a high-dimensional mixed feature vector. In a specific implementation process, the process of calculating the color feature of the pixel of the image to be processed may include:

S101: If the image to be processed is not data in the Lab format, convert the data format of the image to be processed into a Lab format;

S102: extract a pixel point of the neighborhood window W(p) in the image to be processed centering on the pixel to be processed, and divide the three channels L, a, and b of the pixel in the neighborhood window W(p) For K groups, form a color feature vector of 3*K dimensions;

S103: accumulating color values of each of the L, a, and b channels in each of the neighboring window W(p) in a dimension corresponding to the color feature vector to form the neighborhood window. The color characteristics of the pixel to be processed.

The image color feature extraction of the image to be processed may generally include uniformly quantizing the three channels L, a, and b into K packets, and ensuring that each packet of each channel has the same length as possible.

Generally, the image to be processed may be image information of an RGB channel color model, and the color model of the Lab channel generally refers to a color model based on a person's perception of color and independent of light and equipment, and is more in line with human visual perception. Therefore, in this embodiment, the body region in the image detected from the Lab space is used, which is more in line with the human perception result, so that the processing result of the body region extraction is more accurate.

In this embodiment, the image to be processed can be converted from an RGB channel to a Lab channel. Typically the RGB channel consists of three variable color vector vectors (R, G, B) as follows:

R: red, an integer from 0 to 255, with a change value of 256;

G: green, an integer from 0 to 255, with a change value of 256;

B: Blue, an integer from 0 to 255, with a change value of 256.

The Lab channel can include three variables as shown below:

L: brightness, an integer from 0 to 100, a change value of 100;

A: from green to red, an integer from -128 to 127, with a value of 256;

B: an integer from -28 to 127 from blue to yellow, with a value of 256.

When the image to be processed is converted from the RGB channel to the Lab channel, the conversion may be performed by using a given algorithm, or may be converted by using a software tool such as Photoshop, and will not be discussed in detail herein. Then, the pixel points in the image to be processed of the Lab channel may be extracted by using a preset neighborhood window W(p), and the three channels of L, a, and b of the pixel in the neighborhood window W(p) are respectively The uniform quantization is K bins. Further, the values of the pixel points quantized by the three channels L, a, and b can be spliced together to form a 3*K-dimensional color feature vector. For example, the formed 3*K-dimensional color feature vector can be expressed as {L1, L2, ... LK, a1, a2, ... aK, b1, b2, ... bK}. The value of K described in this embodiment can be customized to indicate a description of the color space of the image to be processed. In the present application, if the value of the K is too large, the image to be processed is divided into finer colors in the color space, the color feature is more accurately expressed, and the corresponding calculation time is increased; Small, then the overall division degree of the image to be processed in the color space is low, and the dimension of the color feature vector is small, which can improve the data processing speed. After many experiments, the present application provides a range of values of K. The specific value of the K may be: 6 ≤ K ≤ 16, which can ensure that the color feature vector can be accurate, effective, and suitable within the above range of values. Represents the color characteristics of the image to be processed. In this embodiment, the value of K may be a value of 6, that is, an 18-dimensional mixed feature vector of a pixel to be processed in the neighborhood window may be constructed. Finally, according to the color values of the three channels L, a, b of each pixel in the neighborhood window W(p), it is added to the corresponding dimension of the color feature vector. For example, in a neighborhood window of 5*5 and 25 pixels, the Lab color values of the 25 pixels jointly construct an 18-dimensional color feature vector. Specifically, each of the 25 pixels has a set of Lab color values, and the L channel is taken as an example. If the value of the L channel of the first pixel is 10, it can be mapped to the L channel. One of the corresponding six bins (grouping), for example, divided into L1. The L pixel value of the second pixel is 98, which can be divided into L6. By analogy, all 25 pixels in the neighborhood window W(p) are traversed once, and the color values in the corresponding bins are grouped to obtain a pixel to be processed in the neighborhood window W(p). The distribution vector of the total L, a, b color features.

After calculating the color feature vector of the pixel to be processed in the current neighborhood window, one pixel can be shifted once in a certain direction, and then the pixel of the neighborhood window is extracted again according to the above manner, and the new neighborhood window is calculated to be processed. The color feature vector of the pixel. The color feature vectors of all the pixels in the image to be processed are sequentially calculated, and the color features of the pixels of the image to be processed are acquired.

It should be noted that the pixel to be processed in the neighborhood window described in the present application is usually the center point of the set square neighborhood window. For a non-boundary point pixel in the image to be processed, a square neighborhood window may be extracted at one time. For a boundary point or a pixel point that is not close to the boundary point and cannot be extracted by the square neighborhood window, the extraction specification preset according to the neighborhood window may be used, that is, the pixel point of the boundary point or the pixel point close to the boundary may be centered. The calculation is performed on the pixel points that the neighborhood window actually covers in the image to be processed. FIG. 2 is a schematic diagram of a neighborhood window extraction of an image boundary point to be processed according to the present application. As shown in FIG. 2, for example, the set neighborhood window extraction rule is a 5*5 square region, and for a non-corner point of a certain boundary point, the pixel point P1 of the boundary point is extracted as a neighborhood window center. The pixel size specification is 5*3. Correspondingly, for the corner point P2 of the image to be processed, the extracted pixel point has a specification of 3*3.

The blending features described in this application can include gradient features of the image to be processed. In this embodiment, the HoG feature may be used to perform gradient feature extraction to form a gradient feature of M-dimensionality of each pixel in the image to be processed. Generally, the meaning of the gradient may include the difference between each pixel in the image and the adjacent pixel, which can be used to detect an area where the color is not obvious after being constructed as a gradient feature. In this embodiment, the image to be processed can be converted from an RGB color channel to a grayscale image, thus simplifying the complexity of the gradient feature. Specifically, in the implementation manner, the HoG feature may be used to perform gradient feature extraction, and obtain a gradient direction and a gradient value of the pixel in the preset neighborhood window W(p), and then the neighbor window W(p) may be included. The total gradient direction value of all pixel point gradient directions is divided into M bins, for example, the total gradient direction of 180 degrees is divided into 12 bins, and each bin represents a range of 15 degrees. Finally, according to the gradient value of each pixel in the neighborhood window W(p), a linear interpolation method is used to accumulate the corresponding bin (packet) to form an M-dimensional of the pixel to be processed in the neighborhood window. The gradient feature vector, such as the 12-dimensional gradient feature vector in this embodiment, can be expressed as {g1, g2, ... g12}. For example, if the gradient direction of a point in the neighborhood window W(p) of the pixel to be processed is 44 degrees and the gradient value is 10, the bin direction to which the gradient direction is 44 degrees belongs to g3, and the color feature calculation is performed. In a similar manner, a gradient value of 44 degrees can be accumulated in the group g3 to which the value belongs. The gradient direction and the gradient value of all the pixel points of the neighborhood window are traversed, and the gradient feature of the pixel to be processed of the neighborhood window is calculated. Similarly, after calculating a neighborhood window, one pixel can be shifted, and the gradient feature of the next pixel to be processed is continuously calculated. The gradient features of all the pixels of the image to be processed are calculated in sequence, and the calculation manner of the above-mentioned color features may be referred to, and details are not described herein.

After calculating the color feature and the gradient feature of the pixel of the image to be processed, the mixed feature vector of the image to be processed may be constructed. Specifically, the constructing the mixed feature vector of the image to be processed may include splicing and combining the K-dimensional color feature and the M-dimensional gradient feature of each pixel of the image to be processed to form a (K+M) dimension for the pixel. Mixed feature vector. For example, in this embodiment, the 18-dimensional color feature and the 12-dimensional gradient feature value can be sequentially combined and stitched. The first 18-dimensional data is a color feature, and the latter 12-dimensional data is a gradient feature, which can be expressed as {L1, L2,... L6, a1, a2, ... a6, b1, b2, ... b6, g1, g2, ... g12}. Of course, if the size of the image to be processed is [W, H], where W is the width of the image to be processed, and H is the height of the image to be processed, and the unit is a pixel point, then the method can be constructed by the above method. The mixed feature vector of the W*H (K+M) dimension of the image to be processed.

When calculating the color feature and the gradient feature of the pixel in the present application, considering the calculation of the information of the pixel around each pixel to be processed, the feature value of the pixel can be more accurately established, so that the foreground and the background region are similar when the feature space is mixed. The distance between the mixed feature vectors of the two points is greatly increased compared with the distance using only the color features, and the foreground and background regions can be effectively distinguished, and the accuracy of the detection of the main body region is improved.

S2: Clustering the mixed feature vectors to obtain clusters after clustering.

The product image of the image size [W, H] to be processed in the foregoing may generate a mixed feature vector of W*H (K+M) dimensions. In order to improve computational efficiency in the present application, these feature vectors can be clustered. The clustering algorithm used in this embodiment may be a Kmeans clustering algorithm. The specific operation process of the Kmeans clustering algorithm may mainly include:

S201: randomly select L mixed feature vectors from the mixed feature vectors of the W*H (K+M) dimensions as an initial cluster center. In a specific embodiment, the value range of the L may be tested to select a suitable value. Generally, the L value is too large, which may result in a long calculation time. If the L is too small, the feature space may not be divided finely.

S202: Traverse all W*K mixed feature vectors, and calculate the distance between each mixed feature vector and the current cluster center respectively. The distances described in this embodiment are Euclidean distances. For example, two mixed feature vectors are p and q, respectively, where q is a randomly selected current cluster center, then the mixed feature vector and the current cluster center. The Euclidean distance D(p,q) between q can be:

D(p,q)=||(p ₁ -q ₁ ) ² +(p ₂ -q ₂ ) ² ......+(p _(K+M) -q _(K+M) ) ² ||

S203: Calculate, for each mixed feature vector, a distance from the selected L initial cluster centers, where the mixed feature vector belongs to a cluster with the smallest distance from the L initial cluster centers. After a round of calculation and classification, the mixed feature vector can be reasonably divided into the clusters of the L initial cluster centers closest to each other.

S204: Update the cluster center of each cluster. After each pixel in the image to be processed is divided into corresponding clusters, To update the cluster center of each cluster. The specific update calculation method in this embodiment may include calculating an average value of each mixed feature vector in each cluster in each dimension, and then using the calculated average value of each dimension as the cluster new Cluster center.

The above-mentioned S201-S204 is a process of one-time clustering. In the present application, the steps of dividing the clustering and updating the clustering center for each pixel point may be repeatedly performed by clustering until the clustering center of the clustering is no longer A larger amount of movement is performed (the amplitude threshold of the movement can be set according to requirements) or the number of times the cluster calculation reaches the preset calculation requirement. For example, in this embodiment, the number of clustering of the mixed feature vector may be set to 1000 times, or the distance between the cluster center of the new cluster and the cluster center of the cluster is less than 0.5. If the cluster center is the old_C and the new cluster center is New_C, the stop condition of the cluster calculation can be set to D (Old_C, New_C) < 0.5.

In this embodiment, the mixed feature vectors are clustered to form L clusters, and the calculation of the mixed features of the plurality of pixel points in the image to be processed can be reduced to the calculation of L clusters, thereby improving subsequent image area detection. The further calculation rate improves the overall image information processing efficiency.

S3: Calculate a clustering probability of the cluster according to a predetermined rule, and calculate a pixel probability of the pixel point in the cluster based on the clustering probability.

After processing in the foregoing steps, the to-be-processed image is clustered into L clusters in the (K+M)-dimensional mixed feature vector space described in the present application, wherein each cluster of the L clusters The pixels within are similar in the feature space. In the present application, a clustering probability that each cluster belongs to a body region may be calculated in units of each of the clusters, and then all pixel points in the cluster are calculated based on the clustering probability of the clustering. The pixel probability of the body area. In this embodiment, the degree of saliency of each cluster in the entire image to be processed may be used to describe the probability that each cluster belongs to the body region. Specifically, calculating the clustering probability of the cluster according to a predetermined rule may include:

Calculating a distance and D(Ci) of each cluster Ci in the L clusters from other clusters, using a ratio of the clusters and D(Ci) to the sum of the distance sums of all clusters The clustering probability of the cluster Ci.

In this embodiment, it is assumed that clustering centers of the L clusters obtained after clustering are respectively C1, C2, ..., CL, and the degree of saliency of the cluster in this embodiment may be adopted by distance from all other clusters. And the ratio of the sum to the sum. Then, for any cluster Ci, 1≤i≤L, the embodiment provides a method for calculating the distance sum of each cluster and other clusters in the cluster, specifically the Ci of the cluster and other The cluster distance and D(Ci) can be calculated by the following formula (1):

In the above formula, L is the number of clusters, as set in the embodiment 120, ||c _i , c _j || is the mixed feature vector of the cluster center of the current cluster Ci and the clustering of other clusters The Euclidean distance of the center blending feature vector. Generally, the larger the difference of the mixed feature vectors between the two clusters, the larger the Euclidean distance between the two cluster centers. If the distance between a cluster and other clusters is larger overall, it can be said that the higher the difference between the cluster and other clusters, the more likely it is to approach the body region of the image to be processed, and the corresponding calculation is obtained. The sum of the distances from other clusters is also larger. In the method for calculating the distance sum in the present embodiment, a factor Wj is added, and the Wj may be a weight set according to a pixel point included in the current cluster Ci. In the embodiment, generally, the more the number of pixels included in the cluster, the greater the contribution of the corresponding saliency value. Therefore, the Wj can be set according to the pixel points included in the cluster. For example, it may be set to the number of pixels included in the cluster, or the ratio of the number of pixels included in the current cluster to the total number of pixels of the image to be processed, etc., and may be specifically set according to requirements. In this way, when the distance of the cluster is calculated, the weight Wj of the cluster is added, and the number of pixels included in the cluster is counted, which is more in line with the actual image body area in some application scenarios. Features, in this type of application scenario, the calculation result of extracting the main map area can be more accurate.

After obtaining the saliency of each cluster in the image to be processed, the clustering probability that each cluster belongs to the body region may be further calculated according to the saliency. In this embodiment, the ratio of the cluster and D(Ci) to the sum of the distance sums of all the clusters may be used as the clustering probability that the cluster Ci belongs to the body region, and the specific formula may be (2) Calculated:

The above middle ∑ 1 _≤ _j _{≤ L} D(c _j ) is the sum of the calculated cluster sums of all the clusters, and the distance of the current cluster and the ratio of the sum may be used as the current cluster The clustering probability of the subject area. Since the mixed feature vector values in the clustered clusters are relatively close, in one embodiment of the present application, it can be considered that the pixel probability of the pixel points belonging to the body region in the cluster is equivalent to the cluster of the cluster belonging to the body region. Probability, such that a probability value for each pixel can be derived from the probability of the cluster. Therefore, in an embodiment of the present application, the calculating a pixel probability of a pixel point in the cluster based on the clustering probability may include:

S301: The pixel probability of the pixel in the cluster may be a clustering probability of the cluster to which the pixel belongs.

In other embodiments of the present application, the pixels in the cluster may be distributed in other regions of the image to be processed. In the present application, the extracted body region has a compact feature, and the extracted body region is more accurate. Calculate again The pixel probability of each pixel in each cluster belonging to the body area. Here, the present application may set a second neighborhood window W(p)', and may extract the pixel points of the second neighborhood window W(p)' centering on the pixel point P by referring to the manner of calculating the color feature. The probability of a certain pixel point q in the second neighborhood window W(p)' is the clustering probability of the cluster to which the pixel point q belongs, which is represented by P(q), and in another embodiment Calculating the probability that the pixel points in the cluster belong to the body region based on the probability of the clustering may include:

S302: extract a pixel point of the first neighborhood window W(p)' centering on the pixel to be obtained p, and calculate a pixel probability Sal(p) of the pixel to be obtained p belonging to the body region by using the following formula:

In the above formula, P(q) is the clustering probability of the cluster to which the pixel point q in the first neighborhood window W(p)' belongs belongs to the body region, and t is the cluster to which the pixel point p to be sought belongs. The number of pixels, σ is a set of smoothing parameters, which can indicate the size of the currently calculated pixel point p is affected by the surrounding pixels. If the value of σ is large, it can be said that the calculation result of the pixel point p is easily affected by the surrounding pixel points, and vice versa. The σ value can be set according to the experience or the estimation of the result. Generally, for the image sold by the website product, the σ value may be small, for example, the specific value in the embodiment may be 0.17. If the image is in a natural scene (usually a non-commodity image), the value of σ may be too large, for example, may be 0.25.

The setting of the first neighborhood window W(p)' described above may be the same as the neighborhood window set in the foregoing color feature extraction, for example, a square neighborhood window of 5*5 may be set. In this way, when calculating the pixel probability of the pixel in the image to be processed, the pixel of the first neighborhood window W(p)′, such as 5*5, may be calculated centering on the pixel to be obtained, and the method is traversed. The probability of all pixel points in a neighborhood window W(p)' can be calculated as the pixel probability that the pixel point p to be sought belongs to the body region.

The pixel point according to the above S302 belongs to the body region probability calculation method, and the pixel probability that each pixel in the image to be processed belongs to the body region can be calculated, and the probability value adopts the first neighborhood window W ( The probability value of the pixel point in p)' is smoothed and calculated, which can improve the accuracy of the final extraction result.

S4: The image to be processed is detected based on the pixel probability, and the target area is acquired.

After calculating the pixel probability that each pixel of the image to be processed belongs to the body region, the body region and the background region may be separated, and the target region in the image to be processed is extracted and acquired. The target area described in the present application may be a body area (foreground area) in the image to be processed, and in other embodiments, the target area may also be For the background area, that is, the background area of the image to be processed can be detected. In an embodiment of the present application, the detecting the image to be processed based on the pixel probability to obtain the target area may include:

S401: A pixel point that meets a pixel probability value of a pixel in the image to be processed according to a determination threshold PV is used as a target area of the image to be processed.

Specifically, for example, in the implementation process of detecting the body region, for example, a determination threshold PV of the pixel point probability, such as 0.85, may be set in advance, and then the pixel point of the pixel probability of the pixel to be processed may be greater than 0.85. Coming out as the body area of the image to be processed. If the value of the predetermined judgment threshold is too small, the pixel points of the non-subject area are extracted. If the value is too large, the integrity of the extracted image of the body area is reduced. This embodiment provides a determination. The value range of the threshold, the specific predetermined threshold value PV may be: 0.8≤PV≤0.95. The pixel probability of the pixel point in the above S401 is preferably a probability value obtained by smoothing the probability value of the pixel point in the first neighborhood window W(p)'.

Certainly, in the embodiment for detecting the background area, a value that satisfies the determination threshold PV determined as the background area may be set, and the specific determination may be performed according to the actual scenario application.

The present application further provides another preferred embodiment. In the another embodiment, the detecting, by the pixel probability of the pixel, the target to be processed may be:

S4021: The pixel point in the image to be processed that belongs to the body region and whose probability value is greater than the first threshold PF is used as the seed pixel point;

S4022: Calculate a Euclidean distance from a pixel in a surrounding second neighborhood window centering on the seed pixel point;

S4023: The pixel point whose Euclidean distance is less than the second threshold is used as a new seed pixel point;

S2044: traverse the Euclidean distance of all the seed pixel points and the pixel points in the surrounding second neighborhood window and make a determination, and use the calculated seed pixel point as the target area of the to-be-processed image.

In this embodiment, the pixel probability that the pixel belongs to the body region is preferably a clustering probability of the cluster to which the pixel belongs. In addition, the first threshold PF and the second threshold and the third neighborhood window may be set according to actual data processing requirements, for example, the first valve PF value may also be set to 0.85 or selected as a clustering probability. The value of the higher value, the second threshold can be set to 0.5. If the threshold value of the first threshold PF is too small, the pixel value of the non-subject area is extracted too much, and if the value is too large, the integrity of the extracted image of the body area is reduced. For example, the value range of the first threshold PF is set, and the value of the first threshold PF may be: 0.8≤PF≤0.95. The third neighborhood window described in this embodiment is generally a 3*3 eight-contiguous window centered on the seed pixel point, and then the Euclidean distance calculation can be performed according to the 30-dimensional feature mixed feature vector described in the present application. . If the distance satisfies the second threshold requirement, a pixel point satisfying the second threshold requirement around the seed may be used as a new seed pixel point, and a new seed pixel point that meets the second threshold requirement may be considered to belong to the body area. . when However, during processing, a pixel point that does not satisfy the third neighborhood window may be set as a background area. It should be noted that the body area described in the present application is generally connected. In other application scenarios, a pixel point that has not been judged by the second threshold may be set as a background area. In this embodiment, a pixel point having a larger probability value may be used as a seed pixel point, and then the surrounding neighboring points are continuously traversed and a judgment is made to finally obtain a body region.

Certainly, after the pixel probability of the pixel is used in the present application, the manner of acquiring the target area may include, but is not limited to, the embodiments described in the present application, and other processing methods that do not require creative labor based on the method described in this application. Still within the scope of the application described herein, the body region obtained by separating the body region from the background region is performed, for example, using a geodesic distance algorithm.

The image region detecting method provided by the present application constructs a mixed feature vector including pixel color features and gradient features, which can more accurately establish feature values of pixel points, can effectively distinguish foreground and background regions, and improve the subject. The accuracy of regional extraction. Similarly, in a complex background image, the hybrid feature vector described in the present application can combine the color feature and the gradient feature to describe the pixel of the foreground and the pixel of the background into two different clusters, in the Euclidean distance. It is easy to separate the two when calculating. In the present application, the mixed features are clustered, and the clustering and other clustering distances and the sum of the sums are used as the clustering saliency to express the probability that the cluster belongs to the main body, which is more realistic. The user perceives the situation of the product body in the image, making the processing result more accurate and effective. In practical applications, the accuracy of extracting the main body region of the image to be processed by using the main body region extraction method of the present application reaches 89.62%, and the recall rate reaches 88.83%, which solves the problem in the prior art when facing a complex image. The problem of low regional extraction accuracy.

FIG. 3 and FIG. 4 are schematic diagrams of extracting a main body region by using an image region detecting method according to the present application, and FIG. 3 and FIG. 4 are respectively to-be-processed images, existing algorithm extraction results, and extraction by the present invention from left to right. result. As shown in FIG. 3, an image with a very similar color between the foreground and the background area is selected. It can be seen from FIG. 3 that the existing algorithm cannot detect the highlighted portion of the garment when processing such an image because The color here is very close to the white of the background. The mixed feature vector of (K+M) dimension of the present application can effectively distinguish similar foreground and background regions. FIG. 4 is a case where the background is complicated. It can be seen from FIG. 4 that the existing algorithm is difficult to accurately extract the subject on the image with high complexity. The method of the present application uses the cluster to acquire the cluster to calculate the pixel points. The pixel probability of the main body region can effectively solve the problem of image subject extraction on the background not only in color but also in structure, which greatly improves the detection accuracy.

Based on an image region detecting method described in the present application, the present application further provides an image region detecting device. FIG. 5 is a schematic structural diagram of a module of an image area detecting apparatus according to the present application. As shown in FIG. 5, the apparatus may include:

The feature calculation module 101 is configured to calculate a color feature and a gradient feature of the pixel of the image to be processed, and construct a mixed feature vector of the image to be processed;

The clustering module 102 can be configured to cluster the mixed feature vectors to obtain clusters after clustering;

The clustering probability module 103 may be configured to calculate a clustering probability of the cluster according to a predetermined rule;

a pixel probability module 104, configured to calculate a pixel probability of a pixel point in the cluster based on a probability of the clustering;

The detecting module 105 is configured to detect the image to be processed based on the pixel probability to acquire a target area.

In a specific implementation process, the feature calculation module 101 may be divided into multiple sub-modules to perform processing of respective processes. FIG. 6 is a schematic diagram of a module structure of an embodiment of a feature calculation module 101 according to the present application. As shown in FIG. 6, the feature calculation module 101 may be configured to include:

The color feature module 1011 is configured to calculate a color feature of the pixel of the image to be processed;

The gradient feature module 1012 can be configured to calculate a gradient feature of the pixel of the image to be processed;

The blending feature module 1013 can be configured to combine the color features and the gradient features to form a mixed feature vector of the image to be processed.

FIG. 7 is a schematic diagram of a module structure of an embodiment of a feature calculation module 1011 according to the present application. As shown in FIG. 7, the color feature module 1011 may include:

The Lab conversion module 111 can be configured to convert the image to be processed into data in a Lab format;

The color feature vector module 112 may be configured to extract pixel points of the neighborhood window in the image to be processed centering on the pixel to be processed, and divide the three channels L, a, and b of the pixel in the neighborhood window into two K groups, forming a color feature vector of 3*K dimensions;

The feature calculation module 113 may be configured to add color values of each of the L, a, and b channels of each pixel in the neighborhood window to a dimension corresponding to the color feature vector to form the neighbor. The color characteristics of the pixels to be processed in the domain window.

Through the above module processing, the color characteristics of the image to be processed can be obtained. The present application provides a value range of K for the device, and the specific value of the K may be: 6 ≤ K ≤ 16, and the color feature vector extracted by the device of the present application can be ensured within the above range. Effectively and appropriately express the color characteristics of the image to be processed.

The clustering probability module 103 in the foregoing device calculates the probability that the cluster belongs to the body region, and may specifically include:

a distance and calculation module, which can be used to calculate a distance between each cluster and other clusters in the cluster;

The clustering probability calculation module may be configured to calculate a clustering probability of the cluster according to the cluster and the sum of the distance sums of all clusters.

In a preferred embodiment of the image area detecting apparatus of the present application, the distance calculating module calculates the distance and specificity of each cluster in the cluster from other clusters, and may include:

The distance and D(Ci) of each cluster in the cluster from other clusters are calculated using the following formula:

In the above formula, L is the number of clusters, ||c _i , c _j || is the Euclidean distance of the mixed feature vector of the cluster center of the current cluster Ci and the clustered feature vector of the cluster cluster of other clusters, Wj is a weight set according to the pixel points included in the current cluster Ci.

8 is a block diagram of an embodiment of a pixel probability module 104 of the present application. As shown in FIG. 8, the pixel probability module 104 may include at least one of the following:

The first probability module 1041 may be configured to use the clustering probability of the cluster to which the pixel belongs to be the pixel probability of the pixel;

The second probability module 1042 may be configured to extract a pixel point of the first neighborhood window W(p)′ centering on the pixel point p to be obtained, and calculate a pixel probability Sal(p) of the pixel point p to be obtained by using the following formula: :

In the above formula, P(q) is the probability that the cluster to which the pixel point q in the first neighborhood window W(p)' belongs belongs to the body region, and t is the pixel point in the cluster to which the pixel point p to be sought belongs The number of σ is a smoothing parameter set.

The extraction module 105 may extract a body region of the image to be processed by using different extraction methods set in advance. Specifically, at least one of the following modules may be included:

a first extraction module, configured to use, as a target area of the to-be-processed image, a pixel point in which a pixel probability value of a pixel in the image to be processed meets a determination threshold PV requirement;

The second extraction module may be configured to use, as a seed pixel point, a PF pixel point whose probability value of the pixel in the image to be processed belongs to the body region is greater than the first threshold; and may also be used to calculate and center the pixel pixel a Euclidean distance of a pixel in the surrounding second neighborhood window; and may also be used to use the pixel point whose Euclidean distance is less than the second threshold as a new seed pixel point; and may also be used to traverse all of the seed pixel points and surrounding areas Determining the Euclidean distance of the pixel in the second neighborhood window and making a judgment, and using the calculated seed pixel as the target area of the image to be processed.

In the image area detecting device described above, the value of the determination threshold PV may be: 0.8 ≤ PV ≤ 0.95;

and / or,

The value of the first threshold PF may be: 0.8≤PF≤0.95.

The determination threshold PV or the value range of the first threshold PF provided in this embodiment can effectively ensure the correctness and validity of the extraction of the main body region, and improve the accuracy of the image region detection with high image complexity.

An image area detecting device according to the present application can be used in a platform type e-commerce website to separate a body area and a background area in a complex and varied product image, and can effectively cope with various complicated situations in an actual image scene. It can accurately and effectively separate the main body area in complex images and improve the accuracy of image detection.

An image area detecting apparatus described in the present application can be used in a variety of terminal devices, such as a mapping application of a user mobile client, or a client or server dedicated to image body or background area extraction. Generally, after performing image detection and acquiring a target area, the image detecting apparatus may save or display the image of the acquired target area to the user for further processing. The present application provides an image area detecting apparatus, which can be applied to process an image of a user or a client, perform image detection, and acquire a target area. Specifically, the device may be configured to include:

The first processing unit may be configured to acquire a to-be-processed image of the user/client, calculate a color feature and a gradient feature of the pixel of the image to be processed, and construct a mixed feature vector of the image to be processed;

a second processing unit, configured to perform clustering on the hybrid feature vector to obtain clustered clusters; and may be further configured to calculate a clustering probability of the cluster according to a predetermined rule, and based on the clustering probability Calculating a pixel probability of a pixel point in the cluster;

The output unit may be configured to acquire a target area of the image to be processed based on the pixel probability, and store or display the acquired target area at a specified location.

The image de-detection device provided in this embodiment can effectively and accurately extract the target area of the to-be-processed picture in the client or the server, and can improve the accuracy of the client-side picture processing user experience or the client/server image information processing.

Although the description of the different image format conversion, clustering method, calculation of a given formula, and the like is mentioned in the present application, the present application is not limited to a format conversion, a clustering method, or the present application, which must be completely standard. The case of the fixed formula. The above description of the various embodiments in the present application is only an application in some embodiments of the present application. The slightly modified processing method may also implement the foregoing embodiments of the present application. Program. Of course, the same application can still be implemented without any inventive variation of the processing method steps described in the above embodiments of the present application, and details are not described herein again.

The unit or module illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product having a certain function. For the convenience of description, the above devices are described as being separately divided into various modules by function. Of course, the functions of the modules may be implemented in the same software or software and/or hardware when implementing the present application, or the modules implementing the same functions may be implemented by multiple sub-modules or a combination of sub-units.

The neighboring technical staff also knows that in addition to implementing the controller in a purely computer readable program code, the controller can be logically programmed by means of logic gates, switches, ASICs, programmable logic The form of the controller and embedded microcontroller, etc. to achieve the same function. Therefore, such a controller can be considered as a hardware component, and a device for internally implementing it for implementing various functions can also be regarded as a structure within a hardware component. Or even a device for implementing various functions can be considered as a software module that can be both a method of implementation and a structure within a hardware component.

The application can be described in the general context of computer-executable instructions executed by a computer, such as a program module. Generally, program modules include routines, programs, objects, components, data structures, classes, and the like that perform particular tasks or implement particular abstract data types. The present application can also be practiced in distributed computing environments where tasks are performed by remote processing devices that are connected through a communication network. In a distributed computing environment, program modules can be located in both local and remote computer storage media including storage devices.

It can be seen from the description of the above embodiments that the technicians in the neighborhood can clearly understand that the present application can be implemented by means of software plus a necessary general hardware platform. Based on such understanding, the technical solution of the present application may be embodied in the form of a software product in essence or in the form of a software product, which may be stored in a storage medium such as a ROM/RAM or a disk. , an optical disk, etc., includes instructions for causing a computer device (which may be a personal computer, mobile terminal, server, or network device, etc.) to perform the methods described in various embodiments of the present application or portions of the embodiments.

The various embodiments in the specification are described in a progressive manner, and the same or similar parts between the various embodiments may be referred to each other, and each embodiment focuses on differences from other embodiments. This application can be used in a variety of general purpose or special purpose computer system environments or configurations. For example: personal computer, server computer, handheld or portable device, tablet device, multiprocessor system, microprocessor based system, programmable electronic device, network PC, small computer, mainframe computer, including any of the above systems or The distributed computing environment of the device, and so on.

While the present invention has been described by the embodiments of the present invention, it will be understood by those skilled in the claims .

Claims

An image area detecting method, the method comprising:

Calculating a color feature and a gradient feature of the pixel of the image to be processed, and constructing a mixed feature vector of the image to be processed;

Clustering the mixed feature vectors to obtain clusters after clustering;

Calculating a clustering probability of the cluster according to a predetermined rule, and calculating a pixel probability of the pixel point in the cluster based on the clustering probability;

The image to be processed is detected based on the pixel probability to acquire a target area.
The image area detecting method according to claim 1, wherein the calculating the color characteristics of the pixel points of the image to be processed comprises:

If the image to be processed is not data in the Lab format, converting the data format of the image to be processed into a Lab format;

Extracting pixel points of the neighborhood window in the image to be processed centering on the pixel to be processed, and dividing the three channels L, a, and b of the pixel in the neighborhood window into K groups to form a 3*K dimension. Color feature vector;

Adding color values of the three channels of the L, a, and b in each of the neighboring windows to the dimension corresponding to the color feature vector to form a pixel to be processed in the neighborhood window. Color characteristics.
The image area detecting method according to claim 2, wherein the value of K is: 6 ≤ K ≤ 16.
The image region detecting method according to claim 1, wherein the calculating a clustering probability of the cluster according to a predetermined rule comprises:

A distance sum of each cluster in the cluster to other clusters is calculated, and a ratio of the cluster and the sum of the distance sums of all clusters is used as a clustering probability of the cluster.
The image region detecting method according to claim 4, wherein the calculating a distance between each cluster in the cluster and other clusters comprises:

The distance and D(Ci) of each cluster in the cluster from other clusters are calculated using the following formula:

In the above formula, L is the number of clusters, ||c i , c j || is the Euclidean distance of the mixed feature vector of the cluster center of the current cluster Ci and the clustered feature vector of the cluster cluster of other clusters, Wj is a weight set according to the pixel points included in the current cluster Ci.
The image region detecting method according to claim 1, wherein the calculating a pixel probability of a pixel point in the cluster based on the clustering probability comprises:

The pixel probability of the pixel in the cluster is the clustering probability of the cluster to which the pixel belongs.
The image region detecting method according to claim 1, wherein the calculating a pixel probability of a pixel point in the cluster based on the clustering probability comprises:

The pixel point of the first neighborhood window W(p)' is extracted centering on the pixel point p to be obtained, and the pixel probability Sal(p) of the pixel point p to be obtained is calculated by the following formula:

In the above formula, P(q) is the clustering probability of the cluster to which the pixel point q in the first neighborhood window W(p)' belongs, and t is the number of pixels in the cluster to which the pixel point p to be obtained belongs. The number, σ is the set smoothing parameter.
The image region detecting method according to claim 1, wherein the detecting the image to be processed based on the pixel probability to acquire the target region comprises:

Pixel points whose pixel probability values of pixel points in the image to be processed meet the judgment threshold PV requirement are used as the target area of the image to be processed;

or,

Pixel points whose probability values of pixels in the image to be processed are greater than the first threshold PF are used as seed pixel points;

Calculating an Euclidean distance from a pixel in a surrounding second neighborhood window centering on the seed pixel point;

Pixel points whose Euclidean distance is less than the second threshold are used as new seed pixel points;

And traversing the Euclidean distance of all the seed pixel points and the pixel points in the surrounding second neighborhood window and making a determination, and using the calculated seed pixel point as the target area of the to-be-processed image.
The method for detecting an image region according to claim 8, wherein the value of the determination threshold PV is: 0.8 ≤ PV ≤ 0.95;

or,

The value of the first threshold PF ranges from 0.8 ≤ PF ≤ 0.95.
An image area detecting device, characterized in that the device comprises:

a feature calculation module, configured to calculate a color feature and a gradient feature of the pixel of the image to be processed, and construct a mixed feature vector of the image to be processed;

a clustering module, configured to cluster the mixed feature vectors to obtain clusters after clustering;

a clustering probability module, configured to calculate a clustering probability of the cluster according to a predetermined rule;

a pixel probability module, configured to calculate a pixel probability of a pixel point in the cluster based on the clustering probability;

And a detecting module, configured to detect the image to be processed based on the pixel probability, and acquire a target area.
The image area detecting apparatus according to claim 10, wherein the feature calculating module comprises:

a color feature module, configured to calculate a color feature of the pixel of the image to be processed;

a gradient feature module, configured to calculate a gradient feature of the pixel of the image to be processed;

A hybrid feature module is configured to combine the color feature and the gradient feature to form a mixed feature vector of the image to be processed.
The image area detecting apparatus according to claim 11, wherein the color feature module comprises:

a Lab conversion module, configured to convert the image to be processed into data in a Lab format;

a color feature vector module, configured to extract pixel points of the neighborhood window in the image to be processed centering on the pixel to be processed, and divide the three channels L, a, and b of the pixel in the neighborhood window into K Grouping to form a color feature vector of 3*K dimensions;

a feature calculation module, configured to accumulate color values of each of the L, a, and b channels in each of the neighboring windows in a dimension corresponding to the color feature vector to form the neighborhood window The color characteristics of the pixel to be processed.
The image region detecting device according to claim 12, wherein the value range of K in the color feature vector module is: 6 ≤ K ≤ 16.
The image region detecting apparatus according to claim 10, wherein the clustering probability module comprises:

a distance and calculation module for calculating a distance between each cluster and the other clusters in the cluster;

a clustering probability calculation module, configured to calculate a clustering probability of the cluster according to the cluster and the sum of the distance sums of all clusters.
The image region detecting apparatus according to claim 14, wherein the distance calculating module calculates a distance between each cluster in the cluster and other clusters and includes:

The distance and D(Ci) of each cluster in the cluster from other clusters are calculated using the following formula:

In the above formula, L is the number of clusters, ||c i , c j || is the Euclidean distance of the mixed feature vector of the cluster center of the current cluster Ci and the clustered feature vector of the cluster cluster of other clusters, Wj is a weight set according to the pixel points included in the current cluster Ci.
An image region detecting apparatus according to claim 10, wherein said pixel probability module comprises at least one of the following:

a first probability module, configured to use a clustering probability of a cluster to which the pixel belongs is a pixel probability of the pixel;

a second probability module, configured to extract a pixel point of the first neighborhood window W(p)' centering on the pixel point p to be obtained, and calculate a pixel probability Sal(p) of the pixel point p to be obtained by using the following formula:

In the above formula, P(q) is the probability that the cluster to which the pixel point q in the first neighborhood window W(p)' belongs belongs to the body region, and t is the pixel point in the cluster to which the pixel point p to be sought belongs The number of σ is the set smoothing parameter.
An image region detecting apparatus according to claim 10, wherein said extraction module comprises at least one of the following:

a first extraction module, configured to use, as a target area of the to-be-processed image, a pixel point whose pixel probability value of the pixel in the image to be processed meets the requirement of the determination threshold PV;

a second extraction module, configured to use, as a seed pixel point, a pixel point whose probability value of the pixel in the image to be processed belongs to the body region is greater than the first threshold PF; The Euclidean distance of the pixel in the two neighboring window; the pixel point of the Euclidean distance less than the second threshold is also used as a new seed pixel; and is further used to traverse all the seed pixels and the surrounding second neighbor The Euclidean distance of the pixel in the domain window is determined and the calculated seed pixel is used as the target area of the image to be processed.
The image area detecting apparatus according to claim 17, wherein the value of the determination threshold PV is: 0.8 ≤ PV ≤ 0.95;

and / or,

The value of the first threshold PF ranges from 0.8 ≤ PF ≤ 0.95.
An image area detecting apparatus, wherein the apparatus is configured to include:

a first processing unit, configured to acquire a to-be-processed image of the user/client, calculate a color feature and a gradient feature of the pixel of the image to be processed, and construct a mixed feature vector of the image to be processed;

a second processing unit, configured to cluster the mixed feature vector to obtain clustered clusters; and further configured to calculate a clustering probability of the cluster according to a predetermined rule, and calculate the clustering probability based on the clustering probability The pixel probability of a pixel in the cluster;

And an output unit, configured to acquire a target area of the to-be-processed image based on the pixel probability, and store or display the acquired target area in a specified location.