WO2020211522A1 - 图像的显著性区域的检测方法和装置 - Google Patents

图像的显著性区域的检测方法和装置 Download PDF

Info

Publication number
WO2020211522A1
WO2020211522A1 PCT/CN2020/076000 CN2020076000W WO2020211522A1 WO 2020211522 A1 WO2020211522 A1 WO 2020211522A1 CN 2020076000 W CN2020076000 W CN 2020076000W WO 2020211522 A1 WO2020211522 A1 WO 2020211522A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature
image
brightness
pixel
map
Prior art date
Application number
PCT/CN2020/076000
Other languages
English (en)
French (fr)
Inventor
张欢欢
许景涛
唐小军
李慧
Original Assignee
京东方科技集团股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 京东方科技集团股份有限公司 filed Critical 京东方科技集团股份有限公司
Publication of WO2020211522A1 publication Critical patent/WO2020211522A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration by the use of more than one image, e.g. averaging, subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/90Determination of colour characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]

Definitions

  • the present disclosure relates to the field of image processing technology, and in particular to a method and device for detecting a salient region of an image, a computer device, and a computer-readable storage medium.
  • the saliency of an image is an important visual feature of an image, and it reflects the degree to which the human eye attaches importance to certain areas of the image.
  • a saliency detection algorithm it is often necessary to use a saliency detection algorithm to detect the image to obtain the saliency area of the image. It is mainly used in mobile phone camera software, target detection software, and image compression software.
  • one way to obtain the saliency area of an image is to detect the saliency area of the image based on a pure mathematical calculation method.
  • the detection accuracy of the saliency area is not high in this method, which is different from the human eye perception.
  • Another way to obtain the saliency area of the image is to detect the saliency area of the image based on the deep learning method.
  • this method is related to the selected training samples, requires high hardware, and has poor real-time effects. Therefore, how to improve the speed and accuracy of image saliency region detection is a technical problem to be solved urgently.
  • the first aspect of the embodiments of the present disclosure provides a method for detecting a salient area of an image, including:
  • the saliency area is obtained.
  • the detection method further includes:
  • the obtaining the saliency area according to the saliency map includes: obtaining the saliency area according to the binary image.
  • the acquiring the saliency region according to the binary image includes:
  • the connected domain is regarded as a saliency area.
  • performing connected domain labeling on pixels of the binary image, and combining pixels with the same connected domain label into one connected domain includes:
  • the gray value of the pixel is 1 and it is not marked, it is used as a seed pixel, and the connected domain label of the seed pixel is marked as N, and the 8 neighboring pixels of the seed pixel are traversed to determine 8 of the seed pixel. Whether the gray value of the unmarked pixel in the neighborhood pixel is 1;
  • the gray value of at least one unmarked pixel in the 8-neighbor pixels of the seed pixel is 1, then the connected domain label of the unmarked pixel in the 8-neighbor pixel of the seed pixel and the gray value of 1 is marked as N, and use all unmarked pixels with a gray value of 1 in the 8 neighborhood pixels as seed pixels;
  • the 8 neighborhood pixels of the pixel with coordinates (x, y) have coordinates (x-1, y-1), (x-1, y), (x-1, y+1), (x , Y-1), (x, y+1), (x+1, y-1), (x+1, y), (x+1, y+1) pixels;
  • x is the pixel in The number of rows in the binary image, and
  • y is the number of columns of the pixel in the binary image.
  • the extracting the brightness feature of the image and obtaining at least two first brightness feature maps by using the image pyramid and the first multi-scale calculation relational expression includes:
  • the three primary color components of each pixel in the image use Calculate the brightness feature corresponding to each pixel to obtain the brightness feature middle map, where the three primary color components of each pixel include red component, green component and blue component, R is red component, G is green component, and B is blue Component, I is the brightness feature;
  • the extracting the hue feature of the image and using the image pyramid and the second multi-scale calculation relational expression to obtain at least two first hue feature maps includes:
  • the hue feature corresponding to each pixel in the image use Calculate the hue feature corresponding to each pixel to obtain the hue feature intermediate image, where the three primary color components of each pixel include red, green and blue, R is the red component, G is the green component, and B is the blue Component, H is the hue feature;
  • H(c,s)
  • the at least two first brightness feature maps are respectively normalized, and the at least two first brightness feature maps after the normalization process are merged into a second brightness feature map ,include:
  • For each first brightness feature map traverse the brightness features of the first brightness feature map to obtain the first brightness feature maximum value I 1max and the first brightness feature minimum value I 1min of the first brightness feature map, according to the formula Normalize the brightness feature of the first brightness feature map to between 0 and P, where I represents the brightness feature value of the pixel in the first brightness feature map;
  • the at least two first brightness feature maps after the normalization process are fused into a second brightness feature map in a weighted average manner.
  • the at least two first tone feature maps are respectively normalized, and the at least two first tone feature maps after the normalization process are merged into a second tone feature Figures, including:
  • each first hue feature map traverse the hue features of the first hue feature map to obtain the first hue feature maximum value H 1max and the first hue feature minimum value H 1min of the first hue feature map, according to the formula Normalize the hue feature of the first hue feature map to between 0 and P, and H represents the hue feature value of the pixel of the first hue feature map;
  • the maximum value of the 8-neighbor tone feature and the 8 neighborhoods are obtained.
  • the at least two first tone feature maps after all the normalization processes are merged into a second tone feature map through a weighted average manner.
  • a second aspect of the embodiments of the present disclosure provides a computer device, including a storage unit and a processing unit, wherein the storage unit stores a computer program that can run on the processing unit; the processing unit executes the computer program When realizing the above-mentioned method of detecting the saliency area of the image.
  • a third aspect of the embodiments of the present disclosure provides a computer-readable storage medium that stores a computer program, where the computer program is executed by a processor to realize the above-mentioned method for detecting a salient region of an image.
  • a fourth aspect of the embodiments of the present disclosure provides a device for detecting a salient region of an image, including:
  • the extraction module is configured to extract the brightness feature of the image, and obtain at least two first brightness feature maps by using the image pyramid and the first multi-scale calculation relation, and the extraction module is further configured to extract the tone feature of the image , And use the image pyramid and the second multi-scale calculation relationship to obtain at least two first tone feature maps;
  • the fusion module is configured to perform normalization processing on the at least two first brightness feature maps respectively, and fuse the at least two first brightness feature maps after the normalization processing into a second brightness feature map, so The fusion module is further configured to perform normalization processing on the at least two first tone feature maps, respectively, and fuse the at least two first tone feature maps after the normalization processing into a second tone feature map , The fusion module is further configured to merge the second brightness feature map and the second hue feature map into a saliency map; and
  • the acquiring module is configured to acquire the saliency area according to the saliency map.
  • the acquisition module is further configured to use an adaptive threshold binarization algorithm to perform binarization processing on the saliency map to obtain a binary image; and obtain the saliency region according to the binary image .
  • the acquisition module is further configured to mark pixels of the binary image with connected domains, merge all pixels with the same connected domain label into one connected domain; and use the connected domain as a saliency area.
  • the extraction module is further configured to:
  • the three primary color components of each pixel in the image use Calculate the brightness feature corresponding to each pixel to obtain the brightness feature middle map, where the three primary color components of each pixel include red component, green component and blue component, R is red component, G is green component, and B is blue Component, I is the brightness feature;
  • the extraction module is further configured:
  • the hue feature corresponding to each pixel is calculated, and the hue feature intermediate image is obtained.
  • the three primary color components of each pixel include red, green and blue components, R is the red component, G is the green component, and B is the blue component , H is the color feature;
  • H(c,s)
  • FIG. 1 is a flowchart of a method for detecting a salient region of an image provided by an embodiment of the disclosure
  • FIG. 2 is a flowchart of another method for detecting a salient region of an image provided by an embodiment of the disclosure
  • FIG. 3 is a flowchart of another method for detecting a salient region of an image provided by an embodiment of the disclosure
  • FIG. 4 is a flowchart of another method for detecting a salient region of an image provided by an embodiment of the disclosure.
  • FIG. 5 is a flowchart of another method for detecting a salient region of an image provided by an embodiment of the disclosure.
  • FIG. 6 is a flowchart of another method for detecting a salient region of an image provided by an embodiment of the disclosure.
  • FIG. 7 is a flowchart of yet another method for detecting a salient region of an image provided by an embodiment of the disclosure.
  • FIG. 8 is a flowchart of yet another method for detecting a salient region of an image provided by an embodiment of the disclosure.
  • FIG. 9 is a schematic diagram of modules of an apparatus for detecting a saliency region of an image provided by an embodiment of the disclosure.
  • FIG. 10 is a block diagram of a computer device provided by an embodiment of the disclosure.
  • the embodiment of the present disclosure provides a method for detecting a salient region of an image, as shown in FIG. 1, including:
  • brightness refers to the perception of the brightness of the light source or object by the human eye.
  • the brightness feature can show the brightness of the color.
  • the brightness information of the color of each pixel of the image can be obtained, that is, the brightness feature of each pixel of the image can be obtained.
  • An effective and simple conceptual structure for representing images at multiple resolutions is the image pyramid.
  • the image pyramid was originally used for machine vision and image compression.
  • An image pyramid is a series of image collections arranged in a pyramid shape and gradually reduced in resolution.
  • the Gaussian pyramid is essentially a multi-scale representation of an image, that is, the same image is Gaussian blurring multiple times and down-sampled to produce different scales Download multiple images for subsequent processing.
  • an image from which the brightness feature is extracted is Gaussian blurring based on the Gaussian pyramid multiple times, and down-sampled to generate multiple images with brightness feature information at different scales, and then proceed according to the first multi-scale calculation relationship Operate to obtain at least two first brightness feature maps.
  • the first multi-scale calculation relational expression is obtained, for example, according to the central peripheral difference calculation method, and is used to calculate the contrast information in a plurality of images with brightness feature information at different scales generated by the Gaussian pyramid.
  • the Gaussian Pyramid contains a series of Gaussian filters whose cut-off frequency gradually increases by a factor of 2 from the upper layer to the next.
  • the hue is the comprehensive effect of the various spectral components reflected by the object under sunlight on the human eye, that is, the type of color.
  • the color of the same hue refers to a series of colors with similar combination ratios of the three primary colors in the color components, which show a vivid color tendency in appearance. For example, vermilion, scarlet, and pink are all red hues.
  • hue features can be used to describe color attributes, such as yellow, orange, or red.
  • the color attribute information of each pixel of the image can be obtained, that is, the hue feature of each pixel of the image can be obtained.
  • the image pyramid is essentially the same as the image pyramid used to obtain the first brightness feature map, and is a multi-scale representation of the image.
  • an image from which tonal features can be extracted can be Gaussian blurring multiple times and down-sampled to generate multiple images with tonal feature information at different scales, and then calculate the relationship according to the second multi-scale Perform calculations to obtain at least two first tone feature maps.
  • the second multi-scale calculation relational expression is also obtained according to the central peripheral difference calculation method, and is used to calculate the contrast information in multiple images with tone feature information at different scales generated by the Gaussian pyramid.
  • S30 Perform normalization processing on the at least two first brightness feature maps, respectively, and merge the at least two first brightness feature maps after the normalization process into a second brightness feature map.
  • the normalization process is to unify the brightness characteristics of each first brightness feature map to the same order of magnitude to provide higher-precision data information for subsequent processing.
  • S40 Perform normalization processing on at least two first tone feature maps, and merge the at least two first tone feature maps after the normalization processing into a second tone feature map.
  • the normalization process is to unify the tone characteristics of each first tone feature map to the same order of magnitude, and provide higher-precision data information for subsequent processing.
  • first tone feature maps when two or more first tone feature maps are obtained according to S20, an interpolation operation needs to be performed first, and the first tone feature map with a smaller scale is enlarged by the operation to make it equal to the first tone with a larger scale.
  • the feature maps have the same scale, and then merge to form a second tone feature map.
  • the second brightness feature map and the second hue feature map can be fused by a weighted average method, that is, the second brightness feature map and the second hue feature map are sequentially combined with the pixels at the positions corresponding to the same column in the same row.
  • the brightness feature and the hue feature are weighted and averaged, so that the second brightness feature map and the second hue feature map are fused into a saliency map.
  • the weight value in the weighted average method can be set as required, which is not limited in the present disclosure.
  • the brightness characteristics of the pixels in the first row and the first column of the second brightness feature map are directly added to the tone characteristics of the pixels in the first row and the first column of the second brightness feature map, and then Divide by 2, and the result of the calculation is the information of the pixels in the first row and first column of the saliency map.
  • Other pixels can be deduced by analogy, so that they can be merged into a saliency map.
  • the visual attention mechanism of the human visual system allows people to gradually exclude relatively unimportant information from complex scenes, select important and necessary information as the target of attention, and prioritize it. It can be seen from the above description that the present disclosure is based on the principle of simulating the human visual attention mechanism, excludes other information from the image, specifically selects the brightness feature and hue feature for extraction, and normalizes the first brightness feature map After the transformation processing, the second brightness feature map is fused into the second brightness feature map. The first tone feature map is normalized and then fused into the second tone feature map. Then the second brightness feature map and the second tone feature map are fused into a saliency map. The secondary processing and fusion make the obtained saliency map more approximate to the target that the human eye pays attention to.
  • the area that causes visual contrast in the saliency map, the target that attracts attention is called the salience region.
  • the salient area can also be called the foreground, and the remaining areas are called the background.
  • the use of the saliency area can eliminate the interference of other background areas in the saliency map and directly approach the user's detection intention, which is beneficial to the improvement of detection performance.
  • the embodiments of the present disclosure provide a method for detecting the saliency region of an image.
  • the first brightness feature map is generated by using the image pyramid and the first multi-scale calculation relationship
  • the first hue feature map is generated using the image pyramid and the second multi-scale calculation relationship
  • the first brightness feature map is performed
  • the first color feature map is normalized and then fused into the second color feature map
  • the second brightness feature map and the second color feature map are fused into a saliency map.
  • the saliency area can be extracted from the saliency map.
  • the brightness feature of the image is extracted, and at least two first brightness feature maps are obtained by using the image pyramid and the first multi-scale calculation relationship, as shown in FIG. 2, including:
  • the three primary color components of each pixel in the image The brightness feature corresponding to each pixel is calculated, and the brightness feature middle image is obtained.
  • the three primary color components of each pixel include red, green and blue components.
  • R is the red component
  • G is the green component
  • B is the blue component.
  • I is the brightness characteristic.
  • the RGB color space is not easy to describe the image, and there is no way to correctly express the true difference between the colors perceived by the human eye. Instead, use brightness, hue and saturation.
  • the HIS color space is more consistent with the human eye's perception of colors. Therefore, the RGB color space can be non-linearly transformed into the HIS color space, which can describe images more naturally and intuitively.
  • HIS color space uses H (Hue), I (Intensity), and S (Saturation) to represent hue, brightness and saturation.
  • RGB color space According to the conversion of RGB color space into HIS color space, the relationship between red component, green component and blue component and brightness It can be seen that if you give an image in RGB color format, you can use the formula Calculate the brightness characteristics of each pixel.
  • the brightness characteristics of all pixels constitute the middle image of the brightness characteristics.
  • S102 Input the brightness feature middle image into the image pyramid to obtain M scale sub-brightness feature middle images; where the image pyramid is a Gaussian pyramid with M layers, and M ⁇ 7.
  • the brightness feature intermediate image is input into a Gaussian pyramid, and a series of Gaussian filters are used to filter and sample the brightness feature intermediate image.
  • the 0th layer is the original image of the luminance feature intermediate image, and the size remains unchanged.
  • the large-scale luminance feature intermediate image of the 0th layer is convolved with the Gaussian filter to obtain the small-scale luminance feature intermediate image of the first layer.
  • the other layers are followed by analogy.
  • the size of the Gaussian kernel of the Gaussian filter determines the degree of blurring of the image. The smaller the Gaussian kernel, the lighter the blurring is, and the larger the Gaussian kernel is, the more serious the blurring.
  • the size of the Gaussian kernel can be selected as required. For example, a Gaussian kernel of 5 by 5 pixels may be used to filter and sample the luminance feature intermediate image.
  • the more Gaussian filters included in the Gaussian pyramid, the more levels of the Gaussian pyramid, and the more intermediate images of brightness features of different scales are obtained.
  • the higher the Gaussian pyramid level the smaller the scale of the corresponding brightness feature middle image and the lower the resolution.
  • the number of levels of the Gaussian pyramid needs to be determined according to the size of the input brightness feature middle image. The larger the input brightness feature middle image size, the more the correspondingly set Gaussian pyramid levels, the smaller the input brightness feature middle image size, and the corresponding settings The lower the level of the Gaussian pyramid.
  • the following provides a method for inputting the brightness feature intermediate image into a Gaussian pyramid to obtain the sub-luminance feature intermediate image of M scales to clearly describe the implementation process.
  • the level of the Gaussian pyramid is determined to be 9 layers, that is, the 0th to the 8th layer.
  • the 0th layer is the original image of the brightness feature middle image, and the size is unchanged.
  • the first layer First, the original image of the middle image of the brightness feature of the 0th layer is doubled, and the Gaussian filter is used to filter and sample it, so that its length and width are reduced by one time, and the image area becomes 1/4 times, thus Make its size become 1/2 times of the original image of the brightness feature middle image.
  • the second layer Firstly, the original image of the half-time brightness feature intermediate image obtained in the first layer is doubled, and the Gaussian filter is used to filter and sample it, so that the length and width are respectively shortened, and the image area becomes 1/4 times, so that its size becomes 1/4 times of the original image of the brightness feature middle image.
  • the process from the 3rd layer to the 8th layer is repeated in turn with reference to the above method, so as to obtain 1/8 times, 1/16 times, 1/32 times, 1/64 times, 1/128 times, and the original image of the brightness feature intermediate image.
  • the middle image of the 6 sub-brightness features 1/256 times.
  • 9 scales of sub-luminance feature middle images can be obtained, which are 1 times, 1/2 times, 1/4 times, 1/8 times, 1/16 times, 1/32 times of the input brightness characteristic middle images. , 1/64 times, 1/128 times, 1/256 times.
  • I(c,s)
  • the calculation method of the first multi-scale calculation relation I(c,s)
  • is called the center-peripheral difference calculation, which is designed according to the physiological structure of the human eye and is used for Calculate the contrast information in the image I(c,s).
  • the human eye's receptive field reacts strongly to the features with high contrast in the input of visual information, such as the situation where the center is bright and the periphery is dark, which belongs to the visual information with large contrast.
  • the sub-luminance feature intermediate image with a larger scale has more detailed information, while the smaller-scale sub-luminance feature intermediate image can reflect the local background information better due to filtering and sampling operations. Therefore, the scale is larger.
  • the large sub-luminance feature intermediate image and the smaller-scale sub-luminance feature intermediate image are subjected to a cross-scale subtraction operation to obtain the contrast information of the local center and surrounding background information.
  • the algorithm of the cross-scale subtraction operation is as follows: by linearly interpolating the smaller-scale sub-luminance feature middle image representing the surrounding background information to make it the same size as the larger-scale brightness feature middle image representing the central information, and then proceed
  • the pixel-to-pixel subtraction operation that is, the center peripheral difference operation, such a cross-scale operation is represented by the symbol ⁇ .
  • 9 levels of Gaussian pyramids are used to obtain 9 sub-luminance feature intermediate images.
  • I(2), I(3), and I(4) are selected as the middle image of the sub-luminance feature representing the central information, where c ⁇ ⁇ 2, 3, 4 ⁇ , that is, the second layer, the third layer and the fourth layer are selected
  • I(2,5) select the second layer as the sub-luminance feature middle image representing the central information, and the fifth layer as the sub-luminance feature middle image representing the surrounding background information. Interpolation is performed on the sub-luminance feature intermediate map of the fifth layer, and after zooming in, the size of the sub-luminance feature intermediate map of the fifth layer and the second layer is the same, and then the sub-luminance feature intermediate maps of the second and fifth layers are sequentially The brightness features of the pixels in the same column are subtracted, thereby obtaining a first brightness feature map.
  • S30 normalizes the first brightness feature map, and fuses all the normalized first brightness feature maps into a second brightness feature map, as shown in FIG. 3, including:
  • the maximum value of the brightness feature is set to P.
  • each first brightness feature graph traverse the brightness features of the first brightness feature graph to obtain the first brightness feature maximum value I 1max and the first brightness feature minimum value I 1min of the first brightness feature graph, according to the formula
  • the brightness feature of the first brightness feature map is normalized to be between 0 and P, and I represents the brightness feature value of the pixel in the first brightness feature map.
  • the brightness information of the color of each pixel of the image can be obtained, that is, the brightness feature value of each pixel of the image can be obtained.
  • the first brightness feature map is traversed to obtain the first brightness feature value of each pixel, and the first brightness feature map is found.
  • the first brightness feature maximum value and the first brightness feature minimum value of a brightness feature map according to the formula Normalize the brightness feature of the first brightness feature map to be between 0 and P. Based on this, the six first brightness feature maps can be unified to the same order of magnitude, the amplitude difference is eliminated, and the accuracy is improved.
  • the first brightness feature map in the neighborhood of each pixel, there are at most 8 pixels surrounding it.
  • the brightness feature values of the 8 neighboring pixels are compared with each other, and the value with the largest brightness feature is taken as the 8-neighbor maximum value of the pixel, and the value with the smallest brightness feature is taken as the 8-neighbor minimum value of the pixel.
  • the pixel is also compared as one of the 8 neighborhood pixels of other pixels.
  • S107 Average the maximum value of all 8 neighborhoods and the minimum value of all 8 neighborhoods to obtain the brightness feature average value Q.
  • the brightness feature values obtained in the S105 example normalized to 6 first brightness feature maps between 0 and P, sequentially multiply the brightness feature value of each pixel of each first brightness feature map by (PQ) 2 .
  • PQ the brightness feature value of each pixel of each first brightness feature map
  • the potential saliency area in each first brightness feature map can be enlarged, so that the brightness feature at the location of the potential saliency area is more prominent than the background area.
  • S109 Traverse the brightness features of the first brightness feature map to obtain the second maximum brightness feature I 2max and the second brightness feature minimum I 2min , according to the formula
  • the brightness feature of the first brightness feature map is normalized to be between 0 and 1.
  • each first brightness feature map obtained by multiplying the brightness feature value of each pixel of each first brightness feature map by (PQ) 2 in S108, each first brightness feature map is traversed to obtain each pixel Find the second brightness feature maximum value and the second brightness feature minimum value corresponding to the first brightness feature map.
  • the brightness characteristics of each pixel in the first brightness feature map are normalized to be between 0 and 1, so as to further improve the accuracy of the six first brightness feature maps.
  • the six first brightness feature maps obtained in S109 are merged into a second brightness feature map through a weighted average method, which improves the accuracy of the potential saliency region.
  • the tonal feature of the image is extracted, and at least two first tonal feature maps are obtained by using the image pyramid and the second multi-scale calculation relation, as shown in FIG. 4, including:
  • the hue feature corresponding to each pixel is calculated, and the hue feature intermediate image is obtained.
  • the three primary color components of each pixel include red, green and blue components, R is the red component, G is the green component, and B is the blue component , H is the hue feature.
  • RGB color space According to the conversion of RGB color space into HIS color space, the relationship between red component, green component and blue component and hue It can be seen that if you give an image in RGB color format, you can use the formula Calculate the tonal characteristics of each pixel.
  • the hue characteristics of all pixels constitute an intermediate image of hue characteristics.
  • the tone feature intermediate image is input into a Gaussian pyramid, and a series of Gaussian filters are used to filter and sample the tone feature intermediate image.
  • the 0th layer is the original image of the tone feature intermediate image, and the size remains unchanged.
  • the large-scale tone feature intermediate image of the 0th layer is convolved with the Gaussian filter to obtain the small-scale tone feature intermediate image of the first layer.
  • the other layers are followed by analogy.
  • the size of the Gaussian kernel of the Gaussian filter determines the degree of blurring of the image. The smaller the Gaussian kernel, the lighter the blurring is, and the larger the Gaussian kernel is, the more serious the blurring.
  • the size of the Gaussian kernel can be selected as required. For example, a 5 by 5 pixel Gaussian kernel can be used to filter and sample the tone feature intermediate image.
  • the more Gaussian filters included in the Gaussian pyramid, the more levels of the Gaussian pyramid, and the more tonal feature intermediate images of different scales are obtained.
  • the higher the Gaussian pyramid level the smaller the scale of the corresponding tonal feature intermediate image and the lower the resolution.
  • the number of levels of the Gaussian pyramid needs to be determined according to the size of the input tonal feature intermediate image. The larger the size of the input tone feature intermediate image, the more levels of the Gaussian pyramid set correspondingly, and the smaller the size of the input tone feature intermediate image, the fewer the levels of the correspondingly set Gaussian pyramid.
  • the following provides a method of inputting the tone feature intermediate image into a Gaussian pyramid to obtain the sub-tone feature intermediate image of M scales to clearly describe the implementation process.
  • the level of the Gaussian pyramid is 9 layers, that is, the 0th to the 8th layer.
  • the 0th layer is the original image of the tone feature intermediate image, and the size remains unchanged.
  • the first layer Firstly, the original image of the intermediate image of the tonal feature of the 0th layer is doubled, and the Gaussian filter is used to filter and sample it, so that its length and width are reduced by one time, and the image area becomes 1/4 times. Make its size become 1/2 times of the original image of the tone feature intermediate image.
  • the second layer Firstly, the original image of the half-time tonal feature intermediate image obtained in the first layer is doubled, and the Gaussian filter is used to filter and sample it, so that the length and width are respectively shortened, and the image area becomes 1/4 times, so that its size becomes 1/4 times of the original image of the tone feature intermediate image.
  • the process from the 3rd layer to the 8th layer is carried out in turn with reference to the above method, so as to obtain 1/8 times, 1/16 times, 1/32 times, 1/64 times, 1/128 times,
  • 9 scales of sub-tone feature intermediate images can be obtained, which are 1 times, 1/2 times, 1/4 times, 1/8 times, 1/16 times, and 1/32 times of the input tone feature intermediate images. , 1/64 times, 1/128 times, 1/256 times.
  • H(c, s)
  • the second multi-scale calculation relation H(c,s)
  • is called the center-peripheral difference calculation, which is designed according to the physiological structure of the human eye and is used to calculate the image H(c , S) Contrast information.
  • the human eye's receptive field reacts strongly to features with high contrast in the input of visual information, such as the case where the center is green and the periphery is red. This also belongs to visual information with greater contrast.
  • the larger-scale sub-tone feature intermediate image has more detailed information, while the smaller-scale sub-tone feature intermediate image can reflect the local background information more due to filtering and sampling operations. Therefore, the scale The larger sub-tone feature intermediate image and the smaller-scale sub-tone feature intermediate image are subjected to a cross-scale subtraction operation to obtain the contrast information of the local center and surrounding background information.
  • the algorithm of the cross-scale subtraction operation is as follows: by linearly interpolating the small-scale sub-tone feature intermediate image representing the surrounding background information, so that it has the same size as the larger-scale sub-tone feature intermediate image representing the central information, and then Perform a pixel-to-pixel subtraction operation, that is, a central peripheral difference operation.
  • a cross-scale operation is represented by the symbol ⁇ .
  • 9-layer Gaussian pyramid can be used to obtain 9 intermediate image of tonal characteristics.
  • H(2,5) select the second layer as the sub-tone feature intermediate map representing the central information, and the fifth layer as the sub-tone feature intermediate map representing the surrounding background information. Interpolation is performed on the tone feature intermediate map of the 5th layer, and the size of the sub-tone feature intermediate map of the fifth layer and the second layer is the same after being enlarged, and then the sub-tone feature intermediate maps of the second and fifth layers are aligned in sequence The tone features of pixels in the same column are subtracted, thereby obtaining a first tone feature map.
  • S40 normalizes the first tone feature map, and merges all the normalized first tone feature maps into a second tone feature map, as shown in FIG. 5, including:
  • the color attribute information of each pixel of the image can be obtained, that is, the tone feature value of each pixel of the image can be obtained.
  • the first tone feature map is traversed to obtain the first tone feature value of each pixel, and the first tone feature map is found.
  • the first hue feature maximum value and the first hue feature minimum value of a hue feature map according to the formula
  • the hue feature of the first hue feature map is normalized to be between 0 and P. Based on this, the six first tone feature maps can be unified to the same order of magnitude, the amplitude difference is eliminated, and the accuracy is improved.
  • the tone feature values of these 8 neighborhood pixels are compared with each other, and the maximum value of the tone feature is The 8-neighborhood maximum value of the pixel, and the smallest hue feature value is the 8-neighborhood minimum value of the pixel. And, the pixel is also compared as one of the 8 neighborhood pixels of other pixels.
  • S207 Average the maximum value of all 8 neighborhoods and the minimum value of all 8 neighborhoods to obtain the hue feature average value Q.
  • the tone feature values are normalized to 6 first tone feature maps between 0 and P, and the tone feature value of each pixel of each first tone feature map is multiplied by (PQ) 2 in turn .
  • PQ the tone feature value of each pixel of each first tone feature map
  • the potential saliency area in each first color feature map can be enlarged, so that the color feature at the location of the potential saliency area is more prominent relative to the background area.
  • each first tone feature map is traversed separately to obtain each pixel
  • the hue feature of each pixel of the first hue feature map is normalized to be between 0 and 1, so as to further improve the accuracy of the six first hue feature maps.
  • the six first hue feature maps obtained in S209 are merged into a second hue feature map through a weighted average method, which improves the accuracy of the potential saliency region.
  • the method for detecting the saliency area of the image further includes:
  • the adaptive threshold binarization method can be the maximum between-class variance method, namely the Otsu method (OTSU).
  • Otsu method Otsu method
  • the maximum between-class variance method is used to binarize the saliency map.
  • the saliency map is divided into two parts: the background and the foreground.
  • the foreground is the part to be segmented according to the threshold, and the boundary value between the foreground and the background is the required threshold.
  • Pre-set multiple values as options for the threshold traverse different values, and calculate the inter-class variance between the corresponding background and foreground under different values.
  • the corresponding value is the largest at this time
  • the foreground segmented according to the threshold is the saliency area.
  • the value corresponding to the maximum value of the inter-class variance is The required threshold value is better for segmentation.
  • the segmentation threshold of the foreground and background in the saliency map For example, set the segmentation threshold of the foreground and background in the saliency map to T, the proportion of pixels belonging to the foreground to the entire saliency map is recorded as W1, and its average gray level is ⁇ 1; the proportion of background pixels in the entire saliency map is recorded as W2, its average gray scale is ⁇ 2.
  • the total average gray level of the saliency map is recorded as ⁇ ; the variance between classes is recorded as g.
  • the size of the saliency map is L ⁇ N
  • the number of pixels in the saliency map whose gray values are less than the threshold T is recorded as N1
  • the number of pixels with gray values greater than the threshold T is recorded as N2.
  • N 1 +N 2 L ⁇ N
  • W 1 +W 2 1.
  • Pre-set multiple values as options for the threshold T and use the traversal method to find the maximum value of the inter-class variance g in the formula (3), and the corresponding T value is the optimal threshold T that is sought.
  • S60 obtains the saliency area according to the saliency map, including:
  • the adaptive threshold binarization method By adopting the adaptive threshold binarization method to binarize the saliency map, the difference between the foreground and the background in the saliency map is more obvious, and the saliency region can be obtained more quickly and accurately.
  • the saliency region is acquired according to the binary image obtained after the saliency map is binarized in S61, as shown in FIG. 7, including:
  • S610 Perform connected domain labeling on pixels of the binary image, and merge all pixels with the same connected domain label into one connected domain.
  • the connected domain is labeled for the pixels of the binary image, and the pixels with the same connected domain label are regarded as a connected domain.
  • the pixels as the background in the binary image can be removed later, and only the pixels as the foreground are segmented, which reduces calculations and improves Speed.
  • the pixels with the same connected domain label are a connected domain, and a connected domain is regarded as a saliency area.
  • the binary image is divided into multiple connected components, and multiple connected components are extracted as multiple saliency regions.
  • pixels of the binary image are labeled with connected regions, and pixels with the same connected region label are merged into one connected region, as shown in FIG. 8, including:
  • the gray value of the pixel in the binary image is 1 or 0.
  • the initial mark value is selected from 2.
  • S611 Traverse the pixels of the binary image line by line.
  • S614 If it is 1 and is not marked, use the pixel as a seed pixel, mark the connected domain label of the seed pixel as N, N ⁇ 2; and traverse 8 neighboring pixels of the seed pixel.
  • S615 Determine whether the gray value of the unmarked pixel in the 8 neighborhood pixels of the seed pixel is 1.
  • the 8 neighborhood pixels of the pixel with coordinates (x, y) have coordinates (x-1, y-1), (x-1, y), (x-1, y+1), (x , Y-1), (x, y+1), (x+1, y-1), (x+1, y), (x+1, y+1) pixels;
  • x is the pixel in the binary value
  • y is the number of columns in the binary image.
  • the gray value of at least one unmarked pixel among the 8-neighbor pixels of the seed pixel is 1, then the connected domain label of the unmarked pixel with the gray value of 1 among the 8-neighbor pixels of the seed pixel It is marked as N, and all unmarked pixels with a gray value of 1 traversed in the 8 neighborhood pixels are used as seed pixels, and the traversal is performed in a loop and the marked value remains unchanged.
  • the label value remains unchanged, that is, the label value is still N.
  • the following provides a method for labeling the connected domains of the pixels of the binary image, and merging the pixels with the same connected domain label into one connected domain to clearly describe the implementation process.
  • the method of labeling the connected domains of the pixels of the binary image and merging the pixels with the same connected domain label into one connected domain includes:
  • the first step traverse the pixels of the binary image line by line.
  • Step 2 Determine whether the pixel is unmarked and the gray value is 1.
  • Step 3 When the pixel with coordinates (2, 3) is scanned, it is not marked and the gray value is 1, then the pixel with coordinates (2, 3) is used as the seed pixel. At this time, the coordinates are ( The connected domain of the seed pixel in 2, 3) is marked as 2.
  • Step 4 The 8 neighborhood pixels of the seed pixel with coordinates (2, 3) are respectively (1, 2), (1, 3), (1, 4), (2, 2), (2) , 4), (3, 2), (3, 3), (3, 4) pixels, traverse the above 8 pixels.
  • Step 5 Determine whether there are unmarked pixels and the gray value is 1.
  • Step 6 When the pixels with coordinates (3, 2) and (3, 3) are scanned, they are not marked and the gray value is 1, and the gray values of the remaining pixels are all 0, then the coordinates are (3 , 2) and (3, 3) connected domain labels are marked as 2, and the two pixels with coordinates (3, 2) and (3, 3) are all used as seed pixels.
  • Step 7 In the sixth step, first scan the pixel with coordinates (3, 3) and then scan the pixel with coordinates (3, 2), then first use the pixel with coordinates (3, 3) as the seed pixel, and mark it as 2. Traverse the 8 neighborhood pixels of the pixel with coordinates (3, 3), determine whether there is an unmarked pixel with a gray value of 1, and then use the pixel with coordinates (3, 2) as the seed pixel, and mark it as 2. Traverse the 8-neighbor pixels of the pixel at coordinates (3, 2) to determine whether there are unmarked pixels and the gray value is 1.
  • Step 8 In the seventh step, when the pixel with coordinates (3, 3) is used as the seed pixel, among its 8 neighboring pixels, the pixels with coordinates (2, 2) and (3, 2) have been marked as 2 , The gray values of the remaining pixels are all 0; when the pixel with coordinates (3, 2) is used as the seed pixel, among the 8 neighborhood pixels, the pixels with coordinates (2, 2) and (3, 2) have been marked as 2. The gray values of the remaining pixels are all 0, then the connected domain mark ends, and the mark value becomes 3.
  • the label values of pixels with coordinates (2,3), (3,2), and (3,3) are all 2, forming a connected domain, and the connected domain is extracted as a saliency region.
  • the embodiment of the present disclosure also provides a detection device for a saliency area of an image, as shown in FIG. 9, including:
  • the extraction module 10 is configured to extract the brightness feature of the image, and obtain at least two first brightness feature maps by using the image pyramid and the first multi-scale calculation relation formula.
  • the extraction module 10 is further configured to extract the tone features of the image, and obtain at least two first tone feature maps by using the image pyramid and the second multi-scale calculation relationship.
  • the fusion module 20 is configured to perform normalization processing on the at least two first brightness feature maps respectively, and fuse the at least two first brightness feature maps after the normalization processing into a second brightness feature map.
  • the fusion module 20 is further configured to perform normalization processing on the at least two first tone feature maps respectively, and fuse the at least two first tone feature maps after the normalization processing into a second tone feature map .
  • the fusion module 20 is further configured to merge the second brightness feature map and the second hue feature map into a saliency map.
  • the acquiring module 30 is configured to acquire the saliency area according to the saliency map.
  • the detection device of the image saliency area is integrated in the server.
  • the brightness feature of the image is extracted by the extraction module 10, and at least two first brightness feature maps are obtained by using the image pyramid and the first multi-scale calculation relational expression, and
  • the extraction module 10 extracts the tone features of the image, and obtains at least two first tone feature maps by using the image pyramid and the second multi-scale calculation relationship;
  • the fusion module 20 performs normalization processing on the at least two first brightness feature maps respectively , And fusion of at least two first brightness feature maps after normalization processing into a second brightness feature map, and the fusion module 20 performs normalization processing on the at least two first tone feature maps respectively, and normalizes
  • the processed at least two first tone feature maps are fused into a second tone feature map, and the fusion module 20 also fuses the second brightness feature map and the second tone feature map into a saliency map.
  • the acquisition module acquires the saliency area according to the saliency map. It can be seen that the embodiments of the present disclosure can quickly obtain the target that approximates the human eye's attention when acquiring the saliency region based on the brightness feature and hue feature information fused by the saliency map, which is compared to the need for each acquisition in the prior art.
  • the training sample not only improves the accuracy of acquiring the saliency region, but also improves the acquisition speed.
  • the acquiring module 30 is further configured to use an adaptive threshold binarization algorithm to perform binarization processing on the saliency map to obtain a binary image.
  • the acquiring module 30 is further configured to mark the connected domains of the pixels of the binary image, merge all pixels with the same connected domain label into one connected domain, and use the connected domain as a saliency area.
  • the acquiring module 30 performs connected domain labeling on the pixels of the binary image, merges all pixels with the same connected domain label into a connected domain, and subsequently removes the background pixels in the binary image, and only divides the foreground pixels. Reduce calculations and increase speed.
  • the extraction module 10 is also configured to use the three primary color components of each pixel in the image
  • the brightness feature corresponding to each pixel is calculated, and the brightness feature middle image is obtained.
  • the three primary color components of each pixel include red, green and blue components.
  • R is the red component
  • G is the green component
  • B is the blue component.
  • I is the brightness characteristic.
  • the image pyramid is a Gaussian pyramid with M layers, and M ⁇ 7.
  • I(c,s)
  • at least two first brightness feature maps are calculated; where I(c,s) represents the first brightness feature
  • I(c) represents the middle image of the sub-luminance feature of the c-th scale
  • I(s) represents the middle image of the sub-luminance feature of the s-th scale
  • c ⁇ 2, ⁇ 3, 5 ⁇ s ⁇ M-1, s c+ ⁇ .
  • the extraction module 10 is also configured to use the three primary color components of each pixel in the image
  • the hue feature corresponding to each pixel is calculated, and the hue feature intermediate image is obtained.
  • the three primary color components of each pixel include red, green and blue components, R is the red component, G is the green component, and B is the blue component , H is the hue feature.
  • the tone feature intermediate map is input into the image pyramid to obtain the sub-tone feature intermediate map of M scales, where the image pyramid is a Gaussian pyramid with M layers, and M ⁇ 7.
  • H(c,s)
  • the extraction module 10 specifically extracts brightness features and hue features, which can better reflect the essential attributes of foreground objects in the image, and make the image The foreground objects can be more completely extracted as the saliency area.
  • the extraction module 10 extracts the brightness features of the image, and uses the classmate pyramid and the first multi-scale calculation relationship to obtain at least two first brightness feature maps, and the extraction module 10 extracts the tonal features of the image, and uses the image pyramid and The second multi-scale calculation relation formula is used to obtain at least two first tone feature maps; the fusion module 20 performs normalization processing on the at least two first brightness feature maps respectively, and normalizes at least two first tone feature maps after the normalization
  • the brightness feature maps are fused into a second brightness feature map, and the fusion module 20 performs normalization processing on at least two first tone feature maps, respectively, and fuses the at least two first tone feature maps after the normalization process into a first Two-tone feature map, in addition, the fusion module 20 fuses the second brightness feature map and the second tone feature map into a saliency map.
  • the acquisition module acquires the saliency area according to the saliency map.
  • An embodiment of the present disclosure also provides a computer device 1000, as shown in FIG. 10, including a storage unit 1010 and a processing unit 1020, and the storage unit 1010 stores a computer program that can be run on the processing unit 1020 and stores the marking result.
  • the processing unit 1020 implements the above-mentioned method for detecting a salient region of an image when the computer program is executed.
  • Embodiments of the present disclosure also provide a computer-readable storage medium that stores a computer program, and when the computer program is executed by a processor, the above-mentioned method for detecting a salient region of an image is realized.
  • Computer-readable storage media include permanent/non-permanent, volatile/nonvolatile, removable/non-removable media, and information storage can be achieved by any method or technology.
  • the information can be computer-readable instructions, data structures, program modules, or other data.
  • Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), flash memory or other memory technology, CD-ROM, digital versatile disc (DVD) or other optical storage, Magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices or any other non-transmission media can be used to store information that can be accessed by computing devices. According to the definition in this article, computer-readable media does not include transitory media, such as modulated data signals and carrier waves.
  • PRAM phase change memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • RAM random access memory
  • ROM read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • flash memory or other memory technology
  • CD-ROM compact disc
  • DVD digital versatile disc
  • Magnetic cassettes magnetic tape magnetic disk storage or other magnetic storage devices or any other non

Abstract

一种图像的显著性区域的检测方法和装置。图像的显著性区域的检测方法包括:提取图像的亮度特征,利用图像金字塔和第一多尺度计算关系式,得到至少两个第一亮度特征图(S10);提取图像的色调特征,利用图像金字塔和第二多尺度计算关系式,得到至少两个第一色调特征图(S20);将至少两个第一亮度特征图分别进行归一化处理,并将归一化处理后的至少两个第一亮度特征图融合成第二亮度特征图(S30);将至少两个第一色调特征图分别进行归一化处理,并将归一化处理后的至少两个第一色调特征图融合成第二色调特征图(S40);将第二亮度特征图和第二色调特征图融合成显著图(S50);根据显著图,获取显著性区域(S60)。

Description

图像的显著性区域的检测方法和装置
相关申请的交叉引用
本申请要求于2019年4月15日递交的中国专利申请CN201910301296.3的优先权,其全部公开内容通过引用合并于此。
技术领域
本公开涉及图像处理技术领域,尤其涉及一种图像的显著性区域的检测方法和装置、一种计算机设备和一种计算机可读存储介质。
背景技术
图像的显著性是图像的重要视觉特征,体现了人眼对图像某些区域的重视程度。在图像处理过程中,常需要利用显著性检测算法对图像进行检测,以获得该图像的显著性区域。其主要应用于手机拍照软件、目标检测软件、和图像压缩软件中。
目前,一种获得图像的显著性区域的方式为:基于纯数学计算方法对图像进行显著性区域检测。该方式存在显著性区域检测准确度不高的情况,与人眼感知存在差异。
另外一种获得图像的显著性区域的方式为:基于深度学习的方法对图像进行显著性区域检测。但该方式与选取的训练样本有关,对硬件的要求高,而且实时效果差。因此,如何提高图像显著性区域检测的速度和准确度是目前亟待解决的技术问题。
发明内容
本公开实施例的第一方面提供了一种图像的显著性区域的检测方法,包括:
提取所述图像的亮度特征,并利用图像金字塔和第一多尺度计算关系式,得到至少两个第一亮度特征图;
提取所述图像的色调特征,并利用所述图像金字塔和第二多尺度计算关系式, 得到至少两个第一色调特征图;
将所述至少两个第一亮度特征图分别进行归一化处理,并将归一化处理后的所述至少两个第一亮度特征图融合成第二亮度特征图;
将所述至少两个第一色调特征图分别进行归一化处理,并将归一化处理后的所述至少两个第一色调特征图融合成第二色调特征图;
将所述第二亮度特征图和所述第二色调特征图融合成显著图;以及
根据所述显著图,获取所述显著性区域。
在实施例中,所述检测方法还包括:
采用自适应阈值二值化算法对所述显著图进行二值化处理,得到二值图像,
其中,所述根据所述显著图,获取所述显著性区域,包括:根据所述二值图像,获取所述显著性区域。
在实施例中,所述根据所述二值图像,获取所述显著性区域,包括:
对所述二值图像的像素进行连通域标记,将连通域标号相同的所有像素合并为一个连通域;以及
将所述连通域作为显著性区域。
在实施例中,对所述二值图像的像素进行连通域标记,将所述连通域标号相同的像素合并为一个连通域,包括:
将初始标记值设置为N,N≥2;
逐行遍历所述二值图像的每个像素,判断所述像素是否未标记且灰度值为1;
若该像素的灰度值不为1,或者若为1但已被标记,则继续逐行遍历所述二值图像的像素;
若该像素的灰度值为1且未标记,则作为种子像素,将所述种子像素的连通域标号标记为N,并且遍历所述种子像素的8邻域像素,判断所述种子像素的8邻域像素中未标记的像素灰度值是否为1;
若该种子像素的8邻域像素中的至少一个未标记的像素的灰度值为1,则将该种子像素的8邻域像素中未标记且灰度值为1的像素连通域标号标记为N,并将8邻域像素中遍历到的未标记且灰度值为1的像素全部作为种子像素;
若该种子像素的8邻域像素的未标记的像素的灰度值均不为1,则一个连通域标记结束,N加1,继续逐行遍历所述二值图像的像素;
其中,坐标为(x,y)的像素的8邻域像素分别为坐标是(x-1,y-1)、(x-1,y)、(x-1,y+1)、(x,y-1)、(x,y+1)、(x+1,y-1)、(x+1,y)、(x+1,y+1)的像素;x为所述像素在二值图像中的行数,y为所述像素在二值图像的列数。
在实施例中,所述提取图像的亮度特征,并利用图像金字塔和第一多尺度计算关系式,得到至少两个第一亮度特征图,包括:
根据所述图像中每个像素的三原色分量,利用
Figure PCTCN2020076000-appb-000001
计算得到每个像素对应的亮度特征,以得到亮度特征中间图,其中,每个像素的三原色分量包括红色分量、绿色分量和蓝色分量,R为红色分量、G为绿色分量、B为蓝色分量,I为亮度特征;
将所述亮度特征中间图输入所述图像金字塔,得到M个尺度的子亮度特征中间图,其中,所述图像金字塔为M层的高斯金字塔,M≥7;
通过所述第一多尺度计算关系式I(c,s)=|I(c)ΘI(s)|,计算得到所述至少两个第一亮度特征图,其中,I(c,s)表示第一亮度特征图,I(c)表示第c个尺度的子亮度特征中间图,I(s)表示第s个尺度的子亮度特征中间图,c≥2,δ≥3,5≤s≤M-1,s=c+δ。
在实施例中,所述提取所述图像的色调特征,并利用所述图像金字塔和第二多尺度计算关系式,得到至少两个第一色调特征图,包括:
根据所述图像中每个像素的三原色分量,利用
Figure PCTCN2020076000-appb-000002
Figure PCTCN2020076000-appb-000003
计算得到每个像素对应的色调特征,以得到色调特征中间图,其中,每个像素的三原色分量包括红色分量、绿色分量和蓝色分量,R为红色分量、G为绿色分量、B为蓝色分量,H为色调特征;
将所述色调特征中间图输入所述图像金字塔,得到M个尺度的子色调特征中间图,其中,所述图像金字塔为M层的高斯金字塔,M≥7;
通过所述第二多尺度计算关系式H(c,s)=|H(c)ΘH(s)|,计算得到所述至少两个第一色调特征图,其中,H(c,s)表示第一色调特征图,H(c)表示第c个尺度的子色调特征中间图,H(s)表示第s个尺度的子色调特征中间图,c≥2,δ≥3,5≤s≤M-1,s=c+δ。
在实施例中,所述将所述至少两个第一亮度特征图分别进行归一化处理,并将归一化处理后的所述至少两个第一亮度特征图融合成第二亮度特征图,包括:
设定亮度特征最大值P;
针对每个第一亮度特征图,遍历所述第一亮度特征图的亮度特征,获取所述第一亮度特征图的第一亮度特征最大值I 1max和第一亮度特征最小值I 1min,根据公式
Figure PCTCN2020076000-appb-000004
将所述第一亮度特征图的亮度特征归一化至0~P之间,I表示所述第一亮度特征图的像素的亮度特征值;
在亮度特征归一化后的所述第一亮度特征图中,针对存在8邻域的每个像素,根据其8邻域像素的亮度特征值,获取其中的8邻域亮度特征最大值和8邻域亮度特征最小值;
将所有8邻域亮度特征最大值和所有8邻域亮度特征最小值进行平均,得到亮度特征平均值Q;
将所述第一亮度特征图的各像素的亮度特征值与(P-Q) 2相乘;
遍历所述第一亮度特征图的亮度特征,获取第二亮度特征最大值I 2max和第二亮度特征最小值I 2min,根据公式
Figure PCTCN2020076000-appb-000005
将所述第一亮度特征图的亮度特征归一化至0~1之间;以及
将归一化处理后的所述至少两个第一亮度特征图通过加权平均方式融合成第二亮度特征图。
在实施例中,所述将所述至少两个第一色调特征图分别进行归一化处理,并将归一化处理后的所述至少两个第一色调特征图融合成一个第二色调特征图,包括:
设定色调特征最大值P;
针对每个第一色调特征图,遍历所述第一色调特征图的色调特征,获取所述第一色调特征图的第一色调特征最大值H 1max和第一色调特征最小值H 1min,根据公式
Figure PCTCN2020076000-appb-000006
将所述第一色调特征图的色调特征归一化至0~P之间,H表示所述第一色调特征图的像素的色调特征值;
在色调特征归一化后的所述第一色调特征图中,针对存在8邻域的每个像素,根据其8邻域的色调特征值,获取其中的8邻域色调特征最大值和8邻域色调特征最小值;
将所有8邻域色调特征最大值和所有8邻域色调特征最小值进行平均,得到色调特征平均值Q;
将所述第一色调特征图的各像素的色调特征值与(P-Q) 2相乘;
遍历所述第一亮度特征图的亮度特征,获取第二色调特征最大值H 2max和第二色调特征最小值H 2min,根据公式
Figure PCTCN2020076000-appb-000007
将所述第一色调特征图的色调特征归一化至0~1之间;
将所有归一化处理后的所述至少两个第一色调特征图通过加权平均方式融合成一个第二色调特征图。
本公开实施例的第二方面提供了一种计算机设备,包括存储单元和处理单元,其中所述存储单元中存储可在所述处理单元上运行的计算机程序;所述处理单元执行所述计算机程序时实现上述图像的显著性区域的检测方法。
本公开实施例的第三方面提供了一种计算机可读存储介质,其存储有计算机程序,其中,所述计算机程序被处理器执行时实现上述图像的显著性区域的检测方法。
本公开实施例的第四方面提供了一种图像的显著性区域的检测装置,包括:
提取模块,配置为提取所述图像的亮度特征,并利用图像金字塔和第一多尺度计算关系式,得到至少两个第一亮度特征图,所述提取模块还配置为提取所述图像的色调特征,并利用所述图像金字塔和第二多尺度计算关系式,得到至少两个第一色调特征图;
融合模块,配置为将所述至少两个第一亮度特征图分别进行归一化处理,并将归一化处理后的所述至少两个第一亮度特征图融合成第二亮度特征图,所述融合模块,还配置为将所述至少两个第一色调特征图分别进行归一化处理,并将归一化处 理后的所述至少两个第一色调特征图融合成第二色调特征图,所述融合模块还配置为将所述第二亮度特征图和所述第二色调特征图融合成显著图;以及
获取模块,配置为根据所述显著图,获取所述显著性区域。
在实施例中,所述获取模块还配置为采用自适应阈值二值化算法对所述显著图进行二值化处理,得到二值图像;以及根据所述二值图像,获取所述显著性区域。
在实施例中,所述获取模块还配置为对所述二值图像的像素进行连通域标记,将连通域标号相同的所有像素合并为一个连通域;以及将所述连通域作为显著性区域。
在实施例中,所述提取模块还配置为:
根据所述图像中每个像素的三原色分量,利用
Figure PCTCN2020076000-appb-000008
计算得到每个像素对应的亮度特征,以得到亮度特征中间图,其中,每个像素的三原色分量包括红色分量、绿色分量和蓝色分量,R为红色分量、G为绿色分量、B为蓝色分量,I为亮度特征;
将所述亮度特征中间图输入所述图像金字塔,得到M个尺度的子亮度特征中间图,其中,所述图像金字塔为M层的高斯金字塔,M≥7;
通过所述第一多尺度计算关系式I(c,s)=|I(c)ΘI(s)|,计算得到所述至少两个第一亮度特征图,其中,I(c,s)表示第一亮度特征图,I(c)表示第c个尺度的子亮度特征中间图,I(s)表示第s个尺度的子亮度特征中间图;c≥2,δ≥3,5≤s≤M-1,s=c+δ。
在实施例中,所述提取模块还配置:
根据所述图像中每个像素的三原色分量,利用
Figure PCTCN2020076000-appb-000009
Figure PCTCN2020076000-appb-000010
计算得到每个像素对应的色调特征,得到色调特征中间图,其中,每个像素的三原色分量包括红色分量、绿色分量和蓝色分量,R为红色分量、G为绿色分量、B为蓝色分量,H为色调特征;
将所述色调特征中间图输入所述图像金字塔,得到M个尺度的子色调特征中间图,其中,所述图像金字塔为M层的高斯金字塔,M≥7;
通过所述第二多尺度计算关系式H(c,s)=|H(c)ΘH(s)|,计算得到所述至少两个第一色调特征图,其中,H(c,s)表示第一色调特征图,H(c)表示第c个尺度的子色调特征中间图,H(s)表示第s个尺度的子色调特征中间图,c≥2,δ≥3,5≤s≤M-1,s=c+δ。
附图说明
为了更清楚地说明本公开实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本公开实施例提供的一种图像的显著性区域的检测方法的流程图;
图2为本公开实施例提供的再一种图像的显著性区域的检测方法的流程图;
图3为本公开实施例提供的另一种图像的显著性区域的检测方法的流程图;
图4为本公开实施例提供的又一种图像的显著性区域的检测方法的流程图;
图5为本公开实施例提供的又一种图像的显著性区域的检测方法的流程图;
图6为本公开实施例提供的又一种图像的显著性区域的检测方法的流程图;
图7为本公开实施例提供的又一种图像的显著性区域的检测方法的流程图;
图8为本公开实施例提供的又一种图像的显著性区域的检测方法的流程图;
图9为本公开实施例提供的一种图像的显著性区域的检测装置的模块示意图;以及
图10为本公开实施例提供的一种计算机设备的框图。
具体实施方式
下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。基于本公开中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本公开保护的范围。
本公开的实施例提供一种图像的显著性区域的检测方法,如图1所示,包括:
S10、提取图像的亮度特征,并利用图像金字塔和第一多尺度计算关系式,得 到至少两个第一亮度特征图。
在实施例中,亮度是指人眼对光源或物体明亮程度的感觉。对图像而言,亮度特征可以表现出色彩的明亮程度。当对任意的图像提取亮度特征后,可以得到该图像的每个像素的色彩的明亮程度信息,即,得到该图像的每个像素的亮度特征。
以多个分辨率来表示图像的一种有效且概念简单的结构是图像金字塔。图像金字塔最初用于机器视觉和图像压缩,一个图像金字塔是一系列以金字塔形状排列的、分辨率逐步降低的图像集合。
以高斯金字塔作为图像金字塔为例,高斯金字塔的理论基础为尺度空间理论,高斯金字塔本质上为图像的多尺度表示法,即将同一图像多次的进行高斯模糊,并且向下取样,从而产生不同尺度下的多个图像以进行后续的处理。例如将一个提取出亮度特征的图像,基于高斯金字塔多次的进行高斯模糊,并且向下取样,产生不同尺度下的多个带有亮度特征信息的图像,然后根据第一多尺度计算关系式进行运算,得到至少两个第一亮度特征图。
在本实施例中,第一多尺度计算关系式例如根据中央周边差运算方法所得,用于计算利用高斯金字塔产生的不同尺度下的多个带有亮度特征信息的图像中的反差信息。
高斯金字塔包含了一系列的高斯滤波器,其截止频率从上一层到下一层是以因子2逐渐增加的。
S20、提取图像的色调特征,并利用图像金字塔和第二多尺度计算关系式,得到至少两个第一色调特征图。
其中,色调是物体在日光照射下所反射的各光谱成分作用于人眼的综合效果,即,彩色的类别。同一色调的色彩是指色彩中组成彩色成分的三原色光组合比例相近的一系列色彩,在外观上表现出一种鲜明的色彩倾向。例如朱红、大红、粉红都属于红色色调。对于图像而言,色调特征可用来描述颜色的属性,例如黄色、橙色或红色。
当对任意的图像提取色调特征后,可以得到该图像的每个像素的颜色的属性信息,即,得到该图像的每个像素的色调特征。
图像金字塔与上述得到第一亮度特征图所利用的图像金字塔本质上相同,为图像的多尺度表示法。例如利用金字塔,可以将一个提取出色调特征的图像, 多次的进行高斯模糊,并且向下取样,产生不同尺度下的多个带有色调特征信息的图像,然后根据第二多尺度计算关系式进行运算,得到至少两个第一色调特征图。
第二多尺度计算关系式也是根据中央周边差运算方法所得,用于计算利用高斯金字塔产生的不同尺度下的多个带有色调特征信息的图像中的反差信息。
S30、将至少两个第一亮度特征图分别进行归一化处理,并将归一化处理后的至少两个第一亮度特征图融合成第二亮度特征图。
归一化处理是将每个第一亮度特征图的亮度特征,统一到同一个数量级下,为后续处理提供更高精度的数据信息。
本领域技术人员明白,当根据S10得到两个及两个以上第一亮度特征图时,需先进行插值运算,将尺度较小的第一亮度特征图运算放大,使其与尺度较大的第一亮度特征图尺度相同,然后再进行融合以形成一个第二亮度特征图。
S40、将至少两个第一色调特征图进行归一化处理,并将归一化处理后的至少两个第一色调特征图融合成第二色调特征图。
归一化处理是将每个第一色调特征图的色调特征,统一到同一个数量级下,为后续处理提供精度更高的数据信息。
类似的,当根据S20得到两个及两个以上第一色调特征图时,需先进行插值运算,将尺度较小的第一色调特征图经运算放大,使其与尺度较大的第一色调特征图尺度相同,然后再进行融合以形成第二色调特征图。
S50、将第二亮度特征图和第二色调特征图融合成显著图。
例如,可以采用加权平均的方式将第二亮度特征图和第二色调特征图进行融合,即,依次将第二亮度特征图中和第二色调特征图中,同行同列对应的位置处的像素的亮度特征和色调特征进行加权平均,使得第二亮度特征图和第二色调特征图融合成一个显著图。
其中,加权平均方式中的权值可以根据需要进行设置,本公开对此不进行限定。
示例的,当权值设置为1时,将第二亮度特征图中第一行第一列像素的亮度特征与第二色调特征图中第一行第一列像素的色调特征直接相加,再除以2,由此得到的计算结果即为显著图中第一行第一列像素的信息。其他像素依次类推,从而可以融合成一个显著图。
S60、根据显著图,获取显著性区域。
人眼视觉系统的视觉注意机制使得人们可以从复杂的场景中逐步排除相对不重要的信息,选择重要的和所需要关注的信息,作为注意的目标,并对其进行优先处理。由上述描述可知,本公开基于模拟该人眼视觉注意机制原理,从图像中排除其他信息,针对性的选择亮度特征和色调特征这两种信息进行提取,并将第一亮度特征图进行归一化处理后融合成第二亮度特征图,第一色调特征图进行归一化处理后融合成第二色调特征图,再将第二亮度特征图和第二色调特征图融合成显著图,经过多次处理和融合使得得到的显著图能更近似的呈现出人眼注意的目标。
在显著图中引起视觉反差的区域,吸引人眼注意的目标被称之为显著性区域(Salience region),视觉反差越大,就越容易引起人眼视觉系统的注意,显著性区域也是最能体现显著图内容的区域。显著性区域也可以称之为前景,其余的区域则称之为背景。显著性区域的使用,能够排除显著图中其他背景区域的干扰,直接接近用户的检测意图,有利于检测性能的提高。
本公开的实施例提供了一种图像的显著性区域的检测方法,通过针对性的选择提取亮度特征和色调特征,能更好的反映图像中前景物体的本质属性,使图像中的前景物体能更完整的作为显著性区域被提取。在实施例中,通过利用图像金字塔和第一多尺度计算关系式生成第一亮度特征图,利用图像金字塔和第二多尺度计算关系式生成第一色调特征图,并将第一亮度特征图进行归一化处理后融合成第二亮度特征图,第一色调特征图进行归一化处理后融合成第二色调特征图,再通过第二亮度特征图和第二色调特征图融合成显著图,从而可以根据显著图提取显著性区域。通过以上方法提取显著性区域的速度更快,适用范围更广,同时更接近人眼视觉的感知,提取的效果更好。
在实施例中,S10中提取图像的亮度特征,并利用图像金字塔和第一多尺度计算关系式,得到至少两个第一亮度特征图,如图2所示,包括:
S101、根据图像中每个像素的三原色分量,利用
Figure PCTCN2020076000-appb-000011
计算得到每个像素对应的亮度特征,得到亮度特征中间图,其中,每个像素的三原色分量包括红色分量、绿色分量和蓝色分量,R为红色分量、G为绿色分量、B为蓝色分量,I为亮度特征。
需要说明的是,图像处理中最基础的色彩空间是RGB颜色空间,但是RGB 颜色空间不便于描述图像,没有办法正确表达人眼感知的颜色之间的真实差异,而利用亮度、色调和饱和度三个参量来描述图像的HIS颜色空间,一方面可以对图像的颜色特性和光谱特性进行定量的表征和描述,另一方面亮度、色调和饱和度各参量具有独立性,其物理意义清晰且易于解释。HIS颜色空间与人眼对颜色的感知更加一致,因此,可以将RGB颜色空间进行非线性变换转换成HIS颜色空间,更加自然和直观的描述图像。HIS颜色空间分别用H(Hue)、I(Intensity)、S(Saturation)对色调、亮度和饱和度进行表示。
根据RGB颜色空间转换成HIS颜色空间,红色分量、绿色分量和蓝色分量与亮度的关系式
Figure PCTCN2020076000-appb-000012
可知,若给出一幅RGB彩色格式的图像,可以利用公式
Figure PCTCN2020076000-appb-000013
计算得到每个像素的亮度特征。
所有像素的亮度特征构成亮度特征中间图。
S102、将亮度特征中间图输入图像金字塔,得到M个尺度的子亮度特征中间图;其中,图像金字塔为M层的高斯金字塔,M≥7。
在实施例中,将亮度特征中间图输入高斯金字塔,利用一系列的高斯滤波器对亮度特征中间图进行滤波和采样。其中,第0层为亮度特征中间图的原图,大小保持不变,将第0层大尺度的亮度特征中间图与高斯滤波器进行卷积,得到第1层小尺度的亮度特征中间图,其他层依次类推。高斯滤波器的高斯核大小决定了对图像的模糊程度,高斯核越小模糊的越轻,越大模糊的越严重,高斯核的大小可以根据需要进行选取。示例的,可以采用5乘5像素的高斯核对亮度特征中间图进行滤波和采样。
基于此,高斯金字塔包括的高斯滤波器越多,高斯金字塔的层级越多,得到越多个不同尺度的亮度特征中间图。同时,高斯金字塔层级越高,对应得到的亮度特征中间图的尺度越小,分辨率越低。高斯金字塔的层级数量需要根据输入的亮度特征中间图的尺寸确定,输入的亮度特征中间图尺寸越大,相应设定的高斯金字塔的层级越多,输入的亮度特征中间图尺寸越小,相应设定的高斯金字塔的层级越少。
基于上述描述,以下提供一种将亮度特征中间图输入高斯金字塔,得到M个尺度的子亮度特征中间图的方法,以清楚描述其实现过程。
示例的,根据输入的亮度特征中间图的尺寸确定高斯金字塔的层级为9层,即第0层到第8层,第0层为亮度特征中间图的原图,大小不变。
第1层:首先将第0层亮度特征中间图的原图扩大一倍,利用高斯滤波器对其进行滤波和采样,使其长宽分别缩短一倍,图像面积变为1/4倍,从而使其尺寸变为亮度特征中间图原图的1/2倍。
第2层:首先将第1层得到的1/2倍的亮度特征中间图原图扩大一倍,利用高斯滤波器对其进行滤波和采样,使其长宽分别缩短一倍,图像面积变为1/4倍,从而使其尺寸变为亮度特征中间图原图的1/4倍。
第3层到第8层的过程依次参照上述方式循环进行,从而得到亮度特征中间图原图的1/8倍、1/16倍、1/32倍、1/64倍、1/128倍、1/256倍的6个子亮度特征中间图。
最终,可以得到9个尺度的子亮度特征中间图,分别为输入的亮度特征中间图的1倍、1/2倍、1/4倍、1/8倍、1/16倍、1/32倍、1/64倍、1/128倍、1/256倍。
S103、通过第一多尺度计算关系式I(c,s)=|I(c)ΘI(s)|,计算得到至少两个第一亮度特征图,其中,I(c,s)表示第一亮度特征图,I(c)表示第c个尺度的子亮度特征中间图,I(s)表示第s个尺度的子亮度特征中间图,c≥2,δ≥3,5≤s≤M-1,s=c+δ。
在实施例中,第一多尺度计算关系式I(c,s)=|I(c)ΘI(s)|的运算方式叫作中心周边差运算,是根据人眼生理结构设计的,用于计算图像I(c,s)中的反差信息。人眼感受野对于视觉信息输入中反差大的特征反应强烈,例如中央亮周边暗的情况等,这属于反差较大的视觉信息。在高斯金字塔中,尺度较大的子亮度特征中间图细节信息较多,而尺度较小的子亮度特征中间图由于滤波和采样操作使其更能反映出局部的背景信息,因而,将尺度较大的子亮度特征中间图和尺度较小的子亮度特征中间图进行跨尺度减操作,能得到局部中心和周边背景信息的反差信息。
跨尺度减操作的算法如下:通过将代表周边背景信息的较小尺度的子亮度特征中间图先进行线性插值,使之与代表中心信息的较大尺度的亮度特征中间图具有相同大小,然后进行像素对像素的减操作,即中央周边差操作,这样的跨尺度操作使用符号Θ表示。
示例的,根据S102中的示例,利用9层的高斯金字塔,可以得到9个子亮度特征中间图。代表中心信息的子亮度特征中间图选取I(2),I(3)和I(4),其中,c ∈{2、3、4},即选取第2层、第3层和第4层得到的1/4倍、1/8倍、1/16倍的子亮度特征中间图;代表周边背景信息的子亮度特征中间图选取I(5)、I(6)、I(7)和I(8),其中,δ∈{3、4},根据s=c+δ,可知s∈{5、6、7、8},即选取第5层、第6层、第7层和第8层得到的1/32倍、1/64倍、1/128倍和1/256倍的子亮度特征中间图,相应产生6个中央周边差结果图,即I(2,5)、I(2,6)、I(3,6)、I(3,7)、I(3,8)、I(4,7)和I(4,8)6个第一亮度特征图。
在此,以得到I(2,5)的过程为例进行说明:选择第2层为代表中心信息的子亮度特征中间图,选择第5层为代表周边背景信息的子亮度特征中间图,将第5层的子亮度特征中间图进行插值运算,放大后使第5层与第2层的子亮度特征中间图尺寸一致,然后再依次将第2层与第5层的子亮度特征中间图中同行同列像素的亮度特征相减,由此得到一个第一亮度特征图。
在实施例中,S30将第一亮度特征图进行归一化处理,并将所有归一化处理后的第一亮度特征图融合成一个第二亮度特征图,如图3所示,包括:
S104、亮度特征最大值设定为P。
可以理解的是,P的取值范围在0~255之间。
S105、针对每个第一亮度特征图,遍历该第一亮度特征图的亮度特征,获取第一亮度特征图的第一亮度特征最大值I 1max和第一亮度特征最小值I 1min,根据公式
Figure PCTCN2020076000-appb-000014
将该第一亮度特征图的亮度特征归一化至0~P之间,I表示所述第一亮度特征图的像素的亮度特征值。
当对每个第一亮度特征图遍历亮度特征后,可以得到该图像每个像素的色彩的明亮程度信息,即,得到该图像每个像素的亮度特征值。
示例的,根据S103示例中得到的6个第一亮度特征图,针对每个第一亮度特征图,对该第一亮度特征图进行遍历,得到每个像素的第一亮度特征值,找到该第一亮度特征图的第一亮度特征最大值和第一亮度特征最小值,根据公式
Figure PCTCN2020076000-appb-000015
将该第一亮度特征图的亮度特征归一化至0~P之间。基于此,可使6个第一亮度特征图统一到同一个数量级下,消除了幅值差异,提高了精度。
S106、在亮度特征归一化后的第一亮度特征图中,针对存在8邻域的每个像素,根据其8邻域像素的亮度特征值,获取其中的8邻域最大值和8邻域最小值。
可以理解的是,第一亮度特征图中,每个像素的邻域中,最多有8个像素将其包围。而这8个邻域像素的亮度特征值相互对比,将亮度特征最大的值作为该像素的8邻域最大值,亮度特征最小的值作为该像素的8邻域最小值。并且,该像素也作为其他像素的8邻域像素之一被比对。
S107、将所有8邻域最大值和所有8邻域最小值进行平均,得到亮度特征平均值Q。
S108、将第一亮度特征图的各像素的亮度特征值与(P-Q) 2相乘。
根据S105示例中得到的亮度特征值归一化至0~P之间的6个第一亮度特征图,依次将每个第一亮度特征图的各像素的亮度特征值与(P-Q) 2相乘。这样可以将每个第一亮度特征图中潜在的显著性区域进行放大,使得潜在的显著性区域位置处的亮度特征相对于背景区域更突出。
S109、遍历第一亮度特征图的亮度特征,获取第二亮度特征最大值I 2max和第二亮度特征最小值I 2min,根据公式
Figure PCTCN2020076000-appb-000016
将第一亮度特征图的亮度特征归一化至0~1之间。
根据S108中将每个第一亮度特征图的各像素的亮度特征值与(P-Q) 2相乘后得到的第一亮度特征图,分别对每个第一亮度特征图进行遍历,得到每个像素的第二亮度特征值,找到该第一亮度特征图对应的第二亮度特征最大值和第二亮度特征最小值。根据公式
Figure PCTCN2020076000-appb-000017
将该第一亮度特征图的各像素的亮度特征归一化至0~1之间,进一步使6个第一亮度特征图的精度提高。
S110、将所有归一化处理后的第一亮度特征图通过加权平均方式融合成一个第二亮度特征图。
根据S109中得到的6个第一亮度特征图,将6个第一亮度特征图通过加权平均的方式融合成一个第二亮度特征图,提高了潜在的显著性区域的准确度。
在实施例中,S20中提取图像的色调特征,并利用图像金字塔和第二多尺度计算关系式,得到至少两个第一色调特征图,如图4所示,包括:
S201、根据图像中每个像素的三原色分量,利用
Figure PCTCN2020076000-appb-000018
Figure PCTCN2020076000-appb-000019
计算得到每个像素对应的色调特征,得到色调特征中间图,其中,每个像素的三原色分量包括红色分量、绿色分量和蓝色分量,R为红色分量、G为绿色分量、B为蓝色分量,H为色调特征。
根据RGB颜色空间转换成HIS颜色空间,红色分量、绿色分量和蓝色分量 与色调的关系式
Figure PCTCN2020076000-appb-000020
可知,若给出一幅RGB彩色格式的图像,可以利用公式
Figure PCTCN2020076000-appb-000021
Figure PCTCN2020076000-appb-000022
计算得到每个像素的色调特征。
所有像素的色调特征构成色调特征中间图。
S202、将色调特征中间图输入图像金字塔,得到M个尺度的子色调特征中间图,其中,图像金字塔为M层的高斯金字塔,M≥7。
在实施例中,将色调特征中间图输入高斯金字塔,利用一系列的高斯滤波器对色调特征中间图进行滤波和采样。其中,第0层为色调特征中间图的原图,大小保持不变,将第0层大尺度的色调特征中间图与高斯滤波器进行卷积,得到第1层小尺度的色调特征中间图,其他层依次类推。高斯滤波器的高斯核大小决定了对图像的模糊程度,高斯核越小模糊的越轻,越大模糊的越严重,高斯核的大小可以根据需要进行选取。示例的,可以采用5乘5像素的高斯核对色调特征中间图进行滤波和采样。
基于此,高斯金字塔包括的高斯滤波器越多,高斯金字塔的层级越多,得到越多个不同尺度的色调特征中间图。同时,高斯金字塔层级越高,对应得到的色调特征中间图的尺度越小,分辨率越低。高斯金字塔的层级数量需要根据输入的色调特征中间图的尺寸确定。输入的色调特征中间图尺寸越大,相应设定的高斯金字塔的层级越多,输入的色调特征中间图尺寸越小,相应设定的高斯金字塔的层级越少。
基于上述描述,以下提供一种将色调特征中间图输入高斯金字塔,得到M个尺度的子色调特征中间图的方法,以清楚描述其实现过程。
示例的,根据输入的色调特征中间图的尺寸确定高斯金字塔的层级为9层,即第0层到第8层,第0层为色调特征中间图的原图,大小不变。
第1层:首先将第0层色调特征中间图的原图扩大一倍,利用高斯滤波器对其进行滤波和采样,使其长宽分别缩短一倍,图像面积变为1/4倍,从而使其尺寸变为色调特征中间图原图的1/2倍。
第2层:首先将第1层得到的1/2倍的色调特征中间图原图扩大一倍,利用 高斯滤波器对其进行滤波和采样,使其长宽分别缩短一倍,图像面积变为1/4倍,从而使其尺寸变为色调特征中间图原图的1/4倍。
第3层到第8层的过程依次参照上述方式循环进行,从而得到色调特征中间图原图的1/8倍、1/16倍、1/32倍、1/64倍、1/128倍、1/256倍的6个子色调特征中间图。
最终,可以得到9个尺度的子色调特征中间图,分别为输入的色调特征中间图的1倍、1/2倍、1/4倍、1/8倍、1/16倍、1/32倍、1/64倍、1/128倍、1/256倍。
S203、通过第二多尺度计算关系式H(c,s)=|H(c)ΘH(s)|,计算得到至少两个第一色调特征图,其中,H(c,s)表示第一色调特征图,H(c)表示第c个尺度的子色调特征中间图,H(s)表示第s个尺度的子色调特征中间图,c≥2,δ≥3,5≤s≤M-1,s=c+δ。
第二多尺度计算关系式H(c,s)=|H(c)ΘH(s)|的运算方式叫作中心周边差运算,是根据人眼生理结构设计的,用于计算图像H(c,s)中的反差信息。人眼感受野对于视觉信息输入中反差大的特征反应强烈,例如中央是绿色周边是红色的情况等。这也属于反差较大的视觉信息。在高所金字塔中,尺度较大的子色调特征中间图细节信息较多,而尺度较小的子色调特征中间图由于滤波和采样操作使其更能反映出局部的背景信息,因而,将尺度较大的子色调特征中间图和尺度较小的子色调特征中间图进行跨尺度减操作,能得到局部中心和周边背景信息的反差信息。
跨尺度减操作的算法如下:通过将代表周边背景信息的较小尺度的子色调特征中间图先进行线性插值,使之与代表中心信息的较大尺度的子色调特征中间图具有相同大小,然后进行像素对像素的减操作,即中央周边差操作,这样的跨尺度操作使用符号Θ表示。
示例的,根据S202中的示例,利用9层的高斯金字塔,可以得到9个色调特征中间图。代表中心信息的子色调特征中间图选取H(2),H(3)和H(4),其中,c∈{2、3、4},即选取第2层、第3层和第4层得到的1/4倍、1/8倍、1/16倍的子色调特征中间图;代表周边背景信息的子色调特征中间图选取H(5)、H(6)、H(7)和H(8),其中,δ∈{3、4},根据s=c+δ,可知s∈{5、6、7、8},即选取第5层、第6层、第7层和第8层得到的1/32倍、1/64倍、1/128倍和1/256倍的子亮度特征中间图,相应产生6个中央周边差结果图,即H(2,5)、H(2,6)、H(3,6)、H(3,7)、H(3,8)、H(4,7)和H(4,8)6个第一色调特征图。
在此,以得到H(2,5)的过程为例进行说明:选择第2层为代表中心信息的子色调特征中间图,选择第5层为代表周边背景信息的子色调特征中间图,将第5层的色调特征中间图进行插值运算,放大后使第5层与第2层的子色调特征中间图尺寸一致,然后再依次将第2层与第5层的子色调特征中间图中同行同列像素的色调特征相减,由此得到一个第一色调特征图。
在实施例中,S40将第一色调特征图进行归一化处理,并将所有归一化处理后的第一色调特征图融合成一个第二色调特征图,如图5所示,包括:
S204、将色调特征最大值设定为P。
可以理解的是,P的取值范围在0~255之间。
S205、针对每个第一色调特征图,遍历所述第一色调特征图的色调特征,获取所述第一色调特征图的第一色调特征最大值H 1max和第一色调特征最小值H 1min,根据公式
Figure PCTCN2020076000-appb-000023
将所述第一色调特征图的色调特征归一化至0~P之间;H表示所述第一色调特征图的像素的色调特征值。
当对每个第一色调特征图遍历色调特征后,可以得到该图像每个像素的颜色的属性信息,即,得到该图像每个像素的色调特征值。
示例的,根据S203示例中得到的6个第一色调特征图,针对每个第一色调特征图,对该第一色调特征图进行遍历,得到每个像素的第一色调特征值,找到该第一色调特征图的第一色调特征最大值和第一色调特征最小值,根据公式
Figure PCTCN2020076000-appb-000024
将该第一色调特征图的色调特征归一化至0~P之间。基于此,可使6个第一色调特征图统一到同一个数量级下,消除了幅值差异,提高了精度。
S206、在色调特征归一化后的所述第一色调特征图中,针对存在8邻域的每个像素,根据其8邻域像素的色调特征值,获取其中的8邻域最大值和8邻域最小值。
可以理解的是,第一色调特征图中,每个像素的邻域中,最多有8个像素将其包围,而这8个邻域像素的色调特征值相互对比,将色调特征最大的值为该像素的8邻域最大值,色调特征最小的值为该像素的8邻域最小值。并且,该像素也作为其他像素的8邻域像素之一被比对。
S207、将所有8邻域最大值和所有8邻域最小值进行平均,得到色调特征平均值Q。
S208、将第一色调特征图的各像素的色调特征值与(P-Q) 2相乘。
根据S205示例中得到的色调特征值归一化至0~P之间的6个第一色调特征 图,依次将每个第一色调特征图的各像素的色调特征值与(P-Q) 2相乘。这样可以将每个第一色调特征图中潜在的显著性区域进行放大,使得潜在的显著性区域位置处的色调特征相对于背景区域更突出。
S209、遍历所述第一亮度特征图的亮度特征,获取第二色调特征最大值H 2max和第二色调特征最小值H 2min,根据公式
Figure PCTCN2020076000-appb-000025
将所述第一色调特征图的色调特征归一化至0~1之间。
根据S208中将每个第一色调特征图的各像素的色调特征值与(P-Q) 2相乘后得到的第一色调特征图,分别对每个第一色调特征图进行遍历,得到每个像素的第二色调特征值,找到该第一色调特征图对应的第二色调特征最大值和第二色调特征最小值,根据公式
Figure PCTCN2020076000-appb-000026
将该第一色调特征图的各像素的色调特征归一化至0~1之间,进一步使6个第一色调特征图的精度提高。
S210、将所有归一化处理后的第一色调特征图通过加权平均方式融合成一个第二色调特征图。
根据S209中得到的6个第一色调特征图,将6个第一色调特征图通过加权平均的方式融合成一个第二色调特征图,提高了潜在的显著性区域的准确度。
在实施例中,在S50中将第二亮度特征图和第二色调特征图,采用加权平均的方式,融合成显著图之后,S60中根据显著图,获取显著性区域之前,如图6所示,所述图像的显著性区域的检测方法还包括:
S51、采用自适应阈值二值化方法对显著图进行二值化处理,得到二值图像。
自适应阈值二值化方法可以为最大类间方差法,即大津法(OTSU)。采用最大类间方差法对显著图进行二值化处理,将显著图分成背景和前景两部分看待,前景就是要按照阈值分割出来的部分,前景和背景的分界值就是要求出的阈值。
预先设定多个值作为阈值的可选项,遍历不同的值,计算不同值下对应的背景和前景之间的类间方差,当类间方差取得极大值时,此时对应的值就是最大类间方差法所求的阈值。而此时按照该阈值分割出的前景即为显著性区域。
需要说明的是,类间方差越大,说明构成显著图的前景和背景两部分的差别越大,分割时错分的可能越小,因此,类间方差取得极大值所对应的值即为所求的阈值,此时分割的效果更好。
示例的,设定显著图中前景和背景的分割阈值为T,属于前景的像素占整幅显著图的比例记为W1,其平均灰度为μ1;背景像素占整幅显著图的比例记为 W2,其平均灰度为μ2。显著图的总平均灰度记为μ;类间方差记为g。假设显著图的大小为L×N,显著图中像素的灰度值小于阈值T的像素个数记作N1,灰度值大于阈值T的像素个数记作N2。
基于上述可知:
Figure PCTCN2020076000-appb-000027
N 1+N 2=L×N,W 1+W 2=1。
μ=μ 1×W 12×W 2---公式(一)。
g=W 1×(μ-μ 1) 2+W 2×(μ-μ 2) 2---公式(二)。
将公式(一)代入公式(二)可得,g=W 1×W 2×(μ 12) 2---公式(三)。
预先设定多个值作为阈值T的可选项,采用遍历的方法找到公式(三)中类间方差g的最大值,此时对应的T值即为所求的最佳阈值T。
在此基础上,S60中根据显著图,获取显著性区域,包括:
S61、根据对显著图进行二值化处理后得到的二值图像,获取显著性区域。
通过采用自适应阈值二值化方法对显著图进行二值化处理,使得显著图中的前景与背景的差异更加明显,对其获取显著性区域时更加迅速准确。
在实施例中,S61中根据对显著图进行二值化处理后得到的二值图像,获取显著性区域,如图7所示,包括:
S610、对二值图像的像素进行连通域标记,将连通域标号相同的所有像素合并为一个连通域。
对二值图像的像素进行连通域标记,将连通域标号相同的像素作为一个连通域,后续可以将二值图像中作为背景的像素去掉,仅对作为前景的像素进行分割,减少了运算,提高了速度。
S620、将连通域作为显著性区域。
连通域标号相同的像素为一个连通域,一个连通域作为一个显著性区域。当具有多个连通域标号时,二值图像被分割为多个连通域,多个连通域作为多个显著性区域被提取。
在实施例中,S610中对二值图像的像素进行连通域标记,将连通域标号相同的像素合并一个连通域,如图8所示,包括:
需要说明的是,二值图像中的像素灰度值为1或者为0,为了避免初始标记值与灰度值混淆,因此,将初始标记值从2开始选取。
S611、逐行遍历二值图像的像素。
S612、判断像素是否未标记且灰度值为1。
设定二值图像中的像素灰度值为1的像素作为前景像素,像素灰度值为0的像素作为背景像素。
S613、若不为1,或者若为1但已被标记,则继续逐行遍历二值图像的像素。
若不为1,或者若为1但已被标记,说明是背景像素或者是已经有连通域标号的像素。
S614、若为1且未标记,则将该像素作为种子像素,将种子像素的连通域标号标记为N,N≥2;并且遍历种子像素的8邻域像素。
S615、判断种子像素的8邻域像素中未标记的像素灰度值是否为1。
其中,坐标为(x,y)的像素的8邻域像素分别为坐标是(x-1,y-1)、(x-1,y)、(x-1,y+1)、(x,y-1)、(x,y+1)、(x+1,y-1)、(x+1,y)、(x+1,y+1)的像素;x为像素在二值图像的行数,y为像素在二值图像的列数。
S616、若该种子像素的8邻域像素中的至少一个未标记的像素的灰度值为1,则将该种子像素的8邻域像素中未标记且灰度值为1的像素连通域标号标记为N,并将8邻域像素中遍历到的未标记且灰度值为1的像素全部作为种子像素,循环进行遍历且标记值不变。
此处,标记值不变,即标记值仍为N。
S617、若该种子像素的8邻域像素的未标记的像素其灰度值均不为1,则一个连通域标记结束,使N加1。继续逐行遍历二值图像的像素,判断像素是否未标记且灰度值为1。
可以理解的是,作为种子像素的8邻域像素中没有找到一个灰度值为1的像素,该S617停止,然后初始标记值变为N+1,重新开始S611步骤,继续逐行遍历二值图像的像素,判断下一个像素是否未标记且灰度值为1。
基于上述描述,以下提供一种对二值图像的像素进行连通域标记,将连通域标号相同的像素合并一个连通域的方法,以清楚描述其实现过程。
示例的,对二值图像的像素进行连通域标记,将连通域标号相同的像素合并一个连通域的方法,包括:
第一步:逐行遍历二值图像的像素。
第二步:判断像素是否未标记且灰度值为1。
第三步:当扫描到坐标为(2,3)的像素时,其未标记且灰度值为1,则将坐标为(2,3)的像素作为种子像素,此时,将坐标为(2,3)的该种子像素连通域标号标记为2。
第四步:坐标为(2,3)的该种子像素的8邻域像素分别是坐标为(1,2)、(1,3)、(1,4)、(2,2)、(2,4)、(3,2)、(3,3)、(3,4)的像素,遍历上述8个像素。
第五步:判断其中是否有未标记的像素且灰度值为1。
第六步:当扫描到坐标为(3,2)和(3,3)的像素时,其未标记且灰度值为1,其余像素的灰度值均为0,则将坐标为(3,2)和(3,3)的像素连通域标号标记为2,并将坐标为(3,2)和(3,3)的两个像素全部作为种子像素。
第七步:第六步中先扫描坐标为(3,3)的像素再扫描坐标为(3,2)的像素,则先以坐标为(3,3)的像素作为种子像素,且标记为2,遍历坐标为(3,3)的像素的8邻域像素,判断其中是否有未标记的像素且灰度值为1,再以坐标(3,2)的像素作为种子像素,且标记为2,遍历坐标(3,2)的像素的8邻域像素,判断其中是否有未标记的像素且灰度值为1。
第八步:第七步中,坐标为(3,3)的像素作为种子像素时,其8邻域像素中,坐标为(2,2)、(3,2)的像素已经被标记为2,其余像素灰度值均为0;坐标为(3,2)的像素作为种子像素时,其8邻域像素中,坐标为(2,2)、(3,2)的像素已经被标记为2,其余像素灰度值均为0,则该连通域标记结束,标记值变为3。
然后执行第一步,逐行遍历二值图像的像素,寻找下一个种子像素。
最终,坐标为(2,3)、(3,2)、(3,3)的像素标记值均为2,构成一个连通域,该连通域作为一个显著性区域被提取。
本公开的实施例还提供一种图像的显著性区域的检测装置,如图9所示,包括:
提取模块10,配置为提取图像的亮度特征,并利用图像金字塔和第一多尺度计算关系式,得到至少两个第一亮度特征图。
提取模块10,还配置为提取图像的色调特征,并利用图像金字塔和第二多尺度计算关系式,得到至少两个第一色调特征图。
融合模块20,配置为将至少两个第一亮度特征图分别进行归一化处理,并将归一化处理后的至少两个第一亮度特征图融合成第二亮度特征图。
融合模块20,还配置为将所述至少两个第一色调特征图分别进行归一化处理, 并将归一化处理后的所述至少两个第一色调特征图融合成第二色调特征图。
融合模块20,还配置为将所述第二亮度特征图和所述第二色调特征图融合成显著图。
获取模块30,配置为根据所述显著图,获取显著性区域。
例如,图像显著性区域的检测装置集成于服务器中。在本公开提供的一种图像显著性区域的检测装置中,通过提取模块10提取图像的亮度特征,并利用图像金字塔和第一多尺度计算关系式,得到至少两个第一亮度特征图,以及提取模块10提取图像的色调特征,并利用图像金字塔和第二多尺度计算关系式,得到至少两个第一色调特征图;融合模块20将至少两个第一亮度特征图分别进行归一化处理,并将归一化处理后的至少两个第一亮度特征图融合成第二亮度特征图,以及融合模块20将至少两个第一色调特征图分别进行归一化处理,并将归一化处理后的至少两个第一色调特征图融合成第二色调特征图,此外融合模块20还将第二亮度特征图和第二色调特征图融合成显著图。获取模块根据显著图,获取显著性区域。由此可见,本公开的实施例能够根据显著图所融合的亮度特征和色调特征信息,在获取显著性区域时,快速得到近似人眼注意的目标,相对于现有技术中每次获取都需要训练样本,不仅提高了获取显著性区域的准确度,也提高了获取速度。
在实施例中,获取模块30还配置为采用自适应阈值二值化算法对所述显著图进行二值化处理,得到二值图像。
获取模块30还配置为对二值图像的像素进行连通域标记,将连通域标号相同的所有像素合并为一个连通域;将连通域作为显著性区域。
获取模块30对二值图像的像素进行连通域标记,将连通域标号相同的所有像素合并为一个连通域,后续可以将二值图像中作为背景的像素去掉,仅对作为前景的像素进行分割,减少了运算,提高了速度。
在实施例中,提取模块10还配置为根据图像中每个像素的三原色分量,利用
Figure PCTCN2020076000-appb-000028
计算得到每个像素对应的亮度特征,得到亮度特征中间图,其中,每个像素的三原色分量包括红色分量、绿色分量和蓝色分量,R为红色分量、G为绿色分量、B为蓝色分量,I为亮度特征。
将亮度特征中间图输入图像金字塔,得到M个尺度的子亮度特征中间图,其 中,图像金字塔为M层的高斯金字塔,M≥7。
通过第一多尺度计算关系式I(c,s)=|I(c)ΘI(s)|,计算得到至少两个第一亮度特征图;其中,I(c,s)表示第一亮度特征图,I(c)表示第c个尺度的子亮度特征中间图,I(s)表示第s个尺度的子亮度特征中间图,c≥2,δ≥3,5≤s≤M-1,s=c+δ。
在实施例中,提取模块10还配置为根据所述图像中每个像素的三原色分量,利用
Figure PCTCN2020076000-appb-000029
计算得到每个像素对应的色调特征,得到色调特征中间图,其中,每个像素的三原色分量包括红色分量、绿色分量和蓝色分量,R为红色分量、G为绿色分量、B为蓝色分量,H为色调特征。
将色调特征中间图输入所述图像金字塔,得到M个尺度的子色调特征中间图,其中,所述图像金字塔为M层的高斯金字塔,M≥7。
通过第二多尺度计算关系式H(c,s)=|H(c)ΘH(s)|,计算得到至少两个所述第一色调特征图,其中H(c,s)表示第一色调特征图,H(c)表示第c个尺度的子色调特征中间图,H(s)表示第s个尺度的子色调特征中间图,c≥2,δ≥3,5≤s≤M-1,s=c+δ。
在本公开的实施例提供的一种图像的显著性区域的检测装置中,通过提取模块10针对性的提取亮度特征和色调特征,能更好的反映图像中前景物体的本质属性,使图像中的前景物体能更完整的作为显著性区域被提取。其中,通过提取模块10提取图像的亮度特征,并利用同学金字塔和第一多尺度计算关系式,得到至少两个第一亮度特征图,以及提取模块10提取图像的色调特征,并利用图像金字塔和第二多尺度计算关系式,得到至少两个第一色调特征图;融合模块20将至少两个第一亮度特征图分别进行归一化处理,并将归一化处理后的至少两个第一亮度特征图融合成第二亮度特征图,以及融合模块20将至少两个第一色调特征图分别进行归一化处理,并将归一化处理后的至少两个第一色调特征图融合成第二色调特征图,此外融合模块20还将第二亮度特征图和第二色调特征图融合成显著图。从而获取模块根据显著图,获取显著性区域。通过以上方法提取显著性区域的速度更快,适用范围更广,更接近人眼视觉的感知,提取的效果更好。
本公开的实施例还提供一种计算机设备1000,如图10所示,包括存储单元1010和处理单元1020,所述存储单元1010中存储可在处理单元1020上运行的计算机程序并存储标记结果。处理单元1020执行计算机程序时实现上述的图像显著性区域的检测方法。
本公开的实施例还提供一种计算机可读存储介质,其存储有计算机程序,计算机程序被处理器执行时实现上述的图像显著性区域的检测方法。
计算机可读存储介质包括永久性/非永久性、易失性/非易失性、可移动/非可移动媒体,可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。按照本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。
以上所述,仅为本公开的具体实施方式,但本公开的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本公开揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本公开的保护范围之内。因此,本公开的保护范围应以所述权利要求的保护范围为准。

Claims (15)

  1. 一种图像的显著性区域的检测方法,包括:
    提取所述图像的亮度特征,并利用图像金字塔和第一多尺度计算关系式,得到至少两个第一亮度特征图;
    提取所述图像的色调特征,并利用所述图像金字塔和第二多尺度计算关系式,得到至少两个第一色调特征图;
    将所述至少两个第一亮度特征图分别进行归一化处理,并将归一化处理后的所述至少两个第一亮度特征图融合成第二亮度特征图;
    将所述至少两个第一色调特征图分别进行归一化处理,并将归一化处理后的所述至少两个第一色调特征图融合成第二色调特征图;
    将所述第二亮度特征图和所述第二色调特征图融合成显著图;以及
    根据所述显著图,获取所述显著性区域。
  2. 根据权利要求1所述的图像的显著性区域的检测方法,还包括:
    采用自适应阈值二值化算法对所述显著图进行二值化处理,得到二值图像,
    其中,所述根据所述显著图,获取所述显著性区域,包括:根据所述二值图像,获取所述显著性区域。
  3. 根据权利要求2所述的图像的显著性区域的检测方法,其中,所述根据所述二值图像,获取所述显著性区域,包括:
    对所述二值图像的像素进行连通域标记,将连通域标号相同的所有像素合并为一个连通域;以及
    将所述连通域作为显著性区域。
  4. 根据权利要求3所述的图像的显著性区域的检测方法,其中,对所述二值图像的像素进行连通域标记,将所述连通域标号相同的像素合并为一个连通域,包括:
    将初始标记值设置为N,N≥2;
    逐行遍历所述二值图像的每个像素,判断所述像素是否未标记且灰度值为1;
    若该像素的灰度值不为1,或者若为1但已被标记,则继续逐行遍历所述二值 图像的像素;
    若该像素的灰度值为1且未标记,则作为种子像素,将所述种子像素的连通域标号标记为N,并且遍历所述种子像素的8邻域像素,判断所述种子像素的8邻域像素中未标记的像素灰度值是否为1;
    若该种子像素的8邻域像素中的至少一个未标记的像素的灰度值为1,则将该种子像素的8邻域像素中未标记且灰度值为1的像素连通域标号标记为N,并将8邻域像素中遍历到的未标记且灰度值为1的像素全部作为种子像素;以及
    若该种子像素的8邻域像素的未标记的像素的灰度值均不为1,则一个连通域标记结束,N加1,继续逐行遍历所述二值图像的像素,
    其中,坐标为(x,y)的像素的8邻域像素分别为坐标是(x-1,y-1)、(x-1,y)、(x-1,y+1)、(x,y-1)、(x,y+1)、(x+1,y-1)、(x+1,y)、(x+1,y+1)的像素;x为所述像素在二值图像中的行数,y为所述像素在二值图像的列数。
  5. 根据权利要求1-4任一项所述的图像的显著性区域的检测方法,其中,所述提取图像的亮度特征,并利用图像金字塔和第一多尺度计算关系式,得到至少两个第一亮度特征图,包括:
    根据所述图像中每个像素的三原色分量,利用
    Figure PCTCN2020076000-appb-100001
    计算得到每个像素对应的亮度特征,以得到亮度特征中间图,其中,每个像素的三原色分量包括红色分量、绿色分量和蓝色分量,R为红色分量、G为绿色分量、B为蓝色分量,I为亮度特征;
    将所述亮度特征中间图输入所述图像金字塔,得到M个尺度的子亮度特征中间图,其中,所述图像金字塔为M层的高斯金字塔,M≥7;以及
    通过所述第一多尺度计算关系式I(c,s)=|I(c)ΘI(s)|,计算得到所述至少两个第一亮度特征图,其中,I(c,s)表示第一亮度特征图,I(c)表示第c个尺度的子亮度特征中间图,I(s)表示第s个尺度的子亮度特征中间图,c≥2,δ≥3,5≤s≤M-1,s=c+δ。
  6. 根据权利要求1-4任一项所述的图像的显著性区域的检测方法,其中,所述提取所述图像的色调特征,并利用所述图像金字塔和第二多尺度计算关系式,得到 至少两个第一色调特征图,包括:
    根据所述图像中每个像素的三原色分量,利用
    Figure PCTCN2020076000-appb-100002
    Figure PCTCN2020076000-appb-100003
    计算得到每个像素对应的色调特征,以得到色调特征中间图,其中,每个像素的三原色分量包括红色分量、绿色分量和蓝色分量,R为红色分量、G为绿色分量、B为蓝色分量,H为色调特征;
    将所述色调特征中间图输入所述图像金字塔,得到M个尺度的子色调特征中间图,其中,所述图像金字塔为M层的高斯金字塔,M≥7;以及
    通过所述第二多尺度计算关系式H(c,s)=|H(c)ΘH(s)|,计算得到所述至少两个第一色调特征图,其中,H(c,s)表示第一色调特征图,H(c)表示第c个尺度的子色调特征中间图,H(s)表示第s个尺度的子色调特征中间图,c≥2,δ≥3,5≤s≤M-1,s=c+δ。
  7. 根据权利要求5所述的图像的显著性区域的检测方法,其中,所述将所述至少两个第一亮度特征图分别进行归一化处理,并将归一化处理后的所述至少两个第一亮度特征图融合成第二亮度特征图,包括:
    设定亮度特征最大值P;
    针对每个第一亮度特征图,遍历所述第一亮度特征图的亮度特征,获取所述第一亮度特征图的第一亮度特征最大值I 1max和第一亮度特征最小值I 1min,根据公式
    Figure PCTCN2020076000-appb-100004
    将所述第一亮度特征图的亮度特征归一化至O~P之间,I表示所述第一亮度特征图的像素的亮度特征值;
    在亮度特征归一化后的所述第一亮度特征图中,针对存在8邻域的每个像素,根据其8邻域像素的亮度特征值,获取其中的8邻域亮度特征最大值和8邻域亮度特征最小值;
    将所有8邻域亮度特征最大值和所有8邻域亮度特征最小值进行平均,得到亮度特征平均值Q;
    将所述第一亮度特征图的各像素的亮度特征值与(P-Q) 2相乘;
    遍历所述第一亮度特征图的亮度特征,获取第二亮度特征最大值I 2max和第二亮度特征最小值I 2min,根据公式
    Figure PCTCN2020076000-appb-100005
    将所述第一亮度特征图的亮度特征归一化至0~1之间;以及
    将归一化处理后的所述至少两个第一亮度特征图通过加权平均方式融合成第二亮度特征图。
  8. 根据权利要求6所述的图像的显著性区域的检测方法,其中,所述将所述至少两个第一色调特征图分别进行归一化处理,并将归一化处理后的所述至少两个第一色调特征图融合成一个第二色调特征图,包括:
    设定色调特征最大值P;
    针对每个第一色调特征图,遍历所述第一色调特征图的色调特征,获取所述第一色调特征图的第一色调特征最大值H 1max和第一色调特征最小值H 1min,根据公式
    Figure PCTCN2020076000-appb-100006
    将所述第一色调特征图的色调特征归一化至O~P之间,H表示所述第一色调特征图的像素的色调特征值;
    在色调特征归一化后的所述第一色调特征图中,针对存在8邻域的每个像素,根据其8邻域的色调特征值,获取其中的8邻域色调特征最大值和8邻域色调特征最小值;
    将所有8邻域色调特征最大值和所有8邻域色调特征最小值进行平均,得到色调特征平均值Q;
    将所述第一色调特征图的各像素的色调特征值与(P-Q) 2相乘;
    遍历所述第一亮度特征图的亮度特征,获取第二色调特征最大值H 2max和第二色调特征最小值H 2min,根据公式
    Figure PCTCN2020076000-appb-100007
    将所述第一色调特征图的色调特征归一化至0~1之间;以及
    将所有归一化处理后的所述至少两个第一色调特征图通过加权平均方式融合成一个第二色调特征图。
  9. 一种计算机设备,包括存储单元和处理单元,其中所述存储单元中存储可在 所述处理单元上运行的计算机程序;所述处理单元执行所述计算机程序时实现如权利要求1-8任一项所述的图像的显著性区域的检测方法。
  10. 一种计算机可读存储介质,其存储有计算机程序,其中,所述计算机程序被处理器执行时实现如权利要求1-8任一项所述的图像的显著性区域的检测方法。
  11. 一种图像的显著性区域的检测装置,包括:
    提取模块,配置为提取所述图像的亮度特征,并利用图像金字塔和第一多尺度计算关系式,得到至少两个第一亮度特征图,所述提取模块还配置为提取所述图像的色调特征,并利用所述图像金字塔和第二多尺度计算关系式,得到至少两个第一色调特征图;
    融合模块,配置为将所述至少两个第一亮度特征图分别进行归一化处理,并将归一化处理后的所述至少两个第一亮度特征图融合成第二亮度特征图,所述融合模块,还配置为将所述至少两个第一色调特征图分别进行归一化处理,并将归一化处理后的所述至少两个第一色调特征图融合成第二色调特征图,所述融合模块还配置为将所述第二亮度特征图和所述第二色调特征图融合成显著图;以及
    获取模块,配置为根据所述显著图,获取所述显著性区域。
  12. 根据权利要求11所述的图像的显著性区域的检测装置,其中,所述获取模块还配置为采用自适应阈值二值化算法对所述显著图进行二值化处理,得到二值图像;以及根据所述二值图像,获取所述显著性区域。
  13. 根据权利要求12所述的图像的显著性区域的检测装置,其中,
    所述获取模块还配置为对所述二值图像的像素进行连通域标记,将连通域标号相同的所有像素合并为一个连通域;以及将所述连通域作为显著性区域。
  14. 根据权利要求11所述的图像的显著性区域的检测装置,其中,所述提取模块还配置为:
    根据所述图像中每个像素的三原色分量,利用
    Figure PCTCN2020076000-appb-100008
    计算得到每个像素对应的亮度特征,以得到亮度特征中间图,其中,每个像素的三原色分量包括红色分量、绿色分量和蓝色分量,R为红色分量、G为绿色分量、B为蓝色分量,I为亮度特征;
    将所述亮度特征中间图输入所述图像金字塔,得到M个尺度的子亮度特征中间图,其中,所述图像金字塔为M层的高斯金字塔,M≥7;以及
    通过所述第一多尺度计算关系式I(c,s)=|I(c)ΘI(s)|,计算得到所述至少两个第一亮度特征图,其中,I(c,s)表示第一亮度特征图,I(c)表示第c个尺度的子亮度特征中间图,I(s)表示第s个尺度的子亮度特征中间图;c≥2,δ≥3,5≤s≤M-1,s=c+δ。
  15. 根据权利要求11所述的图像的显著性区域的检测装置,其中,所述提取模块还配置:
    根据所述图像中每个像素的三原色分量,利用
    Figure PCTCN2020076000-appb-100009
    Figure PCTCN2020076000-appb-100010
    计算得到每个像素对应的色调特征,得到色调特征中间图,其中,每个像素的三原色分量包括红色分量、绿色分量和蓝色分量,R为红色分量、G为绿色分量、B为蓝色分量,H为色调特征;
    将所述色调特征中间图输入所述图像金字塔,得到M个尺度的子色调特征中间图,其中,所述图像金字塔为M层的高斯金字塔,M≥7;以及
    通过所述第二多尺度计算关系式H(c,s)=|H(c)ΘH(s)|,计算得到所述至少两个第一色调特征图,其中,H(c,s)表示第一色调特征图,H(c)表示第c个尺度的子色调特征中间图,H(s)表示第s个尺度的子色调特征中间图,c≥2,δ≥3,5≤s≤M-1,s=c+δ。
PCT/CN2020/076000 2019-04-15 2020-02-20 图像的显著性区域的检测方法和装置 WO2020211522A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910301296.3A CN110008969B (zh) 2019-04-15 2019-04-15 图像显著性区域的检测方法和装置
CN201910301296.3 2019-04-15

Publications (1)

Publication Number Publication Date
WO2020211522A1 true WO2020211522A1 (zh) 2020-10-22

Family

ID=67172018

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/076000 WO2020211522A1 (zh) 2019-04-15 2020-02-20 图像的显著性区域的检测方法和装置

Country Status (2)

Country Link
CN (1) CN110008969B (zh)
WO (1) WO2020211522A1 (zh)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112329796A (zh) * 2020-11-12 2021-02-05 北京环境特性研究所 基于视觉显著性的红外成像卷云检测方法和装置
CN113009443A (zh) * 2021-02-22 2021-06-22 南京邮电大学 一种基于图连通密度的海面目标检测方法及其装置
CN113362356A (zh) * 2021-06-02 2021-09-07 杭州电子科技大学 一种基于双侧注意通路的显著轮廓提取方法
CN114022753A (zh) * 2021-11-16 2022-02-08 北京航空航天大学 基于显著性和边缘分析的对空小目标检测算法
CN114332572A (zh) * 2021-12-15 2022-04-12 南方医科大学 基于显著图引导分层密集特征融合网络用于提取乳腺病变超声图像多尺度融合特征参数方法
CN114998189A (zh) * 2022-04-15 2022-09-02 电子科技大学 一种彩色显示点缺陷检测方法
CN115578476A (zh) * 2022-11-21 2023-01-06 山东省标筑建筑规划设计有限公司 一种用于城乡规划数据的高效存储方法
CN115598138A (zh) * 2022-11-23 2023-01-13 惠州威尔高电子有限公司(Cn) 基于显著性检测的电源控制电路板瑕疵检测方法及系统
CN115810113A (zh) * 2023-02-10 2023-03-17 南京隼眼电子科技有限公司 针对sar图像的显著特征提取方法及装置
CN116051543A (zh) * 2023-03-06 2023-05-02 山东锦霖钢材加工有限公司 一种用于钢材剥皮的缺陷识别方法
CN117455913A (zh) * 2023-12-25 2024-01-26 卡松科技股份有限公司 基于图像特征的液压油污染智能检测方法
CN114022753B (zh) * 2021-11-16 2024-05-14 北京航空航天大学 基于显著性和边缘分析的对空小目标检测算法

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110008969B (zh) * 2019-04-15 2021-05-14 京东方科技集团股份有限公司 图像显著性区域的检测方法和装置
CN111444929B (zh) * 2020-04-01 2023-05-09 北京信息科技大学 一种基于模糊神经网络的显著图计算方法及系统
CN112669306A (zh) * 2021-01-06 2021-04-16 北京信息科技大学 一种基于显著图的太阳能电池片缺陷检测方法及系统
CN114022747B (zh) * 2022-01-07 2022-03-15 中国空气动力研究与发展中心低速空气动力研究所 基于特征感知的显著目标提取方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150227816A1 (en) * 2014-02-10 2015-08-13 Huawei Technologies Co., Ltd. Method and apparatus for detecting salient region of image
CN108960247A (zh) * 2017-05-22 2018-12-07 阿里巴巴集团控股有限公司 图像显著性检测方法、装置以及电子设备
CN109410171A (zh) * 2018-09-14 2019-03-01 安徽三联学院 一种用于雨天图像的目标显著性检测方法
CN110008969A (zh) * 2019-04-15 2019-07-12 京东方科技集团股份有限公司 图像显著性区域的检测方法和装置

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100573523C (zh) * 2006-12-30 2009-12-23 中国科学院计算技术研究所 一种基于显著区域的图像查询方法
US20100268301A1 (en) * 2009-03-06 2010-10-21 University Of Southern California Image processing algorithm for cueing salient regions
CN102521595B (zh) * 2011-12-07 2014-01-15 中南大学 一种基于眼动数据和底层特征的图像感兴趣区域提取方法
CN107301420A (zh) * 2017-06-30 2017-10-27 武汉大学 一种基于显著性分析的热红外影像目标探测方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150227816A1 (en) * 2014-02-10 2015-08-13 Huawei Technologies Co., Ltd. Method and apparatus for detecting salient region of image
CN108960247A (zh) * 2017-05-22 2018-12-07 阿里巴巴集团控股有限公司 图像显著性检测方法、装置以及电子设备
CN109410171A (zh) * 2018-09-14 2019-03-01 安徽三联学院 一种用于雨天图像的目标显著性检测方法
CN110008969A (zh) * 2019-04-15 2019-07-12 京东方科技集团股份有限公司 图像显著性区域的检测方法和装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
WANG, WENHAO ET AL.: "Improved Multi-scale Saliency Detection Based on HSV Space", COMPUTER ENGINEERING & SCIENCE, vol. 39, no. 2, 28 February 2017 (2017-02-28), ISSN: 1007-130X, DOI: 20200424151536X *

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112329796B (zh) * 2020-11-12 2023-05-23 北京环境特性研究所 基于视觉显著性的红外成像卷云检测方法和装置
CN112329796A (zh) * 2020-11-12 2021-02-05 北京环境特性研究所 基于视觉显著性的红外成像卷云检测方法和装置
CN113009443A (zh) * 2021-02-22 2021-06-22 南京邮电大学 一种基于图连通密度的海面目标检测方法及其装置
CN113009443B (zh) * 2021-02-22 2023-09-12 南京邮电大学 一种基于图连通密度的海面目标检测方法及其装置
CN113362356A (zh) * 2021-06-02 2021-09-07 杭州电子科技大学 一种基于双侧注意通路的显著轮廓提取方法
CN113362356B (zh) * 2021-06-02 2024-02-02 杭州电子科技大学 一种基于双侧注意通路的显著轮廓提取方法
CN114022753A (zh) * 2021-11-16 2022-02-08 北京航空航天大学 基于显著性和边缘分析的对空小目标检测算法
CN114022753B (zh) * 2021-11-16 2024-05-14 北京航空航天大学 基于显著性和边缘分析的对空小目标检测算法
CN114332572A (zh) * 2021-12-15 2022-04-12 南方医科大学 基于显著图引导分层密集特征融合网络用于提取乳腺病变超声图像多尺度融合特征参数方法
CN114332572B (zh) * 2021-12-15 2024-03-26 南方医科大学 基于显著图引导分层密集特征融合网络用于提取乳腺病变超声图像多尺度融合特征参数方法
CN114998189B (zh) * 2022-04-15 2024-04-16 电子科技大学 一种彩色显示点缺陷检测方法
CN114998189A (zh) * 2022-04-15 2022-09-02 电子科技大学 一种彩色显示点缺陷检测方法
CN115578476A (zh) * 2022-11-21 2023-01-06 山东省标筑建筑规划设计有限公司 一种用于城乡规划数据的高效存储方法
CN115598138A (zh) * 2022-11-23 2023-01-13 惠州威尔高电子有限公司(Cn) 基于显著性检测的电源控制电路板瑕疵检测方法及系统
CN115810113A (zh) * 2023-02-10 2023-03-17 南京隼眼电子科技有限公司 针对sar图像的显著特征提取方法及装置
CN116051543A (zh) * 2023-03-06 2023-05-02 山东锦霖钢材加工有限公司 一种用于钢材剥皮的缺陷识别方法
CN116051543B (zh) * 2023-03-06 2023-06-16 山东锦霖钢材加工有限公司 一种用于钢材剥皮的缺陷识别方法
CN117455913A (zh) * 2023-12-25 2024-01-26 卡松科技股份有限公司 基于图像特征的液压油污染智能检测方法
CN117455913B (zh) * 2023-12-25 2024-03-08 卡松科技股份有限公司 基于图像特征的液压油污染智能检测方法

Also Published As

Publication number Publication date
CN110008969A (zh) 2019-07-12
CN110008969B (zh) 2021-05-14

Similar Documents

Publication Publication Date Title
WO2020211522A1 (zh) 图像的显著性区域的检测方法和装置
CN108805023B (zh) 一种图像检测方法、装置、计算机设备及存储介质
TWI774659B (zh) 圖像文字的識別方法和裝置
CN108229490B (zh) 关键点检测方法、神经网络训练方法、装置和电子设备
Cornelis et al. Crack detection and inpainting for virtual restoration of paintings: The case of the Ghent Altarpiece
CN111915704A (zh) 一种基于深度学习的苹果分级识别方法
CN107784669A (zh) 一种光斑提取及其质心确定的方法
US11151402B2 (en) Method of character recognition in written document
US9256928B2 (en) Image processing apparatus, image processing method, and storage medium capable of determining a region corresponding to local light from an image
JP2012008100A (ja) 画像処理装置、画像処理方法及び画像処理プログラム
WO2015161794A1 (zh) 一种基于图像显著性检测的获取缩略图的方法
CN108389215B (zh) 一种边缘检测方法、装置、计算机存储介质及终端
Ružić et al. Virtual restoration of the Ghent Altarpiece using crack detection and inpainting
US10885326B2 (en) Character recognition method
CN112101386B (zh) 文本检测方法、装置、计算机设备和存储介质
Besheer et al. Modified invariant colour model for shadow detection
JP3740351B2 (ja) 画像加工装置および方法およびこの方法の実行プログラムを記録した記録媒体
US8885971B2 (en) Image processing apparatus, image processing method, and storage medium
JP5870745B2 (ja) 画像処理装置、二値化閾値算出方法及びコンピュータプログラム
CN108877030B (zh) 图像处理方法、装置、终端和计算机可读存储介质
JP2012252691A (ja) 画像からテキスト筆画画像を抽出する方法及び装置
JP2024501642A (ja) 画像内の注釈付きの関心領域の検出
JP2011076302A (ja) 輪郭抽出装置、輪郭抽出方法、および輪郭抽出プログラム
JP2010186246A (ja) 画像処理装置、方法、及び、プログラム
CN114648751A (zh) 一种处理视频字幕的方法、装置、终端及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20792070

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20792070

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20792070

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 06/05/2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20792070

Country of ref document: EP

Kind code of ref document: A1