WO2020107717A1 - Visual saliency region detection method and apparatus - Google Patents

Visual saliency region detection method and apparatus Download PDF

Info

Publication number
WO2020107717A1
WO2020107717A1 PCT/CN2019/075206 CN2019075206W WO2020107717A1 WO 2020107717 A1 WO2020107717 A1 WO 2020107717A1 CN 2019075206 W CN2019075206 W CN 2019075206W WO 2020107717 A1 WO2020107717 A1 WO 2020107717A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
visual saliency
value
saliency
visual
Prior art date
Application number
PCT/CN2019/075206
Other languages
French (fr)
Chinese (zh)
Inventor
陈沅涛
王进
王磊
张建明
陈曦
王志
邝利丹
谷科
Original Assignee
长沙理工大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 长沙理工大学 filed Critical 长沙理工大学
Publication of WO2020107717A1 publication Critical patent/WO2020107717A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures

Definitions

  • the embodiments of the present invention relate to the field of machine vision technology, and in particular, to a method and device for detecting a visually significant area.
  • the visual attention mechanism is an important sensory feature possessed by humans and high IQ animals. This mechanism can quickly find target objects with visual significance and interest from massive visual information, and can quickly "ignore” other uninteresting target objects. Reduce the amount of calculation in the information processing process. Therefore, introducing the visual attention mechanism into the image segmentation process can significantly improve the efficiency of the image processing process.
  • image segmentation and image parsing we usually maintain interest in certain areas on the image to be processed, and these area parts are called areas of interest or target areas.
  • the target is the region of interest with special properties in the image.
  • the image segmentation process is to divide the image into several independent areas with specific uniformity, so as to achieve the image processing method of extracting the area of interest from the complex background of the image.
  • How to detect the region of interest from the image to be processed involves the target detection method of the image and video.
  • This method changes the image sequence or video resolution, so that the image sequence and video can be perfectly displayed, and try to save the image and The key content in the video.
  • image and video target detection based on image content how to quickly detect the visual saliency area is a problem that needs to be solved.
  • the visual saliency area detection algorithm in the prior art focuses on finding fixed image pixels or target objects that human vision first notices, and can focus on understanding human vision pixel points and related applications, such as autofocus applications.
  • the visual saliency of the image pixel is expressed by the relevant area in the center of the pixel. For example, if the difference between the relevant image area block of the center pixel point and the other image area blocks in the original image is large, the pixel point can be regarded as a pixel point with high visual significance.
  • the area detection methods of visual saliency in the related art are all traditional methods that calculate the system visual saliency value of individual pixels every time.
  • the number of pixels will increase the computational complexity, and even a high-dimensional vector search tree structure can be executed.
  • This method has high time complexity and space complexity, and is suitable for small objects in the original image.
  • the embodiments of the present disclosure provide a method and a device for detecting a visual saliency area, which can quickly and accurately detect the visual saliency area in an image to be processed.
  • the embodiments of the present invention provide the following technical solutions:
  • An aspect of an embodiment of the present invention provides a visual saliency area detection method, including:
  • the weight values of the visual saliency maps of each layer are calculated to be used to fuse the visual saliency maps of each layer according to the weight values of the visual saliency maps of each layer.
  • the method further includes:
  • each preset position point according to the visual saliency value of each pixel, it is divided into an enhanced pixel point set and a weakened pixel point set, and the visual saliency value of the pixel point in the enhanced pixel point set is greater than the weakened The visual significance value of the pixels in the pixel set;
  • the image weakening method is used to weaken each pixel in the weakened pixel set.
  • generating a visual saliency map at each level of the image to be processed includes:
  • the selecting a target image area block satisfying the similarity condition from each candidate image area block includes:
  • calculating the visual saliency value of the current pixel according to the similarity between each target image block and the current pixel includes:
  • dist(r i , r k ) is the dissimilarity measure between the current pixel position and the Kth target image block
  • K is the total number of target image blocks
  • dist color (r i , R k ) is the Euclidean distance in the HSV color space between the target image area block subjected to vectorization and the location of the current pixel
  • dist pos (r i , r k ) is the location of the current pixel The Euclidean distance between the Kth target image block.
  • generating a visual saliency map at each level of the image to be processed includes:
  • x and y are the horizontal and vertical coordinate values of the pixel
  • f(x, y) is the significance function for solving the pixel (x, y)
  • f average (x, y) is f(x, y).
  • Arithmetic average, the size of the image to be processed is M*N;
  • x and y are the horizontal and vertical coordinate values of the pixel
  • f average (x, y) is the arithmetic average value of f (x, y)
  • h (f average (x, y)) is the image to be processed The generated feature histogram
  • the local system saliency map, the global system saliency map, and the scarcity saliency map are fused to obtain visual saliency maps at various levels of the image to be processed.
  • the fusing the local system saliency map, the global system saliency map, and the scarcity saliency map to obtain visual saliency maps at various levels of the image to be processed includes:
  • the visual saliency value V final of each pixel of the image to be processed is calculated using the following formula to generate a visual saliency map:
  • V Local , V Global , and V Scarcity are the local system saliency value, global system saliency value, and scarcity saliency value of the pixels whose horizontal and vertical coordinates are x and y, and v 1 is the local system saliency value.
  • the weight value of, v 2 is the weight value of the global system saliency value
  • v 3 is the weight value of the scarcity saliency value
  • i 2
  • V 1 , V 2 , and V 3 are based on Calculated.
  • the calculating the local system saliency value of each pixel of the image to be processed based on the frequency domain includes:
  • x and y are the horizontal and vertical coordinates of the pixel
  • FFT(u,v) is the characteristic value of the pixel
  • is after the fast Fourier transform
  • ⁇ (u,v) is the phase spectrum of the image to be processed.
  • the weight values of the visual saliency maps of each layer are calculated according to the visual saliency values of the two adjacent layers of visual saliency maps, and used to fuse the visual saliency maps of each layer according to the weight values of the visual saliency maps of each layer include:
  • p is the pixel position, Is the weight value of layer i, Is the weight value of layer i-1, Is the visual significance value of layer i, Is the visual significance value of layer i-1;
  • I the visual saliency value of the pixel at the p position of the fusion visual saliency map.
  • a visual saliency area detection device including:
  • the random search module is used to select multiple candidate image area blocks for the current pixel from the to-be-processed image using a random search method, and select target image area blocks satisfying the similarity condition from each candidate image area block;
  • the visual saliency value calculation module is used to calculate the visual saliency value of the current pixel according to the similarity between each target image block and the current pixel;
  • a multi-layer visual saliency map generation module used to generate visual saliency maps of each level of the image to be processed based on the visual saliency value of each pixel of the image to be processed;
  • the visual saliency map fusion module is used to calculate the weight value of the visual saliency map of each layer according to the visual saliency value of the adjacent two layers of visual saliency maps, which is used to visually map each layer according to the weight value of the visual saliency map of each layer Perform fusion.
  • An embodiment of the present invention further provides a visual saliency area detection device, including a processor, which is used to implement the steps of the visual saliency area detection method described in any one of the preceding items when the processor is used to execute a computer program stored in a memory.
  • An embodiment of the present invention finally provides a computer-readable storage medium that stores a visual saliency area detection program stored on the computer-readable storage medium.
  • the visual salience area detection program is implemented by the processor as any of the foregoing The steps of the method for visually significant area detection described in the item.
  • the technical solution provided by the present application has the advantage that when calculating the visual saliency value of each pixel of the image to be processed, a random search method is used to select multiple candidate image area blocks from the image to be processed, and then multiple candidate areas Select the image area block with the highest similarity to the current pixel in the image block, no need to search from each pixel of the image to be processed, shorten the search time of the image area block with the highest similarity, and do not limit the scale of the image pixel , which solves the shortcomings in the related art that it is necessary to find the image region block with the highest similarity from all pixels in the image to be processed, greatly improving the calculation efficiency of the pixel's visual saliency value, thereby effectively improving the image to be processed
  • the detection efficiency of the visual saliency area in the image is suitable for the calculation of the visual saliency value of the pixels of the image to be processed at any scale; in addition, by combining multiple levels of visual saliency maps, it is not only beneficial to eliminate the noise signal in the visual saliency map
  • FIG. 1 is a schematic flowchart of a method for detecting a visually significant region according to an embodiment of the present invention
  • FIG. 2 is a schematic flowchart of another method for detecting a visually significant region according to an embodiment of the present invention
  • FIG. 3 is a schematic diagram comparing ROC curves of the technical solution of the present application and related technologies according to an exemplary embodiment of the present disclosure
  • FIG. 4 is a schematic diagram of comparison between ROC curves of the technical solution of the present application and related technologies according to another exemplary embodiment of the present disclosure
  • FIG. 5 is a schematic diagram of target contours in a schematic image extracted by using the technical solution of the present application provided by the present disclosure
  • FIG. 6 is a structural diagram of a specific implementation manner of a visual saliency area detection device provided by an embodiment of the present invention.
  • FIG. 7 is a structural diagram of another specific implementation manner of a visual saliency area detection device provided by an embodiment of the present invention.
  • FIG. 1 is a schematic flowchart of a method for detecting a visually significant region according to an embodiment of the present invention.
  • the embodiment of the present invention may include the following:
  • S101 Use a random search method to select multiple candidate image area blocks for the current pixel from the image to be processed, and select target image area blocks that satisfy the similarity condition from each candidate image area block.
  • a random search method is used to select multiple candidate image area blocks (for example, 2K candidate image area blocks).
  • the current pixel here refers to the random search method at the current time. Select pixels of candidate image blocks in the processed image.
  • the similarity condition is to select those image area blocks that have the highest similarity to pixels from the candidate image area blocks.
  • the similarity condition can be based on the actual application scenario (such as the number of candidate image area blocks, the candidate image area block and the pixel's Similarity value) to determine a similarity threshold, and delete the image area blocks below the threshold.
  • the two-dimensional coordinate mapping function of the image to be processed is f(x): R ⁇ V, that is, the two-dimensional coordinate mapping function f(x) is defined on the coordinates of all pixels in the image to be processed, and V represents the result obtained after performing the normalization process Visual significance value.
  • V represents the result obtained after performing the normalization process Visual significance value.
  • R i is a uniformly distributed random variable, and its range is limited to [-1,1] ⁇ [-1,1]; ⁇ is 1/2 of the size of the image to be processed; ⁇ is the window
  • dissimilarity measure m i is calculated between the block and the candidate region of pixels in the r 2K block in the candidate region.
  • dissimilarity measure m i is calculated between the block and the candidate region of pixels in the r 2K block in the candidate region.
  • only the image area blocks with a small 1/2 dissimilarity value among the 2K candidate image blocks will be retained, and the remaining 1/2 image area blocks will be discarded.
  • Select 2K candidate image area blocks according to any position approximate to the pixel point r. These approximate image area blocks are part of all possible candidate areas in the image R to be processed. Since this is an incomplete sample acquisition process, it is incomplete Sample sampling will introduce a certain degree of sample error to the subsequent visual saliency map. However, as the number of samples 2K gradually becomes larger, the sample error will definitely decrease significantly.
  • Random search method 2K image areas randomly selected from all the blocks of pixels in the image, and the image areas to ensure wherein K r i highest similarity blocks and domain blocks in the image, the other image area K block is discarded.
  • This application only needs to randomly extract 2K image area blocks, and do not build an auxiliary high-dimensional vector search decision tree structure to improve search efficiency.
  • S102 Calculate the visual saliency value of the current pixel according to the similarity between each target image block and the current pixel.
  • dist color (r i , r j ) is the Euclidean distance between the image region blocks r i and r j in the HSV color space after vectorization, and the normalization operation corresponds to the range of [0,1]; when When dist color (r i , r j ) is relatively large with respect to any image region block r j , the pixel i is considered to be visually significant.
  • dist pos (r i , r j ) is the Euclidean distance between the image area blocks r i and r j , and the process of this Euclidean distance is normalized to correspond to the range [0,1].
  • the measurement method that can measure the dissimilarity between two corresponding image area blocks is shown in the following formula:
  • the visual saliency value S of the current pixel can be calculated using Equation 2:
  • dist(r i , r k ) is the dissimilarity measure between the current pixel position and the Kth target image block
  • K is the total number of target image blocks
  • dist color (r i , r k ) is the Euclidean distance in the HSV color space between the target image area block after vectorization and the current pixel position
  • dist pos (r i , r k ) is the position of the current pixel point and the Kth target image block European distance between.
  • S103 Generate a visual saliency map at each level of the image to be processed based on the visual saliency value of each pixel of the image to be processed.
  • the visual saliency measurement process represents the visual saliency map by calculating the visual saliency value of each image pixel and using the grayscale image with the same size and scale as the original input image.
  • the visual saliency value of each pixel represents the visual saliency value of the relevant position in the original image. The larger the visual saliency value, the more the pixel of the image can highlight its visual saliency on the original input image, and the easier it is for the attention of human observers.
  • the corresponding visual saliency map can be generated for each level according to the same method.
  • S104 Calculate the weight values of the visual saliency maps of each layer according to the visual saliency values of the adjacent two layers of visual saliency maps, and use them to fuse the visual saliency maps of each layer according to the weight values of the visual saliency maps of each layer.
  • the obtained multi-level visual saliency maps can be merged, and finally the combined multi-level visual saliency result map can be generated, for example, the following formula can be used for fusion:
  • Is the merged multi-scale visual saliency map on layer i Is the multi-scale visual saliency map of layer i, It is the detailed visual saliency result graph of the i-1th layer closest to the ith layer. in case versus There is a difference in the size, so when carrying out multi-level merge, you need to put The size of The size is close.
  • a random search method is used to select multiple candidate image area blocks from the image to be processed, and then multiple candidates Select the image area block with the highest similarity to the current pixel in the area image block, there is no need to search from each pixel of the image to be processed, which shortens the search time of the image area block with the highest similarity, and does not limit the image pixel Scale, which solves the disadvantages of the related art that it is necessary to find the image region block with the highest similarity from all pixels in the image to be processed, greatly improving the calculation efficiency of the pixel visual significance value, thereby effectively improving the processing
  • the detection efficiency of the visual saliency area in the image is suitable for the calculation of the visual saliency value of the pixels of the image to be processed at any scale; in addition, by combining multiple levels of visual saliency maps, it is not only beneficial to remove the noise in the visual saliency map
  • the signal can also accurately integrate the
  • the degree of refinement required for the visual saliency result graph is not high, but if an application scenario that requires a higher detection speed is required, using S101-S104 of the above embodiment can quickly generate a rough visual saliency map.
  • the visual saliency map obtained in the foregoing embodiment may also be enhanced and merged to generate a high-quality visual saliency result map. If a higher visual saliency value appears at a location, it indicates that the location has a higher image reliability, so you can selectively merge those visual saliency areas with high noise signals to obtain the final Graph of visual significance results.
  • the implementation of the process of generating a visual saliency map according to the visual saliency value in S103 may be as follows:
  • the area saliency value of the image to be processed depends on the difference between its own characteristics and the surrounding environment. If a certain area in the image is a salient area, there are one or more distinguishing features between the salient area and the surrounding area. In different images, the influence of the same feature on visual saliency is also different. Both brightness and color can be used as visual saliency features, so different image visual features need to be extracted during image preprocessing. For example, visual features such as brightness, direction, and color can be extracted for measurement. After many experiments in this application, it has been found that the directional visual features do not play a significant role in the image, and at the same time increase the algorithm time complexity. In view of this, this application only extracts the color and brightness features of the image to be processed.
  • the HSV color space uses hue (Hue), saturation S (Saturation), and brightness V (Value) to describe specific colors. HSV is more in line with human visual characteristics than RGB. Equation 3 can be used to convert the image to be processed from RGB space to HSV space in order to extract color and brightness features.
  • h is the H channel value in the HSV color space
  • l is the arithmetic mean of the H channel value
  • s is the S channel value in the HSV color space
  • v is the V channel value in the HSV color space
  • g is the R channel value of the RGB primary colors
  • B is the B channel value of the RGB primary colors
  • r is the R channel value of the RGB primary colors
  • max is the maximum value of each corresponding channel value
  • min is the minimum value of each corresponding channel value.
  • the visual saliency of each pixel on the image to be processed does not only depend on the characteristic value of the pixel, but also needs to be based on the difference between the pixel and its surrounding pixels. The greater the degree of difference, the more visually significant the pixel is.
  • Related technologies calculate the visual saliency through the degree of difference between each pixel of the image and the surrounding area, but the size of the area is difficult to determine, and the amount of algorithm calculation is very large.
  • the present application can analyze and calculate the local system saliency of image pixels from the frequency domain.
  • the amplitude spectrum and the phase spectrum have different functions in the image frequency domain features: the amplitude spectrum contains each pixel The specific size of the point feature value; the phase spectrum contains the image structure feature, which can reflect the change of the image pixel feature value.
  • phase spectrum and the amplitude spectrum play different roles in the image construction process.
  • Simply using the phase spectrum to construct the result image can obtain a result with a similar structure to the original image; but only using the amplitude spectrum to construct the result image, the resulting image is very different from the original image.
  • Equation 3 can be used to perform local system saliency calculation for each feature image in advance. Firstly, the Fast Fourier Transform (FFT) is performed on the input image to extract the phase spectrum and the amplitude spectrum; then the phase spectrum is applied to the image to construct the local system saliency map of each feature image.
  • FFT Fast Fourier Transform
  • x and y are the horizontal and vertical coordinates of the pixel
  • FFT(u,v) is the characteristic value of the pixel
  • is after the fast Fourier transform
  • ⁇ (u,v) is the phase spectrum of the image to be processed
  • V Local is the local system saliency value of each image pixel in the image.
  • the global system saliency of the sharply changing edge or complex background area in the image will be higher, while the local system saliency of the smooth area and the target is lower.
  • global system saliency can be introduced.
  • the global system saliency of pixels is used to measure the degree of visual saliency of the image pixels on the entire image.
  • Formula 5 can be applied to generate the global system saliency of the image to be processed.
  • V Global (x, y) in Equation 5 is the global system saliency value of each pixel in the image to be processed.
  • x and y are the horizontal and vertical coordinate values of the pixel
  • f(x, y) is the significance function for solving the pixel (x, y)
  • f average (x, y) is f(x, y).
  • the arithmetic average, the size of the image to be processed is M*N.
  • the scarcity saliency means that the probability of a certain feature value appearing on the entire image is very low, then the image pixels with this feature value are "unusual", the higher the visual significance value of the image pixel. Equation 6 can be applied to measure the scarcity saliency characteristics of each pixel in the input image.
  • H(f average (x, y)) in Equation 6 is the feature histogram generated by the image to be processed, and V Scarcity (x, y) is the scarcity significance value of the pixel:
  • the local system saliency map After calculating the local system saliency value, global system saliency value and scarcity saliency value of each pixel of the image to be processed, correspondingly, the local system saliency map, global system saliency map and scarcity saliency map can be generated.
  • the final visual saliency map is the fusion of the three visual saliency maps.
  • formula 7 can be used to calculate the local system saliency, global system saliency, and The measurement results of scarcity saliency are fused to obtain the final system saliency map.
  • V final is the visual saliency value of each pixel of the image to be processed
  • V Local (x, y), V Global (x, y), V Scarcity (x, y) (V Local , V Global , V Scarcity ) is the local system saliency value, global system saliency value, and scarcity saliency value of pixels with x and y horizontal and vertical coordinates in sequence
  • v 1 is the weight value of the local system saliency value
  • v 2 is the global system
  • v 3 is the weight value of the scarce saliency value
  • the weight value W pixel can be defined as the arithmetic average value of the Euclidean distance between each significant pixel point and the pixel point in the center of the image. The specific method is shown in Equation 9.
  • N in formula 9 is the number of salient points in the system feature saliency map, sp i is the ith salient point in the system, and center is the specific position of the pixel in the center of the image.
  • the W effect weight can be defined as the arithmetic average of the Euclidean distance between the salient points of each system, and the center id is the center of the visual salient point position, as shown in Equation 10.
  • formula 11 can be used to calculate the correlation weight of each system feature saliency map, and the system saliency maps are fused to obtain the final system saliency map.
  • V fi is the saliency map of the i-th system
  • Is the weight of the area, pixel weight, and effect weight of the i-th system feature saliency map.
  • W fi is the fusion of the three weights on the i-th system feature saliency map.
  • W i is the i-th system feature saliency.
  • the final weight to which the graph belongs, V is the saliency graph of the final system.
  • the embodiments of the present invention use specific system saliency, global system saliency, and scarcity saliency to describe and describe specific visual saliency, which will be more conducive to the rapid image segmentation process.
  • the visual saliency mechanism we constantly improve and perfect the application of the visual saliency mechanism for the image segmentation process in order to achieve a better image segmentation effect.
  • the merging method of the weighted visual saliency result graph and the merging method of the average visual saliency result graph may be used to perform the merging operation of the visual saliency graph.
  • the method of combining the weighted visual saliency result graphs is better than the average visual saliency result graph merging method.
  • the visual significance value is among them Normalize to the range of [0,1].
  • the visual saliency result graph on layer i-1 will be adjusted to the same size as the visual saliency result graph on layer i.
  • Use visual significance To represent the visual saliency result graph after operating on the image sequence The visual significance value at the position of the upper p coordinate.
  • the results of merging the refined visual saliency values of two adjacent layers are shown in Equation 12, Equation 13 and Equation 14:
  • p is the pixel position, Is the weight value of layer i, Is the weight value of layer i-1, Is the visual significance value of layer i, Is the visual significance value of layer i-1;
  • the visual saliency value is refined in the i-th layer And the i-1 layer refined visual saliency value It can be calculated if it has been determined with when Time,
  • the refined visual saliency value after merging can be calculated according to formula 12.
  • the visual saliency map after the merge operation can fuse the two levels of visual saliency multi-scale features, and the visual saliency obtained after the fusion
  • the resulting graph is smoother and clearer.
  • the generated visual saliency map will contain a lot of random noise.
  • the reason why the random noise signal is generated is that image noise is generated because the 2K image area block is randomly selected to perform Equation 1 for calculation.
  • a denoising process is performed.
  • the eight-neighbor visual saliency value may be used to perform a refinement process on the roughened visual saliency map (the visual saliency map generated by S103).
  • the eight-neighbor coordinate method can select neighboring pixels from the eight directions corresponding to the pixel r.
  • the candidate image area block obtained by the eight-neighbor coordinate method is very different from the method of randomly selecting candidate image area blocks. Because the image similarity of the neighboring area at the p-coordinate is high, the eight-neighbor coordinate method may make the visual saliency value at the p-coordinate smaller than the actual image visual saliency value; if it is done according to the visual saliency value obtained for this method Normalization will cause corresponding noise on the rough visual saliency map obtained above.
  • the visual saliency of neighboring coordinates is high and similar, for example. Therefore, it is necessary to perform refinement operations on the coordinate positions of pixel points that differ greatly from the visual saliency value of the neighboring coordinates.
  • the visual saliency value with a large difference from the eight neighbors has a low credibility, so it is possible to choose to refine the visual saliency value with a large difference from the eight neighbors.
  • FIG. 2 is a schematic flowchart of another method for detecting a visually significant region according to an embodiment of the present invention.
  • the embodiment of the present invention may be applied to an image segmentation method, for example.
  • the specific content may include the following:
  • S201 Use a random search method to select multiple candidate image area blocks for the current pixel from the image to be processed.
  • S203 Delete the candidate image area blocks corresponding to the similarity metric value lower than the similarity threshold, and use the remaining candidate image area blocks as the target image area blocks.
  • S204 Calculate the visual saliency value of the current pixel according to the similarity between each target image block and the current pixel.
  • S205 Based on the visual saliency value of each pixel of the image to be processed, generate rough visual saliency maps of each level of the image to be processed.
  • S207 Calculate the weight value of the visual saliency map of each layer according to the visual saliency value of the visual saliency maps of two adjacent layers.
  • S208 Integrate the visual saliency maps of each layer according to the weight value of the visual saliency maps of each layer to obtain an initial visual saliency map.
  • S209 Select multiple preset position points in the initial visual saliency map, and for each preset position point, divide it into an enhanced pixel point set and a weakened pixel point set according to the visual significance value of each pixel point.
  • the number of location points can be selected according to the size of the image to be processed and the actual situation of the visual saliency map, which is not limited in this application.
  • the visual saliency value of the pixels in the enhanced pixel set is greater than the visual saliency value of the pixels in the weakened pixel set.
  • a person skilled in the art may divide the high visual saliency value into the enhanced pixel point set according to his own experience, and divide the pixel point corresponding to the low visual saliency value into the weakened pixel point set. Or you can set a visual saliency division threshold based on the visual saliency value and total number of pixels from each position, divide the pixels above the threshold into the enhanced pixel set, and divide the pixels below the threshold The points are divided into a set of weakened pixels.
  • S210 Use the image enhancement method to perform enhancement processing on each pixel in the enhanced pixel set.
  • Any image enhancement method that can enhance the feature of the pixel point may be used, and those skilled in the art may choose according to actual needs and actual application scenarios, which is not limited in this application.
  • Any image enhancement method that can weaken the characteristics of pixel points can be used, and those skilled in the art can select according to actual needs and actual application scenarios, which is not limited in this application.
  • the invention of the embodiments of the present invention implements the detection of visually significant regions in four stages.
  • the first stage is a random detection stage, which performs random search and detection processing according to the image to be processed, and uses the information of each layer of the original image to obtain each layer Roughened visual saliency map;
  • the second stage is the refinement stage, which uses the roughened visual saliency maps at all levels to perform the refinement operation to remove image noise generated by random search and detection processing on the roughened visual saliency map ;
  • the third stage is the multi-level merge stage of the area map. Through the detailed merge of the multi-layer visual saliency map, the final merged visual saliency map combining local features and global features; the fourth stage is to update the visual saliency value.
  • the visual saliency areas with high noise signals will be selectively combined to obtain the final high-quality detailed visual saliency result map. It achieves fast and real-time generation of detailed visual saliency result maps with the same size as the original input image (to-be-processed image), thereby improving the overall quality of the target segmentation results in the video image sequence.
  • multi-scale visual saliency detection algorithm In order to confirm that the technical solution provided by this application (generally referred to as multi-scale visual saliency detection algorithm) can quickly generate a detailed visual saliency result map of the same size as the original input image, it can be applied to the The visual saliency result graph of the image sequence, thereby improving the overall quality of the target segmentation results in the video image sequence.
  • This application uses Matlab simulation environment to carry out a series of experiments to verify.
  • this multi-scale visual saliency detection algorithm is implemented.
  • a visual saliency detection algorithm is used to generate a corresponding visual saliency result map based on the original input image.
  • a large number of visual saliency results are obtained, which show that the multi-scale visual saliency detection method can obtain visual saliency images with good effects on the original input image compared with the existing mainstream visual saliency measurement methods.
  • This application is related to 8 related technologies, such as AIM (Saliency detection algorithm based on information theory), GBVS (saliency detection algorithm based on graph theory), SR (saliency detection algorithm based on spectral remainder), IS (saliency labeling algorithm based on sparse salient regions), ICL ( Dynamic visual attention detection algorithm), ITTI (the most classic model algorithm), RC (global comparison algorithm based on salient region detection), SUN (significant Bayesian frame model using natural statistics) were compared and compared.
  • AIM Ses detection algorithm based on information theory
  • GBVS saliency detection algorithm based on graph theory
  • SR saliency detection algorithm based on spectral remainder
  • IS saliency labeling algorithm based on sparse salient regions
  • ICL Dynamic visual attention detection algorithm
  • ITTI the most classic model algorithm
  • RC global comparison algorithm based on salient region detection
  • SUN significant Bayesian frame model using natural statistics
  • the receiver operating characteristic curve (Receiver Operating Characteristic, ROC) was used to analyze the quantitative results. As shown in Figures 3 and 4, the curve of this application is the highest.
  • the eight methods can be divided into two categories: algorithms that simply use prior knowledge of probabilities, including AIM, GBVS, IS, and SR; algorithms that simply use current observed image information, including ICL, ITTI, RC, and SUN. Then, this multi-scale visual saliency detection method is better than the method using only a priori probability knowledge or current observation image information.
  • FIG. 5 is an image obtained by segmentation using the technical solution of the present application.
  • the technical solution of the present application can accurately locate the salient regions in the image and use
  • the image segmentation method of the technical solution of the present application can clearly cut out the outline of a bird and the outline of a tree, and can obtain an image segmentation result that is almost completely consistent with human eye segmentation.
  • the embodiment of the present invention introduces a random search detection method for visual saliency, greatly improves the efficiency of the visual saliency detection algorithm, and obtains a detailed visual saliency map that is exactly the same size and scale as the original input image.
  • the embodiment of the present invention also provides a corresponding implementation device for the visual saliency area detection method, which further makes the method more practical.
  • the visual saliency area detection device provided by the embodiment of the present invention will be described below.
  • the visual saliency area detection device described below and the visual saliency area detection method described above can be referred to each other.
  • FIG. 6 is a structural diagram of a visually significant area detection device according to an embodiment of the present invention in a specific implementation manner.
  • the device may include:
  • the random search module 601 is used to select a plurality of candidate image area blocks for the current pixel from the image to be processed using a random search method, and select target image area blocks that satisfy the similarity condition from each candidate image area block.
  • the visual saliency value calculation module 602 is used to calculate the visual saliency value of the current pixel according to the similarity between each target image block and the current pixel.
  • the multi-layer visual saliency map generation module 603 is used to generate visual saliency maps at various levels of the image to be processed based on the visual saliency value of each pixel of the image to be processed.
  • the visual saliency map fusion module 604 is used to calculate the weight values of the visual saliency maps of each layer according to the visual saliency values of the adjacent two layers of visual saliency maps, which are used to visually highlight each layer according to the weight values of the visual saliency maps of each layer Figure fusion.
  • the device may further include a visual saliency update module 605, and the visual saliency update module 605 is used for visual saliency obtained after fusion.
  • the visual saliency value is greater than the visual saliency value of the pixels in the weakened pixel set; the image enhancement method is used to strengthen each pixel in the enhanced pixel set; the image weakening method is used to weaken each pixel in the weakened pixel set deal with.
  • the multi-layer visual saliency map generation module 603 may further include:
  • the rough visual saliency map generation sub-module is used to generate rough visual saliency maps at various levels of the image to be processed based on the visual saliency value of each pixel of the image to be processed;
  • the detailed visual saliency map generation sub-module is used to perform a detailed operation on each rough visual saliency map to remove image noise signals and obtain respective corresponding detailed visual saliency maps.
  • the random search module 601 can also calculate the similarity metric values of the current pixel point and each candidate image area block separately; delete the candidate image area blocks corresponding to the similarity metric value lower than the similarity threshold, and The remaining candidate image area blocks serve as modules of the target image area block.
  • the visual saliency value calculation module 602 may be, for example, a module that calculates the visual saliency value S of the current pixel using the following formula:
  • dist(r i , r k ) is the dissimilarity measure between the current pixel position and the Kth target image block
  • K is the total number of target image blocks
  • dist color (r i , r k ) is the Euclidean distance in the HSV color space between the target image area block after vectorization and the current pixel position
  • dist pos (r i , r k ) is the position of the current pixel point and the Kth target image block European distance between.
  • the multi-layer visual saliency map generation module 603 may further include:
  • the local system saliency map generation sub-module is used to calculate the local system saliency value of each pixel of the image to be processed based on the frequency domain to generate a local system saliency map;
  • the global system saliency map generation submodule is used to calculate the global system saliency value V Global (x,y) of each pixel of the image to be processed using the following formula to generate a global system saliency map:
  • x and y are the horizontal and vertical coordinate values of the pixel
  • f(x, y) is the significance function for solving the pixel (x, y)
  • f average (x, y) is f(x, y).
  • the arithmetic mean value, the size of the image to be processed is M*N;
  • the scarcity saliency map generation submodule is used to calculate the scarcity saliency value V Scarcity (x,y) of each pixel of the image to be processed using the following formula to generate a scarcity saliency map:
  • x and y are the horizontal and vertical coordinates of the pixel
  • f average (x, y) is the arithmetic average of f(x, y)
  • h(f average (x, y)) is generated by the image to be processed Feature histogram
  • the fusion sub-module is used to fuse the local system saliency map, the global system saliency map, and the scarcity saliency map to obtain visual saliency maps at various levels of the image to be processed.
  • the fusion sub-module may be, for example, a module for generating a visual saliency map by calculating the visual saliency value V final of each pixel of the image to be processed using the following formula:
  • V Local , V Global , and V Scarcity are the local system saliency value, global system saliency value, and scarcity saliency value of the pixels whose horizontal and vertical coordinates are x and y, and v 1 is the local system saliency value.
  • the weight value of, v 2 is the weight value of the global system saliency value
  • v 3 is the weight value of the scarcity saliency value
  • i 3
  • V 1 , V 2 , V 3 are based on Calculated.
  • the local system saliency map generation sub-module may also be a module that calculates the local system saliency value V Local (x, y) of each pixel of the image to be processed using the following formula:
  • x and y are the horizontal and vertical coordinates of the pixel
  • FFT(u,v) is the characteristic value of the pixel
  • is after the fast Fourier transform
  • the resulting image amplitude spectrum, ⁇ (u,v) is the phase spectrum of the image to be processed.
  • the visual saliency map fusion module 604 may also be a module that calculates the weight value of the visual saliency map of each layer using the following formula:
  • p is the pixel position, Is the weight value of layer i, Is the weight value of layer i-1, Is the visual significance value of layer i, Is the visual significance value of layer i-1;
  • the embodiment of the present invention improves the calculation efficiency of the visual saliency value of pixels, thereby effectively improving the detection efficiency of the visual saliency area of the image to be processed, and can also improve the visual saliency area of the image to be processed. Detection accuracy and precision.
  • An embodiment of the present invention also provides a visually significant area detection device, which may specifically include:
  • Memory used to store computer programs
  • the processor is configured to execute a computer program to implement the steps of the visual saliency area detection method described in any one of the above embodiments.
  • the embodiment of the present invention improves the calculation efficiency of the visual saliency value of pixels, thereby effectively improving the detection efficiency of the visual saliency area of the image to be processed, and can also improve the visual saliency area of the image to be processed. Detection accuracy and precision.
  • An embodiment of the present invention also provides a computer-readable storage medium that stores a visual saliency area detection program, and the visual saliency area detection program is executed by the processor as described in any one of the above embodiments in the visual saliency area detection method A step of.
  • the embodiment of the present invention improves the calculation efficiency of the visual saliency value of pixels, thereby effectively improving the detection efficiency of the visual saliency area of the image to be processed, and can also improve the visual saliency area of the image to be processed. Detection accuracy and precision.
  • RAM random access memory
  • ROM read-only memory
  • electrically programmable ROM electrically erasable and programmable ROM
  • registers hard disks, removable disks, CD-ROMs, or all fields of technology. Any other known storage medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

Disclosed are a visual saliency region detection method and apparatus, wherein the method comprises: first using a random search method to select, from an image to be processed, multiple candidate image region blocks for each pixel point and then select, from the candidate image region blocks, a target image region block having the highest similarity with the pixel point; according to similarities between target image blocks and pixel points, calculating visual saliency values of the pixel points; based on the visual saliency values of the pixel points of the image to be processed, generating visual saliency maps of various layers of the image to be processed; and finally, according to the visual saliency values of visual saliency maps of two adjacent layers, calculating weighted values of the visual saliency maps of various layers, and fusing the visual saliency maps of various layers according to the weighted values. The present application improves the efficiency of calculation of a visual saliency value of a pixel point so as to effectively improve detection efficiency for a visual saliency region of an image to be processed, and further improves detection accuracy and precision for the visual saliency region in the image to be processed.

Description

视觉显著性区域检测方法及装置Visual salient region detection method and device 技术领域Technical field
本发明实施例涉及机器视觉技术领域,特别是涉及一种视觉显著性区域检测方法及装置。The embodiments of the present invention relate to the field of machine vision technology, and in particular, to a method and device for detecting a visually significant area.
背景技术Background technique
随着机器视觉技术的快速发展,基于图像处理的视觉显著性区域检测作为该技术实现过程中的关键一环,得到迫切的发展。With the rapid development of machine vision technology, visual saliency area detection based on image processing has been an urgent development as a key link in the implementation of this technology.
视觉注意机制为人类及高智商动物拥有的重要感官特征,此机制能够从海量视觉信息中迅速找到具有视觉显著性并感兴趣的目标对象,并且可以迅速“无视”其他不感兴趣的目标对象,同时降低信息处理过程中的计算量。因此将视觉注意机制引入图像分割过程中能够明显提高图像处理过程的工作效率。在图像分割和图像解析处理过程中,通常会针对待处理图像上的某些特定区域保持兴趣,这些区域部分被称为感兴趣区域或目标区域。一般情况下,目标就是图像中具有特殊性质的感兴趣区域。为了提取目标就需要从图像上执行分割操作,经过分割之后才能进行进一步分析和相关处理工作。图像分割过程就是将图像区分成若干具有特定均匀一致性的独立区域,从而达到将感兴趣区域从图像复杂背景中提取出来的图像处理方法。The visual attention mechanism is an important sensory feature possessed by humans and high IQ animals. This mechanism can quickly find target objects with visual significance and interest from massive visual information, and can quickly "ignore" other uninteresting target objects. Reduce the amount of calculation in the information processing process. Therefore, introducing the visual attention mechanism into the image segmentation process can significantly improve the efficiency of the image processing process. In the process of image segmentation and image parsing, we usually maintain interest in certain areas on the image to be processed, and these area parts are called areas of interest or target areas. In general, the target is the region of interest with special properties in the image. In order to extract the target, it is necessary to perform a segmentation operation from the image, and after the segmentation, further analysis and related processing work can be performed. The image segmentation process is to divide the image into several independent areas with specific uniformity, so as to achieve the image processing method of extracting the area of interest from the complex background of the image.
如何从待处理图像中检测到感兴趣区域,就涉及到图像及视频的目标检测方法,该方法通过改变图像序列或视频分辨率,从而使得图像序列和视频均能完美显示,并且尽量保存图像与视频里的关键内容。而在这些基于图像内容的图像及视频目标检测过程中,如何进行快速检测视觉显著性区域是需要解决的问题。How to detect the region of interest from the image to be processed involves the target detection method of the image and video. This method changes the image sequence or video resolution, so that the image sequence and video can be perfectly displayed, and try to save the image and The key content in the video. In the process of image and video target detection based on image content, how to quickly detect the visual saliency area is a problem that needs to be solved.
现有技术中的视觉显著性区域检测算法关注寻找人类视觉最先注意到的固定图像像素点或目标对象,能够针对理解人类视觉关注像素点和相关应用,比如自动聚焦应用。根据该区域块表达像素点所处位置的上下文内容,提出图像像素点的视觉显著性是由该像素点中心的相关区域进行表达。举例来说,如果中心像素点的相关图像区域块和原始图像中其他图像区域块的差异均很大,那么像素点则可看作视觉显著性高的像素点。The visual saliency area detection algorithm in the prior art focuses on finding fixed image pixels or target objects that human vision first notices, and can focus on understanding human vision pixel points and related applications, such as autofocus applications. According to the context content of the region block to express the position of the pixel, it is proposed that the visual saliency of the image pixel is expressed by the relevant area in the center of the pixel. For example, if the difference between the relevant image area block of the center pixel point and the other image area blocks in the original image is large, the pixel point can be regarded as a pixel point with high visual significance.
相关技术中的视觉显著性的区域检测方法都是采用每次计算单独像素点的系统视觉显著性值的传统方法。像素点数量规模会导致计算复杂度增加,甚至还要构建高维向量的查找树结构才能执行,这种方法的时间复杂度和空间复杂度都很高,适用于原始图像中体积较小目标。The area detection methods of visual saliency in the related art are all traditional methods that calculate the system visual saliency value of individual pixels every time. The number of pixels will increase the computational complexity, and even a high-dimensional vector search tree structure can be executed. This method has high time complexity and space complexity, and is suitable for small objects in the original image.
发明内容Summary of the invention
本公开实施例提供了一种视觉显著性区域检测方法及装置,可快速、精准地检测出待处理图像中的视觉显著性区域。The embodiments of the present disclosure provide a method and a device for detecting a visual saliency area, which can quickly and accurately detect the visual saliency area in an image to be processed.
为解决上述技术问题,本发明实施例提供以下技术方案:To solve the above technical problems, the embodiments of the present invention provide the following technical solutions:
本发明实施例一方面提供了一种视觉显著性区域检测方法,包括:An aspect of an embodiment of the present invention provides a visual saliency area detection method, including:
利用随机搜索方法从待处理图像中为当前像素点选取多个候选图像区域块,从各候选图像区域块中选取满足相似度条件的目标图像区域块;Use a random search method to select multiple candidate image area blocks for the current pixel from the image to be processed, and select target image area blocks that satisfy the similarity condition from each candidate image area block;
根据各目标图像块与所述当前像素点之间的相似度,计算所述当前像素点的视觉显著性值;Calculate the visual saliency value of the current pixel according to the similarity between each target image block and the current pixel;
基于所述待处理图像的各像素点的视觉显著性值,生成所述待处理图像各层次的视觉显著图;Generating a visual saliency map at each level of the image to be processed based on the visual saliency value of each pixel of the image to be processed;
根据相邻两层视觉显著图的视觉显著性值,计算各层视觉显著图的权重值,以用于按照各层视觉显著图的权重值将各层视觉显著图进行融合。Based on the visual saliency values of the two adjacent layers of visual saliency maps, the weight values of the visual saliency maps of each layer are calculated to be used to fuse the visual saliency maps of each layer according to the weight values of the visual saliency maps of each layer.
可选的,所述根据相邻两层视觉显著图的视觉显著性值,计算各层视觉显著图的权重值之后,还包括:Optionally, after calculating the weight value of the visual saliency map of each layer according to the visual saliency value of the visual saliency maps of two adjacent layers, the method further includes:
在融合后得到的视觉显著图中选择多个预设位置点;Select multiple preset position points in the visual saliency map obtained after fusion;
对每个预设位置点,根据各个像素点的视觉显著性值,将其分为强化像素点集和弱化像素点集,所述强化像素点集中的像素点的视觉显著性值大于所述弱化像素点集中的像素点的视觉显著性值;For each preset position point, according to the visual saliency value of each pixel, it is divided into an enhanced pixel point set and a weakened pixel point set, and the visual saliency value of the pixel point in the enhanced pixel point set is greater than the weakened The visual significance value of the pixels in the pixel set;
利用图像增强方法对所述强化像素点集中的各像素点进行强化处理;Use the image enhancement method to perform enhancement processing on each pixel in the enhanced pixel set;
利用图像弱化方法对所述弱化像素点集中的各像素点进行弱化处理。The image weakening method is used to weaken each pixel in the weakened pixel set.
可选的,所述基于所述待处理图像的各像素点的视觉显著性值,生成所述待处理图像各层次的视觉显著图包括:Optionally, based on the visual saliency value of each pixel of the image to be processed, generating a visual saliency map at each level of the image to be processed includes:
基于所述待处理图像的各像素点的视觉显著性值,生成所述待处理图像各层次的粗略化视觉显著图;Generating rough visual saliency maps at various levels of the image to be processed based on the visual saliency value of each pixel of the image to be processed;
对各粗略化视觉显著图进行细致化操作,以去除图像噪声信号,得到各自相应的细致化视觉显著图。Perform detailed operations on each rough visual saliency map to remove image noise signals, and obtain respective corresponding detailed visual saliency maps.
可选的,所述从各候选图像区域块中选取满足相似度条件的目标图像区域块包括:Optionally, the selecting a target image area block satisfying the similarity condition from each candidate image area block includes:
分别计算所述当前像素点和各候选图像区域块的相似度量值;Calculating the similarity metric values of the current pixel point and each candidate image area block separately;
删除相似度量值低于相似度阈值对应的候选图像区域块,将剩余的候选图像区域块作为目标图像区域块。Delete the candidate image area blocks corresponding to the similarity metric value lower than the similarity threshold, and use the remaining candidate image area blocks as the target image area blocks.
可选的,所述根据各目标图像块与所述当前像素点之间的相似度,计算所述当前像素点的视觉显著性值包括:Optionally, calculating the visual saliency value of the current pixel according to the similarity between each target image block and the current pixel includes:
利用下述公式计算所述当前像素点的视觉显著性值S:Use the following formula to calculate the visual significance value S of the current pixel:
Figure PCTCN2019075206-appb-000001
其中,
Figure PCTCN2019075206-appb-000002
Figure PCTCN2019075206-appb-000001
among them,
Figure PCTCN2019075206-appb-000002
式中,dist(r i,r k)为所述当前像素点所在位置与第K个目标图像块之间的不相似性度量值,K为目标图像块的总个数,dist color(r i,r k)为经过向量化处理的目标图像区域块与所述当前像素点所在位置在HSV颜色空间上的欧氏距离,dist pos(r i,r k)为所述当前像素点所在位置与第K个目标图像块之间的欧式距离。 Where dist(r i , r k ) is the dissimilarity measure between the current pixel position and the Kth target image block, K is the total number of target image blocks, dist color (r i , R k ) is the Euclidean distance in the HSV color space between the target image area block subjected to vectorization and the location of the current pixel, and dist pos (r i , r k ) is the location of the current pixel The Euclidean distance between the Kth target image block.
可选的,所述基于所述待处理图像的各像素点的视觉显著性值,生成所述待处理图像各层次的视觉显著图包括:Optionally, based on the visual saliency value of each pixel of the image to be processed, generating a visual saliency map at each level of the image to be processed includes:
基于频域计算所述待处理图像的各像素点的局部系统显著性值,生成局部系统显著图;Calculating the local system saliency value of each pixel of the image to be processed based on the frequency domain to generate a local system saliency map;
利用下述公式计算所述待处理图像的各像素点的全局系统显著性值V Global(x,y),生成全局系统显著图: Use the following formula to calculate the global system saliency value V Global (x, y) of each pixel of the image to be processed to generate a global system saliency map:
Figure PCTCN2019075206-appb-000003
Figure PCTCN2019075206-appb-000003
式中,x、y为像素点的横纵坐标值,f(x,y)为求解像素点(x,y)的显著性函数,f average(x,y)为f(x,y)的算术平均值,所述待处理图像大小为M*N; In the formula, x and y are the horizontal and vertical coordinate values of the pixel, f(x, y) is the significance function for solving the pixel (x, y), and f average (x, y) is f(x, y). Arithmetic average, the size of the image to be processed is M*N;
利用下述公式计算所述待处理图像的各像素点的稀缺显著性值V Scarcity(x,y),生成稀缺显著图: Use the following formula to calculate the scarcity saliency value V Scarcity (x,y) of each pixel of the image to be processed to generate a scarcity saliency map:
Figure PCTCN2019075206-appb-000004
Figure PCTCN2019075206-appb-000004
式中,x、y为像素点的横纵坐标值,f average(x,y)为f(x,y)的算术平均值,h(f average(x,y))为所述待处理图像产生的特征直方图; In the formula, x and y are the horizontal and vertical coordinate values of the pixel, f average (x, y) is the arithmetic average value of f (x, y), and h (f average (x, y)) is the image to be processed The generated feature histogram;
将所述局部系统显著图、所述全局系统显著图及所述稀缺显著图进行融合,得到所述待处理图像各层次的视觉显著图。The local system saliency map, the global system saliency map, and the scarcity saliency map are fused to obtain visual saliency maps at various levels of the image to be processed.
可选的,所述将所述局部系统显著图、所述全局系统显著图及所述稀缺显著图进行融合,得到所述待处理图像各层次的视觉显著图包括:Optionally, the fusing the local system saliency map, the global system saliency map, and the scarcity saliency map to obtain visual saliency maps at various levels of the image to be processed includes:
对所述待处理图像的各层,利用下述公式计算所述待处理图像的各像素点的视觉显著性值V final,生成视觉显著图: For each layer of the image to be processed, the visual saliency value V final of each pixel of the image to be processed is calculated using the following formula to generate a visual saliency map:
Figure PCTCN2019075206-appb-000005
Figure PCTCN2019075206-appb-000005
式中,V Local、V Global、V Scarcity依次为横纵坐标值为x、y的像素点的局部系统显著性值、全局系统显著性值及稀缺显著性值;v 1为局部系统显著性值的权重值,v 2为全局系统显著性值的权重值,v 3为稀缺显著性值的权重值;i=1,V i=V 1为所述局部系统显著性值,i=2,V i=V 2为所述全部系统显著性值,i=3,V i=V 3为所述稀缺显著性值,V 1、V 2、V 3为根据
Figure PCTCN2019075206-appb-000006
计算所得。
In the formula, V Local , V Global , and V Scarcity are the local system saliency value, global system saliency value, and scarcity saliency value of the pixels whose horizontal and vertical coordinates are x and y, and v 1 is the local system saliency value. The weight value of, v 2 is the weight value of the global system saliency value, v 3 is the weight value of the scarcity saliency value; i=1, V i = V 1 is the local system saliency value, i=2, V i = V 2 is the significance value of all the systems, i = 3, V i = V 3 is the scarcity significance value, and V 1 , V 2 , and V 3 are based on
Figure PCTCN2019075206-appb-000006
Calculated.
可选的,所述基于频域计算所述待处理图像的各像素点的局部系统显著性值包括:Optionally, the calculating the local system saliency value of each pixel of the image to be processed based on the frequency domain includes:
利用下述公式计算所述待处理图像的各像素点的局部系统显著性值V Local(x,y): Use the following formula to calculate the local system saliency value V Local (x,y) of each pixel of the image to be processed:
Figure PCTCN2019075206-appb-000007
Figure PCTCN2019075206-appb-000007
式中,x、y为像素点的横纵坐标值,FFT(u,v)为像素点特征值,|FFT(u,v)e jψ(u,v)|为经过快速傅里叶变换以后所得图像幅度谱,ψ(u,v)为所述待处理图像相位谱。 In the formula, x and y are the horizontal and vertical coordinates of the pixel, FFT(u,v) is the characteristic value of the pixel, |FFT(u,v)e jψ(u,v) | is after the fast Fourier transform The obtained image amplitude spectrum, ψ(u,v) is the phase spectrum of the image to be processed.
可选的,所述根据相邻两层视觉显著图的视觉显著性值,计算各层视觉显著图的权重值,以用于按照各层视觉显 著图的权重值将各层视觉显著图进行融合包括:Optionally, the weight values of the visual saliency maps of each layer are calculated according to the visual saliency values of the two adjacent layers of visual saliency maps, and used to fuse the visual saliency maps of each layer according to the weight values of the visual saliency maps of each layer include:
利用下述公式计算各层视觉显著图的权重值:Use the following formula to calculate the weight value of each layer of visual saliency map:
Figure PCTCN2019075206-appb-000008
Figure PCTCN2019075206-appb-000008
式中,p为像素点位置,
Figure PCTCN2019075206-appb-000009
为第i层的权重值,
Figure PCTCN2019075206-appb-000010
为第i-1层的权重值,
Figure PCTCN2019075206-appb-000011
为第i层的视觉化显著性值,
Figure PCTCN2019075206-appb-000012
为第i-1层的视觉显著性值;
In the formula, p is the pixel position,
Figure PCTCN2019075206-appb-000009
Is the weight value of layer i,
Figure PCTCN2019075206-appb-000010
Is the weight value of layer i-1,
Figure PCTCN2019075206-appb-000011
Is the visual significance value of layer i,
Figure PCTCN2019075206-appb-000012
Is the visual significance value of layer i-1;
利用下述公式将相邻两层的视觉显著图进行融合,得到融合视觉显著图;Use the following formula to fuse the visual saliency maps of two adjacent layers to obtain a fusion visual saliency map;
Figure PCTCN2019075206-appb-000013
Figure PCTCN2019075206-appb-000013
式中,
Figure PCTCN2019075206-appb-000014
为所述融合视觉显著图p位置的像素点视觉显著性值。
In the formula,
Figure PCTCN2019075206-appb-000014
Is the visual saliency value of the pixel at the p position of the fusion visual saliency map.
本发明实施例另一方面提供了一种视觉显著性区域检测装置,包括:Another aspect of an embodiment of the present invention provides a visual saliency area detection device, including:
随机搜索模块,用于利用随机搜索方法从待处理图像中为当前像素点选取多个候选图像区域块,从各候选图像区域块中选取满足相似度条件的目标图像区域块;The random search module is used to select multiple candidate image area blocks for the current pixel from the to-be-processed image using a random search method, and select target image area blocks satisfying the similarity condition from each candidate image area block;
视觉显著性值计算模块,用于根据各目标图像块与所述当前像素点之间的相似度,计算所述当前像素点的视觉显著性值;The visual saliency value calculation module is used to calculate the visual saliency value of the current pixel according to the similarity between each target image block and the current pixel;
多层视觉显著图生成模块,用于基于所述待处理图像的各像素点的视觉显著性值,生成所述待处理图像各层次的视觉显著图;A multi-layer visual saliency map generation module, used to generate visual saliency maps of each level of the image to be processed based on the visual saliency value of each pixel of the image to be processed;
视觉显著图融合模块,用于根据相邻两层视觉显著图的视觉显著性值,计算各层视觉显著图的权重值,以用于按照各层视觉显著图的权重值将各层视觉显著图进行融合。The visual saliency map fusion module is used to calculate the weight value of the visual saliency map of each layer according to the visual saliency value of the adjacent two layers of visual saliency maps, which is used to visually map each layer according to the weight value of the visual saliency map of each layer Perform fusion.
本发明实施例还提供了一种视觉显著性区域检测设备,包括处理器,所述处理器用于执行存储器中存储的计算机程序时实现如前任一项所述视觉显著性区域检测方法的步骤。An embodiment of the present invention further provides a visual saliency area detection device, including a processor, which is used to implement the steps of the visual saliency area detection method described in any one of the preceding items when the processor is used to execute a computer program stored in a memory.
本发明实施例最后还提供了一种计算机可读存储介质,所述计算机可读存储介质上存储有视觉显著性区域检测程序,所述视觉显著性区域检测程序被处理器执行时实现如前任一项所述视觉显著性区域检测方法的步骤。An embodiment of the present invention finally provides a computer-readable storage medium that stores a visual saliency area detection program stored on the computer-readable storage medium. The visual salience area detection program is implemented by the processor as any of the foregoing The steps of the method for visually significant area detection described in the item.
本申请提供的技术方案的优点在于,在计算待处理图像每个像素点的视觉显著性值时,先利用随机搜索方法从待处理图像中选择多个候选图像区域块,再从多个候选区域图像块中选取与当前像素点相似度最高的图像区域块,不需要从待处理图像的每个像素点进行寻找,缩短了相似度最高图像区域块的查找时间,不受限图像像素点的规模,从而解决了相关技术中需要从待处理图像中所有像素点中寻找相似度最高的图像区域块存在的弊端,大大的提高了像素点视觉显著性值计算效率,从而有效地提升了待处理图像中的视觉显著性区域的检测效率,适用于任何规模下待处理图像的像素点视觉显著性值计算场景中;此外,通过合并多层次视觉显著图,不仅有利于消除视觉显著图中的噪声信号,还可将待处理图像中的显著特征准确融合至生成视觉显著特征图中,有利于提升待处理图像中的视觉显著性区域的检测准确度和精度。The technical solution provided by the present application has the advantage that when calculating the visual saliency value of each pixel of the image to be processed, a random search method is used to select multiple candidate image area blocks from the image to be processed, and then multiple candidate areas Select the image area block with the highest similarity to the current pixel in the image block, no need to search from each pixel of the image to be processed, shorten the search time of the image area block with the highest similarity, and do not limit the scale of the image pixel , Which solves the shortcomings in the related art that it is necessary to find the image region block with the highest similarity from all pixels in the image to be processed, greatly improving the calculation efficiency of the pixel's visual saliency value, thereby effectively improving the image to be processed The detection efficiency of the visual saliency area in the image is suitable for the calculation of the visual saliency value of the pixels of the image to be processed at any scale; in addition, by combining multiple levels of visual saliency maps, it is not only beneficial to eliminate the noise signal in the visual saliency map It can also accurately merge the salient features in the image to be processed into the visual salient feature map, which is helpful to improve the detection accuracy and precision of the visual salient regions in the image to be processed.
应当理解的是,以上的一般描述和后文的细节描述仅是示例性的,并不能限制本公开。It should be understood that the above general description and the following detailed description are only exemplary and do not limit the present disclosure.
附图说明BRIEF DESCRIPTION
为了更清楚的说明本发明实施例或相关技术的技术方案,下面将对实施例或相关技术描述中所需要使用的附图作简单的介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly explain the technical solutions of the embodiments of the present invention or related technologies, the following will briefly introduce the drawings used in the description of the embodiments or related technologies. Obviously, the drawings in the following description are only the invention For some of the embodiments, for those of ordinary skill in the art, without paying any creative labor, other drawings can also be obtained based on these drawings.
图1为本发明实施例提供的一种视觉显著性区域检测方法的流程示意图;FIG. 1 is a schematic flowchart of a method for detecting a visually significant region according to an embodiment of the present invention;
图2为本发明实施例提供的另一种视觉显著性区域检测方法的流程示意图;2 is a schematic flowchart of another method for detecting a visually significant region according to an embodiment of the present invention;
图3为本公开根据一示例性实施例示出的本申请技术方案与相关技术的ROC曲线对比示意图;FIG. 3 is a schematic diagram comparing ROC curves of the technical solution of the present application and related technologies according to an exemplary embodiment of the present disclosure;
图4为本公开根据另一示例性实施例示出的本申请技术方案与相关技术的ROC曲线对比示意图;4 is a schematic diagram of comparison between ROC curves of the technical solution of the present application and related technologies according to another exemplary embodiment of the present disclosure;
图5为本公开提供的利用本申请技术方案提取的示意性图像中的目标轮廓示意图;5 is a schematic diagram of target contours in a schematic image extracted by using the technical solution of the present application provided by the present disclosure;
图6为本发明实施例提供的视觉显著性区域检测装置的一种具体实施方式结构图;6 is a structural diagram of a specific implementation manner of a visual saliency area detection device provided by an embodiment of the present invention;
图7为本发明实施例提供的视觉显著性区域检测装置的另一种具体实施方式结构图。7 is a structural diagram of another specific implementation manner of a visual saliency area detection device provided by an embodiment of the present invention.
具体实施方式detailed description
为了使本技术领域的人员更好地理解本发明方案,下面结合附图和具体实施方式对本发明作进一步的详细说明。显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。In order to enable those skilled in the art to better understand the solution of the present invention, the present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments. Obviously, the described embodiments are only a part of the embodiments of the present invention, but not all the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by a person of ordinary skill in the art without making creative efforts fall within the protection scope of the present invention.
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”“第四”等是用于区别不同的对象,而不是用于描述特定的顺序。此外术语“包括”和“具有”以及他们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可包括没有列出的步骤或单元。The terms “first”, “second”, “third” and “fourth” in the description and claims of this application and the above drawings are used to distinguish different objects, not to describe a specific order . In addition, the terms "including" and "having" and any variations thereof are intended to cover non-exclusive inclusions. For example, a process, method, system, product, or device that includes a series of steps or units is not limited to the listed steps or units, but may include steps or units that are not listed.
在介绍了本发明实施例的技术方案后,下面详细的说明本申请的各种非限制性实施方式。After introducing the technical solutions of the embodiments of the present invention, various non-limiting embodiments of the present application are described in detail below.
首先参见图1,图1为本发明实施例提供的一种视觉显著性区域检测方法的流程示意图,本发明实施例可包括以下内容:Referring first to FIG. 1, FIG. 1 is a schematic flowchart of a method for detecting a visually significant region according to an embodiment of the present invention. The embodiment of the present invention may include the following:
S101:利用随机搜索方法从待处理图像中为当前像素点选取多个候选图像区域块,从各候选图像区域块中选取满足相似度条件的目标图像区域块。S101: Use a random search method to select multiple candidate image area blocks for the current pixel from the image to be processed, and select target image area blocks that satisfy the similarity condition from each candidate image area block.
对待处理图像中的每个像素点,均利用随机搜索方法进行选取多个候选图像区域块(例如2K个候选图像区域块),此处的当前像素点指代在当前时刻利用随机搜索方法从待处理图像中选取候选图像块的像素点。For each pixel in the image to be processed, a random search method is used to select multiple candidate image area blocks (for example, 2K candidate image area blocks). The current pixel here refers to the random search method at the current time. Select pixels of candidate image blocks in the processed image.
相似度条件为从候选图像区域块中选择与像素点的相似度最高的那些图像区域块,相似度条件可以根据实际应用场景(例如候选图像区域块的个数、候选图像区域块与像素点的相似度值)来确定一个相似度阈值,将低于阈值的图像区域块删除。The similarity condition is to select those image area blocks that have the highest similarity to pixels from the candidate image area blocks. The similarity condition can be based on the actual application scenario (such as the number of candidate image area blocks, the candidate image area block and the pixel's Similarity value) to determine a similarity threshold, and delete the image area blocks below the threshold.
以候选图像区域块为2K个,例如K=32,目标图像区域块为K个为例,阐述S101的实现过程。Taking the candidate image area blocks as 2K, for example, K=32, and the target image area blocks as K as an example, the implementation process of S101 is described.
待处理图像的二维坐标映射函数为f(x):R→V,即待处理图像中所有像素点坐标上定义二维坐标映射函数f(x),V代表执行归一化过程之后得到的视觉显著性值。设在待处理图像R上的r像素点,待处理图像R对应在[0,1]上 的视觉显著性值为v,映射函数为f(r)=v。函数f值是归一到[0,1]范围内的视觉显著性值,视觉显著性值可存储在与R相同规模的二维数组里。The two-dimensional coordinate mapping function of the image to be processed is f(x): R→V, that is, the two-dimensional coordinate mapping function f(x) is defined on the coordinates of all pixels in the image to be processed, and V represents the result obtained after performing the normalization process Visual significance value. Set the r pixel point on the image R to be processed, the visual saliency value corresponding to the image R to be processed on [0,1] is v, and the mapping function is f(r)=v. The function f value is the visual saliency value normalized to [0,1], and the visual saliency value can be stored in a two-dimensional array of the same size as R.
设v=f(r)为通过与多个与像素点r位置接近的相关图像区域块来进行视觉显著性值计算,具体方法如公式1所示。Let v=f(r) be the calculation of the visual saliency value through a plurality of related image area blocks close to the position of the pixel point r. The specific method is shown in Equation 1.
r i=r+δβ iR i;(1) r i = r+δβ i R i ; (1)
在公式1中,R i是分布均匀的随机变量,它的范围取值限定在[-1,1]×[-1,1]内;δ是待处理图像尺寸的1/2;β是窗口衰减参数,用于将i=1增加到i=2K个图像区域块,直到查找半径ωR i缩小到单个像素点。若i<2K,则使i=1,一直到测试候选区域块数量满足2K为止,在具体实现部分中β=0.75。 In formula 1, R i is a uniformly distributed random variable, and its range is limited to [-1,1]×[-1,1]; δ is 1/2 of the size of the image to be processed; β is the window The attenuation parameter is used to increase i=1 to i=2K image area blocks until the search radius ωR i is reduced to a single pixel. If i<2K, then make i=1 until the number of test candidate area blocks satisfies 2K, in the specific implementation part β=0.75.
根据公式1在2K候选区域块里进行计算候选区域块与像素点r之间的不相似度量值m i。本申请中只会保留2K候选图像块里1/2不相似性值较小的图像区域块,舍弃剩余1/2图像区域块。按照与像素点r近似的任意位置选择2K个候选图像区域块,这些近似图像区域块是待处理图像R中所有的可能候选区域的部分区域,由于这属于不完全化样本采集过程,因此不完全样本采样会给后续得到的视觉显著图引入一定程度的样本误差。但是,伴随采样数量2K逐渐变大,样本误差肯定会明显降低。 According to Equation 1 dissimilarity measure m i is calculated between the block and the candidate region of pixels in the r 2K block in the candidate region. In this application, only the image area blocks with a small 1/2 dissimilarity value among the 2K candidate image blocks will be retained, and the remaining 1/2 image area blocks will be discarded. Select 2K candidate image area blocks according to any position approximate to the pixel point r. These approximate image area blocks are part of all possible candidate areas in the image R to be processed. Since this is an incomplete sample acquisition process, it is incomplete Sample sampling will introduce a certain degree of sample error to the subsequent visual saliency map. However, as the number of samples 2K gradually becomes larger, the sample error will definitely decrease significantly.
相较相关技术中只能检测单个宽度或高度为256个像素点的粗略视觉显著性图像区域图。因为每个像素点都需要进行视觉显著性计算,因此运算量非常大,而且为了以最快速度寻找最为相似的图像区域块,需要在稀疏网格上构造基于高维向量的查找决策树。因此,如果逐个计算每个像素点的视觉显著性值,整个过程会非常慢。本申请不在整个图像所有像素点中来寻找相似度最高的图像区域块来计算该像素点的视觉显著性值。采用随机搜索方法从图像所有像素点中随机抽取2K个图像区域块,并且保证其中K个图像区域块和r i相似度最高的图像区域块,将其他K个图像区域块丢弃。本申请只需要随机提取2K个图像区域块,而且不要建立辅助的高维向量查找决策树结构来提高查找效率。 Compared with the related art, only a single visually significant image area map with a width or height of 256 pixels can be detected. Because each pixel needs to perform visual saliency calculation, the amount of calculation is very large, and in order to find the most similar image area blocks at the fastest speed, a search decision tree based on high-dimensional vectors needs to be constructed on the sparse grid. Therefore, if the visual significance value of each pixel is calculated one by one, the whole process will be very slow. This application does not search for the image area block with the highest similarity among all pixels of the entire image to calculate the visual significance value of the pixel. Random search method 2K image areas randomly selected from all the blocks of pixels in the image, and the image areas to ensure wherein K r i highest similarity blocks and domain blocks in the image, the other image area K block is discarded. This application only needs to randomly extract 2K image area blocks, and do not build an auxiliary high-dimensional vector search decision tree structure to improve search efficiency.
S102:根据各目标图像块与当前像素点之间的相似度,计算当前像素点的视觉显著性值。S102: Calculate the visual saliency value of the current pixel according to the similarity between each target image block and the current pixel.
dist color(r i,r j)为经过向量化处理的图像区域块r i与r j在HSV颜色空间上的欧氏距离,并且执行归一化操作对应到[0,1]范围内;当dist color(r i,r j)相对于任意图像区域块r j都很大时,则认定像素点i具有视觉显著性。 dist color (r i , r j ) is the Euclidean distance between the image region blocks r i and r j in the HSV color space after vectorization, and the normalization operation corresponds to the range of [0,1]; when When dist color (r i , r j ) is relatively large with respect to any image region block r j , the pixel i is considered to be visually significant.
dist pos(r i,r j)为图像区域块r i与r j所在位置之间的欧氏距离,将此欧氏距离执行过程,归一化对应到[0,1]范围内。可衡量两个对应的图像区域块之间不相似性的度量方法如下述公式所示: dist pos (r i , r j ) is the Euclidean distance between the image area blocks r i and r j , and the process of this Euclidean distance is normalized to correspond to the range [0,1]. The measurement method that can measure the dissimilarity between two corresponding image area blocks is shown in the following formula:
Figure PCTCN2019075206-appb-000015
Figure PCTCN2019075206-appb-000015
鉴于此,在一种具体的实施方式中,可利用公式2计算当前像素点的视觉显著性值S:In view of this, in a specific embodiment, the visual saliency value S of the current pixel can be calculated using Equation 2:
Figure PCTCN2019075206-appb-000016
Figure PCTCN2019075206-appb-000016
其中,
Figure PCTCN2019075206-appb-000017
among them,
Figure PCTCN2019075206-appb-000017
式中,dist(r i,r k)为当前像素点所在位置与第K个目标图像块之间的不相似性度量值,K为目标图像块的总个数,dist color(r i,r k)为经过向量化处理的目标图像区域块与当前像素点所在位置在HSV颜色空间上的欧氏距离,dist pos(r i,r k)为当前像素点所在位置与第K个目标图像块之间的欧式距离。 Where dist(r i , r k ) is the dissimilarity measure between the current pixel position and the Kth target image block, K is the total number of target image blocks, dist color (r i , r k ) is the Euclidean distance in the HSV color space between the target image area block after vectorization and the current pixel position, and dist pos (r i , r k ) is the position of the current pixel point and the Kth target image block European distance between.
S103:基于待处理图像的各像素点的视觉显著性值,生成待处理图像各层次的视觉显著图。S103: Generate a visual saliency map at each level of the image to be processed based on the visual saliency value of each pixel of the image to be processed.
视觉显著性度量处理通过计算各个图像像素点的视觉显著性值,用与原始输入图像大小规模相同的灰度图像来表示视觉显著图。其中,每个像素点的视觉显著性值代表原始图像中相关位置的视觉显著性值。视觉显著性值越大则表示该图像像素点在原始输入图像上越能突出其视觉显著性,同时越容易获得人类观察者的注意。The visual saliency measurement process represents the visual saliency map by calculating the visual saliency value of each image pixel and using the grayscale image with the same size and scale as the original input image. Among them, the visual saliency value of each pixel represents the visual saliency value of the relevant position in the original image. The larger the visual saliency value, the more the pixel of the image can highlight its visual saliency on the original input image, and the easier it is for the attention of human observers.
对待处理图像的每个层次,可按照相同的方法可为每个层次生成对应的视觉显著图。For each level of the image to be processed, the corresponding visual saliency map can be generated for each level according to the same method.
S104:根据相邻两层视觉显著图的视觉显著性值,计算各层视觉显著图的权重值,以用于按照各层视觉显著图的权重值将各层视觉显著图进行融合。S104: Calculate the weight values of the visual saliency maps of each layer according to the visual saliency values of the adjacent two layers of visual saliency maps, and use them to fuse the visual saliency maps of each layer according to the weight values of the visual saliency maps of each layer.
考虑到随机搜索算法会产生大量噪声,即使是采用一定的降噪手段,也无法彻底清除各视觉显著性图中红的噪声信息,为了将多尺度视觉显著性特征进一步融入视觉显著图的最终结果当中,可将得到的多层次的视觉显著性图进行合并,最终产生合并后多层次视觉显著性结果图,例如可采用下述公式进行融合:Considering that the random search algorithm will generate a lot of noise, even with certain noise reduction methods, it is impossible to completely remove the red noise information in each visual saliency map. In order to further integrate the multi-scale visual saliency features into the final result of the visual saliency map Among them, the obtained multi-level visual saliency maps can be merged, and finally the combined multi-level visual saliency result map can be generated, for example, the following formula can be used for fusion:
Figure PCTCN2019075206-appb-000018
Figure PCTCN2019075206-appb-000018
Figure PCTCN2019075206-appb-000019
为第i层上经过合并的多尺度视觉显著图,
Figure PCTCN2019075206-appb-000020
为第i层的多尺度视觉显著图,
Figure PCTCN2019075206-appb-000021
为与第i层最接近的第i-1层的细致化视觉显著性结果图。如果
Figure PCTCN2019075206-appb-000022
Figure PCTCN2019075206-appb-000023
的大小有区别,那么在进行多层次合并时,需要把
Figure PCTCN2019075206-appb-000024
的大小向
Figure PCTCN2019075206-appb-000025
大小尺寸靠近。
Figure PCTCN2019075206-appb-000019
Is the merged multi-scale visual saliency map on layer i,
Figure PCTCN2019075206-appb-000020
Is the multi-scale visual saliency map of layer i,
Figure PCTCN2019075206-appb-000021
It is the detailed visual saliency result graph of the i-1th layer closest to the ith layer. in case
Figure PCTCN2019075206-appb-000022
versus
Figure PCTCN2019075206-appb-000023
There is a difference in the size, so when carrying out multi-level merge, you need to put
Figure PCTCN2019075206-appb-000024
The size of
Figure PCTCN2019075206-appb-000025
The size is close.
在本发明实施例提供的技术方案中,在计算待处理图像每个像素点的视觉显著性值时,先利用随机搜索方法从待处理图像中选择多个候选图像区域块,再从多个候选区域图像块中选取与当前像素点相似度最高的图像区域块,不需要从待处理图像的每个像素点进行寻找,缩短了相似度最高图像区域块的查找时间,不受限图像像素点的规模,从而解决了相关技术中需要从待处理图像中所有像素点中寻找相似度最高的图像区域块存在的弊端,大大的提高了像素点视觉显著性值计算效率,从而有效地提升了待处理图像中的视觉显著性区域的检测效率,适用于任何规模下待处理图像的像素点视觉显著性值计算场景中;此外,通过合并多层次视觉显著图,不仅有利于消除视觉显著图中的噪声信号,还可将待处理图像中的显著特征准确融合至生成视觉显著特征图中,有利于提升待处理图像中的视觉显著性区域的检测准确度和精度。In the technical solution provided by the embodiment of the present invention, when calculating the visual saliency value of each pixel of the image to be processed, a random search method is used to select multiple candidate image area blocks from the image to be processed, and then multiple candidates Select the image area block with the highest similarity to the current pixel in the area image block, there is no need to search from each pixel of the image to be processed, which shortens the search time of the image area block with the highest similarity, and does not limit the image pixel Scale, which solves the disadvantages of the related art that it is necessary to find the image region block with the highest similarity from all pixels in the image to be processed, greatly improving the calculation efficiency of the pixel visual significance value, thereby effectively improving the processing The detection efficiency of the visual saliency area in the image is suitable for the calculation of the visual saliency value of the pixels of the image to be processed at any scale; in addition, by combining multiple levels of visual saliency maps, it is not only beneficial to remove the noise in the visual saliency map The signal can also accurately integrate the salient features in the image to be processed into a visual salient feature map, which is helpful to improve the detection accuracy and precision of the visually significant regions in the image to be processed.
针对视觉显著性结果图的细致化程度要求不高,但如果要求达到较高检测速度的应用场景,利用上述实施例的S101-S104可快速生成粗略化视觉显著图。当需要细致化程度和原始输入图像完全匹配相同的视觉显著图的应用场景,还可对上述实施例得到的视觉显著性图进行增强合并来生成高质量的视觉显著性结果图。如果某个位置处出现越高的视觉显著性值,表示该位置出具有越高的图像可信度,所以可通过有选择地对于那些噪声信号高的视觉显著性区域进行合并,从而获得最终的视觉显著性结果图。The degree of refinement required for the visual saliency result graph is not high, but if an application scenario that requires a higher detection speed is required, using S101-S104 of the above embodiment can quickly generate a rough visual saliency map. When it is required that the degree of refinement and the original input image completely match the application scene of the same visual saliency map, the visual saliency map obtained in the foregoing embodiment may also be enhanced and merged to generate a high-quality visual saliency result map. If a higher visual saliency value appears at a location, it indicates that the location has a higher image reliability, so you can selectively merge those visual saliency areas with high noise signals to obtain the final Graph of visual significance results.
在一种具体的实施方式中,S103中根据视觉显著性值生成视觉显著性图的过程的实现方式可如下所述:In a specific embodiment, the implementation of the process of generating a visual saliency map according to the visual saliency value in S103 may be as follows:
待处理图像的区域显著性数值依赖自身特征与周围环境的区别。如果图像中某个区域是显著区域,那么该显著区域与周围区域存在一种或多种区别特征。在不同图像中,相同特征对于视觉显著性产生的影响也存在区别。亮度、颜色都可以用作视觉显著性特征,因此在图像预处理过程中需要抽取不同图像视觉特征。例如可采用提取亮度、方向、颜色等视觉特征进行度量。本申请经过多次实验发现,方向视觉特征在图像中所起作用不明显,同时还增加了算法时间复杂度,鉴于此,本申请只提取待处理图像的颜色和亮度特征。The area saliency value of the image to be processed depends on the difference between its own characteristics and the surrounding environment. If a certain area in the image is a salient area, there are one or more distinguishing features between the salient area and the surrounding area. In different images, the influence of the same feature on visual saliency is also different. Both brightness and color can be used as visual saliency features, so different image visual features need to be extracted during image preprocessing. For example, visual features such as brightness, direction, and color can be extracted for measurement. After many experiments in this application, it has been found that the directional visual features do not play a significant role in the image, and at the same time increase the algorithm time complexity. In view of this, this application only extracts the color and brightness features of the image to be processed.
HSV颜色空间应用色调H(Hue)、饱和度S(Saturation)、明度V(Value)三个参数来描述具体颜色,HSV比RGB更加符合人体视觉特征。可利用公式3将待处理图像从RGB空间映射转换成HSV空间,以便提取颜色和亮度特征。The HSV color space uses hue (Hue), saturation S (Saturation), and brightness V (Value) to describe specific colors. HSV is more in line with human visual characteristics than RGB. Equation 3 can be used to convert the image to be processed from RGB space to HSV space in order to extract color and brightness features.
Figure PCTCN2019075206-appb-000026
Figure PCTCN2019075206-appb-000026
式中,h为HSV颜色空间中H通道值,l为H通道值的算术平均值,s为HSV颜色空间中S通道值,v为HSV颜色空间中V通道值,g为RGB三原色R通道值,b为为RGB三原色B通道值,r为RGB三原色R通道值,max为各自对应通道值的最大值,min为各自对应通道值的最小值。Where h is the H channel value in the HSV color space, l is the arithmetic mean of the H channel value, s is the S channel value in the HSV color space, v is the V channel value in the HSV color space, and g is the R channel value of the RGB primary colors , B is the B channel value of the RGB primary colors, r is the R channel value of the RGB primary colors, max is the maximum value of each corresponding channel value, and min is the minimum value of each corresponding channel value.
为了提升视觉显著性区域检测的精度,在提取待处理图像的显著性特征时,可将局部系统显著性、全局系统显著性和稀缺显著性三种属性方面来构成系统显著图。In order to improve the accuracy of visual saliency area detection, when extracting the saliency features of the image to be processed, three attributes of local system saliency, global system saliency, and scarcity saliency can be used to form a system saliency map.
待处理图像上各个像素点视觉显著性不仅仅依赖该像素点特征值,而是需要依据该像素点与其周围像素点的区别程度。区别程度越大,则说明该像素点越具有视觉显著性。相关技术通过图像各个像素点与周围区域之间的区别程度来计算其视觉显著性,但是区域大小难以确定,同时算法计算量非常庞大。为了解决相关技术中的这些问题,本申请可从频域来分析计算图像像素点的局部系统显著性,幅度谱和相位谱在图像频域特征中的作用存在不同之处:幅度谱包含各个像素点特征值的具体大小;相位谱包含有图像结构特征,它能够反映图像像素点特征值的变化情况。The visual saliency of each pixel on the image to be processed does not only depend on the characteristic value of the pixel, but also needs to be based on the difference between the pixel and its surrounding pixels. The greater the degree of difference, the more visually significant the pixel is. Related technologies calculate the visual saliency through the degree of difference between each pixel of the image and the surrounding area, but the size of the area is difficult to determine, and the amount of algorithm calculation is very large. In order to solve these problems in the related art, the present application can analyze and calculate the local system saliency of image pixels from the frequency domain. The amplitude spectrum and the phase spectrum have different functions in the image frequency domain features: the amplitude spectrum contains each pixel The specific size of the point feature value; the phase spectrum contains the image structure feature, which can reflect the change of the image pixel feature value.
本申请发明人经过研究发现,相位谱和幅度谱在图像构建过程中所起作用不同。单纯应用相位谱来针对结果图像进行构造,可得到与原始图像具有近似结构的结果;而仅仅应用幅度谱来针对结果图像进行构建,所得结果图像和原始图像的区别很大。The inventor of the present application discovered through research that the phase spectrum and the amplitude spectrum play different roles in the image construction process. Simply using the phase spectrum to construct the result image can obtain a result with a similar structure to the original image; but only using the amplitude spectrum to construct the result image, the resulting image is very different from the original image.
因此,可使用公式3来针对各个特征图像预先进行局部系统显著性计算。首先针对输入图像执行快速傅里叶变换过程(Fast Fourier Transform,FFT)来抽取相位谱和幅度谱;然后应用相位谱针对图像进行构造,从而得到各个特征图像的局部系统显著图。Therefore, Equation 3 can be used to perform local system saliency calculation for each feature image in advance. Firstly, the Fast Fourier Transform (FFT) is performed on the input image to extract the phase spectrum and the amplitude spectrum; then the phase spectrum is applied to the image to construct the local system saliency map of each feature image.
Figure PCTCN2019075206-appb-000027
Figure PCTCN2019075206-appb-000027
式中,x、y为像素点的横纵坐标值,FFT(u,v)为像素点特征值,|FFT(u,v)e jψ(u,v)|为经过快速傅里叶变换以后所得图像幅度谱,ψ(u,v)为待处理图像相位谱,V Local是图像里各个图像像素点的局部系统显著性值。 In the formula, x and y are the horizontal and vertical coordinates of the pixel, FFT(u,v) is the characteristic value of the pixel, |FFT(u,v)e jψ(u,v) | is after the fast Fourier transform The resulting image amplitude spectrum, ψ(u,v) is the phase spectrum of the image to be processed, and V Local is the local system saliency value of each image pixel in the image.
如果仅仅考虑局部系统显著性值,将会导致图像中变化剧烈的边缘或复杂背景区域的全局系统显著性较高,同时平滑区域和目标内部的局部系统显著性则较低。为了解决上述问题,可引入全局系统显著性。If only the local system saliency value is considered, the global system saliency of the sharply changing edge or complex background area in the image will be higher, while the local system saliency of the smooth area and the target is lower. To solve the above problems, global system saliency can be introduced.
像素点的全局系统显著性是用来度量该图像像素点在整幅图像上的视觉显著性程度,可应用公式5来产生待处理图像的全局系统显著性。公式5中的V Global(x,y)是待处理图像中各个像素点的全局系统显著性值。 The global system saliency of pixels is used to measure the degree of visual saliency of the image pixels on the entire image. Formula 5 can be applied to generate the global system saliency of the image to be processed. V Global (x, y) in Equation 5 is the global system saliency value of each pixel in the image to be processed.
Figure PCTCN2019075206-appb-000028
Figure PCTCN2019075206-appb-000028
式中,x、y为像素点的横纵坐标值,f(x,y)为求解像素点(x,y)的显著性函数,f average(x,y)为f(x,y)的算术平均值,待处理图像大小为M*N。 In the formula, x and y are the horizontal and vertical coordinate values of the pixel, f(x, y) is the significance function for solving the pixel (x, y), and f average (x, y) is f(x, y). The arithmetic average, the size of the image to be processed is M*N.
稀缺显著性表示某种特征值在整幅图像上出现的机会概率非常低,则具有该特征值的图像像素点“与众不同”,则图像像素点的视觉显著性值就会越高。可应用公式6来衡量输入图像中各个像素点的稀缺显著性特征。公式6的h(f average(x,y))为待处理图像产生的特征直方图,V Scarcity(x,y)为像素点的稀缺显著性值: The scarcity saliency means that the probability of a certain feature value appearing on the entire image is very low, then the image pixels with this feature value are "unusual", the higher the visual significance value of the image pixel. Equation 6 can be applied to measure the scarcity saliency characteristics of each pixel in the input image. H(f average (x, y)) in Equation 6 is the feature histogram generated by the image to be processed, and V Scarcity (x, y) is the scarcity significance value of the pixel:
Figure PCTCN2019075206-appb-000029
Figure PCTCN2019075206-appb-000029
计算得到待处理图像每个像素点的局部系统显著性值、全局系统显著性值和稀缺显著性值后,相对应的,可生成局部系统显著图、全局系统显著性图和稀缺显著性图。而最终得到的视觉显著性图为将这三个视觉显著性图进行融合,可选的,可利用公式7针对计算某个特征图(例如颜色特征)的局部系统显著性、全局系统显著性、稀缺显著性的度量结果进行融合,从而得到最终系统显著图。After calculating the local system saliency value, global system saliency value and scarcity saliency value of each pixel of the image to be processed, correspondingly, the local system saliency map, global system saliency map and scarcity saliency map can be generated. The final visual saliency map is the fusion of the three visual saliency maps. Optionally, formula 7 can be used to calculate the local system saliency, global system saliency, and The measurement results of scarcity saliency are fused to obtain the final system saliency map.
Figure PCTCN2019075206-appb-000030
Figure PCTCN2019075206-appb-000030
式中,V final为待处理图像的各像素点的视觉显著性值,V Local(x,y)、V Global(x,y)、V Scarcity(x,y)(V Local、V Global、V Scarcity)依次为横纵坐标值为x、y的像素点的局部系统显著性值、全局系统显著性值及稀缺显著性值;v 1为局部系统显著性值的权重值,v 2为全局系统显著性值的权重值,v 3为稀缺显著性值的权重值;i=1,V i=V 1为局部系统显著性值,i=2,V i=V 2为全部系统显著性值,i=3,V i=V 3为稀缺显著性值,V 1、V 2、V 3为根据
Figure PCTCN2019075206-appb-000031
计算所得。
In the formula, V final is the visual saliency value of each pixel of the image to be processed, V Local (x, y), V Global (x, y), V Scarcity (x, y) (V Local , V Global , V Scarcity ) is the local system saliency value, global system saliency value, and scarcity saliency value of pixels with x and y horizontal and vertical coordinates in sequence; v 1 is the weight value of the local system saliency value, and v 2 is the global system The weight value of the saliency value, v 3 is the weight value of the scarce saliency value; i=1, V i = V 1 is the local system saliency value, i=2, V i = V 2 is the saliency value of all systems, i = 3, V i = V 3 is the scarcity significance value, V 1 , V 2 , V 3 are based on
Figure PCTCN2019075206-appb-000031
Calculated.
不同的特征显著图对图像分割的贡献存在不同。有些视觉显著图能够有效表达视觉显著区域,而有些视觉显著图则不能达到理想效果。因此,需要合理的特征融合策略对于所得多个特征显著图进行融合并最终生成系统视觉显著图。从视觉显著性点数量、位置、分布情况来进行动态选择特征以及加权处理过程。Different feature saliency maps have different contributions to image segmentation. Some visual saliency maps can effectively express visual saliency areas, while some visual saliency maps cannot achieve the desired effect. Therefore, a reasonable feature fusion strategy is needed to fuse multiple obtained feature saliency maps and finally generate a system visual saliency map. From the number, location, and distribution of visual salience points, dynamic feature selection and weighting processing are performed.
首先需要针对特征显著图执行图像阈值计算,抽取视觉显著值比阈值大的像素点作为显著点。根据稀缺显著性准则,特征显著图上的显著性点数量越多,则此特征显著图对系统显著图的贡献程度就越小。定义权值W region是显著性点数量,如公式8所示。 First, it is necessary to perform image threshold calculation for the feature saliency map, and extract pixel points whose visual saliency value is greater than the threshold value as salient points. According to the scarcity saliency criterion, the greater the number of saliency points on the feature saliency map, the smaller the contribution of this feature saliency map to the system saliency map. The weight W region is defined as the number of salient points, as shown in Equation 8.
W region=N saliency;(8) W region = N saliency ; (8)
人类针对图像的注意区域很容易集中在图像中心区域部分。图像中心区域像素点以及周围区域较容易成为显著性区域。因此,可定义权值W pixel为各个显著像素点与图像中心像素点的欧氏距离算术平均值,具体方法如公式9所示。 The attention area of the human for the image is easily concentrated in the central area of the image. The pixels in the central area of the image and the surrounding area are more likely to be prominent areas. Therefore, the weight value W pixel can be defined as the arithmetic average value of the Euclidean distance between each significant pixel point and the pixel point in the center of the image. The specific method is shown in Equation 9.
Figure PCTCN2019075206-appb-000032
Figure PCTCN2019075206-appb-000032
公式9中的N是系统特征显著图中显著点数量,sp i是第i个系统显著点,center是图像中心像素点具体位置。 N in formula 9 is the number of salient points in the system feature saliency map, sp i is the ith salient point in the system, and center is the specific position of the pixel in the center of the image.
如果系统特征显著图上的各个视觉显著点位置分散于显著图的各个区域,则认定本系统特征显著图对系统最终显著图的贡献程度不够。因此,可定义W effect权值是各个系统显著点之间的欧氏距离算术平均值,center id为视觉显著点位置中心,具体如公式10所示。 If the positions of the visual saliency points on the saliency map of the system are scattered in the areas of the saliency map, the system's saliency map is deemed to have insufficient contribution to the final saliency map of the system. Therefore, the W effect weight can be defined as the arithmetic average of the Euclidean distance between the salient points of each system, and the center id is the center of the visual salient point position, as shown in Equation 10.
Figure PCTCN2019075206-appb-000033
Figure PCTCN2019075206-appb-000033
根据系统视觉显著点分布位置和数量情况,可应用公式11计算各个系统特征显著图的相关权值,对各个系统特征显著图进行融合而得最终系统显著图。According to the distribution position and number of visual saliency points of the system, formula 11 can be used to calculate the correlation weight of each system feature saliency map, and the system saliency maps are fused to obtain the final system saliency map.
Figure PCTCN2019075206-appb-000034
Figure PCTCN2019075206-appb-000034
在公式11中,V fi是第i个系统特征显著图,
Figure PCTCN2019075206-appb-000035
是第i个系统特征显著图所属区域权值、像素点权值、效果权值,W fi是第i个系统特征显著图上的三种权值的融合,W i是第i个系统特征显著图所属最终权值,V是最终系统显著图。
In formula 11, V fi is the saliency map of the i-th system,
Figure PCTCN2019075206-appb-000035
Is the weight of the area, pixel weight, and effect weight of the i-th system feature saliency map. W fi is the fusion of the three weights on the i-th system feature saliency map. W i is the i-th system feature saliency. The final weight to which the graph belongs, V is the saliency graph of the final system.
由上可知,本发明实施例通过局部系统显著性、全局系统显著性、稀缺显著性来进行具体视觉显著性阐述描述,将更有利于进行快速图像分割过程。通过进行视觉显著性机制完善,不断改进和完善应用视觉显著性机制进行图像分割过程,以便达到更好的图像分割效果。It can be seen from the above that the embodiments of the present invention use specific system saliency, global system saliency, and scarcity saliency to describe and describe specific visual saliency, which will be more conducive to the rapid image segmentation process. Through the improvement of the visual saliency mechanism, we constantly improve and perfect the application of the visual saliency mechanism for the image segmentation process in order to achieve a better image segmentation effect.
针对S104,在另外一种实施方式中,可使用加权视觉显著性结果图的合并方法与平均视觉显著性结果图的合并方法来进行视觉显著性图的合并操作。经过本申请发明人的多次试验证明,加权视觉显著性结果图的合并方法比平均视觉显著性结果图合并方法效果更好。在第i层上的p坐标位置,视觉显著性值是
Figure PCTCN2019075206-appb-000036
其中
Figure PCTCN2019075206-appb-000037
进行归一化到[0,1]区域范围内。为计算第i层上的合并视觉显著性值,第i-1层上的视觉显著性结果图会被调整成和第i层上视觉显著性结果图相同大小。使用视觉显著性值
Figure PCTCN2019075206-appb-000038
来表示在图像序列上操作后的视觉显著性结果图
Figure PCTCN2019075206-appb-000039
上p坐标位置的视觉显著性值。相邻两层的细致化视觉显著值进行合并之后的结果如公式12、公式13和公式14所示:
For S104, in another embodiment, the merging method of the weighted visual saliency result graph and the merging method of the average visual saliency result graph may be used to perform the merging operation of the visual saliency graph. After several experiments by the inventor of the present application, it is proved that the method of combining the weighted visual saliency result graphs is better than the average visual saliency result graph merging method. At the p-coordinate position on the i-th layer, the visual significance value is
Figure PCTCN2019075206-appb-000036
among them
Figure PCTCN2019075206-appb-000037
Normalize to the range of [0,1]. To calculate the combined visual saliency value on layer i, the visual saliency result graph on layer i-1 will be adjusted to the same size as the visual saliency result graph on layer i. Use visual significance
Figure PCTCN2019075206-appb-000038
To represent the visual saliency result graph after operating on the image sequence
Figure PCTCN2019075206-appb-000039
The visual significance value at the position of the upper p coordinate. The results of merging the refined visual saliency values of two adjacent layers are shown in Equation 12, Equation 13 and Equation 14:
Figure PCTCN2019075206-appb-000040
Figure PCTCN2019075206-appb-000040
Figure PCTCN2019075206-appb-000041
Figure PCTCN2019075206-appb-000041
Figure PCTCN2019075206-appb-000042
Figure PCTCN2019075206-appb-000042
式中,p为像素点位置,
Figure PCTCN2019075206-appb-000043
为第i层的权重值,
Figure PCTCN2019075206-appb-000044
为第i-1层的权重值,
Figure PCTCN2019075206-appb-000045
为第i层的视觉化显著性值,
Figure PCTCN2019075206-appb-000046
为第i-1层的视觉显著性值;
Figure PCTCN2019075206-appb-000047
为融合视觉显著图p位置的像素点视觉显著性值。
In the formula, p is the pixel position,
Figure PCTCN2019075206-appb-000043
Is the weight value of layer i,
Figure PCTCN2019075206-appb-000044
Is the weight value of layer i-1,
Figure PCTCN2019075206-appb-000045
Is the visual significance value of layer i,
Figure PCTCN2019075206-appb-000046
Is the visual significance value of layer i-1;
Figure PCTCN2019075206-appb-000047
To fuse the visual saliency value of the pixels at the position of the visual saliency map
根据公式13和公式14可知,在第i层细致化视觉显著性值
Figure PCTCN2019075206-appb-000048
和第i-1层细致化视觉显著性值
Figure PCTCN2019075206-appb-000049
已经确定的情况下,就可计算
Figure PCTCN2019075206-appb-000050
Figure PCTCN2019075206-appb-000051
Figure PCTCN2019075206-appb-000052
时,
Figure PCTCN2019075206-appb-000053
可根据式公式12可计算合并后的细致化视觉显著性值。
According to Equation 13 and Equation 14, the visual saliency value is refined in the i-th layer
Figure PCTCN2019075206-appb-000048
And the i-1 layer refined visual saliency value
Figure PCTCN2019075206-appb-000049
It can be calculated if it has been determined
Figure PCTCN2019075206-appb-000050
with
Figure PCTCN2019075206-appb-000051
when
Figure PCTCN2019075206-appb-000052
Time,
Figure PCTCN2019075206-appb-000053
The refined visual saliency value after merging can be calculated according to formula 12.
经过本申请发明人利用上述实施例的方法进行实验的结果可知,经过合并操作后的视觉显著性图能够将两个层次的各自视觉显著性多尺度特征进行融合,经过融合之后得到的视觉显著性结果图更为平滑和清晰。According to the results of experiments conducted by the inventors of the present application using the method of the above embodiment, the visual saliency map after the merge operation can fuse the two levels of visual saliency multi-scale features, and the visual saliency obtained after the fusion The resulting graph is smoother and clearer.
在使用随机搜索检测方法选择候选图像区域块过程中,由于样本采集数量不够以及样本误差等随机原因将导致生成的视觉显著性图中包含大量噪声的随机存在。随机噪声信号产生的原因是因为随机选择2K图像区域块执行公式1来进行计算而产生图像噪声。为了提升视觉显著性图的准确度,在S103之后,进行去噪处理。In the process of selecting candidate image region blocks using the random search detection method, due to insufficient sample collection and random reasons such as sample error, the generated visual saliency map will contain a lot of random noise. The reason why the random noise signal is generated is that image noise is generated because the 2K image area block is randomly selected to perform Equation 1 for calculation. In order to improve the accuracy of the visual saliency map, after S103, a denoising process is performed.
在一种实施方式中,可使用八邻域视觉显著性值可针对粗略化的视觉显著图(S103生成的视觉显著图)执行细致化过程。八邻域坐标方法可从像素点r对应的八个方向进行邻近像素点选择,通过八邻域坐标法获取得到的候选图像区域块与随机选择候选图像区域块方法有很大区别。因为p坐标处邻近区域图像相似性高,所以八邻域坐标法可能会使得p坐标处视觉显著性值能够比实际图像视觉显著性值偏小;如果根据对于此方法获得的视觉显著性值进行归一化处理,则会导致前述所得粗略化的视觉显著性图上产生相应噪声。但是邻近坐标的视觉显著性比如相似度高且相邻,因此需要针对与邻域视觉显著性值差异较大像素点坐标位置进行细致化操作。考虑到视觉显著性的可信度因素,与八邻域间区别较大的视觉显著性值的可信度较低,故可选择细致化与八邻域间区别较大的视觉显著性值。In one embodiment, the eight-neighbor visual saliency value may be used to perform a refinement process on the roughened visual saliency map (the visual saliency map generated by S103). The eight-neighbor coordinate method can select neighboring pixels from the eight directions corresponding to the pixel r. The candidate image area block obtained by the eight-neighbor coordinate method is very different from the method of randomly selecting candidate image area blocks. Because the image similarity of the neighboring area at the p-coordinate is high, the eight-neighbor coordinate method may make the visual saliency value at the p-coordinate smaller than the actual image visual saliency value; if it is done according to the visual saliency value obtained for this method Normalization will cause corresponding noise on the rough visual saliency map obtained above. However, the visual saliency of neighboring coordinates is high and similar, for example. Therefore, it is necessary to perform refinement operations on the coordinate positions of pixel points that differ greatly from the visual saliency value of the neighboring coordinates. Considering the credibility factor of visual saliency, the visual saliency value with a large difference from the eight neighbors has a low credibility, so it is possible to choose to refine the visual saliency value with a large difference from the eight neighbors.
此外,本申请还提供了另外一个实施例,请参阅图2,图2为本发明实施例提供的另一种视觉显著性区域检测方法的流程示意图,本发明实施例例如可应用于图像分割方法中,具体的可包括以下内容:In addition, this application also provides another embodiment. Please refer to FIG. 2. FIG. 2 is a schematic flowchart of another method for detecting a visually significant region according to an embodiment of the present invention. The embodiment of the present invention may be applied to an image segmentation method, for example. Among them, the specific content may include the following:
S201:利用随机搜索方法从待处理图像中为当前像素点选取多个候选图像区域块。S201: Use a random search method to select multiple candidate image area blocks for the current pixel from the image to be processed.
S202:分别计算当前像素点和各候选图像区域块的相似度量值。S202: Calculate the similarity metric values of the current pixel and each candidate image area block separately.
S203:删除相似度量值低于相似度阈值对应的候选图像区域块,将剩余的候选图像区域块作为目标图像区域块。S203: Delete the candidate image area blocks corresponding to the similarity metric value lower than the similarity threshold, and use the remaining candidate image area blocks as the target image area blocks.
S204:根据各目标图像块与当前像素点之间的相似度,计算当前像素点的视觉显著性值。S204: Calculate the visual saliency value of the current pixel according to the similarity between each target image block and the current pixel.
S205:基于待处理图像的各像素点的视觉显著性值,生成待处理图像各层次的粗略化视觉显著图。S205: Based on the visual saliency value of each pixel of the image to be processed, generate rough visual saliency maps of each level of the image to be processed.
S206:对各粗略化视觉显著图进行细致化操作,以去除图像噪声信号,得到各自相应的细致化视觉显著图。S206: Perform a refinement operation on each roughened visual saliency map to remove image noise signals, and obtain respective corresponding detailed visual saliency maps.
S207:根据相邻两层视觉显著图的视觉显著性值,计算各层视觉显著图的权重值。S207: Calculate the weight value of the visual saliency map of each layer according to the visual saliency value of the visual saliency maps of two adjacent layers.
S208:按照各层视觉显著图的权重值将各层视觉显著图进行融合,得到初始视觉显著图。S208: Integrate the visual saliency maps of each layer according to the weight value of the visual saliency maps of each layer to obtain an initial visual saliency map.
S209:在初始视觉显著图中选择多个预设位置点,对每个预设位置点,根据各个像素点的视觉显著性值,将其分为强化像素点集和弱化像素点集。S209: Select multiple preset position points in the initial visual saliency map, and for each preset position point, divide it into an enhanced pixel point set and a weakened pixel point set according to the visual significance value of each pixel point.
位置点的个数可根据待处理图像的大小、视觉显著图的实际情况进行选择,本申请对此不做任何限定。The number of location points can be selected according to the size of the image to be processed and the actual situation of the visual saliency map, which is not limited in this application.
强化像素点集中的像素点的视觉显著性值大于弱化像素点集中的像素点的视觉显著性值。The visual saliency value of the pixels in the enhanced pixel set is greater than the visual saliency value of the pixels in the weakened pixel set.
本领域技术人员可根据自身经验将视觉显著性值高的划分至强化像素点集,将低视觉显著性值对应的像素点划分至弱化像素点集。或者也可根据各位置出的像素点的视觉显著性值和总个数,设置一个视觉显著性划分阈值,将高于该阈值的像素点划分至强化像素点集,将低于该阈值的像素点划分至弱化像素点集。A person skilled in the art may divide the high visual saliency value into the enhanced pixel point set according to his own experience, and divide the pixel point corresponding to the low visual saliency value into the weakened pixel point set. Or you can set a visual saliency division threshold based on the visual saliency value and total number of pixels from each position, divide the pixels above the threshold into the enhanced pixel set, and divide the pixels below the threshold The points are divided into a set of weakened pixels.
S210:利用图像增强方法对强化像素点集中的各像素点进行强化处理。S210: Use the image enhancement method to perform enhancement processing on each pixel in the enhanced pixel set.
可采用任何一种可实现增强像素点特征的图像增强方法,本领域技术人员可根据实际需求和实际应用场景进行选择,本申请对此不做任何限定。Any image enhancement method that can enhance the feature of the pixel point may be used, and those skilled in the art may choose according to actual needs and actual application scenarios, which is not limited in this application.
至于如何实现图像增强的过程,可参阅相关技术记载的实现方案,此处,便不再赘述。As for how to implement the process of image enhancement, please refer to the implementation scheme recorded in the related art, which will not be repeated here.
S211:利用图像弱化方法对弱化像素点集中的各像素点进行弱化处理。S211: Use the image weakening method to weaken each pixel in the weakened pixel set.
可采用任何一种可实现弱化像素点特征的图像增强方法,本领域技术人员可根据实际需求和实际应用场景进行选择,本申请对此不做任何限定。Any image enhancement method that can weaken the characteristics of pixel points can be used, and those skilled in the art can select according to actual needs and actual application scenarios, which is not limited in this application.
至于如何实现图像弱化的过程,可参阅相关技术记载的实现方案,此处,便不再赘述。As for the process of how to weaken the image, please refer to the implementation scheme recorded in the related technology, and it will not be repeated here.
本发明实施例中与上述实施例中相同的步骤或方法,可参阅上述实施例描述的实现过程,此处,不再赘述。For the steps or methods in the embodiments of the present invention that are the same as those in the foregoing embodiments, refer to the implementation process described in the foregoing embodiments, and details are not described here.
由上可知,本发明实施例发明分4个阶段实现视觉显著性区域的检测,第1个阶段为随机检测阶段,根据待处理图像进行随机搜索和检测处理,应用原始图像各层次信息得到各层次的粗略化视觉显著图;第2个阶段为细致化阶段,利用各层次上的粗略化视觉显著图进行细致化操作,去掉在粗略化视觉显著图上因为随机搜索和检测处理而产生的图像噪声;第3个阶段为区域图的多层次合并阶段,通过对多层视觉显著图进行细致化合并最终产生合并后的结合局部特征和全局特征的视觉显著图;第4个阶段为更新视觉显著值阶段,将有选择地对于那些噪声信号高的视觉显著性区域进行合并,从而获得最终的高质量细致化视觉显著性结果图。实现了快速、实时产生和原始输入图像(待处理图像)大小相同的细致化视觉显著性结果图,从而提高视频图像序列中目标分割结果的总体质量。It can be seen from the above that the invention of the embodiments of the present invention implements the detection of visually significant regions in four stages. The first stage is a random detection stage, which performs random search and detection processing according to the image to be processed, and uses the information of each layer of the original image to obtain each layer Roughened visual saliency map; the second stage is the refinement stage, which uses the roughened visual saliency maps at all levels to perform the refinement operation to remove image noise generated by random search and detection processing on the roughened visual saliency map ; The third stage is the multi-level merge stage of the area map. Through the detailed merge of the multi-layer visual saliency map, the final merged visual saliency map combining local features and global features; the fourth stage is to update the visual saliency value. At this stage, the visual saliency areas with high noise signals will be selectively combined to obtain the final high-quality detailed visual saliency result map. It achieves fast and real-time generation of detailed visual saliency result maps with the same size as the original input image (to-be-processed image), thereby improving the overall quality of the target segmentation results in the video image sequence.
为了证实本申请提供的技术方案(概括称为多尺度化视觉显著性检测算法)能够快速产生和原始输入图像大小相同的细致化视觉显著性结果图,可以应用于对于实时性要求较高的基于图像序列的视觉显著性结果图,从而提高视频图像序列中目标分割结果的总体质量。本申请使用Matlab仿真环境进行了一系列实验进行验证。In order to confirm that the technical solution provided by this application (generally referred to as multi-scale visual saliency detection algorithm) can quickly generate a detailed visual saliency result map of the same size as the original input image, it can be applied to the The visual saliency result graph of the image sequence, thereby improving the overall quality of the target segmentation results in the video image sequence. This application uses Matlab simulation environment to carry out a series of experiments to verify.
在Microsoft Windows10操作系统环境下,实现本多尺度化视觉显著性检测算法。采用视觉显著性检测算法,依据原始输入图像生成对应视觉显著性结果图。得到大量视觉显著性结果图表明多尺度化视觉显著性检测方法较之现有主流视觉显著性度量方法可在原始输入图像上得到效果优良的视觉显著图,本申请与8种相关技术,例如AIM(基于信息论的显著性检测算法)、GBVS(基于图论的显著性检测算法)、SR(基于光谱剩余量的显著性检测算法)、IS(基于稀疏显著区域的显著性标注算法)、ICL(动态视觉注意检测算法)、ITTI(最经典模型算法)、RC(基于显著区域检测的全局对比算法)、SUN(应用自然统计的显著性贝叶斯框架模型)进行了对比比较。Under the Microsoft Windows 10 operating system environment, this multi-scale visual saliency detection algorithm is implemented. A visual saliency detection algorithm is used to generate a corresponding visual saliency result map based on the original input image. A large number of visual saliency results are obtained, which show that the multi-scale visual saliency detection method can obtain visual saliency images with good effects on the original input image compared with the existing mainstream visual saliency measurement methods. This application is related to 8 related technologies, such as AIM (Saliency detection algorithm based on information theory), GBVS (saliency detection algorithm based on graph theory), SR (saliency detection algorithm based on spectral remainder), IS (saliency labeling algorithm based on sparse salient regions), ICL ( Dynamic visual attention detection algorithm), ITTI (the most classic model algorithm), RC (global comparison algorithm based on salient region detection), SUN (significant Bayesian frame model using natural statistics) were compared and compared.
从Bruce数据库的120幅原始图像中选择了6幅测试图像用来比较本发明方法与八种相关技术算法。从效果准确度来讲,本多尺度化方法的执行准确度最高。Six test images were selected from 120 original images of Bruce database to compare the method of the present invention with eight related technical algorithms. In terms of effect accuracy, this multi-scale method has the highest execution accuracy.
在实验结果中可知,针对手持卡片的人脸图像,人脸与手持卡片是人眼关注重点,这个情况只有本申请才能较好检测到范围。针对人骑车图像中,骑车的人是显著性区域,RC和SUN方法都能较好地检测到图像中的显著性区域,本文方法也检测到了对应的显著性区域。针对松树图像,图像中松针较稀疏的部分隐约形成的“空洞”是人眼最关注的显著区域,此图像的视觉显著性检测难度较大,八种相关技术方法均没有检测出范围,绝大多数方法都能把图像左下角对比度较强的白色高亮区域误认成视觉显著区域,而本申请则基本检测出了空洞大概区域。从视觉显著区域和视觉非显著区域的对比度上,本申请也是最优。针对一幅包含人像的景色图片,虽然IS和ICL都能检测到图像里的人物,但本申请对比度最高。It can be seen from the experimental results that, for the face image of the hand-held card, the face and the hand-held card are the focus of attention of the human eye. This situation can only be better detected by this application. In the image of people riding bicycles, the cyclist is a saliency region. Both the RC and SUN methods can detect the saliency region in the image well. The method in this paper also detects the corresponding saliency region. For the pine tree image, the vacancy formed by the sparse part of the pine needle in the image is the most noticeable area of the human eye. The visual saliency of this image is difficult to detect. None of the eight related technical methods detect the range, which is extremely large. Most methods can mistake the white highlighted area with strong contrast in the lower left corner of the image as a visually significant area, and this application basically detects the approximate hollow area. This application is also optimal in terms of the contrast between the visually significant area and the visually insignificant area. For a landscape picture containing a portrait, although both IS and ICL can detect people in the image, this application has the highest contrast.
为了客观评价本申请的具体表现,采用受试者工作特征曲线(Receiver Operating Characteristic,ROC)来进行定量结果分析。如图3和图4所示,本申请曲线最高。应用八种方法可以分成两类:单纯使用先验概率知识的算法,它们 包括AIM、GBVS、IS和SR;单纯使用当前观测图像信息的算法,它们包括ICL、ITTI、RC和SUN。那么,本多尺度化视觉显著性检测方法比仅使用先验概率知识或当前观测图像信息的方法效果要更好。In order to objectively evaluate the specific performance of this application, the receiver operating characteristic curve (Receiver Operating Characteristic, ROC) was used to analyze the quantitative results. As shown in Figures 3 and 4, the curve of this application is the highest. The eight methods can be divided into two categories: algorithms that simply use prior knowledge of probabilities, including AIM, GBVS, IS, and SR; algorithms that simply use current observed image information, including ICL, ITTI, RC, and SUN. Then, this multi-scale visual saliency detection method is better than the method using only a priori probability knowledge or current observation image information.
评估彩色图像分割算法目前大多数情况都是由人眼执行主观判断,将本申请技术方案应用于图像分割后的结果与Berkeley图像标准库中人眼分割结果进行比较来对本算法进行定性评估,本申请具有好的图像分割效果。例如针对一幅鸟在树上的风景图,图5为利用本申请技术方案分割得到的图像,由图5分割结果中可见,本申请技术方案可在图像中对显著性区域进行准确定位,利用本申请技术方案的图像分割方法可清楚的切割出鸟的轮廓和树的轮廓,能够获得与人眼分割几乎完全一致的图像分割结果。Evaluation of color image segmentation algorithms At present, most of the cases are subjectively judged by the human eye. The results of applying the technical solution of this application to image segmentation are compared with the results of human eye segmentation in the Berkeley image standard library to evaluate the algorithm qualitatively. The application has good image segmentation effect. For example, for a landscape image of a bird on a tree, FIG. 5 is an image obtained by segmentation using the technical solution of the present application. As can be seen from the segmentation result of FIG. 5, the technical solution of the present application can accurately locate the salient regions in the image and use The image segmentation method of the technical solution of the present application can clearly cut out the outline of a bird and the outline of a tree, and can obtain an image segmentation result that is almost completely consistent with human eye segmentation.
由上可知,本发明实施例针对视觉显著性引入随机搜索检测方法,大幅度提高视觉显著性检测算法的效率,并得到与原始输入图像的大小规模完全相同的细致化的视觉显著性图。It can be seen from the above that the embodiment of the present invention introduces a random search detection method for visual saliency, greatly improves the efficiency of the visual saliency detection algorithm, and obtains a detailed visual saliency map that is exactly the same size and scale as the original input image.
本发明实施例还针对视觉显著性区域检测方法提供了相应的实现装置,进一步使得所述方法更具有实用性。下面对本发明实施例提供的视觉显著性区域检测装置进行介绍,下文描述的视觉显著性区域检测装置与上文描述的视觉显著性区域检测方法可相互对应参照。The embodiment of the present invention also provides a corresponding implementation device for the visual saliency area detection method, which further makes the method more practical. The visual saliency area detection device provided by the embodiment of the present invention will be described below. The visual saliency area detection device described below and the visual saliency area detection method described above can be referred to each other.
参见图6,图6为本发明实施例提供的视觉显著性区域检测装置在一种具体实施方式下的结构图,该装置可包括:Referring to FIG. 6, FIG. 6 is a structural diagram of a visually significant area detection device according to an embodiment of the present invention in a specific implementation manner. The device may include:
随机搜索模块601,用于利用随机搜索方法从待处理图像中为当前像素点选取多个候选图像区域块,从各候选图像区域块中选取满足相似度条件的目标图像区域块。The random search module 601 is used to select a plurality of candidate image area blocks for the current pixel from the image to be processed using a random search method, and select target image area blocks that satisfy the similarity condition from each candidate image area block.
视觉显著性值计算模块602,用于根据各目标图像块与当前像素点之间的相似度,计算当前像素点的视觉显著性值。The visual saliency value calculation module 602 is used to calculate the visual saliency value of the current pixel according to the similarity between each target image block and the current pixel.
多层视觉显著图生成模块603,用于基于待处理图像的各像素点的视觉显著性值,生成待处理图像各层次的视觉显著图。The multi-layer visual saliency map generation module 603 is used to generate visual saliency maps at various levels of the image to be processed based on the visual saliency value of each pixel of the image to be processed.
视觉显著图融合模块604,用于根据相邻两层视觉显著图的视觉显著性值,计算各层视觉显著图的权重值,以用于按照各层视觉显著图的权重值将各层视觉显著图进行融合。The visual saliency map fusion module 604 is used to calculate the weight values of the visual saliency maps of each layer according to the visual saliency values of the adjacent two layers of visual saliency maps, which are used to visually highlight each layer according to the weight values of the visual saliency maps of each layer Figure fusion.
可选的,在本实施例的一些实施方式中,请参阅图7,所述装置例如还可以包括视觉显著性更新模块605,所述视觉显著性更新模块605用于在融合后得到的视觉显著图中选择多个预设位置点;对每个预设位置点,根据各个像素点的视觉显著性值,将其分为强化像素点集和弱化像素点集,强化像素点集中的像素点的视觉显著性值大于弱化像素点集中的像素点的视觉显著性值;利用图像增强方法对强化像素点集中的各像素点进行强化处理;利用图像弱化方法对弱化像素点集中的各像素点进行弱化处理。Optionally, in some implementations of this embodiment, please refer to FIG. 7, for example, the device may further include a visual saliency update module 605, and the visual saliency update module 605 is used for visual saliency obtained after fusion. Select multiple preset position points in the picture; for each preset position point, according to the visual saliency value of each pixel point, it is divided into the enhanced pixel point set and the weakened pixel point set, and the enhanced pixel points in the pixel set The visual saliency value is greater than the visual saliency value of the pixels in the weakened pixel set; the image enhancement method is used to strengthen each pixel in the enhanced pixel set; the image weakening method is used to weaken each pixel in the weakened pixel set deal with.
在一种具体的实施方式中,所述多层视觉显著图生成模块603还可包括:In a specific embodiment, the multi-layer visual saliency map generation module 603 may further include:
粗略化视觉显著图生成子模块,用于基于待处理图像的各像素点的视觉显著性值,生成待处理图像各层次的粗略化视觉显著图;The rough visual saliency map generation sub-module is used to generate rough visual saliency maps at various levels of the image to be processed based on the visual saliency value of each pixel of the image to be processed;
细致化视觉显著图生成子模块,用于对各粗略化视觉显著图进行细致化操作,以去除图像噪声信号,得到各自相应的细致化视觉显著图。The detailed visual saliency map generation sub-module is used to perform a detailed operation on each rough visual saliency map to remove image noise signals and obtain respective corresponding detailed visual saliency maps.
在其他的一些实施方式中,所述随机搜索模块601还可为分别计算当前像素点和各候选图像区域块的相似度量值;删除相似度量值低于相似度阈值对应的候选图像区域块,将剩余的候选图像区域块作为目标图像区域块的模块。In some other embodiments, the random search module 601 can also calculate the similarity metric values of the current pixel point and each candidate image area block separately; delete the candidate image area blocks corresponding to the similarity metric value lower than the similarity threshold, and The remaining candidate image area blocks serve as modules of the target image area block.
可选的,在一些具体的实施方式中,所述视觉显著性值计算模块602例如可为利用下述公式计算当前像素点的视觉显著性值S的模块:Optionally, in some specific embodiments, the visual saliency value calculation module 602 may be, for example, a module that calculates the visual saliency value S of the current pixel using the following formula:
Figure PCTCN2019075206-appb-000054
其中,
Figure PCTCN2019075206-appb-000055
Figure PCTCN2019075206-appb-000054
among them,
Figure PCTCN2019075206-appb-000055
式中,dist(r i,r k)为当前像素点所在位置与第K个目标图像块之间的不相似性度量值,K为目标图像块的总个数,dist color(r i,r k)为经过向量化处理的目标图像区域块与当前像素点所在位置在HSV颜色空间上的欧氏距离,dist pos(r i,r k)为当前像素点所在位置与第K个目标图像块之间的欧式距离。 Where dist(r i , r k ) is the dissimilarity measure between the current pixel position and the Kth target image block, K is the total number of target image blocks, dist color (r i , r k ) is the Euclidean distance in the HSV color space between the target image area block after vectorization and the current pixel position, and dist pos (r i , r k ) is the position of the current pixel point and the Kth target image block European distance between.
此外,所述多层视觉显著图生成模块603还可包括:In addition, the multi-layer visual saliency map generation module 603 may further include:
局部系统显著图生成子模块,用于基于频域计算待处理图像的各像素点的局部系统显著性值,生成局部系统显著图;The local system saliency map generation sub-module is used to calculate the local system saliency value of each pixel of the image to be processed based on the frequency domain to generate a local system saliency map;
全局系统显著图生成子模块,用于利用下述公式计算待处理图像的各像素点的全局系统显著性值V Global(x,y),生成全局系统显著图: The global system saliency map generation submodule is used to calculate the global system saliency value V Global (x,y) of each pixel of the image to be processed using the following formula to generate a global system saliency map:
Figure PCTCN2019075206-appb-000056
Figure PCTCN2019075206-appb-000056
式中,x、y为像素点的横纵坐标值,f(x,y)为求解像素点(x,y)的显著性函数,f average(x,y)为f(x,y)的算术平均值,待处理图像大小为M*N; In the formula, x and y are the horizontal and vertical coordinate values of the pixel, f(x, y) is the significance function for solving the pixel (x, y), and f average (x, y) is f(x, y). The arithmetic mean value, the size of the image to be processed is M*N;
稀缺显著图生成子模块,用于利用下述公式计算待处理图像的各像素点的稀缺显著性值V Scarcity(x,y),生成稀缺显著图: The scarcity saliency map generation submodule is used to calculate the scarcity saliency value V Scarcity (x,y) of each pixel of the image to be processed using the following formula to generate a scarcity saliency map:
Figure PCTCN2019075206-appb-000057
Figure PCTCN2019075206-appb-000057
式中,x、y为像素点的横纵坐标值,f average(x,y)为f(x,y)的算术平均值,h(f average(x,y))为待处理图像产生的特征直方图; In the formula, x and y are the horizontal and vertical coordinates of the pixel, f average (x, y) is the arithmetic average of f(x, y), and h(f average (x, y)) is generated by the image to be processed Feature histogram;
融合子模块,用于将局部系统显著图、全局系统显著图及稀缺显著图进行融合,得到待处理图像各层次的视觉显著图。The fusion sub-module is used to fuse the local system saliency map, the global system saliency map, and the scarcity saliency map to obtain visual saliency maps at various levels of the image to be processed.
在本发明实施例中,所述融合子模块例如还可为对待处理图像的各层,利用下述公式计算待处理图像的各像素点的视觉显著性值V final,生成视觉显著图的模块: In the embodiment of the present invention, the fusion sub-module may be, for example, a module for generating a visual saliency map by calculating the visual saliency value V final of each pixel of the image to be processed using the following formula:
Figure PCTCN2019075206-appb-000058
Figure PCTCN2019075206-appb-000058
式中,V Local、V Global、V Scarcity依次为横纵坐标值为x、y的像素点的局部系统显著性值、全局系统显著性值及稀缺显著性值;v 1为局部系统显著性值的权重值,v 2为全局系统显著性值的权重值,v 3为稀缺显著性值的权重值,i=1,V i=V 1为局部系统显著性值,i=2,V i=V 2为全部系统显著性值,i=3,V i=V 3为稀缺显著性值,V 1、V 2、V 3为根据
Figure PCTCN2019075206-appb-000059
计算所得。
In the formula, V Local , V Global , and V Scarcity are the local system saliency value, global system saliency value, and scarcity saliency value of the pixels whose horizontal and vertical coordinates are x and y, and v 1 is the local system saliency value. The weight value of, v 2 is the weight value of the global system saliency value, v 3 is the weight value of the scarcity saliency value, i = 1, V i = V 1 is the local system saliency value, i = 2, V i = V 2 is the significance value of all systems, i = 3, V i = V 3 is the scarcity significance value, V 1 , V 2 , V 3 are based on
Figure PCTCN2019075206-appb-000059
Calculated.
可选的,所述局部系统显著图生成子模块还可为利用下述公式计算待处理图像的各像素点的局部系统显著性值V Local(x,y)的模块: Optionally, the local system saliency map generation sub-module may also be a module that calculates the local system saliency value V Local (x, y) of each pixel of the image to be processed using the following formula:
Figure PCTCN2019075206-appb-000060
Figure PCTCN2019075206-appb-000060
式中,x、y为像素点的横纵坐标值,FFT(u,v)为像素点特征值,|FFT(u,v)e jψ(u,v)|为经过快速傅里叶变换以后所得图像幅度谱,ψ(u,v)为待处理图像相位谱。 In the formula, x and y are the horizontal and vertical coordinates of the pixel, FFT(u,v) is the characteristic value of the pixel, |FFT(u,v)e jψ(u,v) | is after the fast Fourier transform The resulting image amplitude spectrum, ψ(u,v) is the phase spectrum of the image to be processed.
在一些实施方式中,所述视觉显著图融合模块604还可为利用下述公式计算各层视觉显著图的权重值的模块:In some embodiments, the visual saliency map fusion module 604 may also be a module that calculates the weight value of the visual saliency map of each layer using the following formula:
Figure PCTCN2019075206-appb-000061
Figure PCTCN2019075206-appb-000061
式中,p为像素点位置,
Figure PCTCN2019075206-appb-000062
为第i层的权重值,
Figure PCTCN2019075206-appb-000063
为第i-1层的权重值,
Figure PCTCN2019075206-appb-000064
为第i层的视觉化显著性值,
Figure PCTCN2019075206-appb-000065
为第i-1层的视觉显著性值;
In the formula, p is the pixel position,
Figure PCTCN2019075206-appb-000062
Is the weight value of layer i,
Figure PCTCN2019075206-appb-000063
Is the weight value of layer i-1,
Figure PCTCN2019075206-appb-000064
Is the visual significance value of layer i,
Figure PCTCN2019075206-appb-000065
Is the visual significance value of layer i-1;
利用下述公式将相邻两层的视觉显著图进行融合,得到融合视觉显著图;Use the following formula to fuse the visual saliency maps of two adjacent layers to obtain a fusion visual saliency map;
Figure PCTCN2019075206-appb-000066
Figure PCTCN2019075206-appb-000066
式中,
Figure PCTCN2019075206-appb-000067
为融合视觉显著图p位置的像素点视觉显著性值。
In the formula,
Figure PCTCN2019075206-appb-000067
To fuse the visual saliency value of pixels at the position of p of the visual saliency map.
本发明实施例所述视觉显著性区域检测装置的各功能模块的功能可根据上述方法实施例中的方法具体实现,其具体实现过程可以参照上述方法实施例的相关描述,此处不再赘述。The functions of the functional modules of the visual saliency area detection device according to the embodiments of the present invention may be specifically implemented according to the method in the above method embodiments. For the specific implementation process, reference may be made to the related descriptions in the above method embodiments, and details are not described here.
由上可知,本发明实施例提高了像素点视觉显著性值的计算效率,从而有效地提升了待处理图像的视觉显著性区域的检测效率,还可提升待处理图像中的视觉显著性区域的检测准确度和精度。It can be seen from the above that the embodiment of the present invention improves the calculation efficiency of the visual saliency value of pixels, thereby effectively improving the detection efficiency of the visual saliency area of the image to be processed, and can also improve the visual saliency area of the image to be processed. Detection accuracy and precision.
本发明实施例还提供了一种视觉显著性区域检测设备,具体可包括:An embodiment of the present invention also provides a visually significant area detection device, which may specifically include:
存储器,用于存储计算机程序;Memory, used to store computer programs;
处理器,用于执行计算机程序以实现如上任意一实施例所述视觉显著性区域检测方法的步骤。The processor is configured to execute a computer program to implement the steps of the visual saliency area detection method described in any one of the above embodiments.
本发明实施例所述视觉显著性区域检测设备的各功能模块的功能可根据上述方法实施例中的方法具体实现,其具体实现过程可以参照上述方法实施例的相关描述,此处不再赘述。The functions of the functional modules of the visual saliency area detection device according to the embodiments of the present invention may be specifically implemented according to the methods in the above method embodiments. For the specific implementation process, reference may be made to the related descriptions in the above method embodiments, and details are not described here.
由上可知,本发明实施例提高了像素点视觉显著性值的计算效率,从而有效地提升了待处理图像的视觉显著性区域的检测效率,还可提升待处理图像中的视觉显著性区域的检测准确度和精度。It can be seen from the above that the embodiment of the present invention improves the calculation efficiency of the visual saliency value of pixels, thereby effectively improving the detection efficiency of the visual saliency area of the image to be processed, and can also improve the visual saliency area of the image to be processed. Detection accuracy and precision.
本发明实施例还提供了一种计算机可读存储介质,存储有视觉显著性区域检测程序,所述视觉显著性区域检测程序被处理器执行时如上任意一实施例所述视觉显著性区域检测方法的步骤。An embodiment of the present invention also provides a computer-readable storage medium that stores a visual saliency area detection program, and the visual saliency area detection program is executed by the processor as described in any one of the above embodiments in the visual saliency area detection method A step of.
本发明实施例所述计算机可读存储介质的各功能模块的功能可根据上述方法实施例中的方法具体实现,其具体实现过程可以参照上述方法实施例的相关描述,此处不再赘述。The functions of the functional modules of the computer-readable storage medium according to the embodiments of the present invention may be specifically implemented according to the method in the foregoing method embodiments. For the specific implementation process, reference may be made to the related descriptions of the foregoing method embodiments, and details are not described here.
由上可知,本发明实施例提高了像素点视觉显著性值的计算效率,从而有效地提升了待处理图像的视觉显著性区域的检测效率,还可提升待处理图像中的视觉显著性区域的检测准确度和精度。It can be seen from the above that the embodiment of the present invention improves the calculation efficiency of the visual saliency value of pixels, thereby effectively improving the detection efficiency of the visual saliency area of the image to be processed, and can also improve the visual saliency area of the image to be processed. Detection accuracy and precision.
本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其它实施例的不同之处,各个实施例之间相同或相似部分互相参见即可。对于实施例公开的装置而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。The embodiments in this specification are described in a progressive manner. Each embodiment focuses on the differences from other embodiments, and the same or similar parts between the embodiments may refer to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant part can be referred to the description in the method part.
专业人员还可以进一步意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。Professionals can further realize that the units and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, computer software, or a combination of the two. In order to clearly illustrate the hardware and software Interchangeability, in the above description, the composition and steps of each example have been generally described according to function. Whether these functions are executed in hardware or software depends on the specific application of the technical solution and design constraints. Professional technicians can use different methods to implement the described functions for each specific application, but such implementation should not be considered beyond the scope of the present invention.
结合本文中所公开的实施例描述的方法或算法的步骤可以直接用硬件、处理器执行的软件模块,或者二者的结合来实施。软件模块可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的存储介质中。The steps of the method or algorithm described in conjunction with the embodiments disclosed herein may be implemented directly by hardware, a software module executed by a processor, or a combination of both. Software modules can be placed in random access memory (RAM), memory, read-only memory (ROM), electrically programmable ROM, electrically erasable and programmable ROM, registers, hard disks, removable disks, CD-ROMs, or all fields of technology. Any other known storage medium.
以上对本发明所提供的一种视觉显著性区域检测方法及装置进行了详细介绍。本文中应用了具体个例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及其核心思想。应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以对本发明进行若干改进和修饰,这些改进和修饰也落入本发明权利要求的保护范围内。The method and device for detecting visually significant regions provided by the present invention have been described in detail above. In this article, specific examples are used to explain the principles and implementations of the present invention. The descriptions of the above examples are only used to help understand the method and the core idea of the present invention. It should be noted that for those of ordinary skill in the art, without departing from the principles of the present invention, the present invention may also be subject to several improvements and modifications, and these improvements and modifications also fall within the protection scope of the claims of the present invention.

Claims (10)

  1. 一种视觉显著性区域检测方法,其特征在于,包括:A visual saliency area detection method, characterized in that it includes:
    利用随机搜索方法从待处理图像中为当前像素点选取多个候选图像区域块,从各候选图像区域块中选取满足相似度条件的目标图像区域块;Use a random search method to select multiple candidate image area blocks for the current pixel from the image to be processed, and select target image area blocks that satisfy the similarity condition from each candidate image area block;
    根据各目标图像块与所述当前像素点之间的相似度,计算所述当前像素点的视觉显著性值;Calculate the visual saliency value of the current pixel according to the similarity between each target image block and the current pixel;
    基于所述待处理图像的各像素点的视觉显著性值,生成所述待处理图像各层次的视觉显著图;Generating a visual saliency map at each level of the image to be processed based on the visual saliency value of each pixel of the image to be processed;
    根据相邻两层视觉显著图的视觉显著性值,计算各层视觉显著图的权重值,以用于按照各层视觉显著图的权重值将各层视觉显著图进行融合。Based on the visual saliency values of the two adjacent layers of visual saliency maps, the weight values of the visual saliency maps of each layer are calculated to be used to fuse the visual saliency maps of each layer according to the weight values of the visual saliency maps of each layer.
  2. 根据权利要求1所述的视觉显著性区域检测方法,其特征在于,所述根据相邻两层视觉显著图的视觉显著性值,计算各层视觉显著图的权重值之后,还包括:The method for detecting a visual saliency area according to claim 1, wherein after calculating the weight values of the visual saliency maps of each layer according to the visual saliency values of the visual saliency maps of two adjacent layers, the method further comprises:
    在融合后得到的视觉显著图中选择多个预设位置点;Select multiple preset position points in the visual saliency map obtained after fusion;
    对每个预设位置点,根据各个像素点的视觉显著性值,将其分为强化像素点集和弱化像素点集,所述强化像素点集中的像素点的视觉显著性值均大于所述弱化像素点集中的像素点的视觉显著性值;For each preset position point, according to the visual saliency value of each pixel point, it is divided into an enhanced pixel point set and a weakened pixel point set, and the visual saliency value of the pixels in the enhanced pixel point set is greater than the To weaken the visual significance value of the pixels in the pixel set;
    利用图像增强方法对所述强化像素点集中的各像素点进行强化处理;Use the image enhancement method to perform enhancement processing on each pixel in the enhanced pixel set;
    利用图像弱化方法对所述弱化像素点集中的各像素点进行弱化处理。The image weakening method is used to weaken each pixel in the weakened pixel set.
  3. 根据权利要求1所述的视觉显著性区域检测方法,其特征在于,所述基于所述待处理图像的各像素点的视觉显著性值,生成所述待处理图像各层次的视觉显著图包括:The method for detecting a visual saliency region according to claim 1, wherein the generating the visual saliency maps of each level of the image to be processed based on the visual saliency value of each pixel of the image to be processed includes:
    基于所述待处理图像的各像素点的视觉显著性值,生成所述待处理图像各层次的粗略化视觉显著图;Generating rough visual saliency maps at various levels of the image to be processed based on the visual saliency value of each pixel of the image to be processed;
    对各粗略化视觉显著图进行细致化操作,以去除图像噪声信号,得到各自相应的细致化视觉显著图。Perform detailed operations on each rough visual saliency map to remove image noise signals, and obtain respective corresponding detailed visual saliency maps.
  4. 根据权利要求1所述的视觉显著性区域检测方法,其特征在于,所述从各候选图像区域块中选取满足相似度条件的目标图像区域块包括:The visual saliency area detection method according to claim 1, wherein the selection of target image area blocks satisfying the similarity condition from each candidate image area block includes:
    分别计算所述当前像素点和各候选图像区域块的相似度量值;Calculating the similarity metric values of the current pixel point and each candidate image area block separately;
    删除相似度量值低于相似度阈值对应的候选图像区域块,将剩余的候选图像区域块作为目标图像区域块。Delete the candidate image area blocks corresponding to the similarity metric value lower than the similarity threshold, and use the remaining candidate image area blocks as the target image area blocks.
  5. 根据权利要求1至4任意一项所述的视觉显著性区域检测方法,其特征在于,所述根据各目标图像块与所述当前像素点之间的相似度,计算所述当前像素点的视觉显著性值包括:The visual saliency area detection method according to any one of claims 1 to 4, wherein the vision of the current pixel is calculated according to the similarity between each target image block and the current pixel Significance values include:
    利用下述公式计算所述当前像素点的视觉显著性值S:Use the following formula to calculate the visual significance value S of the current pixel:
    Figure PCTCN2019075206-appb-100001
    其中,
    Figure PCTCN2019075206-appb-100002
    Figure PCTCN2019075206-appb-100001
    among them,
    Figure PCTCN2019075206-appb-100002
    式中,dist(r i,r k)为所述当前像素点所在位置与第K个目标图像块之间的不相似性度量值,K为目标图像块的总个数,dist color(r i,r k)为经过向量化处理的目标图像区域块与所述当前像素点所在位置在HSV颜色空间上的欧氏距离,dist pos(r i,r k)为所述当前像素点所在位置与第K个目标图像块之间的欧式距离。 Where dist(r i , r k ) is the dissimilarity measure between the current pixel position and the Kth target image block, K is the total number of target image blocks, dist color (r i , R k ) is the Euclidean distance in the HSV color space between the target image area block subjected to vectorization and the location of the current pixel, and dist pos (r i , r k ) is the location of the current pixel The Euclidean distance between the Kth target image block.
  6. 根据权利要求1至4任意一项所述的视觉显著性区域检测方法,其特征在于,所述基于所述待处理图像的各像素点的视觉显著性值,生成所述待处理图像各层次的视觉显著图包括:The visual saliency area detection method according to any one of claims 1 to 4, characterized in that, based on the visual saliency value of each pixel of the image to be processed, each level of the image to be processed is generated Visual saliency maps include:
    基于频域计算所述待处理图像的各像素点的局部系统显著性值,生成局部系统显著图;Calculating the local system saliency value of each pixel of the image to be processed based on the frequency domain to generate a local system saliency map;
    利用下述公式计算所述待处理图像的各像素点的全局系统显著性值V Global(x,y),生成全局系统显著图: Use the following formula to calculate the global system saliency value V Global (x, y) of each pixel of the image to be processed to generate a global system saliency map:
    Figure PCTCN2019075206-appb-100003
    Figure PCTCN2019075206-appb-100003
    式中,x、y为像素点的横纵坐标值,f(x,y)为求解像素点(x,y)的显著性函数,f average(x,y)为f(x,y)的算术平均值,所述待处理图像大小为M*N; In the formula, x and y are the horizontal and vertical coordinate values of the pixel, f(x, y) is the significance function for solving the pixel (x, y), and f average (x, y) is f(x, y). Arithmetic average, the size of the image to be processed is M*N;
    利用下述公式计算所述待处理图像的各像素点的稀缺显著性值V Scarcity(x,y),生成稀缺显著图: Use the following formula to calculate the scarcity saliency value V Scarcity (x,y) of each pixel of the image to be processed to generate a scarcity saliency map:
    Figure PCTCN2019075206-appb-100004
    Figure PCTCN2019075206-appb-100004
    式中,x、y为像素点的横纵坐标值,f average(x,y)为f(x,y)的算术平均值,h(f average(x,y))为所述待处理图像产生的特征直方图; In the formula, x and y are the horizontal and vertical coordinate values of the pixel, f average (x, y) is the arithmetic average value of f (x, y), and h (f average (x, y)) is the image to be processed The generated feature histogram;
    将所述局部系统显著图、所述全局系统显著图及所述稀缺显著图进行融合,得到所述待处理图像各层次的视觉显著图。The local system saliency map, the global system saliency map, and the scarcity saliency map are fused to obtain visual saliency maps at various levels of the image to be processed.
  7. 根据权利要求6所述的视觉显著性区域检测方法,其特征在于,所述将所述局部系统显著图、所述全局系统显著图及所述稀缺显著图进行融合,得到所述待处理图像各层次的视觉显著图包括:The visual saliency area detection method according to claim 6, wherein the fusion of the local system saliency map, the global system saliency map, and the scarcity saliency map to obtain each of the to-be-processed images Hierarchical visual saliency maps include:
    对所述待处理图像的各层,利用下述公式计算所述待处理图像的各像素点的视觉显著性值V final,生成视觉显著图: For each layer of the image to be processed, the visual saliency value V final of each pixel of the image to be processed is calculated using the following formula to generate a visual saliency map:
    Figure PCTCN2019075206-appb-100005
    Figure PCTCN2019075206-appb-100005
    式中,V Local、V Global、V Scarcity依次为横纵坐标值为x、y的像素点的局部系统显著性值、全局系统显著性值及稀缺显著性值;v 1为局部系统显著性值的权重值,v 2为全局系统显著性值的权重值,v 3为稀缺显著性值的权重值;i=1,V i=V 1为所述局部系统显著性值,i=2,V i=V 2为所述全部系统显著性值,i=3,V i=V 3为所述稀缺显著性值,V 1、V 2、V 3为根据
    Figure PCTCN2019075206-appb-100006
    计算所得。
    In the formula, V Local , V Global , and V Scarcity are the local system saliency value, global system saliency value, and scarcity saliency value of the pixels whose horizontal and vertical coordinates are x and y, and v 1 is the local system saliency value. The weight value of, v 2 is the weight value of the global system saliency value, v 3 is the weight value of the scarcity saliency value; i=1, V i = V 1 is the local system saliency value, i=2, V i = V 2 is the significance value of all the systems, i = 3, V i = V 3 is the scarcity significance value, and V 1 , V 2 , and V 3 are based on
    Figure PCTCN2019075206-appb-100006
    Calculated.
  8. 根据权利要求6所述的视觉显著性区域检测方法,其特征在于,所述基于频域计算所述待处理图像的各像素点的局部系统显著性值包括:The method for detecting a visual saliency region according to claim 6, wherein the calculating the local system saliency value of each pixel of the image to be processed based on the frequency domain includes:
    利用下述公式计算所述待处理图像的各像素点的局部系统显著性值V Local(x,y): Use the following formula to calculate the local system saliency value V Local (x,y) of each pixel of the image to be processed:
    Figure PCTCN2019075206-appb-100007
    Figure PCTCN2019075206-appb-100007
    式中,x、y为像素点的横纵坐标值,FFT(u,v)为像素点特征值,|FFT(u,v)e jψ(u,v)|为经过快速傅里叶变换以后所得图像幅度谱,ψ(u,v)为所述待处理图像相位谱。 In the formula, x and y are the horizontal and vertical coordinates of the pixel, FFT(u,v) is the characteristic value of the pixel, |FFT(u,v)e jψ(u,v) | is after the fast Fourier transform The obtained image amplitude spectrum, ψ(u,v) is the phase spectrum of the image to be processed.
  9. 根据权利要求1至4任意一项所述的视觉显著性区域检测方法,其特征在于,所述根据相邻两层视觉显著图的视觉显著性值,计算各层视觉显著图的权重值,以用于按照各层视觉显著图的权重值将各层视觉显著图进行融合包括:The method for detecting a visual saliency area according to any one of claims 1 to 4, wherein the weight value of the visual saliency map of each layer is calculated according to the visual saliency value of the visual saliency maps of two adjacent layers, to It is used to fuse the visual saliency map of each layer according to the weight value of the visual saliency map of each layer, including:
    利用下述公式计算各层视觉显著图的权重值:Use the following formula to calculate the weight value of each layer of visual saliency map:
    Figure PCTCN2019075206-appb-100008
    Figure PCTCN2019075206-appb-100008
    式中,p为像素点位置,
    Figure PCTCN2019075206-appb-100009
    为第i层的权重值,
    Figure PCTCN2019075206-appb-100010
    为第i-1层的权重值,
    Figure PCTCN2019075206-appb-100011
    为第i层的视觉化显著性值,
    Figure PCTCN2019075206-appb-100012
    为第i-1层的视觉显著性值;
    In the formula, p is the pixel position,
    Figure PCTCN2019075206-appb-100009
    Is the weight value of layer i,
    Figure PCTCN2019075206-appb-100010
    Is the weight value of layer i-1,
    Figure PCTCN2019075206-appb-100011
    Is the visual significance value of layer i,
    Figure PCTCN2019075206-appb-100012
    Is the visual significance value of layer i-1;
    利用下述公式将相邻两层的视觉显著图进行融合,得到融合视觉显著图;Use the following formula to fuse the visual saliency maps of two adjacent layers to obtain a fusion visual saliency map;
    Figure PCTCN2019075206-appb-100013
    Figure PCTCN2019075206-appb-100013
    式中,
    Figure PCTCN2019075206-appb-100014
    为所述融合视觉显著图p位置的像素点视觉显著性值。
    In the formula,
    Figure PCTCN2019075206-appb-100014
    Is the visual saliency value of the pixel at the p position of the fusion visual saliency map.
  10. 一种视觉显著性区域检测装置,其特征在于,包括:A visual saliency area detection device, characterized in that it includes:
    随机搜索模块,用于利用随机搜索方法从待处理图像中为当前像素点选取多个候选图像区域块,从各候选图像区域块中选取满足相似度条件的目标图像区域块;The random search module is used to select multiple candidate image area blocks for the current pixel from the to-be-processed image using a random search method, and select target image area blocks satisfying the similarity condition from each candidate image area block;
    视觉显著性值计算模块,用于根据各目标图像块与所述当前像素点之间的相似度,计算所述当前像素点的视觉显著性值;The visual saliency value calculation module is used to calculate the visual saliency value of the current pixel according to the similarity between each target image block and the current pixel;
    多层视觉显著图生成模块,用于基于所述待处理图像的各像素点的视觉显著性值,生成所述待处理图像各层次的视觉显著图;A multi-layer visual saliency map generation module, used to generate visual saliency maps of each level of the image to be processed based on the visual saliency value of each pixel of the image to be processed;
    视觉显著图融合模块,用于根据相邻两层视觉显著图的视觉显著性值,计算各层视觉显著图的权重值,以用于按照各层视觉显著图的权重值将各层视觉显著图进行融合。The visual saliency map fusion module is used to calculate the weight value of the visual saliency map of each layer according to the visual saliency value of the adjacent two layers of visual saliency maps, which is used to visually map each layer according to the weight value of the visual saliency map of each layer Perform fusion.
PCT/CN2019/075206 2018-11-30 2019-02-15 Visual saliency region detection method and apparatus WO2020107717A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811455470.1 2018-11-30
CN201811455470.1A CN109543701A (en) 2018-11-30 2018-11-30 Vision significance method for detecting area and device

Publications (1)

Publication Number Publication Date
WO2020107717A1 true WO2020107717A1 (en) 2020-06-04

Family

ID=65851847

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/075206 WO2020107717A1 (en) 2018-11-30 2019-02-15 Visual saliency region detection method and apparatus

Country Status (2)

Country Link
CN (1) CN109543701A (en)
WO (1) WO2020107717A1 (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111860643A (en) * 2020-07-20 2020-10-30 苏州大学 Robustness improving method for visual template matching based on frequency modulation model
CN112418223A (en) * 2020-12-11 2021-02-26 互助土族自治县北山林场 Wild animal image significance target detection method based on improved optimization
CN113256581A (en) * 2021-05-21 2021-08-13 中国科学院自动化研究所 Automatic defect sample labeling method and system based on visual attention modeling fusion
CN114022778A (en) * 2021-10-25 2022-02-08 电子科技大学 SAR (synthetic Aperture Radar) berthing ship detection method based on significance CNN (CNN)
CN114926657A (en) * 2022-06-09 2022-08-19 山东财经大学 Method and system for detecting saliency target
CN115147409A (en) * 2022-08-30 2022-10-04 深圳市欣冠精密技术有限公司 Mobile phone shell production quality detection method based on machine vision
CN115272335A (en) * 2022-09-29 2022-11-01 江苏万森绿建装配式建筑有限公司 Metallurgical metal surface defect detection method based on significance detection
CN116645368A (en) * 2023-07-27 2023-08-25 青岛伟东包装有限公司 Online visual detection method for edge curl of casting film
CN116721107A (en) * 2023-08-11 2023-09-08 青岛胶州电缆有限公司 Intelligent monitoring system for cable production quality
CN116740098A (en) * 2023-08-11 2023-09-12 中色(天津)新材料科技有限公司 Aluminum alloy argon arc welding image segmentation method and system
CN117419650B (en) * 2023-12-18 2024-02-27 湖南西欧新材料有限公司 Alumina ceramic surface glaze layer thickness measuring method based on visual analysis

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110264545A (en) * 2019-06-19 2019-09-20 北京字节跳动网络技术有限公司 Picture Generation Method, device, electronic equipment and storage medium
CN111079730B (en) * 2019-11-20 2023-12-22 北京云聚智慧科技有限公司 Method for determining area of sample graph in interface graph and electronic equipment
CN110910830B (en) * 2019-11-29 2021-02-12 京东方科技集团股份有限公司 Display brightness adjusting method, display system, computer device and medium
CN111083477B (en) * 2019-12-11 2020-11-10 北京航空航天大学 HEVC (high efficiency video coding) optimization algorithm based on visual saliency
CN112101376A (en) * 2020-08-14 2020-12-18 北京迈格威科技有限公司 Image processing method, image processing device, electronic equipment and computer readable medium
CN115643811A (en) * 2020-12-31 2023-01-24 华为技术有限公司 Image processing method, data acquisition method and equipment
CN113140005B (en) * 2021-04-29 2024-04-16 上海商汤科技开发有限公司 Target object positioning method, device, equipment and storage medium
CN117058723B (en) * 2023-10-11 2024-01-19 腾讯科技(深圳)有限公司 Palmprint recognition method, palmprint recognition device and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103049893A (en) * 2011-10-14 2013-04-17 深圳信息职业技术学院 Method and device for evaluating image fusion quality
US8437543B2 (en) * 2010-09-16 2013-05-07 Thomson Licensing Method and device of determining a saliency map for an image
CN104463907A (en) * 2014-11-13 2015-03-25 南京航空航天大学 Self-adaptation target tracking method based on vision saliency characteristics
CN107103608A (en) * 2017-04-17 2017-08-29 大连理工大学 A kind of conspicuousness detection method based on region candidate samples selection

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8374436B2 (en) * 2008-06-30 2013-02-12 Thomson Licensing Method for detecting layout areas in a video image and method for generating an image of reduced size using the detection method
CN106407927B (en) * 2016-09-12 2019-11-05 河海大学常州校区 The significance visual method suitable for underwater target detection based on polarization imaging

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8437543B2 (en) * 2010-09-16 2013-05-07 Thomson Licensing Method and device of determining a saliency map for an image
CN103049893A (en) * 2011-10-14 2013-04-17 深圳信息职业技术学院 Method and device for evaluating image fusion quality
CN104463907A (en) * 2014-11-13 2015-03-25 南京航空航天大学 Self-adaptation target tracking method based on vision saliency characteristics
CN107103608A (en) * 2017-04-17 2017-08-29 大连理工大学 A kind of conspicuousness detection method based on region candidate samples selection

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YUANTAO CHEN: "The Study on Image Segmentation Based on Visual Saliency of Image and Improved SVM", CHINESE DOCTORAL DISSERTATIONS FULL-TEXT DATABASE, 1 June 2014 (2014-06-01), pages 1 - 128, XP055711166 *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111860643B (en) * 2020-07-20 2023-10-03 苏州大学 Visual template matching robustness improving method based on frequency modulation model
CN111860643A (en) * 2020-07-20 2020-10-30 苏州大学 Robustness improving method for visual template matching based on frequency modulation model
CN112418223A (en) * 2020-12-11 2021-02-26 互助土族自治县北山林场 Wild animal image significance target detection method based on improved optimization
CN113256581B (en) * 2021-05-21 2022-09-02 中国科学院自动化研究所 Automatic defect sample labeling method and system based on visual attention modeling fusion
CN113256581A (en) * 2021-05-21 2021-08-13 中国科学院自动化研究所 Automatic defect sample labeling method and system based on visual attention modeling fusion
CN114022778B (en) * 2021-10-25 2023-04-07 电子科技大学 SAR (synthetic Aperture Radar) berthing ship detection method based on significance CNN (CNN)
CN114022778A (en) * 2021-10-25 2022-02-08 电子科技大学 SAR (synthetic Aperture Radar) berthing ship detection method based on significance CNN (CNN)
CN114926657A (en) * 2022-06-09 2022-08-19 山东财经大学 Method and system for detecting saliency target
CN114926657B (en) * 2022-06-09 2023-12-19 山东财经大学 Saliency target detection method and system
CN115147409A (en) * 2022-08-30 2022-10-04 深圳市欣冠精密技术有限公司 Mobile phone shell production quality detection method based on machine vision
CN115272335A (en) * 2022-09-29 2022-11-01 江苏万森绿建装配式建筑有限公司 Metallurgical metal surface defect detection method based on significance detection
CN116645368A (en) * 2023-07-27 2023-08-25 青岛伟东包装有限公司 Online visual detection method for edge curl of casting film
CN116645368B (en) * 2023-07-27 2023-10-03 青岛伟东包装有限公司 Online visual detection method for edge curl of casting film
CN116721107A (en) * 2023-08-11 2023-09-08 青岛胶州电缆有限公司 Intelligent monitoring system for cable production quality
CN116740098A (en) * 2023-08-11 2023-09-12 中色(天津)新材料科技有限公司 Aluminum alloy argon arc welding image segmentation method and system
CN116740098B (en) * 2023-08-11 2023-10-27 中色(天津)新材料科技有限公司 Aluminum alloy argon arc welding image segmentation method and system
CN116721107B (en) * 2023-08-11 2023-11-03 青岛胶州电缆有限公司 Intelligent monitoring system for cable production quality
CN117419650B (en) * 2023-12-18 2024-02-27 湖南西欧新材料有限公司 Alumina ceramic surface glaze layer thickness measuring method based on visual analysis

Also Published As

Publication number Publication date
CN109543701A (en) 2019-03-29

Similar Documents

Publication Publication Date Title
WO2020107717A1 (en) Visual saliency region detection method and apparatus
WO2019114036A1 (en) Face detection method and device, computer device, and computer readable storage medium
CN105184763B (en) Image processing method and device
US8571271B2 (en) Dual-phase red eye correction
TWI281126B (en) Image detection method based on region
JP6330385B2 (en) Image processing apparatus, image processing method, and program
WO2017092431A1 (en) Human hand detection method and device based on skin colour
EP3101594A1 (en) Saliency information acquisition device and saliency information acquisition method
Ishikura et al. Saliency detection based on multiscale extrema of local perceptual color differences
CN111340824B (en) Image feature segmentation method based on data mining
WO2019071976A1 (en) Panoramic image saliency detection method based on regional growth and eye movement model
US20180349716A1 (en) Apparatus and method for recognizing traffic signs
JP2012238175A (en) Information processing device, information processing method, and program
CN111160110A (en) Method and device for identifying anchor based on face features and voice print features
CN108256454B (en) Training method based on CNN model, and face posture estimation method and device
CN110991547A (en) Image significance detection method based on multi-feature optimal fusion
WO2019119919A1 (en) Image recognition method and electronic device
CN112329851A (en) Icon detection method and device and computer readable storage medium
Song et al. Depth-aware saliency detection using discriminative saliency fusion
Manh et al. Small object segmentation based on visual saliency in natural images
KR101833943B1 (en) Method and system for extracting and searching highlight image
Chen et al. Fresh tea sprouts detection via image enhancement and fusion SSD
Cozzolino et al. A novel framework for image forgery localization
JP7225978B2 (en) Active learning method and active learning device
CN109460763B (en) Text region extraction method based on multilevel text component positioning and growth

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19890734

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19890734

Country of ref document: EP

Kind code of ref document: A1