WO2020107716A1 - 目标图像分割方法、装置及设备 - Google Patents

目标图像分割方法、装置及设备 Download PDF

Info

Publication number
WO2020107716A1
WO2020107716A1 PCT/CN2019/075205 CN2019075205W WO2020107716A1 WO 2020107716 A1 WO2020107716 A1 WO 2020107716A1 CN 2019075205 W CN2019075205 W CN 2019075205W WO 2020107716 A1 WO2020107716 A1 WO 2020107716A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
pixel
image
area
initial target
Prior art date
Application number
PCT/CN2019/075205
Other languages
English (en)
French (fr)
Inventor
陈沅涛
王进
王磊
张建明
陈曦
王志
桂彦
谷科
Original Assignee
长沙理工大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 长沙理工大学 filed Critical 长沙理工大学
Publication of WO2020107716A1 publication Critical patent/WO2020107716A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines

Definitions

  • Embodiments of the present invention relate to the technical field of image processing, and in particular, to a target image segmentation method, device, and equipment.
  • the embodiments of the present disclosure provide a target image segmentation method, device and equipment, which solves the problem of unstable image segmentation caused by the use of separate color features in the related art, and also effectively resolves changes in light intensity, target deformation, and color distribution
  • the unstable problem of target image segmentation caused by similarity improves the stability and accuracy of target image segmentation, and also improves the efficiency of image segmentation.
  • the embodiments of the present invention provide the following technical solutions:
  • An aspect of an embodiment of the present invention provides a target image segmentation method, including:
  • each sample feature model calculates the weight of each pixel in the saliency area, and delete the pixels in the saliency area whose weight does not meet the preset conditions to obtain the final area, to Extract and segment as the target area.
  • using the pre-built SVM model to select a plurality of target pixels of the support vector from the image to be processed includes:
  • Collecting a plurality of pixels of the image to be processed constitutes an original sample set, the original sample set contains pixels with non-support vectors and support vectors, and the weights of the pixels are the same;
  • the determining the saliency area in the image to be processed based on the calculated characteristic distance between each target pixel point and the initial target area includes:
  • Multiple target pixel points are divided into a first pixel point set and a second pixel point set, the characteristic distance corresponding to each target pixel point in the first pixel point set is less than each target pixel point in the second pixel point set The corresponding characteristic distance;
  • Each target pixel in the first set of pixels is used to replace the target pixel in the second set of pixels to obtain a saliency area.
  • the calculation of the characteristic distance between each target pixel and the initial target area is:
  • x 1i is the ith target pixel
  • x 2 is the initial target area
  • K(x 1i , x 1i ) is the Euclidean distance between the x 1i target pixel and the x 1i target pixel
  • K( x 1i , x 2 ) is the Euclidean distance between the x 1i target pixel and the center of the initial target area
  • K(x 2 , x 2 ) is the distance between the center of the initial target area and the center of the initial target area European distance.
  • calculating the weight of each pixel in the saliency area according to the initial target feature model and each sample feature model includes:
  • the calculating the color histogram of the initial target area includes:
  • x i is the ith pixel in the initial target area
  • C is the normalization factor
  • k is the kernel function
  • u is the weak limit
  • a is the size of the initial target area
  • N is the total number of pixels in the initial target area
  • B(x i ) is the color feature of the ith pixel assigned to the corresponding part on the color histogram
  • ⁇ ( ⁇ ) is the Dirac function.
  • the calculation of the visual significance histogram of the initial target area includes:
  • the size of the image to be processed is M*N
  • x and y are the horizontal and vertical coordinate values of the pixel
  • ⁇ (u,v) is the phase spectrum obtained after the fast Fourier transform of the image to be processed
  • V w H ⁇ V H +w S ⁇ V S +w V ⁇ V V ;
  • V H , V S , and V V are the corresponding visual saliency maps obtained by calculating the visual saliency of the HSV color space features
  • w H , w S , and w V are the feature weights corresponding to the color space features, respectively.
  • the method before selecting the target pixels of multiple support vectors from the image to be processed using the pre-built SVM model, the method further includes:
  • v t-1 is the process noise signal
  • A is the state transfer function
  • t is the time
  • X t is the pixel.
  • a target image segmentation device including:
  • the initial target area positioning module is used to locate the initial target area in the image to be processed according to the preset initial target area setting conditions, calculate the color histogram and the visual saliency histogram of the initial target area, and form an initial target feature model ;
  • the sample point selection module is used to select a plurality of target pixels of the support vector from the to-be-processed image using the pre-built SVM model, and calculate the color histogram and visual significance histogram of each target pixel to form a sample feature model ;
  • a saliency area determination module configured to determine a saliency area in the image to be processed based on the calculated characteristic distance between each target pixel point and the initial target area;
  • the target area determination module is used to calculate the weight value of each pixel in the saliency area according to the initial target feature model and each sample feature model, and delete the pixels in the saliency area whose weights do not meet the preset condition After the point is obtained, the final area is used as the target area for extraction and segmentation.
  • An embodiment of the present invention further provides a target image segmentation device, including a processor, which is used to implement the steps of the target image segmentation method according to any one of the preceding items when the computer program stored in the memory is executed.
  • An embodiment of the present invention finally provides a computer-readable storage medium on which a target image segmentation program is stored, and when the target image segmentation program is executed by a processor, the target described in any one of the preceding items is achieved The steps of the image segmentation method.
  • the advantage of the technical solution provided by this application is that the visual saliency feature and the color feature are used as the features of the description target. Because the visual saliency has high robustness, high robustness and high anti-interference ability, it not only solves the problem of using separate colors. The problem of unstable image segmentation caused by features also effectively solves the problem of difficult detection due to target deformation, light changes, and similar color distribution of target and background, thereby effectively improving the stability and accuracy of target image segmentation In addition, using the SVM model to select a support vector from the image to be processed, and selecting effective pixels from the support vector with a similarity to the segmentation target area to determine the final target area, not only improves the efficiency of the target image segmentation, but also Further improve the accuracy of target image segmentation.
  • the embodiments of the present invention also provide a corresponding implementation device and device for the target image segmentation method, which further makes the method more practical, and the device and device have corresponding advantages.
  • FIG. 1 is a schematic flowchart of a target image segmentation method according to an embodiment of the present invention
  • FIG. 2 is a schematic flowchart of a method for calculating a saliency of a perspective provided by an embodiment of the present invention
  • FIG. 3 is a schematic flowchart of another target image segmentation method according to an embodiment of the present invention.
  • FIG. 4 is a structural diagram of a specific implementation manner of a target image segmentation device according to an embodiment of the present invention.
  • FIG. 5 is a structural diagram of another specific implementation manner of a target image segmentation device according to an embodiment of the present invention.
  • FIG. 1 is a schematic flowchart of a target image segmentation method according to an embodiment of the present invention.
  • the embodiment of the present invention may include the following:
  • S101 Locate the initial target area in the to-be-processed image according to the preset initial target area setting conditions, and calculate the color histogram and visual saliency histogram of the initial target area to form an initial target feature model.
  • the initial target area setting condition is a condition that can set a rough area in the image to be processed according to the position of the target to be cut in the image and its own parameter information in the image to be processed in advance.
  • the target center can be located in the image to be processed first, and then the area range can be determined according to the height and width positioning, so that the initial target area is detected from the image to be processed.
  • S102 Use the pre-built SVM model to select multiple target pixel points of the support vector from the image to be processed, and calculate the color histogram and visual significance histogram of each target pixel point to form a sample feature model.
  • the SVM model is a classifier trained in advance based on adaptive incremental learning and decrement learning algorithms.
  • the implementation process of adaptive incremental learning and decrement learning algorithms can be described as follows:
  • the incremental learning algorithm and the decrement learning algorithm lack selective elimination of the training set data, this will have a large impact on processing time and processing accuracy. If the non-support vectors in the pixel samples are directly discarded during the incremental execution process, but as the subsequent incremental training process continues, the previously discarded non-support vectors may become support vectors. Moreover, the non-support vectors of pixel samples are directly discarded in a single training, and it is very likely that some important processing effective information will be discarded, resulting in a decrease in the accuracy of the classification process.
  • the incremental and decrement learning algorithms for adaptive processing are introduced by introducing a threshold, and the threshold is the maximum number of times that the pixel samples in the training set can tolerate non-support vectors.
  • each pixel of the image to be processed can be labeled with a sample label in advance.
  • l(x i ) is called a pixel sample (x i ,
  • the sample label of y i ) can be:
  • the sample label mark can be used to determine whether a pixel sample can be discarded.
  • a threshold can be introduced to represent the maximum number of non-support vector times for sample point data on the training set.
  • the threshold value needs to consider the balance between training time and training accuracy, which is not limited in this application.
  • the initial discarded sample set is T d ⁇ ;
  • T d T d ⁇
  • thresholds can implement adaptive strategies, thereby improving SVM training accuracy and reducing training time.
  • the process of selecting target pixels of multiple support vectors from the image to be processed using the SVM model may include:
  • a plurality of pixels of the image to be processed form an original sample set, and the original sample set contains pixels with non-support vectors and support vectors, and the weights of the pixels are the same.
  • the original sample set contains all the salient feature pixels in the image to be processed.
  • the importance weight of each pixel is the same. For example, it can be assigned a value of 1/N, and N is the total number of samples in the original sample set.
  • the color feature and visual saliency feature of the target pixel are also extracted.
  • the color feature and the visual saliency feature are used to describe each target sample pixel.
  • the method of generating the sample feature model of each target pixel can be the same as the method of generating the initial target feature model of the initial target area, of course, different methods can also be used.
  • the target pixel extracted in step S102 is a pixel with obvious distinctive features in the entire image to be processed, that is, a pixel with the highest similarity to the target to be segmented.
  • the characteristic distance between each target pixel and the initial target area is used as a measure of the similarity between each target pixel and the target.
  • the initial target area is the area that contains the target to be segmented. The smaller the distance between the pixel and the initial target area, the second The higher the similarity, the more likely the target pixel is the target to be cut.
  • the effective salient area of the image to be processed is extracted, and the effective salient area is the area containing the target to be segmented.
  • S104 Calculate the weight of each pixel in the saliency area according to the initial target feature model and each sample feature model, and delete the pixels in the saliency area whose weight does not meet the preset conditions to obtain the final area, which is used as the target area. Extract segmentation.
  • the characteristics of the initial target area and each target pixel are described by the color of the pixel and the visual saliency characteristics of the pixel.
  • the color specific feature value and the specific value of the visual significance of the initial target area and each target pixel can be used The difference between them is used to describe the similarity between the two, and the similarity is used to calculate the weight of each pixel.
  • the weight is used to represent the similarity weight measure of the pixel and the target to be cut. The larger the weight, the two The higher the similarity.
  • the preset condition is to select pixels with a larger weight value, that is, to delete pixels with a weight value that does not meet the preset conditions , For example, discard pixels whose weight value is lower than 80.
  • the resulting saliency area is the area with the highest similarity to the target to be cut, and this area is extracted as the final area, thereby Realize image segmentation.
  • the visual saliency feature and the color feature are used as the features of the description target. Because the visual saliency has high robustness, high robustness, and high anti-interference ability, it not only solves the problem of using separate The problem of unstable image segmentation caused by color features also effectively solves the problem of difficult detection due to target deformation, light changes, and similar color distribution of the target and background, thereby effectively improving the stability and stability of target image segmentation.
  • the methods of calculating the color histogram and the visual histogram in steps S101 and S102 may be implemented according to the following methods:
  • h is the H channel value in the HSV color space
  • L is the arithmetic average of the H channel value
  • s is the S channel value in the HSV color space
  • v is the V channel value in the HSV color space
  • g is the R channel value of the RGB primary colors
  • B is the B channel value of the RGB primary colors
  • r is the R channel value of the RGB primary colors
  • max is the maximum value of each corresponding channel value
  • min is the minimum value of each corresponding channel value.
  • the color histogram containing m color histograms is obtained by calculating the frequency of the color vectors appearing in each sub-area.
  • a' is the height of the obtained curve
  • b' is the center of the curve on the x-axis
  • c' is the width (related to the full width at half maximum)
  • r is the number of categories of the kernel function K.
  • x i is the ith pixel in the initial target area
  • C is the normalization factor
  • k is the kernel function
  • u is the weak limit
  • a is the size of the initial target area
  • N is the total number of pixels in the initial target area
  • b(x i ) is the color feature of the ith pixel assigned to the corresponding part on the color histogram
  • ⁇ ( ⁇ ) is Dirac function.
  • Color is very sensitive to changes in lighting conditions.
  • the color interval of the detection target is close to the background color interval, simply using the color feature as the target feature expression model, the detection effect is often not easy to achieve the ideal state.
  • fusion of visual saliency and color features can be used as a representation model of the target to be detected.
  • the visual saliency measurement can be generated by the HSV color feature on the image to be processed. Compared with the pure color feature, the visual saliency has high robustness, high robustness and high anti-interference ability.
  • the specific process of the visual saliency calculation method is shown in the figure 2 shows.
  • S(i,j,t) represents the value of the pixel (i,j) at time t
  • S(i,j,t-1) represents the value of the pixel (i,j) at time t-1 Value.
  • the visual saliency feature is a feature resulting from the visual change between the image area and the background environment. The more obvious the change, the greater the visual saliency value.
  • the image features are changed from the spatial domain to the frequency domain, so that two features of the image amplitude spectrum
  • f(i,j) is the specific feature value of the pixel (i,j), and M ⁇ N is the scale of the image to be processed.
  • the image phase spectrum and image amplitude spectrum contain the specific information of the image.
  • the characteristics of the image amplitude spectrum represent the amount of information change at each frequency point in the image, and the characteristics of the image phase spectrum represent specific information of the location of the information change.
  • the image phase spectrum features are used for image restoration, and the pixel positions with large visual saliency values are output for the positions where the feature values in the original image change greatly, and these positions are the visual salient regions. Therefore, only the phase spectrum feature is used to construct the original image, and the recovered image after inverse Fourier transform IDFT can reflect the visual saliency map of each part of the image, as shown in Equation 6:
  • the size of the image to be processed is M*N
  • x and y are the horizontal and vertical coordinate values of the pixels
  • ⁇ (u, v) is the phase spectrum obtained after the fast Fourier transform of the image to be processed.
  • the visual saliency maps of the HSV color features can be fused using the following formula 7 to obtain the final visual saliency histogram V:
  • V w H ⁇ V H +w S ⁇ V S +w V ⁇ V V ; (7)
  • V H , V S , and V V are the corresponding visual saliency maps obtained by calculating the visual saliency of the HSV color space features
  • w H , w S , and w V are the feature weights corresponding to the color space features, respectively.
  • the feature weights can each take their arithmetic average to represent average feature fusion.
  • the comprehensive visual saliency map is a grayscale image with the same scale as the image to be processed.
  • the value of each pixel indicates the value of the visual saliency of the pixel at the corresponding position in the image to be processed.
  • the present application can also apply an autoregressive model (Equation 8) to denoise the image :
  • v t-1 is the process noise signal
  • A is the state transfer function
  • t is the time
  • X t is the pixel.
  • an implementation of S103 may be implemented according to the following method:
  • Multiple target pixel points are divided into a first pixel point set and a second pixel point set, the feature distance corresponding to each target pixel in the first pixel set is less than the feature distance corresponding to each target pixel in the second pixel set ;
  • Each target pixel in the first pixel set is used to replace the target pixel in the second pixel set to obtain a saliency area. That is, select the visually significant area with the smallest distance to replace the support vector with a larger distance, and then perform the support vector update operation.
  • Equation 9 the characteristic distance d F (x 1i , x 2 ) of the target pixel and the initial target area can be calculated using Equation 9:
  • x 1i is the ith target pixel
  • x 2 is the initial target area
  • K(x 1i , x 1i ) is the Euclidean distance between the x 1i target pixel and the x 1i target pixel
  • K( x 1i , x 2 ) is the Euclidean distance between the x 1i target pixel and the center of the initial target area
  • K(x 2 , x 2 ) is the distance between the center of the initial target area and the center of the initial target area European distance.
  • the weight value of each pixel in the saliency area can be calculated using formula (10)
  • x t represents the system vector at time t
  • x t is composed of the specific position, size and acceleration of the target to be detected.
  • z t represents the observation result of the system state at time t.
  • f(x) and h(x) are system state transfer function and system state observation function
  • v t-1 and n t are system state noise signal and observation state noise signal.
  • the filtering process can be divided into a prediction step and an update step. Refers not predicting step system state observations Z t the case where the time t, can be applied to t 1-posterior probability density P after time (x t-1
  • the process of density is shown in Equation 12.
  • the function of the update step is to correct, that is, to apply the latest observation value z t of the system state at time t and the previously obtained prior probability density P(x t
  • Equation 14 (N represents the number of particles) is the sample set and sample weight obtained by sampling at the posterior probability density P(x 0:t
  • the posterior probability density at time t can use a discrete weighting formula to perform the function approximation process. Equation 14.
  • Equation 14 ⁇ (g) is a Dirac function (unit pulse function).
  • Equation 15 The filtered estimate of the system state x t at time t is shown in Equation 15:
  • FIG. 3 may include:
  • S301 Locate the initial target area in the image to be processed according to the preset initial target area setting conditions, calculate the color histogram and visual histogram of the initial target area, and form an initial target feature model.
  • S302 De-noise the image to be processed using an autoregressive model.
  • S303 Collect a plurality of pixels of the image to be processed to form an original sample set, and each pixel of the image to be processed is previously marked by a sample label.
  • S305 Delete the pixel points of which the number of non-support vectors exceeds the preset threshold to obtain a reduced sample set, select multiple target pixels of the support vector from the reduced sample set, and calculate the color histogram and visual significance histogram of each target pixel, Constitute a sample feature model.
  • S306 Calculate the characteristic distance between each target pixel and the initial target area, and divide the multiple target pixels into a first pixel set and a second pixel set.
  • the embodiments of the present invention can avoid the instability of target image segmentation caused by the application of a single color feature.
  • the target can be detected correctly when there is a large target posture change, illumination change, shape change, and occlusion.
  • the color expression and the visual saliency feature are applied to achieve a more robust by using a reasonable expression model.
  • AVLSVM adaptive incremental and decrement learning algorithm
  • On-line we will directly use On-line to express the online incremental learning algorithm proposed by related technologies, and use AVLSVM to represent adaptive incremental and decrement learning algorithms, and perform relevant numerical experiments from linear and non-linear cases respectively.
  • the penalty parameter C is the optimal value selected through the training process through the adjustment set selected from the training set.
  • AVLSVM is better than On-line in the classification success rate, and significantly better than On-line in CPU execution time.
  • the execution time of AVLSVM is 1.85 seconds.
  • the execution time of On-line is 15.4012 seconds.
  • the radial basis kernel function K(x,y) exp(-p
  • 2 ) is used.
  • the numerical experiment results of the nonlinear case are shown in Table 2, and p is the kernel function parameter. According to the experimental results of the values shown in Table 2, it can be concluded that AVLSVM's CPU execution time is significantly shorter than On-line; AVLSVM's training accuracy and test accuracy are significantly higher than the On-line method.
  • This part of the experiment performs video sequence image saliency calculation on the video data set published by Itti.
  • This data set includes day and night video, indoor and outdoor video, sports video, news video and other situation videos.
  • the original input image can be labeled first, so that it is easy to distinguish the advantages and disadvantages of the model results on the visual saliency map.
  • the visual saliency map obtained by integrating the AVLSVM method can better reflect the characteristics of the original image.
  • the visual saliency map with good effect can play a very good auxiliary role in the subsequent target image segmentation process.
  • the target image segmentation experiment was selected on the face tracking test video released by Stan Birchfield of Stanford University.
  • the experimental test software environment is to use Matlab simulation environment to achieve target image segmentation, focusing on testing the robustness of the algorithm, including the detection results under light intensity changes, target shape changes, and target occlusion, and executed with separate feature target image segmentation algorithms Comparison of effects.
  • the first group of experiments is the segmentation of the face target image based on the video image sequence (128 ⁇ 96) of the occlusion situation (the video file name is movie_cubicle, a total of 95 frames of video images).
  • the three algorithms compared in the experiment are the target image segmentation algorithm considering color features, the target image segmentation algorithm using visual saliency features, and the target image segmentation algorithm using AVLSVM and overall visual saliency features.
  • the target image segmentation result map is the first, 16, 21, 34, 51, 62, 70, and 88 frames of images, respectively.
  • the target in the video image sequence will be blocked during the movement, and the color of the blocked object is similar to the color of the target to be detected.
  • the target image segmentation method that considers color features, because the background color distribution is similar to the target color, the target image segmentation effect is not good; this application considers the fusion of AVLSVM and the overall features of visual saliency. Even if there is occlusion, the algorithm can accurately locate the target, reflecting the robustness of the algorithm.
  • the second set of experiments is to perform head target image segmentation on the video image sequence when the target performs rotation, shape change, etc.
  • the video file is movie_mb, 500 frames.
  • the test video image has a total of 500 frame video image sequences. This video was taken During the process, both the target object and the camera have moved significantly, and the shape, size, and posture of the target have changed significantly.
  • the video image sequences are the first, 55, 78, 95, 115, 160, 195, and 285 images, respectively. From the experimental results, it can be seen that when the target is deformed due to back and forth movement, the color characteristics of the target do not change significantly, and the three algorithms can detect correctly; when the target rotates, the color distribution characteristics of the target will change significantly.
  • This method uses AVLSVM to integrate the overall features of visual saliency, which is more robust than the single feature method. After related tests, this method can obtain the ideal head target image segmentation effect when the target rotates, changes in shape, and changes in color.
  • the third group of experiments is aimed at segmentation of face and target images in a video sequence that includes changes in light intensity, target posture, and target occlusion (video file is movie_sb, 500 frames), and the video image sequences are 1, 24, 55, 78,175,181,238,342 frames of images.
  • the video image sequence includes the comprehensive situation of light intensity change, target occlusion, and target posture change.
  • the detection algorithm that only considers a single feature, when the illumination changes greatly and the target posture changes greatly, the target image segmentation results will have relevant errors; and because this application incorporates visual saliency features and color features, it enables Perform accurate target positioning and ensure the accuracy and efficiency of the target image segmentation algorithm.
  • This application is based on the traditional target image segmentation algorithm, incorporating visual saliency features and color feature models as the target image segmentation feature representation model to ensure the effectiveness of the target image segmentation algorithm, and this application does not significantly increase the algorithm space and Time complexity, and ensure the real-time effect of the algorithm.
  • the average time of this application is 24ms/frame video image; the average time of the traditional color image-based target image segmentation algorithm is 19ms/frame video image . Because this application needs to calculate the relevant visual saliency, the time-consuming of this application will increase. In order to improve the accuracy of the target image segmentation algorithm, time consumption is unavoidable.
  • This application combines the overall visual saliency features and AVLSVM in the target image segmentation process, and expresses the visual saliency features and color features together as the overall target features. According to the results of relevant experiments conducted in the video image sequence, it is known that the present application has achieved excellent results in the effectiveness and real-time performance of target image segmentation.
  • the embodiment of the present invention also provides a corresponding implementation device for the target image segmentation method, which further makes the method more practical.
  • the target image segmentation device provided by the embodiment of the present invention will be described below.
  • the target image segmentation device described below and the target image segmentation method described above can be referred to each other.
  • FIG. 4 is a structural diagram of a target image segmentation device according to an embodiment of the present invention in a specific implementation manner.
  • the device may include:
  • the initial target area positioning module 401 is used to locate the initial target area in the image to be processed according to the preset initial target area setting conditions, calculate the color histogram and the visual saliency histogram of the initial target area, and form an initial target feature model.
  • the sample point selection module 402 is used to select target pixels of multiple support vectors from the to-be-processed image using the pre-built SVM model, calculate the color histogram and visual significance histogram of each target pixel, and form a sample feature model.
  • the saliency area determination module 403 is used to determine a saliency area in the image to be processed based on the calculated characteristic distance between each target pixel point and the initial target area.
  • the target area determination module 404 is used to calculate the weight of each pixel in the saliency area according to the initial target feature model and each sample feature model, and delete the pixels in the saliency area whose weight does not meet the preset conditions to obtain the final area To extract and segment the target area.
  • the sample point selection module 402 may label sample pixels of the image to be processed in advance; collecting multiple pixels of the image to be processed to form the original sample set,
  • the original sample set contains pixels with non-support vectors and support vectors, and the weights of the pixels are the same; during the SVM training of the original sample set, the number of times each pixel is a non-support vector is counted; the number of non-support vector deletions exceeds
  • a pixel sample with a preset threshold is used to obtain a reduced sample set; a module for selecting multiple target pixels of a support vector from the reduced sample set.
  • the sample point selection module 402 may also be a module that calculates the characteristic distance d F (x 1i , x 2 ) of each target pixel from the initial target area using the following formula:
  • x 1i is the ith target pixel
  • x 2 is the initial target area
  • K(x 1i , x 1i ) is the Euclidean distance between the x 1i target pixel and the x 1i target pixel
  • K(x 1i , x 2 ) is the Euclidean distance between the x 1i target pixel and the center of the initial target area
  • K(x 2 , x 2 ) is the Euclidean distance between the center of the initial target area and the center of the initial target area.
  • the saliency area determination module 403 may further calculate the characteristic distance between each target pixel and the initial target area; divide the multiple target pixels into a first pixel set and a second pixel set, The feature distance corresponding to each target pixel in the first pixel set is less than the feature distance corresponding to each target pixel in the second pixel set; use each target pixel in the first pixel set to replace the second pixel set The target pixel point to get the saliency area of the module.
  • the target area determination module 404 may also calculate the weight of each pixel in the saliency area using the following formula The module:
  • the initial target area positioning module 401 may switch the image to be processed to the HSV color space, using the H channel and the S channel as color features, and the V channel as the lightness feature; use the following formula to calculate the initial The module of the color histogram feature of the target area:
  • x i is the ith pixel in the initial target area
  • C is the normalization factor
  • k is the kernel function
  • u is the weak limit
  • a is the size of the initial target area
  • N is the total number of pixels in the initial target area
  • ⁇ ( ⁇ ) is the Dirac function.
  • the initial target region positioning module 401 can also convert the HSV color space characteristics of the image to be processed from the spatial domain to the frequency domain to obtain the image amplitude spectrum and the image phase spectrum of the image to be processed; for each color space feature, use the following The formula obtains the visual significance value V(i,j) of each pixel in the initial target area:
  • the size of the image to be processed is M*N
  • x and y are the horizontal and vertical coordinate values of the pixel
  • ⁇ (u,v) is the phase spectrum obtained after the fast Fourier transform of the image to be processed
  • V w H ⁇ V H +w S ⁇ V S +w V ⁇ V V ;
  • V H , V S , and V V are the corresponding visual saliency maps obtained by calculating the visual saliency of the HSV color space features
  • w H , w S , and w V are the feature weights corresponding to the color space features, respectively.
  • the apparatus may further include a denoising module 405, and the denoising module 405 is used to de-process the image to be processed using the following formula Noise processing:
  • v t-1 is the process noise signal
  • A is the state transfer function
  • t is the time
  • X t is the pixel.
  • each function module of the target image segmentation device may be specifically implemented according to the method in the above method embodiments, and the specific implementation process may refer to the related description of the above method embodiments, which will not be repeated here.
  • the embodiments of the present invention solve the problem of unstable image segmentation caused by the use of separate color features in the related art, and also effectively solve the unstable image segmentation caused by changes in light intensity, target deformation, and similar color distribution Problems, improve the stability and accuracy of target image segmentation, and also improve the efficiency of image segmentation.
  • An embodiment of the present invention also provides a target image segmentation device, which may specifically include:
  • Memory used to store computer programs
  • the processor is configured to execute a computer program to implement the steps of the target image segmentation method described in any one of the above embodiments.
  • each function module of the target image segmentation device may be specifically implemented according to the method in the above method embodiments.
  • specific implementation process reference may be made to the related description of the above method embodiments, and details are not described herein again.
  • the embodiments of the present invention solve the problem of unstable image segmentation caused by the use of separate color features in the related art, and also effectively solve the unstable image segmentation caused by changes in light intensity, target deformation, and similar color distribution Problems, improve the stability and accuracy of target image segmentation, and also improve the efficiency of image segmentation.
  • An embodiment of the present invention also provides a computer-readable storage medium that stores a target image segmentation program.
  • the target image segmentation program is executed by a processor, the steps of the target image segmentation method described in any one of the above embodiments are performed.
  • the embodiments of the present invention solve the problem of unstable image segmentation caused by the use of separate color features in the related art, and also effectively solve the unstable image segmentation caused by changes in light intensity, target deformation, and similar color distribution Problems, improve the stability and accuracy of target image segmentation, and also improve the efficiency of image segmentation.
  • RAM random access memory
  • ROM read-only memory
  • electrically programmable ROM electrically erasable and programmable ROM
  • registers hard disks, removable disks, CD-ROMs, or all fields of technology. Any other known storage medium.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

本发明实施例公开了一种目标图像分割方法、装置及设备。其中,方法包括在待处理图像中根据初始目标区域设定条件定位目标所在位置的初始区域,利用初始目标区域的颜色直方图和视觉显著性直方图作为目标特征;利用预先构建的SVM模型从待处理图像中选取多个支持向量的目标像素点,各目标像素点的样本特征为由各自的颜色直方图和视觉显著性直方图构成;基于计算得到的各目标像素点与初始目标区域的特征距离,在待处理图像中确定显著性区域;根据目标特征和各样本特征,计算显著性区域中各像素点的权值,删除显著性区域中权值不满足预设条件的像素点后得到最终区域作为目标区域,用于进行图像提取分割。本申请提升了目标图像分割的稳定性和准确度。

Description

目标图像分割方法、装置及设备 技术领域
本发明实施例涉及图像处理技术领域,特别是涉及一种目标图像分割方法、装置及设备。
背景技术
随着视觉技术的快速发展,应用图像视觉显著性进行图像分割,越来越广泛的应用于目标识别和跟踪技术领域。目标跟踪与检测所面临的重要问题是机器视觉条件需要面对的不可预见性,从而导致预期目标的后续状态出现偏差,光照强度变化、目标形状和大小变化、变化复杂的背景以及不可预见的物体遮挡等各种因素都会影响目标图像分割方法的系统鲁棒性。
目前,相关技术通常采用颜色和边缘轮廓表示目标的特征,尽管颜色特征对于光线强度变化十分有效,但是图像颜色特征针对噪声信号和物体遮挡部分效果不佳,且当在目标主体的前景颜色与背景颜色状况非常近似,无法有效地区分目标与背景,从而导致图像分割效果不稳定。
发明内容
本公开实施例提供了一种目标图像分割方法、装置及设备,解决了相关技术中使用单独颜色特征所导致的图像分割效果不稳定问题,还有效地解决了光照强度变化、目标形变以及颜色分布类似而引起的目标图像分割不稳定问题,提升了目标图像分割的稳定性和准确度,还提高了图像分割的效率。
为解决上述技术问题,本发明实施例提供以下技术方案:
本发明实施例一方面提供了一种目标图像分割方法,包括:
根据预先设置的初始目标区域设定条件在待处理图像中定位初始目标区域,计算所述初始目标区域的颜色直方图和视觉显著性直方图,构成初始目标特征模型;
利用预先构建的SVM模型从所述待处理图像中选取多个支持向量的目标像素点,计算各目标像素点的颜色直方图和视觉显著性直方图,构成样本特征模型;
基于计算得到的各目标像素点与所述初始目标区域的特征距离,在所述待处理图像中确定显著性区域;
根据所述初始目标特征模型和各样本特征模型,计算所述显著性区域中各像素点的权值,删除所述显著性区域中权值不满足预设条件的像素点后得到最终区域,以作为目标区域进行提取分割。
可选的,所述利用预先构建的SVM模型从所述待处理图像中选取多个支持向量的目标像素点包括:
预先对所述待处理图像的各像素点进行样本标签标记;
采集所述待处理图像的多个像素点构成原始样本集,所述原始样本集中包含具有非支持向量和支持向量的像素点,且各像素点的权值相同;
在SVM训练所述原始样本集过程中,统计各像素点为非支持向量的次数;
删除非支持向量次数超过预设阈值的像素点,得到精简样本集;
从所述精简样本集中选取多个支持向量的目标像素点。
可选的,所述基于计算得到的各目标像素点与所述初始目标区域的特征距离,在所述待处理图像中确定显著性区域包括:
计算各目标像素点与所述初始目标区域的特征距离;
将多个目标像素点分为第一像素点集合和第二像素点集合,所述第一像素点集合中各目标像素点对应的特征距离均小于所述第二像素点集合中各目标像素点对应的特征距离;
利用所述第一像素点集合中的各目标像素点代替所述第二像素点集合中的目标像素点,得到显著性区域。
可选的,所述计算各目标像素点与所述初始目标区域的特征距离为:
利用下述公式计算各目标像素点与所述初始目标区域的特征距离d F(x 1i,x 2):
Figure PCTCN2019075205-appb-000001
式中,x 1i为第i个目标像素点,x 2为所述初始目标区域,K(x 1i,x 1i)为x 1i目标像素点与x 1i目标像素点之间的欧式距离,K(x 1i,x 2)为x 1i目标像素点与所述初始目标区域中心之间的欧式距离,K(x 2,x 2)为所述初始目标区域中心与所述初始目标区域中心之间的欧式距离。
可选的,所述根据所述初始目标特征模型和各样本特征模型,计算所述显著性区域中各像素点的权值包括:
利用下述公式计算所述显著性区域中各像素点的权值
Figure PCTCN2019075205-appb-000002
Figure PCTCN2019075205-appb-000003
式中,
Figure PCTCN2019075205-appb-000004
为第i个像素点的颜色与所述初始目标区域的颜色的相似度,
Figure PCTCN2019075205-appb-000005
为第i个像素点的视觉显著性与与所述初始目标区域的视觉显著性的相似度,
Figure PCTCN2019075205-appb-000006
是全部
Figure PCTCN2019075205-appb-000007
的算术平均值,
Figure PCTCN2019075205-appb-000008
是全部
Figure PCTCN2019075205-appb-000009
的算术平均值;
Figure PCTCN2019075205-appb-000010
为t时刻的先验概率密度,λ为调整参数。
可选的,所述计算所述初始目标区域的颜色直方图包括:
将所述待处理图像切换到HSV颜色空间,将H通道、S通道作为颜色特征,V通道作为明度特征;
利用下述公式计算所述初始目标区域的颜色直方图特征:
Figure PCTCN2019075205-appb-000011
式中,x i为所述初始目标区域中第i个像素点,
Figure PCTCN2019075205-appb-000012
为以x像素点为中心区域的颜色分布模型,C为标准化因子,k为核函数,u为弱极限,a为所述初始目标区域的大小,N为所述初始目标区域中像素点的总数,b(x i)为将第i个像素点的颜色特征分配到颜色直方图上相应部分,δ(·)为狄拉克函数。
可选的,所述计算所述初始目标区域的视觉显著性直方图包括:
将所述待处理图像的HSV颜色空间特征从空域转换至频域,得到所述待处理图像的图像幅度谱和图像相位谱;
对每种颜色空间特征,利用下述公式得到所述初始目标区域的各像素点的视觉显著性值V(i,j):
Figure PCTCN2019075205-appb-000013
式中,所述待处理图像的大小为M*N,x、y为像素点的横纵坐标值,φ(u,v)为所述待处理图像经过快速傅里叶变换以后所得相位谱;
利用下述公式将HSV颜色特征的各视觉显著图进行融合,得到最终视觉显著直方图V:
V=w H×V H+w S×V S+w V×V V
式中,V H、V S、V V分别为HSV颜色空间特征进行视觉显著性计算获得的相对应视觉显著图,w H、w S、w V分别为各颜色空间特征对应的特征权值。
可选的,在所述利用预先构建的SVM模型从所述待处理图像中选取多个支持向量的目标像素点之前,还包括:
利用下述公式对所述待处理图像进行去噪处理:
X t=AX t-1+v t-1
式中,v t-1为过程噪声信号,A为状态转移函数,t为时间,X t为像素点。
本发明实施例另一方面提供了一种目标图像分割装置,包括:
初始目标区域定位模块,用于根据预先设置的初始目标区域设定条件在待处理图像中定位初始目标区域,计算所述初始目标区域的颜色直方图和视觉显著性直方图,构成初始目标特征模型;
样本点选取模块,用于利用预先构建的SVM模型从所述待处理图像中选取多个支持向量的目标像素点,计算各目标像素点的颜色直方图和视觉显著性直方图,构成样本特征模型;
显著性区域确定模块,用于基于计算得到的各目标像素点与所述初始目标区域的特征距离,在所述待处理图像中确定显著性区域;
目标区域确定模块,用于根据所述初始目标特征模型和各样本特征模型,计算所述显著性区域中各像素点的权值,删除所述显著性区域中权值不满足预设条件的像素点后得到最终区域,以作为目标区域进行提取分割。
本发明实施例还提供了一种目标图像分割设备,包括处理器,所述处理器用于执行存储器中存储的计算机程序时实现如前任一项所述目标图像分割方法的步骤。
本发明实施例最后还提供了一种计算机可读存储介质,所述计算机可读存储介质上存储有目标图像分割程序,所述目标图像分割程序被处理器执行时实现如前任一项所述目标图像分割方法的步骤。
本申请提供的技术方案的优点在于,将视觉显著性特征与颜色特征共同作为描述目标的特征,由于视觉显著性具有高鲁棒性、高稳健性和高抗干扰能力,不仅解决了使用单独颜色特征所导致的图像分割效果不稳定的问题,还有效解决了由于目标形变、光照变化以及目标和背景颜色分布相似而产生的检测困难的问题,从而有效的提升了目标图像分割的稳定性和准确度;此外,利用SVM模型从待处理图像中选择支持向量,并从支持向量中选择与分割目标区域相似度较高的有效像素点来确定最终目标区域,不仅提升了目标图像分割的效率,还进一步的提升了目标 图像分割准确度。
此外,本发明实施例还针对目标图像分割方法提供了相应的实现装置及设备,进一步使得所述方法更具有实用性,所述装置、及设备具有相应的优点。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性的,并不能限制本公开。
附图说明
为了更清楚的说明本发明实施例或相关技术的技术方案,下面将对实施例或相关技术描述中所需要使用的附图作简单的介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本发明实施例提供的一种目标图像分割方法的流程示意图;
图2为本发明实施例提供的一种视角显著性计算方法的流程示意图;
图3为本发明实施例提供的另一种目标图像分割方法的流程示意图;
图4为本发明实施例提供的目标图像分割装置的一种具体实施方式结构图;
图5为本发明实施例提供的目标图像分割装置的另一种具体实施方式结构图。
具体实施方式
为了使本技术领域的人员更好地理解本发明方案,下面结合附图和具体实施方式对本发明作进一步的详细说明。显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”、“第三”“第四”等是用于区别不同的对象,而不是用于描述特定的顺序。此外术语“包括”和“具有”以及他们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可包括没有列出的步骤或单元。
在介绍了本发明实施例的技术方案后,下面详细的说明本申请的各种非限制性实施方式。
首先参见图1,图1为本发明实施例提供的一种目标图像分割方法的流程示意图,本发明实施例可包括以下内容:
S101:根据预先设置的初始目标区域设定条件在待处理图像中定位初始目标区域,计算初始目标区域的颜色直方图和视觉显著性直方图,构成初始目标特征模型。
初始目标区域设定条件为预先根据待处理图像中欲切割目标在图像中的位置和自身的参数信息,设定的一个可在待处理图像中定位大概区域的条件。举例来说,初始目标区域设定条件可为X 0=(x 0,y 0,h x,h y),(x 0,y 0)为目标中心的坐标位置,(h x,h y)为目标区域的宽度高度,根据该条件可首先在待处理图像中定位目标中心,然后再根据高度和宽度定位确定区域范围,从而实现从待处理图像中检测到初始目标区域。
在得到初始目标区域后,提取初始目标区域的颜色特征和视觉显著性特征,利用颜色特征和视觉显著性特征共同描述初始目标区域,即初始目标特征模型包括初始目标区域的颜色特征和视觉显著性特征。
S102:利用预先构建的SVM模型从待处理图像中选取多个支持向量的目标像素点,计算各目标像素点的颜色直方图和视觉显著性直方图,构成样本特征模型。
SVM模型为预先基于自适应增量学习和减量学习算法训练所得的分类器,自适应增量学习和减量学习算法的实现过程可如下描述:
由于增量学习算法和减量学习算法缺乏针对训练集数据进行有选择淘汰,这样在很大程度上会对处理时间和处理精度产生影响。如果在增量执行过程中直接丢弃像素点样本中的非支持向量,但伴随后续增量训练过程的不断进行,之前被丢弃的非支持向量是有可能成为支持向量。而且,在单次训练中就直接丢弃像素点样本的非支持向量,很有可能丢弃部分重要处理有效信息,从而导致分类过程的精度下降。为了解决上述相关问题,通过引入阈值达到自适应处理的增量和减量学习算法,阈值为训练集中可以容忍像素点样本为非支持向量的次数最大值。
为了实现自适应算法,可预先对待处理图像的每个像素点进行样本标签标记,例如针对每个像素点样本(x i,y i),称l(x i)为像素点样本(x i,y i)的样本标记,赋值条件可为:
Figure PCTCN2019075205-appb-000014
样本标签标记可用于确定某个像素点样本能否被丢弃的主要依据,在支持向量机的增量学习中,可以使用r(x i)来存储的发生次数,即当l(x i)=1时,r(x i)在原有数据基础上进行加1运算。
随着训练过程的不断深入执行,部分样本会产生“样本振荡”问题。为了解决此问题,可引入阈值来代表训练集上对于样本点数据是非支持向量次数最大值。阈值的取值需要考虑训练时间与训练精度平衡因素,本申请对此不做任何限定。
举例来说,对于(x i,y i)样本,当其r(x i)达到预设阈值l(例如3次)后,即r(x i)≥l,便可将该像素点样本(x i,y i)从整个训练集中丢弃,这种方法可以降低“振荡”现象的样本对分类器产生的相关影响。
在确定阈值l后,以(x i,y i)像素点样本为例,在SVM训练过程中,记录像素点样本为非支持向量的具体次数r(x i),当次数r(x i)达到预定的阈值l之后,就将像素点样本(x i,y i)从训练集中执行丢弃操作。下面阐述自适应的增量和减量学习算法的具体算法说明:
初始化过程:设定阈值l的具体数值,获取样本图像中的像素点作为样本点,直到支持向量和非支持向量两类像素点样本都出现为止,设当前取得的像素点样本集为T。以T作为训练集执行训练过程,从而得到样本标记l(x i),令r(x i)=l(x i),i=1,...,|T|。
设当前像素点样本集是T,且|T|=m。新获取的像素点样本表达为(x m+1,y m+1),令r(x m+1)=0,把T=TU{(x m+1,y m+1)}作为训练集执行重新训练过程。
假设舍弃样本集为T d,初始化舍弃样本集为T d←φ;
依次遍历当前像素点样本集中的每个像素点样本:
若l(x j)=1,也即xj是非支持向量,那么r(x j)←r(x j)+1;
若l(x j)≠1,r(x j)保持不变。
判断r(x j)≥l是否成立,若成立,则T d←T dU{x j,y j};若不成立,则继续保留该像素点样本。
若T d←φ,那么以T=T-T d成为训练集执行减量学习过程。
当存在新样本时,则根据上述过程继续执行增量学习和减量学习过程。
在SVM执行训练学习过程中,通过阈值的引入可以实现自适应策略,从而提高SVM训练精度和降低训练时间。
利用SVM模型从待处理图像中选取多个支持向量的目标像素点过程可包括:
采集待处理图像的多个像素点构成原始样本集,原始样本集中包含具有非支持向量和支持向量的像素点,且各像素点的权值相同。原始样本集中包含待处理图像中的所有显著特征的像素点,每个像素点的重要性权值相同,例如可赋值为1/N,N为原始样本集的样本总数。
在SVM训练原始样本集过程中,统计各像素点为非支持向量的次数,将非支持向量次数超过预设阈值的像素点删除,得到精简样本集;从精简样本集中选取多个支持向量的目标像素点。
提取每个目标像素点的特征时,同样提取的是该目标像素点的颜色特征和视觉显著性特征,利用颜色特征和视觉显著性特征共同描述每个目标样本像素点。
各目标像素点的样本特征模型生成方法可与初始目标区域的初始目标特征模型生成方法相同,当然,也可采用不同的方法。
S103:基于计算得到的各目标像素点与初始目标区域的特征距离,在待处理图像中确定显著性区域。
S102步骤中提取得到的目标像素点,为整个待处理图像中具有明显显著性特征的像素点,也即是与待分割目标相似性最高的像素点。
各目标像素点与初始目标区域的特征距离用于作为每个目标像素点与目标相似度的度量标准,初始目标区域为包含待分割目标的区域,像素点与初始目标区域距离越小,则二者相似度越高,也就是说目标像素点为待切割目标的可能性就越大。
基于各目标像素点和初始目标区域的特征距离,提取待处理图像的有效显著区域,有效显著区域为包含待分割目标的区域。
S104:根据初始目标特征模型和各样本特征模型,计算显著性区域中各像素点的权值,删除显著性区域中权值不满足预设条件的像素点后得到最终区域,以作为目标区域进行提取分割。
初始目标区域和各目标像素点的特征均是利用像素点颜色和像素点视觉显著性特征进行描述的,可利用初始目标区域和各目标像素点的颜色具体特征值和视觉显著性具体特征值之间的差别来描述二者之间的相似度,利用该相似度来计算各像素点的权值,权值用于表示该像素点与待切割目标的相似性权重度量,权值越大,二者相似度越高。
为了准确、精准定位目标所在区域,可将权重值较小的像素点丢弃,预设条件即为用于选择权重值较大的像素点,也就是将权重值不满足预设条件的像素点删除,例如将权重值低于80的像素点舍弃。
将显著性区域中低权值的像素点删除后,更新显著性区域中的像素点后,所得的显著性区域为与待切割目标相似度最高的区域,将该区域作为最终区域进行提取,从而实现图像分割。
在本发明实施例提供的技术方案中,将视觉显著性特征与颜色特征共同作为描述目标的特征,由于视觉显著性具有高鲁棒性、高稳健性和高抗干扰能力,不仅解决了使用单独颜色特征所导致的图像分割效果不稳定的问题,还有效解决了由于目标形变、光照变化以及目标和背景颜色分布相似而产生的检测困难的问题,从而有效的提升了目标图像分割的稳定性和准确度;此外,利用SVM模型从待处理图像中选择支持向量,并从支持向量中选择与分割目标区域相似度较高的有效像素点来确定最终目标区域,不仅提升了目标图像分割的效率,还进一步的提升了目标图像分割准确度。
在一种具体的实施方式中,S101步骤和S102中计算颜色直方图和视觉显著性直方图的方法可根据下述方法进行实施:
自适应的支持向量机中可利用多种相关特征对待检测目标进行形式表达。对于颜色特征的提取,由于颜色对于噪声信号和遮挡部分不够敏感且计算过程相对简单,因此颜色特征的应用非常广泛。HSV颜色空间和人类自身视觉系统特点类似,可利用公式1将处理图像由RGB三原色空间切换到HSV颜色空间:
Figure PCTCN2019075205-appb-000015
式中,h为HSV颜色空间中H通道值,L为H通道值的算术平均值,s为HSV颜色空间中S通道值,v为HSV颜色空间中V通道值,g为RGB三原色R通道值,b为为RGB三原色B通道值,r为RGB三原色R通道值,max为各自对应通道值的最大值,min为各自对应通道值的最小值。
应用颜色直方图作为目标区域颜色模型,假定将颜色空间分成m个子区域,通过计算颜色向量在各子区域中出现的频率来获取包含m条颜色直方柱的颜色直方图。考虑像素点在目标区域位置上针对颜色分布情况的影响情况,可考虑增加核函数k(r)针对空间信息执行融合过程,具体操作方法见公式2:
Figure PCTCN2019075205-appb-000016
式中,a'为所得曲线高度,b'为曲线在x轴的中心,c'为宽度(与半峰全宽有关),r为核函数K的类别数。
应用
Figure PCTCN2019075205-appb-000017
来表示以x像素点为中心区域的颜色分布模型,则有:
Figure PCTCN2019075205-appb-000018
式中,x i为初始目标区域中第i个像素点,
Figure PCTCN2019075205-appb-000019
为以x像素点为中心区域的颜色分布模型,C为标准化因子,k为核函数,u为弱极限,
Figure PCTCN2019075205-appb-000020
a为初始目标区域的大小,N为初始目标区域中像素点的总数,b(x i)为将第i个像素点的颜色特征分配到颜色直方图上相应部分,δ(·)为狄拉克函数。
颜色对于光照变化情况非常敏感,当检测目标的颜色区间和背景颜色区间接近时,单纯使用颜色特征来作为目标特征表达模型,检测效果往往不容易达到理想状态。鉴于此,可使用视觉显著性与颜色特征融合来作为待检测目标的表示模型。
视觉显著性度量可由待处理图像上的HSV颜色特征作用产生,与单纯颜色特征相比,视觉显著性具有高鲁棒性、高稳健性和高抗干扰能力,视觉显著性计算方法具体过程见图2所示。
应用公式1将待处理图像由RGB空间切换成HSV空间,将H通道、S通道作为颜色特征,V通道作为明度特征,特殊性特征的表达式如公式4所示:
M(i,j)=S(i,j,t)-S(i,j,t-1);(4)
式中,S(i,j,t)表示像素点(i,j)在t时刻的取值、S(i,j,t-1)表示像素点(i,j)在t-1时刻的取值。
视觉显著性特征为由于图像区域与背景环境发生视觉变化而产生的特征,变化越明显则视觉显著性值越大。首先把图像特征从空域变为频域,从而可得图像幅度谱|F(u,v)|和图像相位谱φ(u,v)两种特征,表达为公式5:
Figure PCTCN2019075205-appb-000021
式中,f(i,j)是像素点(i,j)的具体特征值,M×N是待处理图像的规模。
图像相位谱与图像幅度谱包含了图像的具体信息。图像幅度谱特征表示图像中每个频率点中信息变化量,图像相位谱特征表示信息变化位置具体信息。针对图像上每个像素点的视觉显著性执行计算,同时寻找视觉显著性明显的位置。利用图像相位谱特征进行图像恢复,输出视觉显著性值较大的像素点位置就针对原始图像中特征值变化较大位置,而这些位置即为视觉显著区域。因此,仅利用相位谱特征针对原始图像进行构造,进行傅里叶反变换IDFT后得到恢复图像就能够反映图像上各个部分的视觉显著图,即公式6所示:
Figure PCTCN2019075205-appb-000022
式中,待处理图像的大小为M*N,x、y为像素点的横纵坐标值,φ(u,v)为待处理图像经过快速傅里叶变换以后所得相位谱。
在获得HSV颜色空间特征显著图之后,可利用下述公式7将HSV颜色特征的各视觉显著图进行融合,得到最终视觉显著直方图V:
V=w H×V H+w S×V S+w V×V V;(7)
式中,V H、V S、V V分别为HSV颜色空间特征进行视觉显著性计算获得的相对应视觉显著图,w H、w S、w V分别为各颜色空间特征对应的特征权值。可选的,特征权值可各取其算术平均值来表示平均化特征融合。
综合视觉显著图是和待处理图像规模完全相同的灰度图像,每个像素点取值表示待处理图像中相应位置像素点的视觉显著性值大小。
考虑到待处理图像存在噪声信号,噪声的存在影响后续图像检测的精度和准确度,在从待处理图像中采样之前,本申请还可应用自回归模型(公式8)对待处理图像进行去噪处理:
X t=AX t-1+v t-1;(8)
式中,v t-1为过程噪声信号,A为状态转移函数,t为时间,X t为像素点。
为了进一步提升显著性区域中像素点和目标之间的相似度,提升显著性区域检测的准确度,S103的一种实现方式可根据下述方法进行实施:
计算各目标像素点与初始目标区域的特征距离。
将多个目标像素点分为第一像素点集合和第二像素点集合,第一像素点集合中各目标像素点对应的特征距离均小于第二像素点集合中各目标像素点对应的特征距离;
利用第一像素点集合中的各目标像素点代替第二像素点集合中的目标像素点,得到显著性区域。也即选择距离最小的视觉显著区域替换距离较大的部分支持向量,然后执行支持向量更新操作。
其中,对每个目标像素点,可利用公式9计算目标像素点与初始目标区域的特征距离d F(x 1i,x 2):
Figure PCTCN2019075205-appb-000023
式中,x 1i为第i个目标像素点,x 2为所述初始目标区域,K(x 1i,x 1i)为x 1i目标像素点与x 1i目标像素点之间的欧式距离,K(x 1i,x 2)为x 1i目标像素点与所述初始目标区域中心之间的欧式距离,K(x 2,x 2)为所述初始目标区域中心与所述初始目标区域中心之间的欧式距离。
当然,还可采用其他方法进行计算目标像素点和初始目标区域之间的相关距离,作为二者之间相似度的度量标准,本申请对此不做任何限定。
在另外一些实施方式中,在S104中,可利用公式(10)计算显著性区域中各像素点的权值
Figure PCTCN2019075205-appb-000024
Figure PCTCN2019075205-appb-000025
式中,
Figure PCTCN2019075205-appb-000026
为第i个像素点的颜色与初始目标区域的颜色的相似度,
Figure PCTCN2019075205-appb-000027
为第i个像素点的视觉显著性与初始目标区域的视觉显著性的相似度,
Figure PCTCN2019075205-appb-000028
是全部
Figure PCTCN2019075205-appb-000029
的算术平均值,
Figure PCTCN2019075205-appb-000030
是全部
Figure PCTCN2019075205-appb-000031
的算术平均值;
Figure PCTCN2019075205-appb-000032
为t时刻的先验概率密度,λ为调整参数。λ用于针对
Figure PCTCN2019075205-appb-000033
表示相似度进行区分,其取值范围为(0,1)。
在计算
Figure PCTCN2019075205-appb-000034
时,可参见粒子滤波方法,其实现原理可参阅下述过程:
粒子滤波算法的核心思想为依据t时刻系统状态结果Z t={z 0,z 1,…,z t},应用概率方法理论执行迭代估计得到t时刻的系统状态x t,等同于寻找后验概率分布函数P(x t|z t)。
假定系统状态空间模型具体表示为公式11:
Figure PCTCN2019075205-appb-000035
在公式11中,x t代表t时刻系统向量,x t由待检测目标的具体位置、大小和加速度等特征组成。z t代表t时刻系统状态观查结果。f(x)和h(x)是系统状态转移函数和系统状态观测函数,v t-1和n t是系统状态噪声信号和观测状态噪声信号。
滤波过程可分成预测步骤和更新步骤。预测步骤指在没有得到t时刻系统状态观测结果z t情况下,可应用t-1时刻后验概率密度P(x t-1|z t-1)进行推演,从而求得t时刻先验概率密度的过程,具体情况如公式12所示。
P(x t|z t-1)=∫P(x i|x t-1)P(x t-1|z t-1)dx t-1;(12)
更新步骤作用是修正,即应用t时刻系统状态最新观测值z t和之前求得t时刻先验概率密度P(x t|z t-1),从而得到t时刻后验概率密度P(x t|z t),具体如公式13所示。
Figure PCTCN2019075205-appb-000036
假设
Figure PCTCN2019075205-appb-000037
(N代表粒子数量)是在后验概率密度P(x 0:t|z 1:t)取样得到的样本集及样本权重,
Figure PCTCN2019075205-appb-000038
x 0:t={x j,j=0,1,...,t}是样本集合,根据统计模拟方法原理,t时刻后验概率密度可使用离散加权公式来执行函数逼近过程,具体见公式14。
Figure PCTCN2019075205-appb-000039
在公式14中,δ(g)为狄拉克函数(单位脉冲函数)。t时刻系统状态x t的滤波估计值如公式15所示:
Figure PCTCN2019075205-appb-000040
此外,本申请还提供了另外一个实施例,请参阅图3,具体可包括:
S301:根据预先设置的初始目标区域设定条件在待处理图像中定位初始目标区域,计算初始目标区域的颜色直方图和视觉显著性直方图,构成初始目标特征模型。
S302:利用自回归模型对待处理图像进行去噪处理。
S303:采集待处理图像的多个像素点构成原始样本集,待处理图像的各像素点预先被事先进了样本标签标记。
S304:基于自适应增量和减量学习算法,在SVM训练原始样本集过程中,统计各像素点为非支持向量的次数。
S305:删除非支持向量次数超过预设阈值的像素点,得到精简样本集,从精简样本集中选取多个支持向量的目标像素点,计算各目标像素点的颜色直方图和视觉显著性直方图,构成样本特征模型。
S306:计算各目标像素点与初始目标区域的特征距离,将多个目标像素点分为第一像素点集合和第二像素点集合。
S307:利用第一像素点集合中的各目标像素点代替第二像素点集合中的目标像素点,得到显著性区域。
S308:根据初始目标特征模型和各样本特征模型,计算显著性区域中各像素点的权值。
S309:从显著性区域中删除权值不满足预设条件的像素点后得到最终区域,以作为目标区域进行提取分割。
各个步骤的实现过程可参阅上述实施例相应步骤或相应方法的实现过程,此处,不再赘述。
由上可知,本发明实施例能够避免应用单个颜色特征所引发的目标图像分割的不稳定问题。在出现较大目标姿态变化、光照变化、形状变化以及出现遮挡的情况下都能够正确检测目标。本发明在目标如果长时间被遮挡和发生剧烈光照变化情况仍然有可能造成本算法的检测出现失败情况下,通过应用颜色特征与视觉显著性特征,采用合理表达模型实现了鲁棒性更高的目标图像分割方法。
为了证实本申请提供的技术方案能够克服利用单一颜色特征引发的目标分割不稳定问题,并且能够有效解决由于目标形变、光照变化以及目标和背景颜色分布相似而产生的目标分割问题。本申请在视频图像序列中进行相关实验,对斯坦福大学标准视频库上的目标分割有效性和实时性都取得优良效果。
AVLSVM实验部分
为了评估自适应增量和减量学习算法(缩写为AVLSVM)具体性能,从训练正确率、测试正确率、CPU执行时间三个要素来对比AVLSVM和相关技术在线增量学习算法。下面直接使用On-line表达相关技术提出的在线增量学习算法,用AVLSVM表示自适应的增量和减量学习算法,从线性情况、非线性情况分别执行相关数值实验。
在数值实验中,首先选取UCI机器学习数据库 [13]中有关数据集进行相关数值实验,通过将样本逐个加入训练集来模拟在线情况。取λ=1.9/C,C为惩罚参数,ε要求达到10 -5。惩罚参数C则是通过从训练集中挑选的调整集经过训练过程选择的最优值。数值实验上阈值l的选定是通过在各种UCI机器学习数据集上经过不断调整和测试选定的。根据数值实验相关结果,最后发现l=4阈值是确保训练成功率、测试成功率、CPU执行时间的最优解。
线性情况的实验结果如表1所示。从表1可见:AVLSVM在分类成功率上优于On-line,在CPU执行时间方面明显优于On-line,比如针对维数较高的Pima-diabetes数据集,AVLSVM的执行时间是1.85秒,而On-line的执行时间则是15.4012秒。
表1线性情况数值实验结果
Figure PCTCN2019075205-appb-000041
表2非线性情况数值实验结果
Figure PCTCN2019075205-appb-000042
针对非线性情况,采用径向基核函数K(x,y)=exp(-p||x-y|| 2),非线性情况数值实验结果如表2所示,p是核函数参数。根据表2显示的数值的实验结果可得出:AVLSVM的CPU执行时间比On-line明显要小;AVLSVM的训练正确率、测试正确率都比On-line方法明显要高。
视觉显著结果图实验
该部分实验针对Itti公布的视频数据集进行视频序列图像视觉显著性计算,此数据集包括白天和晚上视频、室内和室外视频、运动视频、新闻视频等各种情况视频。为了方便进行效果对比,可在原始输入图像上首先进行形状标注,这样在视觉显著图上可以很方便分辨模型结果的优劣。
根据视觉显著图结果可知,融入AVLSVM方法所得视觉显著图能够更好地反映原始图像的特点。同时,效果良好的视觉显著图在后续的目标图像分割过程中能够起到非常好的辅助作用。
目标图像分割算法测试实验
为了验证本申请的目标图像分割的正确率,选择在斯坦福大学的Stan Birchfield所发布的人脸跟踪测试视频上进行目标图像分割实验。实验测试软件环境是利用Matlab仿真环境实现目标图像分割,重点针对算法鲁棒性进行测试,包括光照强度变化、目标形状变化、目标遮挡情况下的检测结果,并与单独特征的目标图像分割算法执行效果比较。
第一组实验是针对遮挡情况的视频图像序列(128×96)进行人脸目标图像分割(视频文件名为movie_cubicle,共95帧视频图像)。实验对比的三种算法分别是考虑颜色特征的目标图像分割算法、应用视觉显著性特征的目标图像分割算法、应用AVLSVM和视觉显著性整体特征的目标图像分割算法。首先在遮挡情况下利用三种算法目标图像分割结果图。视频图像序列分别是第1、16、21、34、51、62、70、88帧图像。视频图像序列中的目标在运动过程中会出现遮挡情况,遮挡物体的颜色与待检测目标颜色相近。在考虑颜色特征的目标图像分割方法中,因为背景颜色分布与目标颜色近似,所以目标图像分割效果不佳;本申请考虑AVLSVM与视觉显著性整体特征进行融合。即使存在遮挡情况,本算法也能准确定位目标,体现算法鲁棒性。
第二组实验针对在目标执行旋转、形状变化等情况下,对视频图像序列执行头部目标图像分割(视频文件为movie_mb,500帧),该测试视频图像共有500帧视频图像序列,本视频拍摄过程中目标对象和摄像机均有大幅度移动,目标形状、大小与人体姿态都有明显变化,视频图像序列分别为第1、55、78、95、115、160、195、285帧图像。从实验结果可见,当目标因为前后运动产生目标大小形变时,此时目标的颜色特征并没有显著变化,三种算法均能够检测正确;当目标发生旋转时,目标颜色分布特征会发生明显改变,使用单独颜色特征当作表达模型进行目标图像分割过程时,时常会失败并导致目标丢失。本方法应用AVLSVM与视觉显著性整体特征进行融合,比应用单个特征方法的系统鲁棒性要高。经过相关测试,本方法在目标发生旋转、形状变化、颜色改变时,都能够获得理想头部目标图像分割效果。
第三组实验针对在包含光照强度变化、目标姿态变化、目标遮挡下的视频序列中进行人脸目标图像分割(视频文件为movie_sb,500帧),视频图像序列分别为第1、24、55、78、175、181、238、342帧图像。本视频图像序列中包括光照强度变化、目标遮挡、目标姿态变化综合情况。在只考虑单个特征的检测算法中,当光照变化情况较大、目标姿态变化情况较大情况下,目标图像分割结果会出现相关误差;而本申请因为融入视觉显著性特征与颜色特征,使得能够进行目标准确定位,并保证目标图像分割算法的正确性和高效性。
本申请在传统目标图像分割算法基础上,融入了视觉显著性特征和颜色特征模型共同作为目标图像分割的特征表示模型,以确保目标图像分割算法的有效性,而且本申请没有明显增加算法空间和时间复杂度,并确保了算法的实时性效果。在上述三组目标图像分割实验中,在支持向量数达到500时,本申请的平均耗时为24ms/帧视频图像;传统基于颜色特征的目标图像分割算法的平均耗时为19ms/帧视频图像。因为本申请需要计算相关视觉显著性,所以本申请的耗时会有所增加,为了提高目标图像分割算法效果精度,时间消耗是无法避免的情况,当支持向量数减 少到200个时,本申请的平均耗时为15ms/帧视频图像,而且可以确保较高的目标图像分割精度。因此本申请在提高目标图像分割精度时又确保了目标图像分割实时性。
本申请在目标图像分割过程中将视觉显著性整体特征和AVLSVM相结合,将视觉显著性特征和颜色特征共同作为目标整体特征进行表示。根据视频图像序列中进行相关实验的结果可知,本申请在目标图像分割的有效性和实时性都取得优良效果。
本发明实施例还针对目标图像分割方法提供了相应的实现装置,进一步使得所述方法更具有实用性。下面对本发明实施例提供的目标图像分割装置进行介绍,下文描述的目标图像分割装置与上文描述的目标图像分割方法可相互对应参照。
参见图4,图4为本发明实施例提供的目标图像分割装置在一种具体实施方式下的结构图,该装置可包括:
初始目标区域定位模块401,用于根据预先设置的初始目标区域设定条件在待处理图像中定位初始目标区域,计算初始目标区域的颜色直方图和视觉显著性直方图,构成初始目标特征模型。
样本点选取模块402,用于利用预先构建的SVM模型从待处理图像中选取多个支持向量的目标像素点,计算各目标像素点的颜色直方图和视觉显著性直方图,构成样本特征模型。
显著性区域确定模块403,用于基于计算得到的各目标像素点与初始目标区域的特征距离,在待处理图像中确定显著性区域。
目标区域确定模块404,用于根据初始目标特征模型和各样本特征模型,计算显著性区域中各像素点的权值,删除显著性区域中权值不满足预设条件的像素点后得到最终区域,以作为目标区域进行提取分割。
可选的,在本实施例的一些实施方式中,所述样本点选取模块402可为预先对待处理图像的各像素点进行样本标签标记;采集待处理图像的多个像素点构成原始样本集,原始样本集中包含具有非支持向量和支持向量的像素点,且各像素点的权值相同;在SVM训练原始样本集过程中,统计各像素点为非支持向量的次数;删除非支持向量次数超过预设阈值的像素点,得到精简样本集;从精简样本集中选取多个支持向量的目标像素点的模块。
此外,在本发明实施例中,所述样本点选取模块402还可为利用下述公式计算各目标像素点与初始目标区域的特征距离d F(x 1i,x 2)的模块:
Figure PCTCN2019075205-appb-000043
式中,x 1i为第i个目标像素点,x 2为初始目标区域,K(x 1i,x 1i)为x 1i目标像素点与x 1i目标像素点之间的欧式距离,K(x 1i,x 2)为x 1i目标像素点与初始目标区域中心之间的欧式距离,K(x 2,x 2)为初始目标区域中心与初始目标区域中心之间的欧式距离。
在另外一些实施方式中所述显著性区域确定模块403还可为计算各目标像素点与初始目标区域的特征距离;将多个目标像素点分为第一像素点集合和第二像素点集合,第一像素点集合中各目标像素点对应的特征距离均小于第二像素点集合中各目标像素点对应的特征距离;利用第一像素点集合中的各目标像素点代替第二像素点集合中的目标像素点,得到显著性区域的模块。
可选的,所述目标区域确定模块404还可为利用下述公式计算显著性区域中各像素点的权值
Figure PCTCN2019075205-appb-000044
的模块:
Figure PCTCN2019075205-appb-000045
式中,
Figure PCTCN2019075205-appb-000046
为第i个像素点的颜色与初始目标区域的颜色的相似度,
Figure PCTCN2019075205-appb-000047
为第i个像素点的视觉显著性与与初始目标区域的视觉显著性的相似度,
Figure PCTCN2019075205-appb-000048
是全部
Figure PCTCN2019075205-appb-000049
的算术平均值,
Figure PCTCN2019075205-appb-000050
是全部
Figure PCTCN2019075205-appb-000051
的算术平均值;
Figure PCTCN2019075205-appb-000052
为t时刻的先验概率密度,λ为调整参数。
在一些具体的实施方式中,所述初始目标区域定位模块401可为将待处理图像切换到HSV颜色空间,将H通道、S通道作为颜色特征,V通道作为明度特征;利用下述公式计算初始目标区域的颜色直方图特征的模块:
Figure PCTCN2019075205-appb-000053
式中,x i为初始目标区域中第i个像素点,
Figure PCTCN2019075205-appb-000054
为以x像素点为中心区域的颜色分布模型,C为标准化因子,k为核函数,u为弱极限,a为初始目标区域的大小,N为初始目标区域中像素点的总数,b(x i)为将第i个像素点的颜色特征分配到颜色直方图上相应部分,δ(·)为狄拉克函数。
所述初始目标区域定位模块401还可为将待处理图像的HSV颜色空间特征从空域转换至频域,得到待处理图像的图像幅度谱和图像相位谱;对每种颜色空间特征,利用下述公式得到初始目标区域的各像素点的视觉显著性值V(i,j):
Figure PCTCN2019075205-appb-000055
式中,待处理图像的大小为M*N,x、y为像素点的横纵坐标值,φ(u,v)为待处理图像经过快速傅里叶变换以后所得相位谱;利用下述公式将HSV颜色特征的各视觉显著图进行融合,得到最终视觉显著直方图V的模块:
V=w H×V H+w S×V S+w V×V V
式中,V H、V S、V V分别为HSV颜色空间特征进行视觉显著性计算获得的相对应视觉显著图,w H、w S、w V分别为各颜色空间特征对应的特征权值。
可选的,在本发明实施例的其他一些实施方式中,请参阅图5,所述装置例如还可包括去噪模块405,所述去噪模块405用于利用下述公式对待处理图像进行去噪处理:
X t=AX t-1+v t-1
式中,v t-1为过程噪声信号,A为状态转移函数,t为时间,X t为像素点。
本发明实施例所述目标图像分割装置的各功能模块的功能可根据上述方法实施例中的方法具体实现,其具体实现过程可以参照上述方法实施例的相关描述,此处不再赘述。
由上可知,本发明实施例解决了相关技术中使用单独颜色特征所导致的图像分割效果不稳定问题,还有效地解决了光照强度变化、目标形变以及颜色分布类似而引起的目标图像分割不稳定问题,提升了目标图像分割的稳定性 和准确度,还提高了图像分割的效率。
本发明实施例还提供了一种目标图像分割设备,具体可包括:
存储器,用于存储计算机程序;
处理器,用于执行计算机程序以实现如上任意一实施例所述目标图像分割方法的步骤。
本发明实施例所述目标图像分割设备的各功能模块的功能可根据上述方法实施例中的方法具体实现,其具体实现过程可以参照上述方法实施例的相关描述,此处不再赘述。
由上可知,本发明实施例解决了相关技术中使用单独颜色特征所导致的图像分割效果不稳定问题,还有效地解决了光照强度变化、目标形变以及颜色分布类似而引起的目标图像分割不稳定问题,提升了目标图像分割的稳定性和准确度,还提高了图像分割的效率。
本发明实施例还提供了一种计算机可读存储介质,存储有目标图像分割程序,所述目标图像分割程序被处理器执行时如上任意一实施例所述目标图像分割方法的步骤。
本发明实施例所述计算机可读存储介质的各功能模块的功能可根据上述方法实施例中的方法具体实现,其具体实现过程可以参照上述方法实施例的相关描述,此处不再赘述。
由上可知,本发明实施例解决了相关技术中使用单独颜色特征所导致的图像分割效果不稳定问题,还有效地解决了光照强度变化、目标形变以及颜色分布类似而引起的目标图像分割不稳定问题,提升了目标图像分割的稳定性和准确度,还提高了图像分割的效率。
本说明书中各个实施例采用递进的方式描述,每个实施例重点说明的都是与其它实施例的不同之处,各个实施例之间相同或相似部分互相参见即可。对于实施例公开的装置而言,由于其与实施例公开的方法相对应,所以描述的比较简单,相关之处参见方法部分说明即可。
专业人员还可以进一步意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本发明的范围。
结合本文中所公开的实施例描述的方法或算法的步骤可以直接用硬件、处理器执行的软件模块,或者二者的结合来实施。软件模块可以置于随机存储器(RAM)、内存、只读存储器(ROM)、电可编程ROM、电可擦除可编程ROM、寄存器、硬盘、可移动磁盘、CD-ROM、或技术领域内所公知的任意其它形式的存储介质中。
以上对本发明所提供的一种目标图像分割方法、装置、设备及计算机可读存储介质进行了详细介绍。本文中应用了具体个例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及其核心思想。应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以对本发明进行若干改进和修饰,这些改进和修饰也落入本发明权利要求的保护范围。

Claims (10)

  1. 一种目标图像分割方法,其特征在于,包括:
    根据预先设置的初始目标区域设定条件在待处理图像中定位初始目标区域,计算所述初始目标区域的颜色直方图和视觉显著性直方图,构成初始目标特征模型;
    利用预先构建的SVM模型从所述待处理图像中选取多个支持向量的目标像素点,计算各目标像素点的颜色直方图和视觉显著性直方图,构成样本特征模型;
    基于计算得到的各目标像素点与所述初始目标区域的特征距离,在所述待处理图像中确定显著性区域;
    根据所述初始目标特征模型和各样本特征模型,计算所述显著性区域中各像素点的权值,删除所述显著性区域中权值不满足预设条件的像素点后得到最终区域,以作为目标区域进行提取分割。
  2. 根据权利要求1所述的目标图像分割方法,其特征在于,所述利用预先构建的SVM模型从所述待处理图像中选取多个支持向量的目标像素点包括:
    预先对所述待处理图像的各像素点进行样本标签标记;
    采集所述待处理图像的多个像素点构成原始样本集,所述原始样本集中包含具有非支持向量和支持向量的像素点,且各像素点的权值相同;
    在SVM训练所述原始样本集过程中,统计各像素点为非支持向量的次数;
    删除非支持向量次数超过预设阈值的像素点,得到精简样本集;
    从所述精简样本集中选取多个支持向量的目标像素点。
  3. 根据权利要求1所述的目标图像分割方法,其特征在于,所述基于计算得到的各目标像素点与所述初始目标区域的特征距离,在所述待处理图像中确定显著性区域包括:
    计算各目标像素点与所述初始目标区域的特征距离;
    将多个目标像素点分为第一像素点集合和第二像素点集合,所述第一像素点集合中各目标像素点对应的特征距离均小于所述第二像素点集合中各目标像素点对应的特征距离;
    利用所述第一像素点集合中的各目标像素点代替所述第二像素点集合中的目标像素点,得到显著性区域。
  4. 根据权利要求3所述的目标图像分割方法,其特征在于,所述计算各目标像素点与所述初始目标区域的特征距离为:
    利用下述公式计算各目标像素点与所述初始目标区域的特征距离d F(x 1i,x 2):
    Figure PCTCN2019075205-appb-100001
    式中,x 1i为第i个目标像素点,x 2为所述初始目标区域,K(x 1i,x 1i)为x 1i目标像素点与x 1i目标像素点之间的欧式距离,K(x 1i,x 2)为x 1i目标像素点与所述初始目标区域中心之间的欧式距离, K(x 2,x 2)为所述初始目标区域中心与所述初始目标区域中心之间的欧式距离。
  5. 根据权利要求1至4任意一项所述的目标图像分割方法,其特征在于,所述根据所述初始目标特征模型和各样本特征模型,计算所述显著性区域中各像素点的权值包括:
    利用下述公式计算所述显著性区域中各像素点的权值
    Figure PCTCN2019075205-appb-100002
    Figure PCTCN2019075205-appb-100003
    式中,
    Figure PCTCN2019075205-appb-100004
    为第i个像素点的颜色与所述初始目标区域的颜色的相似度,
    Figure PCTCN2019075205-appb-100005
    为第i个像素点的视觉显著性与与所述初始目标区域的视觉显著性的相似度,
    Figure PCTCN2019075205-appb-100006
    是全部
    Figure PCTCN2019075205-appb-100007
    的算术平均值,
    Figure PCTCN2019075205-appb-100008
    是全部
    Figure PCTCN2019075205-appb-100009
    的算术平均值;
    Figure PCTCN2019075205-appb-100010
    为t时刻的先验概率密度,λ为调整参数。
  6. 根据权利要求1至4任意一项所述的目标图像分割方法,其特征在于,所述计算所述初始目标区域的颜色直方图包括:
    将所述待处理图像切换到HSV颜色空间,将H通道、S通道作为颜色特征,V通道作为明度特征;
    利用下述公式计算所述初始目标区域的颜色直方图特征:
    Figure PCTCN2019075205-appb-100011
    式中,x i为所述初始目标区域中第i个像素点,
    Figure PCTCN2019075205-appb-100012
    为以x像素点为中心区域的颜色分布模型,C为标准化因子,k为核函数,u为弱极限,a为所述初始目标区域的大小,N为所述初始目标区域中像素点的总数,b(x i)为将第i个像素点的颜色特征分配到颜色直方图上相应部分,δ(·)为狄拉克函数。
  7. 根据权利要求6所述的目标图像分割方法,其特征在于,所述计算所述初始目标区域的视觉显著性直方图包括:
    将所述待处理图像的HSV颜色空间特征从空域转换至频域,得到所述待处理图像的图像幅度谱和图像相位谱;
    对每种颜色空间特征,利用下述公式得到所述初始目标区域的各像素点的视觉显著性值V(i,j):
    Figure PCTCN2019075205-appb-100013
    式中,所述待处理图像的大小为M*N,x、y为像素点的横纵坐标值,φ(u,v)为所述待处 理图像经过快速傅里叶变换以后所得相位谱;
    利用下述公式将HSV颜色特征的各视觉显著图进行融合,得到最终视觉显著直方图V:
    V=w H×V H+w S×V S+w V×V V
    式中,V H、V S、V V分别为HSV颜色空间特征进行视觉显著性计算获得的相对应视觉显著图,w H、w S、w V分别为各颜色空间特征对应的特征权值。
  8. 根据权利要求1至4任意一项所述的目标图像分割方法,其特征在于,在所述利用预先构建的SVM模型从所述待处理图像中选取多个支持向量的目标像素点之前,还包括:
    利用下述公式对所述待处理图像进行去噪处理:
    X t=AX t-1+v t-1
    式中,v t-1为过程噪声信号,A为状态转移函数,t为时间,X t为像素点。
  9. 一种目标图像分割装置,其特征在于,包括:
    初始目标区域定位模块,用于根据预先设置的初始目标区域设定条件在待处理图像中定位初始目标区域,计算所述初始目标区域的颜色直方图和视觉显著性直方图,构成初始目标特征模型;
    样本点选取模块,用于利用预先构建的SVM模型从所述待处理图像中选取多个支持向量的目标像素点,计算各目标像素点的颜色直方图和视觉显著性直方图,构成样本特征模型;
    显著性区域确定模块,用于基于计算得到的各目标像素点与所述初始目标区域的特征距离,在所述待处理图像中确定显著性区域;
    目标区域确定模块,用于根据所述初始目标特征模型和各样本特征模型,计算所述显著性区域中各像素点的权值,删除所述显著性区域中权值不满足预设条件的像素点后得到最终区域,以作为目标区域进行提取分割。
  10. 一种目标图像分割设备,其特征在于,包括处理器,所述处理器用于执行存储器中存储的计算机程序时实现如权利要求1至8任一项所述目标图像分割方法的步骤。
PCT/CN2019/075205 2018-11-30 2019-02-15 目标图像分割方法、装置及设备 WO2020107716A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811455474.XA CN109544568A (zh) 2018-11-30 2018-11-30 目标图像分割方法、装置及设备
CN201811455474.X 2018-11-30

Publications (1)

Publication Number Publication Date
WO2020107716A1 true WO2020107716A1 (zh) 2020-06-04

Family

ID=65851857

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/075205 WO2020107716A1 (zh) 2018-11-30 2019-02-15 目标图像分割方法、装置及设备

Country Status (2)

Country Link
CN (1) CN109544568A (zh)
WO (1) WO2020107716A1 (zh)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111724396A (zh) * 2020-06-17 2020-09-29 泰康保险集团股份有限公司 图像分割方法及装置、计算机可读存储介质、电子设备
CN111798470A (zh) * 2020-07-20 2020-10-20 成都快乐猴科技有限公司 一种应用于智能农业的农作物图像实体分割方法及系统
CN111862128A (zh) * 2020-06-12 2020-10-30 广州市申迪计算机系统有限公司 一种图像分割方法及装置
CN112818775A (zh) * 2021-01-20 2021-05-18 北京林业大学 基于区域边界像素交换的林区道路快速识别方法及系统
CN113012202A (zh) * 2021-03-31 2021-06-22 开放智能机器(上海)有限公司 目标跟踪方法、装置、设备、介质及程序产品
CN113159229A (zh) * 2021-05-19 2021-07-23 深圳大学 图像融合方法、电子设备及相关产品
CN113763491A (zh) * 2021-08-26 2021-12-07 浙江中烟工业有限责任公司 一种烟丝桶残余物的视觉检测方法
CN116051805A (zh) * 2022-12-06 2023-05-02 中科三清科技有限公司 一种副热带高压的影响区识别方法、装置、存储介质及终端
CN116309687A (zh) * 2023-05-26 2023-06-23 深圳世国科技股份有限公司 基于人工智能的摄像机实时跟踪定位方法
CN116503741A (zh) * 2023-06-25 2023-07-28 山东仟邦建筑工程有限公司 一种农作物成熟期智能预测系统
CN116778532A (zh) * 2023-08-24 2023-09-19 汶上义桥煤矿有限责任公司 一种煤矿井下人员目标跟踪方法
CN117274293A (zh) * 2023-11-17 2023-12-22 广东省农业科学院动物科学研究所 基于图像特征的细菌菌落精确划分方法
CN117541800A (zh) * 2024-01-10 2024-02-09 深圳因赛德思医疗科技有限公司 基于喉镜影像的喉部异常部位分割方法

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110765882B (zh) * 2019-09-25 2023-04-07 腾讯科技(深圳)有限公司 一种视频标签确定方法、装置、服务器及存储介质
CN110991465B (zh) * 2019-11-15 2023-05-23 泰康保险集团股份有限公司 一种物体识别方法、装置、计算设备及存储介质
CN113096022B (zh) * 2019-12-23 2022-12-30 RealMe重庆移动通信有限公司 图像虚化处理方法、装置、存储介质与电子设备
CN113923430A (zh) * 2020-04-15 2022-01-11 深圳市瑞立视多媒体科技有限公司 基于高清视频的实时抠像方法、装置、设备及存储介质
CN111583283B (zh) * 2020-05-20 2023-06-20 抖音视界有限公司 图像分割方法、装置、电子设备及介质
CN111522020A (zh) * 2020-06-23 2020-08-11 山东亦贝数据技术有限公司 一种园区活动要素混合定位系统及方法
CN113222941B (zh) * 2021-05-17 2022-11-11 中冶赛迪信息技术(重庆)有限公司 连铸铸坯的切割状态确定方法、系统、设备及介质
CN116485819B (zh) * 2023-06-21 2023-09-01 青岛大学附属医院 一种耳鼻喉检查图像分割方法及系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101980248A (zh) * 2010-11-09 2011-02-23 西安电子科技大学 基于改进视觉注意力模型的自然场景目标检测方法
US20130223740A1 (en) * 2012-02-23 2013-08-29 Microsoft Corporation Salient Object Segmentation
CN105022990A (zh) * 2015-06-29 2015-11-04 华中科技大学 一种基于无人艇应用的水面目标快速检测方法
CN106997597A (zh) * 2017-03-22 2017-08-01 南京大学 一种基于有监督显著性检测的目标跟踪方法

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102129693B (zh) * 2011-03-15 2012-07-25 清华大学 基于色彩直方图和全局对比度的图像视觉显著性计算方法
CN103208123B (zh) * 2013-04-19 2016-03-02 广东图图搜网络科技有限公司 图像分割方法与系统
CN103810473B (zh) * 2014-01-23 2016-09-07 宁波大学 一种基于隐马尔科夫模型的人体对象的目标识别方法
CN105224914B (zh) * 2015-09-02 2018-10-23 上海大学 一种基于图的无约束视频中显著物体检测方法
CN108647703B (zh) * 2018-04-19 2021-11-02 北京联合大学 一种基于显著性的分类图像库的类型判断方法

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101980248A (zh) * 2010-11-09 2011-02-23 西安电子科技大学 基于改进视觉注意力模型的自然场景目标检测方法
US20130223740A1 (en) * 2012-02-23 2013-08-29 Microsoft Corporation Salient Object Segmentation
CN105022990A (zh) * 2015-06-29 2015-11-04 华中科技大学 一种基于无人艇应用的水面目标快速检测方法
CN106997597A (zh) * 2017-03-22 2017-08-01 南京大学 一种基于有监督显著性检测的目标跟踪方法

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111862128A (zh) * 2020-06-12 2020-10-30 广州市申迪计算机系统有限公司 一种图像分割方法及装置
CN111862128B (zh) * 2020-06-12 2024-04-16 广州市申迪计算机系统有限公司 一种图像分割方法及装置
CN111724396A (zh) * 2020-06-17 2020-09-29 泰康保险集团股份有限公司 图像分割方法及装置、计算机可读存储介质、电子设备
CN111798470A (zh) * 2020-07-20 2020-10-20 成都快乐猴科技有限公司 一种应用于智能农业的农作物图像实体分割方法及系统
CN112818775B (zh) * 2021-01-20 2023-07-25 北京林业大学 基于区域边界像素交换的林区道路快速识别方法及系统
CN112818775A (zh) * 2021-01-20 2021-05-18 北京林业大学 基于区域边界像素交换的林区道路快速识别方法及系统
CN113012202A (zh) * 2021-03-31 2021-06-22 开放智能机器(上海)有限公司 目标跟踪方法、装置、设备、介质及程序产品
CN113159229B (zh) * 2021-05-19 2023-11-07 深圳大学 图像融合方法、电子设备及相关产品
CN113159229A (zh) * 2021-05-19 2021-07-23 深圳大学 图像融合方法、电子设备及相关产品
CN113763491B (zh) * 2021-08-26 2024-03-12 浙江中烟工业有限责任公司 一种烟丝桶残余物的视觉检测方法
CN113763491A (zh) * 2021-08-26 2021-12-07 浙江中烟工业有限责任公司 一种烟丝桶残余物的视觉检测方法
CN116051805A (zh) * 2022-12-06 2023-05-02 中科三清科技有限公司 一种副热带高压的影响区识别方法、装置、存储介质及终端
CN116051805B (zh) * 2022-12-06 2023-08-18 中科三清科技有限公司 一种副热带高压的影响区识别方法、装置、存储介质及终端
CN116309687B (zh) * 2023-05-26 2023-08-04 深圳世国科技股份有限公司 基于人工智能的摄像机实时跟踪定位方法
CN116309687A (zh) * 2023-05-26 2023-06-23 深圳世国科技股份有限公司 基于人工智能的摄像机实时跟踪定位方法
CN116503741A (zh) * 2023-06-25 2023-07-28 山东仟邦建筑工程有限公司 一种农作物成熟期智能预测系统
CN116503741B (zh) * 2023-06-25 2023-08-25 山东仟邦建筑工程有限公司 一种农作物成熟期智能预测系统
CN116778532A (zh) * 2023-08-24 2023-09-19 汶上义桥煤矿有限责任公司 一种煤矿井下人员目标跟踪方法
CN116778532B (zh) * 2023-08-24 2023-11-07 汶上义桥煤矿有限责任公司 一种煤矿井下人员目标跟踪方法
CN117274293A (zh) * 2023-11-17 2023-12-22 广东省农业科学院动物科学研究所 基于图像特征的细菌菌落精确划分方法
CN117274293B (zh) * 2023-11-17 2024-03-15 广东省农业科学院动物科学研究所 基于图像特征的细菌菌落精确划分方法
CN117541800A (zh) * 2024-01-10 2024-02-09 深圳因赛德思医疗科技有限公司 基于喉镜影像的喉部异常部位分割方法
CN117541800B (zh) * 2024-01-10 2024-04-09 深圳因赛德思医疗科技有限公司 基于喉镜影像的喉部异常部位分割方法

Also Published As

Publication number Publication date
CN109544568A (zh) 2019-03-29

Similar Documents

Publication Publication Date Title
WO2020107716A1 (zh) 目标图像分割方法、装置及设备
KR102275452B1 (ko) 색상과 형태를 동시에 고려한 실시간 영상 추적 방법 및 이를 위한 장치
US9483835B2 (en) Depth value restoration method and system
CN112926410B (zh) 目标跟踪方法、装置、存储介质及智能视频系统
JP2017531883A (ja) 画像の主要被写体を抽出する方法とシステム
WO2019071976A1 (zh) 基于区域增长和眼动模型的全景图像显著性检测方法
JP6756406B2 (ja) 画像処理装置、画像処理方法および画像処理プログラム
Xiao et al. Defocus blur detection based on multiscale SVD fusion in gradient domain
Yadav Efficient method for moving object detection in cluttered background using Gaussian Mixture Model
US20180357212A1 (en) Detecting occlusion of digital ink
JP2019512821A (ja) 画像処理装置、画像処理方法、及びプログラム
CN113379789B (zh) 一种复杂环境下运动目标跟踪方法
CN105787484B (zh) 一种物体跟踪或识别的方法、装置
Jia et al. Fast and robust image segmentation using an superpixel based FCM algorithm
KR100976584B1 (ko) 평균 이동 클러스터 및 초기 색상 갱신을 이용한 색상 기반객체 추적 장치 및 방법
Fida et al. Unsupervised image segmentation using lab color space
Tian et al. Point-cut: Fixation point-based image segmentation using random walk model
Garg et al. A survey on visual saliency detection and computational methods
CN113129332A (zh) 执行目标对象跟踪的方法和装置
Kerdvibulvech Hybrid model of human hand motion for cybernetics application
US20230169708A1 (en) Image and video matting
Hati et al. Review and improvement areas of mean shift tracking algorithm
CN112802055B (zh) 一种目标鬼影检测及边缘传播抑制算法
Liu et al. Multi-object tracking and occlusion reasoning based on adaptive weighing particle filter
Jia et al. Image Saliency Detection Based on Low-Level Features and Boundary Prior

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19888493

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19888493

Country of ref document: EP

Kind code of ref document: A1