EP3438929A1 - Foreground and background detection method - Google Patents

Foreground and background detection method Download PDF

Info

Publication number
EP3438929A1
EP3438929A1 EP17184931.8A EP17184931A EP3438929A1 EP 3438929 A1 EP3438929 A1 EP 3438929A1 EP 17184931 A EP17184931 A EP 17184931A EP 3438929 A1 EP3438929 A1 EP 3438929A1
Authority
EP
European Patent Office
Prior art keywords
pixel
foreground
probability
background
belongs
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
EP17184931.8A
Other languages
German (de)
French (fr)
Other versions
EP3438929B1 (en
Inventor
Marc Van Droogenbroeck
Marc BRAHAM
Sébastien PIERARD
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Universite de Liege
Original Assignee
Universite de Liege
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Universite de Liege filed Critical Universite de Liege
Priority to EP17184931.8A priority Critical patent/EP3438929B1/en
Priority to CN201711402509.9A priority patent/CN109389618B/en
Priority to US15/983,238 priority patent/US10614736B2/en
Priority to US16/267,474 priority patent/US20190251695A1/en
Publication of EP3438929A1 publication Critical patent/EP3438929A1/en
Priority to US16/288,468 priority patent/US10706558B2/en
Application granted granted Critical
Publication of EP3438929B1 publication Critical patent/EP3438929B1/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/215Motion-based segmentation
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G3/00Control arrangements or circuits, of interest only in connection with visual indicators other than cathode-ray tubes
    • G09G3/20Control arrangements or circuits, of interest only in connection with visual indicators other than cathode-ray tubes for presentation of an assembly of a number of characters, e.g. a page, by composing the assembly by combination of individual elements arranged in a matrix no fixed position being assigned to or needed to be assigned to the individual characters or partial characters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/174Segmentation; Edge detection involving the use of two or more images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/254Analysis of motion involving subtraction of images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/277Analysis of motion involving stochastic approaches, e.g. using Kalman filters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20076Probabilistic image processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20224Image subtraction
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G2300/00Aspects of the constitution of display devices
    • G09G2300/04Structural and physical details of display devices
    • G09G2300/0439Pixel structures
    • G09G2300/0443Pixel structures with several sub-pixels for the same colour in a pixel, not specifically used to display gradations
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09GARRANGEMENTS OR CIRCUITS FOR CONTROL OF INDICATING DEVICES USING STATIC MEANS TO PRESENT VARIABLE INFORMATION
    • G09G2300/00Aspects of the constitution of display devices
    • G09G2300/04Structural and physical details of display devices
    • G09G2300/0439Pixel structures
    • G09G2300/0452Details of colour pixel setup, e.g. pixel composed of a red, a blue and two green components

Definitions

  • the disclosure relates to a method for assigning a pixel to one of a foreground and a background pixel sets.
  • a major research area in computer vision is the field of motion detection.
  • the aim of motion detection is to classify pixels according to whether they belong to such a moving object or not, filtering any pixels that may be misclassified, so as to detect moving objects in a scene.
  • This task which is solved in nature with apparent ease by even rudimentary animal vision systems, has turned out to be complex to replicate in computer vision.
  • an image may be expressed as a plurality of picture elements, or pixels. Each single pixel in an image may have a position x in the image and a pixel value I ( x ).
  • the position x may have any number of dimensions.
  • voxel for "volume element”
  • pixel should be understood broadly in the present disclosure as also covering such voxels and any picture element in images having any number of dimensions, including 3D images and/or multispectral images.
  • This position x may be limited to a finite domain, for instance if it is an image captured by a fixed imaging device. However, it may alternatively not be limited to a finite domain, for example if the image is captured by a moving imaging device, such as, for example, a satellite on-board camera.
  • the pixel value I ( x ) may also have any number of dimensions.
  • the pixel value I ( x ) may be a scalar luminance value, but in polychromatic images, such as red-green-blue (RGB) component video images or hue saturation value (HSV) images, this pixel value I ( x ) may be a multidimensional vector value.
  • RGB red-green-blue
  • HSV hue saturation value
  • Van Droogenbroeck in "ViBe: A universal background subtraction algorithm for video sequences" in IEEE Trans. Image Process., vol. 20, no. 6, pp. 1709-1724, June 2011 classify pixels according to color components, whereas the background subtraction algorithms disclosed by V. Jain, B. Kimia, and J. Mundy in "Background modeling based on subpixel edges," IEEE Int. Conf. Image Process. (ICIP), Sept. 2007, vol. 6, pp. 321-324 , S. Zhang, H. Yao, and S. Liu in "Dynamic background modeling and subtraction using spatio-temporal local binary patterns", IEEE Int. Conf. Image Process. (ICIP), Oct. 2008, pp. 1556-1559 , M. Chen, Q.
  • a foreground object is considered to be "camouflaged" when its corresponding pixel values (e.g. color or luminance) are similar to those of the background.
  • background subtraction algorithms may erroneously assign the corresponding foreground pixels to the background, as false negatives. This may for instance take the form of color camouflage on images from color cameras, or of thermal camouflage on images from thermal cameras. Snow cover, for example, may lead to such camouflaging.
  • Ghosting is the phenomenon when a previously static object, which thus belonged to the background, starts moving. In this situation, because not only the pixel values of the pixels corresponding to the object change, but also those belonging to the background previously hidden by the object when it was static, these latter background pixels may be erroneously assigned to the foreground, as false positives.
  • Dynamic backgrounds are such backgrounds were there may be changes in pixel values, such as for instance a windblown leafy tree or a sea wave. In this situation, the corresponding background pixels may be erroneously assigned to the foreground, also as false positives.
  • shadows and reflections may lead to background pixels being erroneously assigned to the foreground, as false positives, due to the associated changes in pixel values.
  • a first aspect of the disclosure relates to a method for assigning a pixel to one of a foreground pixel set and a background pixel set, more reliably and robustly than with background subtraction algorithms comparing a pixel value of the pixel with a pixel value of a corresponding pixel in a background model.
  • the present disclosure seeks to address the abovementioned challenges to background subtraction algorithms.
  • the method according to this first aspect may comprise the steps of calculating a probability that a pixel of the selected image belongs to a foreground-relevant object according to a semantic segmentation algorithm, and assigning the pixel to the background pixel set if the probability that the pixel belongs to a foreground-relevant object does not exceed a first predetermined threshold, assigning the pixel to the foreground pixel set if the probability that the pixel belongs to a foreground-relevant object exceeds the first predetermined threshold and a difference between the probability that the pixel belongs to a foreground-relevant object and a baseline probability for the pixel equals or exceeds a second predetermined threshold.
  • semantic segmentation also known as scene labeling or scene parsing
  • the task is difficult and requires the simultaneous detection, localization, and segmentation of semantic objects and regions.
  • the advent of deep neural networks within the computer vision community and the access to large labeled training datasets have dramatically improved the performance of semantic segmentation algorithms, as described by J. Long, E. Shelhamer, and T Darrell in "Fully convolutional networks for semantic segmentation", IEEE Int. Conf. Comput.
  • Semantic segmentation algorithms have thus begun to be used for specific computer vision tasks, such as optical flow estimation as described by L. Sevilla-Lara, D. Sun, V. Jampani, and M. J. Black in "Optical flow with semantic segmentation and localized layers", IEEE Int. Conf. Comput. Vision and Pattern Recogn. (CVPR), June 2016, pp. 3889-3898 .
  • the method according to this first aspect of the disclosure can provide a more robust, reliable image segmentation into foreground and background than that provided by a background subtraction algorithm merely comparing low-level pixel values with a background model.
  • the semantic level can thus be used to identify foreground-relevant objects, that is, objects belonging to semantic classes that can be expected to move, and thus belong to the foreground, and leverage this knowledge in the step of assigning the pixel to the background pixel set if the probability that the pixel belongs to a foreground-relevant object does not exceed a first predetermined threshold, so as to prevent false positives, that is, erroneously assigning pixels to the foreground pixel set due to, for example, dynamic backgrounds, ghosting, shadows and/or reflections, camera jitter, panning, tilting and/or zooming, bad weather, gradual or sudden lighting changes or background displacement, which usually affect the performances of conventional background subtraction algorithms.
  • the semantic level can also be used to identify whether the probability that a pixel belongs to such a foreground-relevant object is increased with respect to a baseline probability for that pixel, that may for instance correspond to a corresponding pixel in a semantic background model, in the step of assigning the pixel of the selected image to the foreground pixel set if the probability that the pixel belongs to a foreground-relevant object exceeds the first predetermined threshold and a difference between the probability that the pixel belongs to a foreground-relevant object and a baseline probability for the pixel equals or exceeds a second predetermined threshold, so as to prevent false negatives, that is, erroneously assigning pixels to the background, due to camouflage, i.e. when background and foreground share similar pixel values.
  • the abovementioned method may further comprise a baseline updating step, wherein the baseline probability for the pixel is made equal to the probability that the pixel belongs to a foreground-relevant object calculated according to the semantic segmentation algorithm, if the pixel has been assigned to the background pixel set. Consequently, the baseline probability for the pixel can be updated for subsequent use with respect to corresponding pixels in other images using the information from the semantic level of this image.
  • a conservative updating strategy may be applied in which the baseline updating step is executed only randomly, according to a predetermined probability of execution, if the pixel has been assigned to the background pixel set.
  • the method may further comprise a step of assigning the pixel to either the foreground pixel set or the background pixel set according to a background subtraction algorithm comparing a pixel value of the pixel with a pixel value of a corresponding pixel in a background model, and in particular a background subtraction algorithm based on at least one low-level image feature, if the probability that the pixel belongs to a foreground-relevant object exceeds the first predetermined threshold and the difference between the probability that the pixel belongs to a foreground-relevant object and the baseline probability for the pixel is lower than the second predetermined threshold.
  • any pixel that the abovementioned steps fail to assign to either the foreground pixel set or the background pixel set on the basis of the semantic segmentation algorithm may be assigned using a known background subtraction algorithm comparing a pixel value of the pixel with a pixel value of a corresponding pixel in a background model.
  • the pixel may belong to an image and the background model be based on at least another, related image, such as for instance a previous image in a chronological sequence of images including the image to which the pixel belongs.
  • the pixel may belong to an image of a chronological sequence of images, in which case the baseline probability for the pixel may have been initialized as equal to a probability that a corresponding pixel in an initial image of the plurality of related images belongs to a foreground-relevant object, calculated according to the semantic segmentation algorithm. Consequently, the semantic knowledge from this initial image can be leveraged in at least initially setting the baseline probabilities with which the probabilities of corresponding pixels in subsequent images belonging to foreground-relevant objects are compared when determining whether those pixels are to be assigned to the foreground.
  • the first and second predetermined thresholds may have been predetermined so as to optimize an F score of the method on a test image series.
  • the F score of a detection method may be defined as the harmonic mean between precision and recall, wherein the precision is a ratio of true positives to the sum of true positives and false positives and the recall is a ratio of true positives to the sum of true positives and false negatives.
  • pixels that are correctly assigned to the foreground can be considered as true positives
  • pixels that are incorrectly assigned to the foreground represent false positives
  • the first and second predetermined thresholds may have been heuristically predetermined based on, respectively, a false foreground detection rate of the background subtraction algorithm and a true foreground detection rate of the background subtraction algorithm. It has indeed been found by the inventors that the first and second predetermined thresholds with which the F score of the method on test image series can be optimized are strongly correlated with, respectively, the false foreground detection rate and the true foreground detection rate of the background subtraction algorithm applied in this method.
  • the present invention relates also to a data processing device programmed so as to carry out the image background recognition method of the invention; to a data storage medium comprising a set of instructions configured to be read by a data processing device to carry out an image background recognition method according to the invention; to a set of signals in magnetic, electromagnetic, electric and/or mechanical form, comprising a set of instructions for a data processing device to carry out an image background recognition method according to the invention; and/or to a process of transmitting, via magnetic, electromagnetic, electric and/or mechanical means, a set of instructions for a data processing device to carry out an image background recognition method according to the invention.
  • data storage medium may be understood any physical medium capable of containing data readable by a reading device for at least a certain period of time. Examples of such data storage media are magnetic tapes and discs, optical discs (read-only as well as recordable or re-writable), logical circuit memories, such as read-only memory chips, random-access memory chips and flash memory chips, and even more exotic data storage media, such as chemical, biochemical or mechanical memories.
  • electromagnetic any part of the electromagnetic spectrum is understood, from radio to UV and beyond, including microwave, infrared and visible light, in coherent (LASER, MASER) or incoherent form.
  • object any observable element of the real world, including animals and/or humans.
  • each image may be formed by a plurality of pixels, each single pixel in an image having a dedicated pixel position x and a pixel value I ( x ).
  • the pixel position x is shown as two-dimensional, but it could have any number of dimensions.
  • the pixel position x may have three dimensions.
  • the pixel value I ( x ) in the illustrated embodiment is a three-dimensional vector, in the form of RGB- or HSV-triplets for obtaining a polychromatic image. In alternative embodiments, it could however have any other number of dimensions.
  • a set of probabilities p t ( x ⁇ c i ) that the pixel at pixel position x and time t belongs to each class c i of the set C may be calculated by applying a softmax function to the scores v t i x .
  • a subset R may correspond to foreground-relevant objects, that is, objects relevant to motion detection.
  • foreground-relevant objects may comprise potentially mobile objects like vehicles, people and animals, but not typically immobile objects like trees or buildings.
  • the subset R may include just people and animals as foreground-relevant object classes in the area of the walking path, but also vehicles in the area of the road.
  • This probability p S,t ( x ) that the pixel at pixel position x and time t belongs to a foreground-relevant object according to the semantic segmentation algorithm can be used in a method for assigning pixels to foreground and background pixel sets in each image of the set of images.
  • Fig. 1 shows a flowchart illustrating a core routine of this method, wherein the pixel at pixel position x and time t is assigned to either the foreground pixel set or the background pixel set.
  • the probability p S,t ( x ) that the pixel at pixel position x and time t belongs to a foreground-relevant object is calculated using the semantic segmentation algorithm.
  • a second step S200 it is determined whether this probability p S,t ( x ) is lower than or equal to a first predetermined threshold ⁇ BG . If the result of this comparison is positive, and it is thus determined that the probability p S,t ( x ) that the pixel at pixel position x and time t belongs to a foreground-relevant object according to the semantic segmentation algorithm does not exceed the first predetermined threshold ⁇ BG , it is considered unlikely that the pixel at pixel position x and time t belongs to a potentially mobile object, and the pixel at pixel position x and time t is thus assigned to the background in step S300.
  • This baseline probability M t ( x ) corresponds to a semantic model of the background for pixel position x and time t .
  • ⁇ ⁇ denotes a predetermined probability ⁇ of execution, which may be set, for example, to 0.00024.
  • the random determination, with predetermined probability ⁇ of execution, of whether the baseline probability M t ( x ) for pixel position x is to be updated, may be carried out using a random number generator.
  • a pseudorandom number generator may be used instead with properties similar to those of a true random number generator.
  • Another alternative is the use of a large look-up list of previously generated random or pseudorandom numbers.
  • This second rule can prevent to a large extent that foreground pixels that are camouflaged, that is, that have similar pixel values to the background, be erroneously assigned to the background, which is also a challenge for conventional background subtraction algorithms.
  • semantic segmentation alone may not suffice to distinguish between foreground and background, for instance in the case in which a foreground-relevant object (e.g. a moving car) moves in front of a stationary object of the same semantic, foreground-relevant object (e.g. a parked car). Since both objects belong to the same foreground-relevant object class, the probability p S,t ( x ) will not significantly increase as the moving object moves in front of the stationary object at pixel position and time t .
  • a third rule is applied in the next step S600, assigning the pixel at pixel position x and time t to either the foreground pixel set or the background pixel set according to a conventional background subtraction algorithm comparing a pixel value I ( x ) of the pixel at pixel position x and time t with a pixel value of a corresponding pixel in a background model based on at least another image of the plurality of related images.
  • D t x B t x wherein B t ( x ) ⁇ BG,FG ⁇ denotes the result from the background subtraction algorithm.
  • Fig. 2 thus illustrates how the three signals S t BG x , S t FG x and B t ( x ) can be obtained and applied in combination, using the abovementioned three rules, for foreground and background detection. How these signals are combined can also be summarized with the following table: B t ( x ) S t BG x ⁇ ⁇ BG S t FG x ⁇ ⁇ BG D t ( x ) BG false false BG BG false true FG BG true false BG BG true true X FG false false FG Table 1: Foreground and background detection according to the three rules of the method FG false true FG FG true false BG FG true true X
  • the first rule only assigns pixels to the background pixel set, raising the first predetermined threshold ⁇ BG so that the first rule is applied more frequently can only decrease the True Positive Rate TPR, that is the rate at which pixels are correctly assigned to the foreground, and the False Positive Rate FPR, that is the rate at which pixels are erroneously assigned to the foreground pixel set.
  • the second rule only assigns pixels to the foreground pixel set, decreasing the second predetermined threshold ⁇ FG so that the second rule is applied more frequently can only increase the True Positive Rate TPR and the False Positive Rate FPR.
  • the first predetermined threshold ⁇ BG and second predetermined threshold ⁇ FG are thus to be set at the level that achieves the best compromise between the highest possible True Positive Rate TPR and the lowest possible False Positive Rate FPR.
  • One first alternative for setting the first predetermined threshold ⁇ BG and second predetermined threshold ⁇ FG is to perform tests on test image sequences using the abovementioned method with various different values for the first predetermined threshold ⁇ BG and second predetermined threshold ⁇ FG , and select the duple of values for these thresholds resulting, for given background subtraction and semantic segmentation algorithms, in the best overall F score, that is, the highest harmonic mean between precision and recall, wherein the precision is a ratio of true positives (instances of pixels correctly assigned to the foreground pixel set) to the sum of true positives and false positives (instances of pixels erroneously assigned to the foreground pixel set) and the recall is a ratio of true positives to the sum of true positives and false negatives (instances of pixels erroneously assigned to the background pixel set).
  • This can be performed as a grid search optimization.
  • the inventors have carried out such tests on 53 video sequences, organized in 11 categories, of the CDNet dataset presented by Y Wang, P.-M. Jodoin, F. Porikli, J. Konrad, Y Benezeth, and P. Ishwar in "CDnet 2014: An expanded change detection benchmark dataset", IEEE Int. Conf. Comput. Vision and Pattern Recogn. Workshops (CVPRW), pages 393-400, Columbus, Ohio, USA, June 2014 , applying the presently disclosed method using 34 different background subtraction algorithms and, as semantic segmentation algorithm, the deep architecture PSPNet disclosed by H. Zhao, J. Shi X. Qi, X. Wang and J. Jia in "Pyramid scene parsing network", CoRR, vol.
  • a second alternative approach is to heuristically set the first predetermined threshold ⁇ BG and second predetermined threshold ⁇ FG based on, respectively, the False Positive Rate FPR and True Positive Rate TPR of the background subtraction algorithm to be used in the third rule of the method.
  • the first predetermined threshold ⁇ BG may be set as half the False Positive Rate FPR of the background subtraction algorithm
  • the second predetermined threshold ⁇ FG as equal to the True Positive Rate TPR of the background subtraction algorithm.
  • the background subtraction algorithm should, by definition, perform better than a random classifier, its False Positive Rate FPR should be lower than its True Positive Rate TPR, thus ensuring that the first predetermined threshold ⁇ BG is also lower than the second predetermined threshold ⁇ FG .
  • the first predetermined threshold ⁇ BG and second predetermined threshold ⁇ FG may be set to default values, corresponding for example to the arithmetic mean of the values for these thresholds resulting in the best overall F score for each of the best-performing five background subtraction algorithms in the 2014 CDNet ranking, with the same semantic segmentation algorithm.
  • Fig. 4 illustrates this improvement, defined as one minus the error rate of the method combining background subtraction with semantic segmentation divided by the error rate of the background subtraction algorithm on its own, for each one of these three approaches. More specifically, Fig. 4 illustrates the mean improvement, measured on the overall CDNet dataset, both for the entire set of 34 background subtraction algorithms, and for only the 5 best-performing background subtraction algorithms. As can be seen on this figure, the first approach offers a very significant improvement, even over the background subtraction algorithms that already performed best, and this improvement is hardly decreased with the second and third alternative approaches.
  • Fig. 5 illustrates the improvement, with respect to each background subtraction algorithm of the abovementioned set of 34 different background subtraction algorithms, in terms of change in the mean True Positive Rate TPR and False Positive Rate FPR. As can be seen there, the present method tends to reduce significantly the False Positive Rate FPR, while simultaneously increasing the True Positive Rate TPR.
  • Fig. 6 illustrates the mean improvement, both for all 34 different background subtraction algorithms and for the 5 best-performing, per category of video sequence, in the 11 categories of the CDNet dataset: "Baseline”, “Dynamic background”, “Camera jitter”, “Intermittent object motion”, “Shadow”, “Thermal”, “Bad weather”, “Low framerate”, “Night videos”, “Pan-Tilt-Zoom Camera” and “Turbulence”. Particularly good improvements can be observed for the "Baseline”, “Dynamic background”, “Shadow” and “Bad weather” categories. With respect to the "Thermal” and “Night videos” categories, it must be noted that the ADE20K dataset used to teach the semantic segmentation algorithm did not include images of these types, which may explain the less good results for those categories.
  • Fig. 7 illustrates the benefits of the method according to the present disclosure in four different scenarios of real-world surveillance tasks. From left to right, the four columns correspond, respectively, to scenarios with dynamic background, risk of ghosts, strong shadows, and camouflage effects. From the top down, the five rows illustrate a frame of the corresponding video sequence, the probability p S,t ( x ) for each pixel, the output of the IUTIS-5 background subtraction algorithm described by S. Bianco, G. Ciocca and R. Schettini in "How far can you get by combining change detection algorithms?", CoRR, vol. abs/1505.02921, 2015, the output of the presently-disclosed method, applying the IUTIS-5 background subtraction algorithm in its third rule, and the ground truth.
  • the presently-disclosed method greatly reduces the number of false positive foreground pixel detections caused by dynamic backgrounds, ghosts and strong shadows, while at the same time mitigating camouflage effects.
  • the presently disclosed method may be carried out with assistance of a data processing device, such as, for example, a programmable computer like the abovementioned NVIDIA® GeForce® GTX Titan X GPU, connected to an imaging device providing a video sequence of successive images.
  • a data processing device such as, for example, a programmable computer like the abovementioned NVIDIA® GeForce® GTX Titan X GPU, connected to an imaging device providing a video sequence of successive images.
  • the data processing device may receive instructions for carrying out this method using a data storage medium, or as signals in magnetic, electromagnetic, electric and/or mechanical form.
  • the presently disclosed method may, for example, be applied to video-surveillance, professional and/or consumer digital still and/or video cameras, computer and videogame devices using image capture interfaces, satellite imaging and Earth observation, automatic image analysis and/or medical imaging systems or may be included in a smartphone.
  • Fig. 8 illustrates a possible application of the invention with an imaging device 1 in the particular form of a digital camera with an embedded data processing device 2 programmed to carry out the method of the invention.
  • Fig. 9 illustrates another possible application of the invention with an imaging device 1 connected to a data processing device 2 programmed to carry out the method of the invention.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Computer Hardware Design (AREA)
  • Image Analysis (AREA)

Abstract

The present invention concerns a method for assigning a pixel to one of a foreground pixel set and a background pixel set. In this method, if a first condition is met the pixel is assigned to the background pixel set, and if the first condition is not met and a second condition is met, the pixel is assigned to the foreground pixel set. The method comprises a step (S100) of calculating a probability that the pixel belongs to a foreground-relevant object according to a semantic segmentation algorithm, the first condition is that this probability that the pixel belongs to a foreground-relevant object does not exceed a first predetermined threshold, and the second condition is that a difference between this probability that the pixel belongs to a foreground-relevant object and a baseline probability for the pixel equals or exceeds a second predetermined threshold.

Description

    TECHNICAL FIELD
  • The disclosure relates to a method for assigning a pixel to one of a foreground and a background pixel sets.
  • BACKGROUND
  • A major research area in computer vision is the field of motion detection. The aim of motion detection is to classify pixels according to whether they belong to such a moving object or not, filtering any pixels that may be misclassified, so as to detect moving objects in a scene. This task, which is solved in nature with apparent ease by even rudimentary animal vision systems, has turned out to be complex to replicate in computer vision.
  • In the field of computer vision, an image may be expressed as a plurality of picture elements, or pixels. Each single pixel in an image may have a position x in the image and a pixel value I (x).
  • The position x may have any number of dimensions. For this reason, although the term "voxel" (for "volume element") is sometimes used instead of "pixel" in the field of 3D imaging, the term "pixel" should be understood broadly in the present disclosure as also covering such voxels and any picture element in images having any number of dimensions, including 3D images and/or multispectral images.
  • This position x may be limited to a finite domain, for instance if it is an image captured by a fixed imaging device. However, it may alternatively not be limited to a finite domain, for example if the image is captured by a moving imaging device, such as, for example, a satellite on-board camera.
  • The pixel value I (x) may also have any number of dimensions. For example, in a monochromatic image, the pixel value I (x) may be a scalar luminance value, but in polychromatic images, such as red-green-blue (RGB) component video images or hue saturation value (HSV) images, this pixel value I (x) may be a multidimensional vector value.
  • Over the last two decades, a large number of background subtraction algorithms have been proposed for motion detection. Many of these background subtraction algorithms have been reviewed by P.-M. Jodoin, S. Piérard, Y Wang, and M. Van Droogenbroeck in "Overview and benchmarking of motion detection methods", Background Modeling and Foreground Detection for Video Surveillance, chapter 24, Chapman and Hall/CRC, July 2014, and by T. Bouwmans in "Traditional and recent approaches in background modeling for foreground detection: An overview", Computer Science Review, vol. 11-12, pp. 31-66, May 2014.
  • Most background subtraction algorithms involve a comparison of low-level features, such as individual pixel values, in each image, with a background model, which may be reduced to an image free of moving objects and possibly adaptive. Pixels with a noticeable difference with respect to the background model may be assumed to belong to moving objects, and may thus be assigned to a set of foreground pixels, while the remainder may be assigned to a set of background pixels. For instance, the background subtraction algorithms disclosed by C. Stauffer and E. Grimson in "Adaptive background mixture models for real-time tracking", IEEE Int. Conf. Comput. Vision and Pattern Recogn. (CVPR), June 1999, vol. 2, pp. 246-252, and by O. Barnich and M. Van Droogenbroeck in "ViBe: A universal background subtraction algorithm for video sequences" in IEEE Trans. Image Process., vol. 20, no. 6, pp. 1709-1724, June 2011, classify pixels according to color components, whereas the background subtraction algorithms disclosed by V. Jain, B. Kimia, and J. Mundy in "Background modeling based on subpixel edges," IEEE Int. Conf. Image Process. (ICIP), Sept. 2007, vol. 6, pp. 321-324, S. Zhang, H. Yao, and S. Liu in "Dynamic background modeling and subtraction using spatio-temporal local binary patterns", IEEE Int. Conf. Image Process. (ICIP), Oct. 2008, pp. 1556-1559, M. Chen, Q. Yang, Q. Li, G. Wang, and M.-H. Yang in "Spatiotemporal background subtraction using minimum spanning tree and optical flow", Eur. Conf. Comput. Vision (ECCV), Sept. 2014, vol. 8695 of Lecture Notes Comp. Sci., pp. 521-534, Springer, and M. Braham, A. Lejeune, and M. Van Droogenbroeck, "A physically motivated pixel-based model for background subtraction in 3D images," in IEEE Int. Conf. 3D Imaging (IC3D), Dec. 2014, pp. 1-8, use, respectively, edges, texture descriptors, optical flow, or depth to assign pixels to the foreground or the background. A comprehensive review and classification of features used for background modeling was given by T. Bouwmans, C. Silva, C. Marghes, M. Zitouni, H. Bhaskar, and C. Frelicot in "On the role and the importance of features for background modeling and foreground detection," CoRR, vol. abs/1611.09099, pp. 1-131, Nov. 2016.
  • While most of these low-level features can be computed with a very low computational load, they cannot address simultaneously the numerous challenges arising in real-world video sequences such as illumination changes, camouflage, camera jitter, dynamic backgrounds, shadows, etc. Upper bounds on the performance of pixel-based methods based exclusively on RGB color components were simulated by S. Piérard and M. Van Droogenbroeck in "A perfect estimation of a background image does not lead to a perfect background subtraction: analysis of the upper bound on the performance," in Int. Conf. Image Anal. and Process. (ICIAP), Workshop Scene Background Modeling and Initialization (SBMI). Sept. 2015, vol. 9281 of Lecture Notes Comp. Sci., pp. 527-534, Springer. In particular, it was shown that background subtraction algorithms fail to provide a perfect segmentation in the presence of noise and shadows, even when a perfect background image is available.
  • Among the typical challenges for background subtraction algorithms, we can in particular consider camouflaged foreground objects, "ghosts", dynamic backgrounds and shadows and/or reflection effects.
  • A foreground object is considered to be "camouflaged" when its corresponding pixel values (e.g. color or luminance) are similar to those of the background. In this situation, background subtraction algorithms may erroneously assign the corresponding foreground pixels to the background, as false negatives. This may for instance take the form of color camouflage on images from color cameras, or of thermal camouflage on images from thermal cameras. Snow cover, for example, may lead to such camouflaging.
  • "Ghosting" is the phenomenon when a previously static object, which thus belonged to the background, starts moving. In this situation, because not only the pixel values of the pixels corresponding to the object change, but also those belonging to the background previously hidden by the object when it was static, these latter background pixels may be erroneously assigned to the foreground, as false positives.
  • Dynamic backgrounds are such backgrounds were there may be changes in pixel values, such as for instance a windblown leafy tree or a sea wave. In this situation, the corresponding background pixels may be erroneously assigned to the foreground, also as false positives.
  • Similarly, shadows and reflections may lead to background pixels being erroneously assigned to the foreground, as false positives, due to the associated changes in pixel values.
  • Other challenges that may lead background pixels to be erroneously assigned to the foreground as false positives are noisy images (for instance due to compression artifacts), camera jitter, automatic camera adjustments, slow framerates, panning, tilting and/or zooming, bad weather, gradual or sudden lighting changes, motion/insertion of background objects, residual heat stamps on thermal images, persistent background changes, clouds, smoke and highlights due to reflections.
  • Other challenges that may lead foreground pixels to be erroneously assigned to the background are fast moving objects, and foreground objects that become motionless and may thus be erroneously incorporated into the background.
  • SUMMARY
  • A first aspect of the disclosure relates to a method for assigning a pixel to one of a foreground pixel set and a background pixel set, more reliably and robustly than with background subtraction algorithms comparing a pixel value of the pixel with a pixel value of a corresponding pixel in a background model. In particular, according to this first aspect, the present disclosure seeks to address the abovementioned challenges to background subtraction algorithms. For this purpose, the method according to this first aspect may comprise the steps of calculating a probability that a pixel of the selected image belongs to a foreground-relevant object according to a semantic segmentation algorithm, and assigning the pixel to the background pixel set if the probability that the pixel belongs to a foreground-relevant object does not exceed a first predetermined threshold, assigning the pixel to the foreground pixel set if the probability that the pixel belongs to a foreground-relevant object exceeds the first predetermined threshold and a difference between the probability that the pixel belongs to a foreground-relevant object and a baseline probability for the pixel equals or exceeds a second predetermined threshold.
  • Humans can easily delineate relevant moving objects with a high precision because they incorporate knowledge from the semantic level: they know what a car is, recognize shadows, distinguish between object motion and camera motion, etc. The purpose of semantic segmentation (also known as scene labeling or scene parsing) is to provide such information by labeling each pixel of an image with the class of its enclosing object or region. The task is difficult and requires the simultaneous detection, localization, and segmentation of semantic objects and regions. However, the advent of deep neural networks within the computer vision community and the access to large labeled training datasets have dramatically improved the performance of semantic segmentation algorithms, as described by J. Long, E. Shelhamer, and T Darrell in "Fully convolutional networks for semantic segmentation", IEEE Int. Conf. Comput. Vision and Pattern Recogn. (CVPR), June 2015, pp. 3431-3440, by S. Zheng, S. Jayasumana, B. Romera-Paredes, V. Vineet, Z. Su, D. Du, C. Huang, and P. Torr in "Conditional random fields as recurrent neural networks", IEEE Int. Conf. Comput. Vision (ICCV), Dec. 2015, pp. 1529-1537, and by H. Zhao, J. Shi, X. Qi, X. Wang, and J. Jia, "Pyramid scene parsing network," CoRR, vol. abs/1612.01105, Dec. 2016. Semantic segmentation algorithms have thus begun to be used for specific computer vision tasks, such as optical flow estimation as described by L. Sevilla-Lara, D. Sun, V. Jampani, and M. J. Black in "Optical flow with semantic segmentation and localized layers", IEEE Int. Conf. Comput. Vision and Pattern Recogn. (CVPR), June 2016, pp. 3889-3898.
  • By leveraging information from a higher, semantic level, the method according to this first aspect of the disclosure can provide a more robust, reliable image segmentation into foreground and background than that provided by a background subtraction algorithm merely comparing low-level pixel values with a background model.
  • On one hand, the semantic level can thus be used to identify foreground-relevant objects, that is, objects belonging to semantic classes that can be expected to move, and thus belong to the foreground, and leverage this knowledge in the step of assigning the pixel to the background pixel set if the probability that the pixel belongs to a foreground-relevant object does not exceed a first predetermined threshold, so as to prevent false positives, that is, erroneously assigning pixels to the foreground pixel set due to, for example, dynamic backgrounds, ghosting, shadows and/or reflections, camera jitter, panning, tilting and/or zooming, bad weather, gradual or sudden lighting changes or background displacement, which usually affect the performances of conventional background subtraction algorithms.
  • On the other hand, the semantic level can also be used to identify whether the probability that a pixel belongs to such a foreground-relevant object is increased with respect to a baseline probability for that pixel, that may for instance correspond to a corresponding pixel in a semantic background model, in the step of assigning the pixel of the selected image to the foreground pixel set if the probability that the pixel belongs to a foreground-relevant object exceeds the first predetermined threshold and a difference between the probability that the pixel belongs to a foreground-relevant object and a baseline probability for the pixel equals or exceeds a second predetermined threshold, so as to prevent false negatives, that is, erroneously assigning pixels to the background, due to camouflage, i.e. when background and foreground share similar pixel values.
  • According to a second aspect of the present disclosure, the abovementioned method may further comprise a baseline updating step, wherein the baseline probability for the pixel is made equal to the probability that the pixel belongs to a foreground-relevant object calculated according to the semantic segmentation algorithm, if the pixel has been assigned to the background pixel set. Consequently, the baseline probability for the pixel can be updated for subsequent use with respect to corresponding pixels in other images using the information from the semantic level of this image. However, to avoid corrupting this baseline probability, for instance due to intermittent and slow-moving objects, a conservative updating strategy may be applied in which the baseline updating step is executed only randomly, according to a predetermined probability of execution, if the pixel has been assigned to the background pixel set.
  • According to a third aspect of the present disclosure, the method may further comprise a step of assigning the pixel to either the foreground pixel set or the background pixel set according to a background subtraction algorithm comparing a pixel value of the pixel with a pixel value of a corresponding pixel in a background model, and in particular a background subtraction algorithm based on at least one low-level image feature, if the probability that the pixel belongs to a foreground-relevant object exceeds the first predetermined threshold and the difference between the probability that the pixel belongs to a foreground-relevant object and the baseline probability for the pixel is lower than the second predetermined threshold. Consequently, any pixel that the abovementioned steps fail to assign to either the foreground pixel set or the background pixel set on the basis of the semantic segmentation algorithm may be assigned using a known background subtraction algorithm comparing a pixel value of the pixel with a pixel value of a corresponding pixel in a background model. In particular, the pixel may belong to an image and the background model be based on at least another, related image, such as for instance a previous image in a chronological sequence of images including the image to which the pixel belongs.
  • Indeed, according to a fourth aspect of the present disclosure, the pixel may belong to an image of a chronological sequence of images, in which case the baseline probability for the pixel may have been initialized as equal to a probability that a corresponding pixel in an initial image of the plurality of related images belongs to a foreground-relevant object, calculated according to the semantic segmentation algorithm. Consequently, the semantic knowledge from this initial image can be leveraged in at least initially setting the baseline probabilities with which the probabilities of corresponding pixels in subsequent images belonging to foreground-relevant objects are compared when determining whether those pixels are to be assigned to the foreground.
  • According to a fifth aspect of the present invention, the first and second predetermined thresholds may have been predetermined so as to optimize an F score of the method on a test image series. The F score of a detection method may be defined as the harmonic mean between precision and recall, wherein the precision is a ratio of true positives to the sum of true positives and false positives and the recall is a ratio of true positives to the sum of true positives and false negatives. In the present context, pixels that are correctly assigned to the foreground can be considered as true positives, pixels that are incorrectly assigned to the foreground represent false positives, and pixels that are incorrectly assigned to the background represent false negatives. Consequently, predetermining the first and second predetermined thresholds so as to optimize the F score of the abovementioned method on a test image series can ensure a good compromise between precision p and recall r when the method is subsequently carried out on the selected image.
  • However, in an alternative sixth aspect of the present invention, the first and second predetermined thresholds may have been heuristically predetermined based on, respectively, a false foreground detection rate of the background subtraction algorithm and a true foreground detection rate of the background subtraction algorithm. It has indeed been found by the inventors that the first and second predetermined thresholds with which the F score of the method on test image series can be optimized are strongly correlated with, respectively, the false foreground detection rate and the true foreground detection rate of the background subtraction algorithm applied in this method. Consequently, if those rates are known from earlier tests of the background subtraction algorithm, it becomes possible to ensure a good compromise between precision and recall when the method is carried out on the selected image, even without carrying out a time- and resource-consuming optimization of the F score of the method applying both the background subtraction algorithm and the semantic segmentation algorithm.
  • The present invention relates also to a data processing device programmed so as to carry out the image background recognition method of the invention; to a data storage medium comprising a set of instructions configured to be read by a data processing device to carry out an image background recognition method according to the invention; to a set of signals in magnetic, electromagnetic, electric and/or mechanical form, comprising a set of instructions for a data processing device to carry out an image background recognition method according to the invention; and/or to a process of transmitting, via magnetic, electromagnetic, electric and/or mechanical means, a set of instructions for a data processing device to carry out an image background recognition method according to the invention.
  • As "data storage medium" may be understood any physical medium capable of containing data readable by a reading device for at least a certain period of time. Examples of such data storage media are magnetic tapes and discs, optical discs (read-only as well as recordable or re-writable), logical circuit memories, such as read-only memory chips, random-access memory chips and flash memory chips, and even more exotic data storage media, such as chemical, biochemical or mechanical memories.
  • As "electromagnetic" any part of the electromagnetic spectrum is understood, from radio to UV and beyond, including microwave, infrared and visible light, in coherent (LASER, MASER) or incoherent form.
  • As "object" is understood any observable element of the real world, including animals and/or humans.
  • The above summary of some aspects of the invention is not intended to describe each disclosed embodiment or every implementation of the invention. In particular, selected features of any illustrative embodiment within this specification may be incorporated into an additional embodiment unless clearly stated to the contrary.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • The invention may be more completely understood in consideration of the following detailed description of various embodiments in connection with the accompanying drawings, in which :
    • FIG. 1 is a flowchart illustrating a core routine of a method according to an aspect of the present disclosure ;
    • FIG. 2 is a functional scheme illustrating how the results of a semantic segmentation algorithm and a background subtraction algorithm are combined in the core routine of FIG. 1 ;
    • FIGS. 3A and 3B are graphs charting the positive correlations between the False Positive Rate FPR and True Positive Rate TPR of the background subtraction algorithm and the optimum values for, respectively, a first predetermined threshold τ BG and a second predetermined threshold τ FG in the method of FIG. 2 ;
    • FIGS. 4 to 6 are graphs charting the improvement achieved by the method of FIG. 1 over a background subtraction algorithm used therein ; and
    • FIG. 7 illustrates the outputs of a semantic segmentation algorithm, a background subtraction algorithm and a method combining both for various video sequences in difficult scenarios ; and
    • FIGS. 8 and 9 illustrate potential embodiments of video systems applying the method of FIG. 1.
  • While the invention is amenable to various modifications and alternative forms, specifics thereof have been shown by way of example in the drawings and will be described in detail. It should be understood, however, that the intention is not to limit aspects of the invention to the particular embodiments described. On the contrary, the intention is to cover all modifications, equivalents, and alternatives falling within the scope of the invention.
  • DETAILED DESCRIPTION
  • For the following defined terms, these definitions shall be applied, unless a different definition is given in the claims or elsewhere in this specification.
  • All numeric values are herein assumed to be preceded by the term "about", whether or not explicitly indicated. The term "about" generally refers to a range of numbers that one of skill in the art would consider equivalent to the recited value (i.e. having the same function or result). In many instances, the term "about" may be indicative as including numbers that are rounded to the nearest significant figure.
  • As used in this specification and the appended claims, the singular forms "a", "an", and "the" include plural referents unless the content clearly dictates otherwise. As used in this specification and the appended claims, the term "or" is generally employed in its sense including "and/or" unless the content clearly dictates otherwise.
  • The following detailed description should be read with reference to the drawings in which similar elements in different drawings are numbered the same. The detailed description and the drawings, which are not necessarily to scale, depict illustrative embodiments and are not intended to limit the scope of the invention. The illustrative embodiments depicted are intended only as exemplary. Selected features of any illustrative embodiment may be incorporated into an additional embodiment unless clearly stated to the contrary.
  • In a set of images that may have been, for example, successively captured by an imaging device at times t following a time series, each image may be formed by a plurality of pixels, each single pixel in an image having a dedicated pixel position x and a pixel value I (x). For ease of understanding, in the accompanying drawings, the pixel position x is shown as two-dimensional, but it could have any number of dimensions. For 3D images, for instance, the pixel position x may have three dimensions. The pixel value I (x) in the illustrated embodiment is a three-dimensional vector, in the form of RGB- or HSV-triplets for obtaining a polychromatic image. In alternative embodiments, it could however have any other number of dimensions.
  • A semantic segmentation algorithm may be applied to each image in order to calculate, for each pixel position x and time t, a real-valued vector v t x = v t 1 x , v t 2 x , , v t N x ,
    Figure imgb0001
    where v i t x
    Figure imgb0002
    denotes a score for each class ci of a set C = {c1 ,c2 ,...,cN } of N disjoint classes of objects. A set of probabilities pt (xci ) that the pixel at pixel position x and time t belongs to each class ci of the set C may be calculated by applying a softmax function to the scores v t i x .
    Figure imgb0003
  • Among the N disjoint classes of objects of set C, a subset R may correspond to foreground-relevant objects, that is, objects relevant to motion detection. For instance, if the images relate to a street scene, these foreground-relevant objects may comprise potentially mobile objects like vehicles, people and animals, but not typically immobile objects like trees or buildings. Using the semantic segmentation algorithm it is thus possible to calculate an aggregated probability p S , t x = p t x R = c i R p t x c i
    Figure imgb0004
    that the pixel at pixel position x and time t belongs to a foreground-relevant object. It may be possible to consider different subsets R , possibly with different numbers of disjoint classes of foreground-relevant objects, for different areas of an image . For instance, when the image shows both a road and a walking path, the subset R may include just people and animals as foreground-relevant object classes in the area of the walking path, but also vehicles in the area of the road.
  • This probability pS,t (x) that the pixel at pixel position x and time t belongs to a foreground-relevant object according to the semantic segmentation algorithm can be used in a method for assigning pixels to foreground and background pixel sets in each image of the set of images. Fig. 1 shows a flowchart illustrating a core routine of this method, wherein the pixel at pixel position x and time t is assigned to either the foreground pixel set or the background pixel set. In a first step S100, the probability pS,t (x) that the pixel at pixel position x and time t belongs to a foreground-relevant object is calculated using the semantic segmentation algorithm. In a second step S200, it is determined whether this probability pS,t (x) is lower than or equal to a first predetermined threshold τ BG . If the result of this comparison is positive, and it is thus determined that the probability pS,t (x) that the pixel at pixel position x and time t belongs to a foreground-relevant object according to the semantic segmentation algorithm does not exceed the first predetermined threshold τ BG , it is considered unlikely that the pixel at pixel position x and time t belongs to a potentially mobile object, and the pixel at pixel position x and time t is thus assigned to the background in step S300. Using a binary variable D ∈{BG,FG}, wherein the value BG indicates a background pixel and the value FG indicates a foreground pixel, this can be expressed as a first rule: S t BG x τ BG D t x = BG
    Figure imgb0005
    wherein S t BG x
    Figure imgb0006
    denotes a signal that equals the probability pS,t (x) , and Dt (x) denotes the value of the binary variable D for the pixel at pixel position x and time t. This first rule provides a simple way to address the challenges of illumination changes, dynamic backgrounds, ghosts and strong shadows, which severely affect the performances of conventional background subtraction algorithms by erroneously assigning background pixels to the foreground pixel set.
  • On the other hand, if in step S200 it is determined that the probability pS,t (x) is not lower than or equal to a first predetermined threshold τBG , in the next step S400 it is determined whether a difference S t FG x = p S , t x M t x
    Figure imgb0007
    is at least equal to a second predetermined threshold τFG , wherein Mt (x) denotes a baseline probability for pixel position x and time t. This baseline probability Mt (x) corresponds to a semantic model of the background for pixel position x and time t. It may have been initialized as equal to the probability p S,0(x) that a corresponding pixel at pixel position x and time 0, that is, in an initial image of the set of related images, belongs to a foreground-relevant object according to the semantic segmentation algorithm. It may then have been updated according to the following update strategy at each subsequent time step: D t x = FG M t + 1 x = M t x D t x = BG M α t + 1 x = p s , t x M 1 α t + 1 x = M t x
    Figure imgb0008
    wherein → α denotes a predetermined probability α of execution, which may be set, for example, to 0.00024. Therefore, the value of the baseline probability M t+1(x) for pixel position x and the next time step t+1 is maintained equal to the baseline probability Mt (x) for a corresponding pixel at time step t, and only updated randomly, according to the predetermined probability of execution p, with the value of the probability pS,t (x), if Dt (x)=BG, that is, if the pixel at pixel position x and time t has been assigned to the background pixel set.
  • The random determination, with predetermined probability α of execution, of whether the baseline probability Mt (x) for pixel position x is to be updated, may be carried out using a random number generator. However, since such random numbers cannot be provided by a deterministic computer, a pseudorandom number generator may be used instead with properties similar to those of a true random number generator. Another alternative is the use of a large look-up list of previously generated random or pseudorandom numbers.
  • If the result of the comparison at step S400 is positive and it is thus determined that the difference S t FG x
    Figure imgb0009
    is indeed equal to or higher than this second predetermined threshold τFG , it is considered that there has been a significant increase in the probability pS,t (x) for the pixel at pixel position x and time t with respect to that to be expected according to the semantic model, and in the next step S500 the pixel at pixel position x and time t is thus assigned to the foreground pixel set. This can be expressed as a second rule: S t FG x τ BG D t x = FG
    Figure imgb0010
  • This second rule can prevent to a large extent that foreground pixels that are camouflaged, that is, that have similar pixel values to the background, be erroneously assigned to the background, which is also a challenge for conventional background subtraction algorithms.
  • However, semantic segmentation alone may not suffice to distinguish between foreground and background, for instance in the case in which a foreground-relevant object (e.g. a moving car) moves in front of a stationary object of the same semantic, foreground-relevant object (e.g. a parked car). Since both objects belong to the same foreground-relevant object class, the probability pS,t (x) will not significantly increase as the moving object moves in front of the stationary object at pixel position and time t.
  • To address such a situation, if the result of the comparison at step S400 is negative, that is, if the probability S t BG x
    Figure imgb0011
    exceeds first predetermined threshold τBG and the difference S t FG x
    Figure imgb0012
    is lower than the second predetermined threshold τFG , a third rule is applied in the next step S600, assigning the pixel at pixel position x and time t to either the foreground pixel set or the background pixel set according to a conventional background subtraction algorithm comparing a pixel value I (x) of the pixel at pixel position x and time t with a pixel value of a corresponding pixel in a background model based on at least another image of the plurality of related images. This can be expressed as a third rule: D t x = B t x
    Figure imgb0013
    wherein Bt (x)∈{BG,FG} denotes the result from the background subtraction algorithm.
  • Fig. 2 thus illustrates how the three signals S t BG x ,
    Figure imgb0014
    S t FG x
    Figure imgb0015
    and Bt (x) can be obtained and applied in combination, using the abovementioned three rules, for foreground and background detection. How these signals are combined can also be summarized with the following table:
    Bt (x) S t BG x τ BG
    Figure imgb0016
    S t FG x τ BG
    Figure imgb0017
    Dt (x)
    BG false false BG
    BG false true FG
    BG true false BG
    BG true true X
    FG false false FG
    Table 1: Foreground and background detection according to the three rules of the method
    FG false true FG
    FG true false BG
    FG true true X
  • If the first predetermined threshold τBG is set lower than the second predetermined threshold τFG , the two situations denoted with "X" on Table 1 above cannot effectively be encountered.
  • Because the first rule only assigns pixels to the background pixel set, raising the first predetermined threshold τBG so that the first rule is applied more frequently can only decrease the True Positive Rate TPR, that is the rate at which pixels are correctly assigned to the foreground, and the False Positive Rate FPR, that is the rate at which pixels are erroneously assigned to the foreground pixel set. On the other hand, because the second rule only assigns pixels to the foreground pixel set, decreasing the second predetermined threshold τFG so that the second rule is applied more frequently can only increase the True Positive Rate TPR and the False Positive Rate FPR. Ideally, the first predetermined threshold τBG and second predetermined threshold τFG are thus to be set at the level that achieves the best compromise between the highest possible True Positive Rate TPR and the lowest possible False Positive Rate FPR.
  • One first alternative for setting the first predetermined threshold τBG and second predetermined threshold τFG is to perform tests on test image sequences using the abovementioned method with various different values for the first predetermined threshold τBG and second predetermined threshold τFG , and select the duple of values for these thresholds resulting, for given background subtraction and semantic segmentation algorithms, in the best overall F score, that is, the highest harmonic mean between precision and recall, wherein the precision is a ratio of true positives (instances of pixels correctly assigned to the foreground pixel set) to the sum of true positives and false positives (instances of pixels erroneously assigned to the foreground pixel set) and the recall is a ratio of true positives to the sum of true positives and false negatives (instances of pixels erroneously assigned to the background pixel set). This can be performed as a grid search optimization.
  • The inventors have carried out such tests on 53 video sequences, organized in 11 categories, of the CDNet dataset presented by Y Wang, P.-M. Jodoin, F. Porikli, J. Konrad, Y Benezeth, and P. Ishwar in "CDnet 2014: An expanded change detection benchmark dataset", IEEE Int. Conf. Comput. Vision and Pattern Recogn. Workshops (CVPRW), pages 393-400, Columbus, Ohio, USA, June 2014, applying the presently disclosed method using 34 different background subtraction algorithms and, as semantic segmentation algorithm, the deep architecture PSPNet disclosed by H. Zhao, J. Shi X. Qi, X. Wang and J. Jia in "Pyramid scene parsing network", CoRR, vol. abs/1612.01105, trained on the ADE20K dataset presented by B. Zhou, H. Zhao, X. Puig, S. Fidler, A. Barriuso and A. Torralba in "Semantic understanding of scenes through the ADE20K dataset", CoRR, vol. abs/1608.05442, Aug. 2016, to extract semantics, using the PSPNet50 ADE20K model made publicly available by H. Zhao, running at approximately 7 fps at a 473 x 473 pixel image resolution on an NVIDIA® GeForce® GTX Titan X GPU. The last layer of this PSPNet architecture assigns to each pixel a score for each class ci of a set C={c 1,c 2,...,c N} of N=150 disjoint object classes. In these tests, the selected subset of foreground-relevant object classes is R={person,car,cushion,box,book,boat,bus,truck,bottle,van,bag,bicycle], corresponding to the semantics of CDNet foreground objects.
  • During these tests, it was found that there is a close correlation between the False Positive Rate FPR and True Positive Rate TPR of the background subtraction algorithm used in the third rule of the abovementioned method and, respectively, the first predetermined threshold τBG and second predetermined threshold τFG that achieve the best overall F score when applied in the first and second rules of the same method, as shown in Figs. 3A and 3B.
  • Consequently, a second alternative approach is to heuristically set the first predetermined threshold τBG and second predetermined threshold τFG based on, respectively, the False Positive Rate FPR and True Positive Rate TPR of the background subtraction algorithm to be used in the third rule of the method. For instance, the first predetermined threshold τBG may be set as half the False Positive Rate FPR of the background subtraction algorithm, and the second predetermined threshold τFG as equal to the True Positive Rate TPR of the background subtraction algorithm. Since the background subtraction algorithm should, by definition, perform better than a random classifier, its False Positive Rate FPR should be lower than its True Positive Rate TPR, thus ensuring that the first predetermined threshold τBG is also lower than the second predetermined threshold τ FG .
  • According to a third alternative approach, the first predetermined threshold τBG and second predetermined threshold τFG may be set to default values, corresponding for example to the arithmetic mean of the values for these thresholds resulting in the best overall F score for each of the best-performing five background subtraction algorithms in the 2014 CDNet ranking, with the same semantic segmentation algorithm.
  • Each one of these three alternative approaches has been tested and found to provide very significant improvements over the results of the underlying background subtraction algorithm on its own. Fig. 4 illustrates this improvement, defined as one minus the error rate of the method combining background subtraction with semantic segmentation divided by the error rate of the background subtraction algorithm on its own, for each one of these three approaches. More specifically, Fig. 4 illustrates the mean improvement, measured on the overall CDNet dataset, both for the entire set of 34 background subtraction algorithms, and for only the 5 best-performing background subtraction algorithms. As can be seen on this figure, the first approach offers a very significant improvement, even over the background subtraction algorithms that already performed best, and this improvement is hardly decreased with the second and third alternative approaches.
  • Fig. 5 illustrates the improvement, with respect to each background subtraction algorithm of the abovementioned set of 34 different background subtraction algorithms, in terms of change in the mean True Positive Rate TPR and False Positive Rate FPR. As can be seen there, the present method tends to reduce significantly the False Positive Rate FPR, while simultaneously increasing the True Positive Rate TPR.
  • Fig. 6 illustrates the mean improvement, both for all 34 different background subtraction algorithms and for the 5 best-performing, per category of video sequence, in the 11 categories of the CDNet dataset: "Baseline", "Dynamic background", "Camera jitter", "Intermittent object motion", "Shadow", "Thermal", "Bad weather", "Low framerate", "Night videos", "Pan-Tilt-Zoom Camera" and "Turbulence". Particularly good improvements can be observed for the "Baseline", "Dynamic background", "Shadow" and "Bad weather" categories. With respect to the "Thermal" and "Night videos" categories, it must be noted that the ADE20K dataset used to teach the semantic segmentation algorithm did not include images of these types, which may explain the less good results for those categories.
  • Fig. 7 illustrates the benefits of the method according to the present disclosure in four different scenarios of real-world surveillance tasks. From left to right, the four columns correspond, respectively, to scenarios with dynamic background, risk of ghosts, strong shadows, and camouflage effects. From the top down, the five rows illustrate a frame of the corresponding video sequence, the probability pS,t (x) for each pixel, the output of the IUTIS-5 background subtraction algorithm described by S. Bianco, G. Ciocca and R. Schettini in "How far can you get by combining change detection algorithms?", CoRR, vol. abs/1505.02921, 2015, the output of the presently-disclosed method, applying the IUTIS-5 background subtraction algorithm in its third rule, and the ground truth. As can be seen, with respect to the IUTIS-5 background subtraction algorithm on its own, the presently-disclosed method greatly reduces the number of false positive foreground pixel detections caused by dynamic backgrounds, ghosts and strong shadows, while at the same time mitigating camouflage effects.
  • The presently disclosed method may be carried out with assistance of a data processing device, such as, for example, a programmable computer like the abovementioned NVIDIA® GeForce® GTX Titan X GPU, connected to an imaging device providing a video sequence of successive images. In such a case, the data processing device may receive instructions for carrying out this method using a data storage medium, or as signals in magnetic, electromagnetic, electric and/or mechanical form.
  • The presently disclosed method may, for example, be applied to video-surveillance, professional and/or consumer digital still and/or video cameras, computer and videogame devices using image capture interfaces, satellite imaging and Earth observation, automatic image analysis and/or medical imaging systems or may be included in a smartphone.
  • Fig. 8 illustrates a possible application of the invention with an imaging device 1 in the particular form of a digital camera with an embedded data processing device 2 programmed to carry out the method of the invention. Fig. 9 illustrates another possible application of the invention with an imaging device 1 connected to a data processing device 2 programmed to carry out the method of the invention.
  • Those skilled in the art will recognize that the present invention may be manifested in a variety of forms other than the specific embodiments described and contemplated herein. Accordingly, departure in form and detail may be made without departing from the scope of the present invention as described in the appended claims.

Claims (11)

  1. A method for assigning a pixel to one of a foreground pixel set and a background pixel set, comprising the steps of :
    calculating (S100) a probability that the pixel belongs to a foreground-relevant object according to a semantic segmentation algorithm ;
    assigning (S300) the pixel to the background pixel set if the probability that the pixel belongs to a foreground-relevant object does not exceed a first predetermined threshold ; and
    assigning (S500) the pixel to the foreground pixel set if the probability that the pixel belongs to a foreground-relevant object exceeds the first predetermined threshold and a difference between the probability that the pixel belongs to a foreground-relevant object and a baseline probability for the pixel equals or exceeds a second predetermined threshold.
  2. The method according to claim 1, further comprising a baseline updating step, wherein the baseline probability for the pixel is made equal to the probability that the pixel belongs to a foreground-relevant object calculated according to the semantic segmentation algorithm, if the pixel has been assigned to the background pixel set.
  3. The method according to claim 2, wherein the baseline updating step is executed only randomly, according to a predetermined probability of execution, if the pixel has been assigned to the background pixel set.
  4. The method according to any preceding claim, further comprising a step of assigning the pixel to either the foreground pixel set or the background pixel set according to a background subtraction algorithm comparing a pixel value of the pixel with a pixel value of a corresponding pixel in a background model, if the probability that the pixel belongs to a foreground-relevant object exceeds the first predetermined threshold and the difference between the probability that the pixel belongs to a foreground-relevant object and the baseline probability for the pixel is lower than the second predetermined threshold.
  5. The method according to claim 4 wherein the pixel belongs to an image and the background model is based on at least another, related image.
  6. The method according to any one of the previous claims, wherein the pixel belongs to an image of a chronological sequence of images.
  7. The method according to claim 6, wherein the baseline probability for the pixel has been initialized as equal to a probability that a corresponding pixel in an initial image of chronological sequence of images belongs to a foreground-relevant object, calculated according to the semantic segmentation algorithm.
  8. The method according to any preceding claim, wherein the first and second predetermined thresholds have been predetermined so as to optimize an F score of the method on a test image series.
  9. The method according to any one of claims 1-7, wherein the first and second predetermined thresholds have been heuristically predetermined based on, respectively, a false foreground detection rate of the background subtraction algorithm and a true foreground detection rate of the background subtraction algorithm.
  10. A data processing device (2) programmed so as to carry out a method according to any one of claims 1 to 9.
  11. A data storage medium comprising a set of instructions configured to be read by a data processing device (2) to carry out a method according to any one of claims 1 to 9.
EP17184931.8A 2017-08-04 2017-08-04 Foreground and background detection method Active EP3438929B1 (en)

Priority Applications (5)

Application Number Priority Date Filing Date Title
EP17184931.8A EP3438929B1 (en) 2017-08-04 2017-08-04 Foreground and background detection method
CN201711402509.9A CN109389618B (en) 2017-08-04 2017-12-22 Foreground and background detection method
US15/983,238 US10614736B2 (en) 2017-08-04 2018-05-18 Foreground and background detection method
US16/267,474 US20190251695A1 (en) 2017-08-04 2019-02-05 Foreground and background detection method
US16/288,468 US10706558B2 (en) 2017-08-04 2019-02-28 Foreground and background detection method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
EP17184931.8A EP3438929B1 (en) 2017-08-04 2017-08-04 Foreground and background detection method

Publications (2)

Publication Number Publication Date
EP3438929A1 true EP3438929A1 (en) 2019-02-06
EP3438929B1 EP3438929B1 (en) 2020-07-08

Family

ID=59702525

Family Applications (1)

Application Number Title Priority Date Filing Date
EP17184931.8A Active EP3438929B1 (en) 2017-08-04 2017-08-04 Foreground and background detection method

Country Status (3)

Country Link
US (1) US10614736B2 (en)
EP (1) EP3438929B1 (en)
CN (1) CN109389618B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10747811B2 (en) * 2018-05-22 2020-08-18 Adobe Inc. Compositing aware digital image search
EP3582181B1 (en) * 2018-06-14 2020-04-01 Axis AB Method, device and system for determining whether pixel positions in an image frame belong to a background or a foreground
CN114463367A (en) * 2019-04-30 2022-05-10 腾讯科技(深圳)有限公司 Image processing method and device
US10984558B2 (en) * 2019-05-09 2021-04-20 Disney Enterprises, Inc. Learning-based sampling for image matting
EP3800615A1 (en) * 2019-10-01 2021-04-07 Axis AB Method and device for image analysis
US11109586B2 (en) 2019-11-13 2021-09-07 Bird Control Group, Bv System and methods for automated wildlife detection, monitoring and control
CN111488882B (en) * 2020-04-10 2020-12-25 视研智能科技(广州)有限公司 High-precision image semantic segmentation method for industrial part measurement
JP7475959B2 (en) * 2020-05-20 2024-04-30 キヤノン株式会社 Image processing device, image processing method, and program
CN111815673A (en) * 2020-06-23 2020-10-23 四川虹美智能科技有限公司 Moving object detection method, device and readable medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050002572A1 (en) * 2003-07-03 2005-01-06 General Electric Company Methods and systems for detecting objects of interest in spatio-temporal signals

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7639840B2 (en) * 2004-07-28 2009-12-29 Sarnoff Corporation Method and apparatus for improved video surveillance through classification of detected objects
US7783075B2 (en) * 2006-06-07 2010-08-24 Microsoft Corp. Background blurring for video conferencing
EP2463821A1 (en) * 2010-12-08 2012-06-13 Alcatel Lucent Method and system for segmenting an image
US9153031B2 (en) * 2011-06-22 2015-10-06 Microsoft Technology Licensing, Llc Modifying video regions using mobile device input
CN103325112B (en) * 2013-06-07 2016-03-23 中国民航大学 Moving target method for quick in dynamic scene
US9323991B2 (en) * 2013-11-26 2016-04-26 Xerox Corporation Method and system for video-based vehicle tracking adaptable to traffic conditions
US9414016B2 (en) * 2013-12-31 2016-08-09 Personify, Inc. System and methods for persona identification using combined probability maps
CN104751484B (en) * 2015-03-20 2017-08-25 西安理工大学 A kind of moving target detecting method and the detecting system for realizing moving target detecting method
CN106875406B (en) * 2017-01-24 2020-04-14 北京航空航天大学 Image-guided video semantic object segmentation method and device

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050002572A1 (en) * 2003-07-03 2005-01-06 General Electric Company Methods and systems for detecting objects of interest in spatio-temporal signals

Non-Patent Citations (20)

* Cited by examiner, † Cited by third party
Title
B. ZHOU; H. ZHAO; X. PUIG; S. FIDLER; A. BARRIUSO; A. TORRALBA: "Semantic understanding of scenes through the ADE20K dataset", CORR, vol. abs/1608, August 2016 (2016-08-01)
C. STAUFFER; E. GRIMSON: "Adaptive background mixture models for real-time tracking", IEEE INT. CONF. COMPUT. VISION AND PATTERN RECOGN. (CVPR, vol. 2, June 1999 (1999-06-01), pages 246 - 252
H. ZHAO; J. SHI; X. QI; X. WANG; J. JIA: "Pyramid scene parsing network", CORR, vol. abs/1612
H. ZHAO; J. SHI; X. QI; X. WANG; J. JIA: "Pyramid scene parsing network", CORR, vol. abs/1612, December 2016 (2016-12-01)
J. LONG; E. SHELHAMER; T. DARRELL: "Fully convolutional networks for semantic segmentation", IEEE INT. CONF. COMPUT. VISION AND PATTERN RECOGN. (CVPR, June 2015 (2015-06-01), pages 3431 - 3440, XP032793793, DOI: doi:10.1109/CVPR.2015.7298965
L. SEVILLA-LARA; D. SUN; V. JAMPANI; M. J. BLACK: "Optical flow with semantic segmentation and localized layers", IEEE INT. CONF. COMPUT. VISION AND PATTERN RECOGN. (CVPR, June 2016 (2016-06-01), pages 3889 - 3898, XP033021574, DOI: doi:10.1109/CVPR.2016.422
M. BRAHAM; A. LEJEUNE; M. VAN DROOGENBROECK: "A physically motivated pixel-based model for background subtraction in 3D images", IEEE INT. CONF. 3D IMAGING (IC3D, December 2014 (2014-12-01), pages 1 - 8
M. CHEN; Q. YANG; Q. LI; G. WANG; M.-H. YANG: "Eur. Conf. Comput. Vision (ECCV", vol. 8695, September 2014, SPRINGER, article "Spatiotemporal background subtraction using minimum spanning tree and optical flow", pages: 521 - 534
O. BARNICH; M. VAN DROOGENBROECK: "ViBe: A universal background subtraction algorithm for video sequence", IEEE TRANS. IMAGE PROCESS., vol. 20, no. 6, June 2011 (2011-06-01), pages 1709 - 1724, XP011411821, DOI: doi:10.1109/TIP.2010.2101613
P.-M. JODOIN; S. PIERARD; Y WANG; M. VAN DROOGENBROECK: "Background Modeling and Foreground Detection for Video Surveillance", July 2014, CHAPMAN AND HALL/CRC, article "Overview and benchmarking of motion detection methods"
S. PIERARD; M. VAN DROOGENBROECK: "A perfect estimation of a background image does not lead to a perfect background subtraction: analysis of the upper bound on the performance", INT. CONF. IMAGE ANAL. AND PROCESS. (ICIAP), WORKSHOP SCENE BACKGROUND MODELING AND INITIALIZATION (SBMI, vol. 9281, September 2015 (2015-09-01), pages 527 - 534
S. ZHANG; H. YAO; S. LIU: "Dynamic background modeling and subtraction using spatio-temporal local binary patterns", IEEE INT. CONF. IMAGE PROCESS. (ICIP, October 2008 (2008-10-01), pages 1556 - 1559, XP031374312, DOI: doi:10.1109/ICIP.2008.4712065
S. ZHENG; S. JAYASUMANA; B. ROMERA-PAREDES; V. VINEET; Z. SU; D. DU; C. HUANG; P. TORR: "Conditional random fields as recurrent neural networks", IEEE INT. CONF. COMPUT. VISION (ICCV, December 2015 (2015-12-01), pages 1529 - 1537, XP032866501, DOI: doi:10.1109/ICCV.2015.179
SERGI CAELLES ET AL: "Semantically-Guided Video Object Segmentation", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 6 April 2017 (2017-04-06), XP080761260 *
SEVILLA-LARA LAURA ET AL: "Optical Flow with Semantic Segmentation and Localized Layers", 2016 IEEE CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR), IEEE, 27 June 2016 (2016-06-27), pages 3889 - 3898, XP033021574, DOI: 10.1109/CVPR.2016.422 *
T. BOUWMANS: "Traditional and recent approaches in background modeling for foreground detection: An overview", COMPUTER SCIENCE REVIEW, vol. 11-12, May 2014 (2014-05-01), pages 31 - 66
T. BOUWMANS; C. SILVA; C. MARGHES; M. ZITOUNI; H. BHASKAR; C. FRELICOT: "On the role and the importance of features for background modeling and foreground detection", CORR, vol. abs/1611, November 2016 (2016-11-01), pages 1 - 131
V. JAIN; B. KIMIA; J. MUNDY: "Background modeling based on subpixel edges", IEEE INT. CONF. IMAGE PROCESS. (ICIP, vol. 6, September 2007 (2007-09-01), pages 321 - 324
VIKAS REDDY ET AL: "Improved Foreground Detection via Block-Based Classifier Cascade With Probabilistic Decision Integration", IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS FOR VIDEO TECHNOLOGY, INSTITUTE OF ELECTRICAL AND ELECTRONICS ENGINEERS, USA, vol. 23, no. 1, 1 January 2013 (2013-01-01), pages 83 - 93, XP011486355, ISSN: 1051-8215, DOI: 10.1109/TCSVT.2012.2203199 *
Y WANG; P.-M. JODOIN; F. PORIKLI; J. KONRAD; Y BENEZETH; P. ISHWAR: "CDnet 2014: An expanded change detection benchmark dataset", IEEE INT. CONF. COMPUT. VISION AND PATTERN RECOGN. WORKSHOPS (CVPRW, June 2014 (2014-06-01), pages 393 - 400, XP032649709, DOI: doi:10.1109/CVPRW.2014.126

Also Published As

Publication number Publication date
CN109389618B (en) 2022-03-01
US10614736B2 (en) 2020-04-07
US20190043403A1 (en) 2019-02-07
CN109389618A (en) 2019-02-26
EP3438929B1 (en) 2020-07-08

Similar Documents

Publication Publication Date Title
EP3438929B1 (en) Foreground and background detection method
CN109584248B (en) Infrared target instance segmentation method based on feature fusion and dense connection network
US10706558B2 (en) Foreground and background detection method
Bouwmans et al. Scene background initialization: A taxonomy
US8571261B2 (en) System and method for motion detection in a surveillance video
Nonaka et al. Evaluation report of integrated background modeling based on spatio-temporal features
Krig et al. Image pre-processing
US20140063275A1 (en) Visual saliency estimation for images and video
CN109460764B (en) Satellite video ship monitoring method combining brightness characteristics and improved interframe difference method
KR101436369B1 (en) Apparatus and method for detecting multiple object using adaptive block partitioning
US11887346B2 (en) Systems and methods for image feature extraction
Russell et al. An evaluation of moving shadow detection techniques
CN108765460B (en) Hyperspectral image-based space-time joint anomaly detection method and electronic equipment
US20190251695A1 (en) Foreground and background detection method
Chen et al. Visual depth guided image rain streaks removal via sparse coding
CN114581318A (en) Low-illumination image enhancement method and system
Moghimi et al. Shadow detection based on combinations of HSV color space and orthogonal transformation in surveillance videos
CN111815529B (en) Low-quality image classification enhancement method based on model fusion and data enhancement
CN114842235B (en) Infrared dim and small target identification method based on shape prior segmentation and multi-scale feature aggregation
Lee et al. Global illumination invariant object detection with level set based bimodal segmentation
CN113592801A (en) Method and device for detecting stripe interference of video image
CN112487994A (en) Smoke and fire detection method and system, storage medium and terminal
Qi et al. Fast detection of small infrared objects in maritime scenes using local minimum patterns
Thinh et al. Depth-aware salient object segmentation
CN116110076B (en) Power transmission aerial work personnel identity re-identification method and system based on mixed granularity network

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION HAS BEEN PUBLISHED

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

AX Request for extension of the european patent

Extension state: BA ME

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: REQUEST FOR EXAMINATION WAS MADE

17P Request for examination filed

Effective date: 20190806

RBV Designated contracting states (corrected)

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

GRAP Despatch of communication of intention to grant a patent

Free format text: ORIGINAL CODE: EPIDOSNIGR1

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: GRANT OF PATENT IS INTENDED

GRAS Grant fee paid

Free format text: ORIGINAL CODE: EPIDOSNIGR3

GRAA (expected) grant

Free format text: ORIGINAL CODE: 0009210

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE PATENT HAS BEEN GRANTED

INTG Intention to grant announced

Effective date: 20200515

AK Designated contracting states

Kind code of ref document: B1

Designated state(s): AL AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HR HU IE IS IT LI LT LU LV MC MK MT NL NO PL PT RO RS SE SI SK SM TR

REG Reference to a national code

Ref country code: CH

Ref legal event code: EP

Ref country code: AT

Ref legal event code: REF

Ref document number: 1289276

Country of ref document: AT

Kind code of ref document: T

Effective date: 20200715

REG Reference to a national code

Ref country code: DE

Ref legal event code: R096

Ref document number: 602017019264

Country of ref document: DE

REG Reference to a national code

Ref country code: IE

Ref legal event code: FG4D

REG Reference to a national code

Ref country code: LT

Ref legal event code: MG4D

REG Reference to a national code

Ref country code: AT

Ref legal event code: MK05

Ref document number: 1289276

Country of ref document: AT

Kind code of ref document: T

Effective date: 20200708

REG Reference to a national code

Ref country code: NL

Ref legal event code: MP

Effective date: 20200708

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: GR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201009

Ref country code: ES

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200708

Ref country code: AT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200708

Ref country code: BG

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201008

Ref country code: NO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201008

Ref country code: PT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201109

Ref country code: LT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200708

Ref country code: FI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200708

Ref country code: SE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200708

Ref country code: HR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200708

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LV

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200708

Ref country code: PL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200708

Ref country code: RS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200708

Ref country code: IS

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20201108

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: NL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200708

REG Reference to a national code

Ref country code: CH

Ref legal event code: PL

REG Reference to a national code

Ref country code: DE

Ref legal event code: R097

Ref document number: 602017019264

Country of ref document: DE

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: LU

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20200804

Ref country code: LI

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20200831

Ref country code: RO

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200708

Ref country code: CZ

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200708

Ref country code: DK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200708

Ref country code: CH

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20200831

Ref country code: SM

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200708

Ref country code: EE

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200708

Ref country code: IT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200708

PLBE No opposition filed within time limit

Free format text: ORIGINAL CODE: 0009261

STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: NO OPPOSITION FILED WITHIN TIME LIMIT

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: AL

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200708

26N No opposition filed

Effective date: 20210409

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200708

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: SI

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200708

Ref country code: IE

Free format text: LAPSE BECAUSE OF NON-PAYMENT OF DUE FEES

Effective date: 20200804

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: TR

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200708

Ref country code: MT

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200708

Ref country code: CY

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200708

PG25 Lapsed in a contracting state [announced via postgrant information from national office to epo]

Ref country code: MK

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200708

Ref country code: MC

Free format text: LAPSE BECAUSE OF FAILURE TO SUBMIT A TRANSLATION OF THE DESCRIPTION OR TO PAY THE FEE WITHIN THE PRESCRIBED TIME-LIMIT

Effective date: 20200708

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: GB

Payment date: 20230822

Year of fee payment: 7

PGFP Annual fee paid to national office [announced via postgrant information from national office to epo]

Ref country code: FR

Payment date: 20230828

Year of fee payment: 7

Ref country code: DE

Payment date: 20230821

Year of fee payment: 7

Ref country code: BE

Payment date: 20230821

Year of fee payment: 7