US11354808B2 - Image processing apparatus and method and monitoring system for classifying visual elements as foreground or background - Google Patents

Image processing apparatus and method and monitoring system for classifying visual elements as foreground or background Download PDF

Info

Publication number
US11354808B2
US11354808B2 US16/140,279 US201816140279A US11354808B2 US 11354808 B2 US11354808 B2 US 11354808B2 US 201816140279 A US201816140279 A US 201816140279A US 11354808 B2 US11354808 B2 US 11354808B2
Authority
US
United States
Prior art keywords
visual
background
visual elements
image
current image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US16/140,279
Other languages
English (en)
Other versions
US20190102887A1 (en
Inventor
Qin Yang
Tsewei Chen
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Canon Inc
Original Assignee
Canon Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Canon Inc filed Critical Canon Inc
Assigned to CANON KABUSHIKI KAISHA reassignment CANON KABUSHIKI KAISHA ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: YANG, QIN, CHEN, TSEWEI
Publication of US20190102887A1 publication Critical patent/US20190102887A1/en
Application granted granted Critical
Publication of US11354808B2 publication Critical patent/US11354808B2/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/254Analysis of motion involving subtraction of images
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/22Image preprocessing by selection of a specific region containing or referencing a pattern; Locating or processing of specific regions to guide the detection or recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/764Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/772Determining representative reference patterns, e.g. averaging or distorting patterns; Generating dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20048Transform domain processing
    • G06T2207/20052Discrete cosine transform [DCT]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30232Surveillance

Definitions

  • the present invention relates to image processing, especially relates to, for example, foreground detection and monitoring.
  • an image is generally made up of visual elements that are visible characteristics contributing to the appearance of the image.
  • one visual element for example could be a pixel, a Discrete Cosine Transform (DCT) block which represents a group of pixels or a super-pixel which represents a group of pixels with similar attributes (e.g. similar texture, similar color, similar luminance).
  • DCT Discrete Cosine Transform
  • the existing background subtraction techniques are generally used to classify the visual elements in a current image of a video as “foreground” or “background” by comparing with a background image which is obtained based on the images of the video in a certain duration time previous to the current image.
  • the “foreground” refers to transient objects that appear in a scene captured on a video. Such transient objects may include, for example, moving humans or moving cars. The remaining part of the scene is considered to be the “background”.
  • one exemplary technique is disclosed in “Moving Objects Detection and Segmentation Based on Background Subtraction and Image Over-Segmentation” (Yun-fang Zhu, ⁇ Journal of Software> VOL. 6, NO. 7, July, 2011).
  • this exemplary technique detects this visual element as the foreground or the background according to a background confidence of this visual element. More specifically, in case the background confidence of this visual element is larger than a threshold, this visual element will be detected as the background.
  • the background confidence of this visual element is a ratio of a first number to a second number, the first number denotes a number of the visual elements in the current image which neighbour to this visual element and have same color as this visual element, and the second number denotes a number of the visual elements in the current image which neighbour to this visual element and are detected as the foreground.
  • both of the above-mentioned first number and the above-mentioned second number will be larger, which will make the background confidence of this visual element to be detected smaller and less than the threshold. Therefore, the above-mentioned false foreground detection could not be eliminated as much as possible, which makes the foreground detection less than desired.
  • the present disclosure aims to solve at least one point of the issues as described above.
  • an image processing apparatus comprising: an acquisition unit configured to acquire a current image from an inputted video and a background model which comprises a background image and classification information of visual elements, wherein the classification information of the visual elements comprises foreground and background; a similarity measure determination unit configured to determine first similarity measures between visual elements in the current image and the visual elements in the background model; and a classification unit configured to classify the visual elements in the current image as the foreground or the background according to the current image, the background image in the background model and the first similarity measures determined by the similarity measure determination unit.
  • the visual elements in the background model are the visual elements whose classification information is the background
  • the visual elements in the background model are the visual elements which neighbour to the corresponding portions of the visual elements in the current image.
  • the background image in the background model is obtained according to at least one previous image of the current image.
  • classification information of the visual elements in the background model is obtained according to the visual elements which are classified as the foreground or the background in at least one previous image of the current image.
  • FIG. 1 is a block diagram schematically showing the hardware configuration that can implement the techniques according to the embodiments of the present invention.
  • FIG. 2 is a block diagram illustrating the configuration of an image processing apparatus according to the embodiment of the present invention.
  • FIGS. 3A to 3H schematically shows the foreground/background classification results of the visual elements in the previous images of the current image and the foreground/background classification information of the visual elements in the background model according to the present invention.
  • FIG. 4 schematically shows a flowchart of the image processing according to the embodiment of the present invention.
  • FIGS. 5A to 5B schematically shows a current image and the foreground/background classification information of the visual elements in the background model according to the present invention.
  • FIG. 6 schematically shows sub-units of the classification unit 230 as shown in FIG. 2 according to the present invention.
  • FIG. 7 schematically shows a flowchart of the step S 430 as shown in FIG. 4 according to the present invention.
  • FIG. 8 illustrates the arrangement of an exemplary monitor according to the present invention.
  • a real object i.e. the foreground
  • a real object will have a moving trajectory in the video. Therefore, in case one visual element in a current image of the video is the background, generally, the visual elements in at least one previous image of the current image which neighbour to the corresponding portion of this visual element will also be the background.
  • the inventors found that, in foreground detection, as for one visual element in a current image of a video, it could regard the visual elements which are classified as the background in the previous images of the current image and which neighbour to the corresponding portions of this visual element in the previous images of the current image as a reference to determine to classify this visual element as the foreground or the background.
  • one visual element in a current image of a video except the differences (i.e. the changes) between this visual element and the corresponding visual element in the previous images, it will also take into consideration the similarities between this visual element and the visual elements which are classified as the background in the previous images and which neighbour to the corresponding portions of this visual element in the previous images. For example, in case this visual element is similar to the visual elements which are classified as the background in the previous images and which neighbour to the corresponding portions of this visual element in the previous images, such as texture/color/luminance of these visual elements are similar, the probability that this visual element is the background is high.
  • the present disclosure even if the background includes movements (e.g. the water ripples or the leaves moving in the wind) in certain images of the video, or even if the graph segmentation algorithm with low accuracy is used to obtain the visual elements which are used for foreground detection, since the foreground/background classification results which are obtained in the previous processing will be used as a reference for the subsequent processing, the false foreground detection could be eliminated efficiently. Thus, the accuracy of the foreground detection will be improved.
  • movements e.g. the water ripples or the leaves moving in the wind
  • the hardware configuration 100 includes Central Processing Unit (CPU) 110 , Random Access Memory (RAM) 120 , Read Only Memory (ROM) 130 , Hard Disk 140 , Input Device 150 , Output Device 160 , Network Interface 170 and System Bus 180 .
  • the hardware configuration 100 could be implemented by a computer, such as tablet computers, laptops, desktops or other suitable electronic device.
  • the hardware configuration 100 could be implemented by a monitor, such as digital cameras, video cameras, network cameras or other suitable electronic device.
  • the hardware configuration 100 further includes Optical System 190 for example.
  • the image processing according to the present disclosure is configured by hardware or firmware and may function as a module or component of the hardware configuration 100 .
  • the image processing apparatus 200 which will be described in detail hereinafter with reference to FIG. 2 may function as a module or component of the hardware configuration 100 .
  • the image processing according to the present disclosure is configured by software, stored in the ROM 130 or the Hard Disk 140 , and executed by the CPU 110 .
  • the procedure 400 which will be described in detail hereinafter with reference to FIG. 4 may function as a program stored in the ROM 130 or the Hard Disk 140 .
  • the CPU 110 is any suitable programmable control devices (such as processors) and could execute a variety of functions, to be described hereinafter, by executing a variety of application programs that are stored in the ROM 130 or the Hard Disk 140 (such as memories).
  • the RAM 120 is used to temporarily store the program or the data that are loaded from the ROM 130 or the Hard Disk 140 , and is also used as a space wherein the CPU 110 executes the variety of procedures, such as carrying out the techniques which will be described in detail hereinafter with reference to FIG. 4 , as well as other available functions.
  • the Hard Disk 140 stores many kinds of information, such as an operating system (OS), the various applications, a control program, a video, processing results for each image of a video, and/or, pre-defined data (e.g. Thresholds (THs)).
  • OS operating system
  • THs Thresholds
  • the Input Device 150 is used to allow the user to interact with the hardware configuration 100 .
  • the user could input images/videos/data through the Input Device 150 .
  • the user could trigger the corresponding processing of the present disclosure through the Input Device 150 .
  • the Input Device 150 can take a variety of forms, such as a button, a keypad or a touch screen.
  • the Input Device 150 is used to receive images/videos which are outputted from special electronic devices, such as the digital cameras, the video cameras and/or the network cameras.
  • the optical system 190 in the hardware configuration 100 will capture images/videos of a monitoring place directly.
  • the Output Device 160 is used to display the processing results (such as the foreground) to the user.
  • the Output Device 160 can take a variety of forms, such as a Cathode Ray Tube (CRT) or a liquid crystal display.
  • the Output Device 160 is used to output the processing results to the subsequent processing, such as monitoring analysis that whether or not giving an alarm to the user, and so on.
  • the Network Interface 170 provides an interface for connecting the hardware configuration 100 to the network.
  • the hardware configuration 100 could perform, via the Network Interface 170 , data communication with other electronic device connected via the network.
  • a wireless interface may be provided for the hardware configuration 100 to perform wireless data communication.
  • the system bus 180 may provide a data transfer path for transferring data to, from, or between the CPU 110 , the RAM 120 , the ROM 130 , the Hard Disk 140 , the Input Device 150 , the Output Device 160 and the Network Interface 170 , and the like to each other.
  • the system bus 180 is not limited to any specific data transfer technology.
  • the above described hardware configuration 100 is merely illustrative and is in no way intended to limit the invention, its application, or uses. And for the sake of simplicity, only one hardware configuration is shown in FIG. 1 . However, a plurality of hardware configurations can also be used as needed.
  • FIG. 2 is a block diagram illustrating the configuration of the image processing apparatus 200 according to the embodiment of the present disclosure. Wherein, some or all of the blocks shown in FIG. 2 could be implemented by dedicated hardware. As shown in FIG. 2 , the image processing apparatus 200 comprises an acquisition unit 210 , a similarity measure determination unit 220 and a classification unit 230 .
  • a storage device 240 shown in FIG. 2 stores videos and the processing results (i.e. foreground/background classification results of the visual elements) for each image of the videos.
  • the videos are inputted by the user, or outputted from the special electronic device (e.g. the camera), or captured by the optical system as described in FIG. 1 .
  • the videos and the processing results could be stored in different storage devices.
  • the storage device 240 is the ROM 130 or the Hard Disk 140 shown in FIG. 1 .
  • the storage device 240 is a server or an external storage device which is connected with the image processing apparatus 200 via the network (not shown).
  • the Input Device 150 receives a video which is outputted from the special electronic device (e.g. the camera) or is input by the user. Second, the Input Device 150 transfers the received video to the image processing apparatus 200 via the system bus 180 . In another implementation, for example, in case the hardware configuration 100 is implemented by the monitor, the image processing apparatus 200 directly receives a video which is captured by the optical system 190 .
  • the acquisition unit 210 acquires a current image from the received video (i.e. the inputted video).
  • the current image is the t th image, wherein, t is the natural number and 2 ⁇ t ⁇ T, T is the total number of the images of the inputted video.
  • the acquisition unit 210 acquires a background model from the storage device 240 .
  • the background model comprises a background image and classification information of the visual elements.
  • the classification information of the visual elements comprises the foreground and the background.
  • the “classification information of the visual elements” will be regarded as “foreground/background classification information of the visual elements”.
  • the background image in the background model is obtained according to at least one previous image of the t th image. That is, the background image is obtained according to at least one image of the video in a certain duration time previous to the t th image, and the certain duration time is not limited and is set based on experimental statistics and/or experience.
  • the background image is an average image of the previous images of the t th image.
  • the background image is any one of the previous images of the t th image.
  • the background image is obtained timely according to models which are generated for each pixel based on for example Gaussian Models. However, it is readily apparent that it is not necessarily limited thereto.
  • the foreground/background classification information of the visual elements in the background model is obtained according to the visual elements which are classified as the foreground or the background in at least one previous image of the t th image.
  • the foreground/background classification information of the visual elements is obtained by averaging the foreground/background classification results of the visual elements in the previous images of the t th image.
  • the foreground/background classification information of the visual elements is the foreground/background classification results of the visual elements in any one of the previous images of the t th image.
  • the foreground/background classification information of the visual elements is obtained timely according to models which are generated for each visual element based on for example Gaussian Models. However, it is readily apparent that it is not necessarily limited thereto.
  • the visual elements are super-pixels
  • the foreground/background classification information of the visual elements in the background model is obtained according to the foreground/background classification results of the visual elements in three previous images of the t th image
  • the three previous images of the t th image for example are (t ⁇ 3) th image shown in FIG. 3A , (t ⁇ 2) th image shown in FIG. 3B and (t ⁇ 1) th image shown in FIG. 3C , wherein one block in the images shown in FIG.
  • FIG. 3A to 3C represents one visual element, wherein “B” or “F” in each block represents this visual element is classified as “background” or “foreground”, hence, in case the averaging operation is executed, the obtained foreground/background classification information of the visual elements in the background model is shown in FIG. 3D for example.
  • FIG. 3E to 3G shows the three previous images of the t th image
  • FIG. 3H shows the obtained foreground/background classification information of the visual elements in the background model in case the averaging operation is executed.
  • the visual elements are pixels
  • the foreground/background classification information of the visual elements in the background model could be obtained in the same manner.
  • these two previous images could be the same image or the different image.
  • the similarity measure determination unit 220 determines first similarity measures, wherein the first similarity measures are similarity measures between visual elements in the t th image and the visual elements in the background model.
  • the visual elements in the background model are the visual elements whose classification information is the background
  • the visual elements in the background model are the visual elements which neighbour to corresponding portions of the visual elements in the t th image.
  • the corresponding portion of this visual element is a portion whose position in the background model is same as position of this visual element in the t th image.
  • the larger the first similarity measure corresponding to this visual element is, the higher the probability that this visual element is the background is.
  • the classification unit 230 classifies the visual elements in the t th image as the foreground or the background according to the current image, the background image in the background model and the determined first similarity measures.
  • the classification unit 230 transfers the foreground/background classification results of the visual elements in the t th image to the storage device 240 , so that the corresponding information stored in the storage device 240 could be updated and the background model which will be used for the next image (e.g. (t+1) th image) could be acquired according to the updated information.
  • the classification unit 230 transfers the foreground/background classification results of the visual elements in the t th image to the Output Device 160 shown in FIG. 1 via the system bus 180 for displaying the foreground in the t th image to the user or for outputting the foreground in the t th image to the subsequent processing, such as monitoring analysis, and so on.
  • the visual elements in the 1 st image of the inputted video will be regarded as the background acquiescently.
  • the flowchart 400 shown in FIG. 4 is the corresponding procedure of the image processing apparatus 200 shown in FIG. 2 .
  • the acquisition unit 210 acquires the t th image from the inputted video and acquires the background model which comprises a background image and foreground/background classification information of visual elements from the storage device 240 .
  • the background model which comprises a background image and foreground/background classification information of visual elements from the storage device 240 .
  • similarity measure determination step S 420 the similarity measure determination unit 220 determines the first similarity measures corresponding to the visual elements in the t th image.
  • the similarity measure determination unit 220 determines the first similarity measure corresponding to the visual element 510 as follows. Wherein, as for the visual element 510 in the t th image, the corresponding portion of the visual element 510 is the visual element 520 shown in FIG. 5B , and the visual elements whose classification information is the background in the background model and which neighbour to the visual element 520 in the background model are for example the visual elements 530 - 580 shown in FIG. 5B .
  • the similarity measure determination unit 220 determines a similarity measure between the visual element 510 and the visual element 530 according to feature values of these two visual elements. For example, an absolute difference between the feature values of these two visual elements is regarded as the corresponding similarity measure. It is readily apparent that it is not necessarily limited thereto.
  • the feature value of one visual element in one image could be determined according to channel features of this visual element in the image. For example, in case the image is in YCbCr color space, one visual element includes Y (Luminance) channel feature, Cb (Blue) channel feature and Cr (Red) channel feature.
  • the feature value of the visual element 510 is determined according to its channel features in the t th image.
  • the feature value of the visual element 530 is determined according to feature values of visual elements in the previous images of the t th image, wherein positions of these visual elements in the previous images are same as the position of the visual element 530 and the foreground/background classification results of these visual elements are used to determine the foreground/background classification information of the visual element 530 .
  • the similarity measure determination unit 220 determines the similarity measure between the visual element 510 and the visual element 530 (e.g. regarded as Sim 1 ), the similarity measure between the visual element 510 and the visual element 540 (e.g. regarded as Sim 2 ), . . . , the similarity measure between the visual element 510 and the visual element 580 (e.g. regarded as Sim 6 ), the similarity measure determination unit 220 determines the first similarity measure corresponding to the visual element 510 according to the determined similarity measures (i.e. Sim 1 , Sim 2 , . . . , Sim 6 ). In one instance, the average value of Sim 1 to Sim 6 is determined as the first similarity measure corresponding to the visual element 510 . In another instance, one similarity measure among Sim 1 to Sim 6 whose value is maximal is determined as the first similarity measure corresponding to the visual element 510 . However, it is readily apparent that it is not necessarily limited thereto.
  • the classification unit 230 classifies the visual elements in the t th image as the foreground or the background according to the t th image, the background image in the background model and the first similarity measures determined in the step S 420 .
  • the classification unit 230 classifies the visual elements in the t th image as the foreground or the background according to second similarity measures which are adjusted according to the first similarity measures determined in the step S 420 , wherein the second similarity measures are similarity measures between the visual elements in the t th image and corresponding visual elements in the background image.
  • the similarity measures between the visual elements in the t th image and the corresponding visual elements in the background image i.e. the second similarity measures
  • the similarity measures between the visual elements in the t th image and the corresponding visual elements in the background image will be regarded as “visual distances between the visual elements in the t th image and the corresponding visual elements in the background image”.
  • the classification unit 230 classifies the visual elements in the t th image as the foreground or the background with reference to FIG. 6 and FIG. 7 .
  • the classification unit 230 comprises a calculation unit 231 , an adjustment unit 232 and a determination unit 233 .
  • the corresponding units shown in FIG. 6 execute the corresponding operations as follows.
  • step S 431 as for each of the visual elements in the t th image, the calculation unit 231 calculates the visual distance between this visual element and the corresponding visual element in the background image.
  • the corresponding visual element is a visual element whose position in the background image is same as the position of this visual element in the t th image.
  • the visual distance between this visual element and the corresponding visual element in the background image is calculated according to the feature values of these two visual elements. For example, an absolute difference between the feature values of these two visual elements is regarded as the corresponding visual distance. However, it is readily apparent that it is not necessarily limited thereto.
  • step S 432 the adjustment unit 232 adjusts the visual distances calculated in the step S 431 according to the first similarity measures determined in the similarity measure determination step S 420 shown in FIG. 4 .
  • each of the visual elements in the t th image could be regarded as a processing part.
  • the adjustment unit 232 adjusts the corresponding visual distance calculated in the step S 431 according to a predefined first threshold (e.g. TH 1 ) and the first similarity measure corresponding to this visual element determined in the similarity measure determination step S 420 .
  • a predefined first threshold e.g. TH 1
  • the adjustment unit 232 decreases the corresponding visual distance by subtracting a value, or by multiplying a coefficient which is between [0, 1], or by setting the corresponding visual distance as 0 directly. In other words, as for each of the visual elements in the t th image, the adjustment unit 232 adjusts the corresponding visual distance as following:
  • VisualDistance ⁇ decrease ⁇ ⁇ VisualDistance , FirstSimilarityMeasure > TH ⁇ ⁇ 1 VisualDistance , FirstSimilarityMeasure ⁇ TH ⁇ ⁇ 1
  • each of groups of the visual elements in the t th image also could be regarded as a processing part.
  • the groups in the t th image could be determined according to any of manners, such as set by the user, determined by clustering the visual elements, etc.
  • the adjustment unit 232 adjusts the corresponding visual distances calculated in the step S 431 according to a predefined second threshold (e.g. TH 2 ) and a possibility measure for this group.
  • the possibility measure for this group represents a probability that classifies the visual elements in this group as the background. Taking one group in the t th image for example, the adjustment unit 232 adjusts the corresponding visual distances as follows.
  • the adjustment unit 232 determines a possibility measure for this visual element according to the first similarity measure corresponding to this visual element determined in the similarity measure determination step S 420 by using a Gaussian distribution or a Bayesian distribution for example.
  • the possibility measure for this visual element represents the probability that classifies this visual element as the background.
  • the adjustment unit 232 determines the possibility measure for this group according to the possibility measures for the visual elements in this group by using mathematical calculations.
  • the possibility measure for this group is a product of the possibility measures for the visual elements in this group.
  • the adjustment unit 232 decreases the corresponding visual distance calculated in the step S 431 by subtracting a value, or by multiplying a coefficient which is between [0, 1], or by setting the corresponding visual distance as 0 directly. In other words, as for each of the visual elements in this group, the adjustment unit 232 adjusts the corresponding visual distance as following:
  • VisualDistance ⁇ decrease ⁇ ⁇ VisualDistance , PossibilityMeasure > TH ⁇ ⁇ 2 VisualDistance , PossibilityMeasure ⁇ TH ⁇ ⁇ 2
  • the determination unit 233 determines the visual elements in the t th image as the foreground or the background according to a predefined threshold (e.g. TH 3 ) and the adjusted visual distances which are obtained from the step S 432 . Taking one visual element in the t th image for example, in case the corresponding visual distance adjusted in the step S 432 is larger than the TH 3 , the determination unit 233 determines this visual element as the foreground. Otherwise, this visual element will be determined as the background.
  • a predefined threshold e.g. TH 3
  • the classification unit 230 transfers the foreground/background classification results of the visual elements in the t th image to the storage device 240 shown in FIG. 2 or to the Output Device 160 shown in FIG. 1 .
  • the foreground/background classification results which are obtained in the previous processing will be used as a reference for the subsequent processing, the false foreground detection could be eliminated efficiently.
  • the accuracy of the foreground detection will be improved.
  • the present invention could be implemented by a computer (e.g. tablet computers, laptops or desktops) or could be implemented by a monitor (e.g. digital cameras, video cameras or network cameras).
  • a monitor e.g. digital cameras, video cameras or network cameras.
  • the network camera could output the corresponding processing results (i.e. the foreground) to the subsequent processing, such as monitoring analysis that whether or not giving an alarm to the user. Therefore, as an exemplary application of the present invention, an exemplary monitor (e.g. a network camera) will be described next with reference to FIG. 8 .
  • FIG. 8 illustrates the arrangement of an exemplary monitor 800 according to the present invention.
  • the monitor 800 comprises an optical system 810 and the image processing apparatus 200 as described above.
  • a storage device 820 shown in FIG. 8 stores captured videos and the processing results (i.e. foreground/background classification results of the visual elements) for each image of the captured videos.
  • the storage device 820 is an internal storage device of the monitor 800 .
  • the storage device 820 is a server or an external storage device which is connected with the monitor 800 via the network (not shown).
  • the optical system 810 continuously captures a video of a monitoring place (e.g. an illegal parking area) and stores the captured video to the storage device 820 .
  • a monitoring place e.g. an illegal parking area
  • the image processing apparatus 200 classifies visual elements in images of the captured video as foreground or background with reference to FIG. 2 to FIG. 7 and stores the foreground/background classification results of the visual elements to the storage device 820 .
  • the monitor 800 outputs the detected foreground to a processor which is used to execute a monitoring analysis for example, assuming that the monitoring place is an illegal parking area and the pre-defined alarming rule is that giving an alarm to the user in case cars or other objects are parked in the illegal parking area, that is to say, the illegal parking area is the background and the cars or other objects that appear in the illegal parking area are the foreground.
  • the monitor 800 will continuously capture the video of the illegal parking area and execute the foreground detection on the captured video with reference to FIG. 8 .
  • the monitor 800 will output the car to the processor, so that the processor could give an alarm to the user.
  • the monitor 800 will not detect the moving leaves as the foreground falsely, thus, the processor will not give the wrong alarm to the user.
  • All of the units described above are exemplary and/or preferable modules for implementing the processes described in the present disclosure.
  • These units can be hardware units (such as a Field Programmable Gate Array (FPGA), a digital signal processor, an application specific integrated circuit or the like) and/or software modules (such as computer readable program).
  • FPGA Field Programmable Gate Array
  • the units for implementing the various steps are not described exhaustively above. However, where there is a step of performing a certain process, there may be a corresponding functional module or unit (implemented by hardware and/or software) for implementing the same process.
  • Technical solutions by all combinations of steps described and units corresponding to these steps are included in the disclosure of the present application, as long as the technical solutions they constitute are complete and applicable.
  • the present invention may also be embodied as programs recorded in recording medium, including machine-readable instructions for implementing the method according to the present invention.
  • the present invention also covers the recording medium which stores the program for implementing the method according to the present invention.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Multimedia (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)
  • Closed-Circuit Television Systems (AREA)
US16/140,279 2017-09-30 2018-09-24 Image processing apparatus and method and monitoring system for classifying visual elements as foreground or background Active 2038-11-22 US11354808B2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710915056.3A CN109598741A (zh) 2017-09-30 2017-09-30 图像处理装置和方法及监控系统
CN201710915056.3 2017-09-30

Publications (2)

Publication Number Publication Date
US20190102887A1 US20190102887A1 (en) 2019-04-04
US11354808B2 true US11354808B2 (en) 2022-06-07

Family

ID=65897834

Family Applications (1)

Application Number Title Priority Date Filing Date
US16/140,279 Active 2038-11-22 US11354808B2 (en) 2017-09-30 2018-09-24 Image processing apparatus and method and monitoring system for classifying visual elements as foreground or background

Country Status (3)

Country Link
US (1) US11354808B2 (ja)
JP (1) JP6598943B2 (ja)
CN (1) CN109598741A (ja)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114037623A (zh) * 2021-10-25 2022-02-11 扬州大学 基于波纹和环境特性融合与迭代复原的水波纹预消除方法

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120197642A1 (en) * 2009-10-15 2012-08-02 Huawei Technologies Co., Ltd. Signal processing method, device, and system
US20130129144A1 (en) * 2011-11-23 2013-05-23 Seoul National University Industry Foundation Apparatus and method for detecting object using ptz camera
US20130170557A1 (en) * 2011-12-29 2013-07-04 Pelco, Inc. Method and System for Video Coding with Noise Filtering
US20140003713A1 (en) * 2012-06-29 2014-01-02 Behavioral Recognition Systems, Inc. Automatic gain control filter in a video analysis system
US20150278616A1 (en) * 2014-03-27 2015-10-01 Xerox Corporation Feature- and classifier-based vehicle headlight/shadow removal in video
US20150310615A1 (en) * 2014-04-24 2015-10-29 Xerox Corporation Method and system for automated sequencing of vehicles in side-by-side drive-thru configurations via appearance-based classification
US20160125245A1 (en) * 2014-10-29 2016-05-05 Behavioral Recognition Systems, Inc. Foreground detector for video analytics system
US20160125621A1 (en) * 2014-10-29 2016-05-05 Behavioral Recognition Systems, Inc. Incremental update for background model thresholds
US20160125255A1 (en) * 2014-10-29 2016-05-05 Behavioral Recognition Systems, Inc. Dynamic absorption window for foreground background detector
US20190102888A1 (en) * 2017-09-30 2019-04-04 Canon Kabushiki Kaisha Image processing apparatus and method and monitoring system

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2010241260B2 (en) * 2010-10-29 2013-12-19 Canon Kabushiki Kaisha Foreground background separation in a scene with unstable textures
AU2010238543B2 (en) * 2010-10-29 2013-10-31 Canon Kabushiki Kaisha Method for video object detection
AU2011203219B2 (en) * 2011-06-30 2013-08-29 Canon Kabushiki Kaisha Mode removal for improved multi-modal background subtraction
AU2011265429B2 (en) * 2011-12-21 2015-08-13 Canon Kabushiki Kaisha Method and system for robust scene modelling in an image sequence
JP6445775B2 (ja) * 2014-04-01 2018-12-26 キヤノン株式会社 画像処理装置、画像処理方法
CN105205830A (zh) * 2014-06-17 2015-12-30 佳能株式会社 用于更新场景模型和视频监视的方法和设备
AU2014271236A1 (en) * 2014-12-02 2016-06-16 Canon Kabushiki Kaisha Video segmentation method
AU2014280948A1 (en) * 2014-12-24 2016-07-14 Canon Kabushiki Kaisha Video segmentation method
CN106408554B (zh) * 2015-07-31 2019-07-09 富士通株式会社 遗留物检测装置、方法和系统
CN106611417B (zh) * 2015-10-20 2020-03-31 佳能株式会社 将视觉元素分类为前景或背景的方法及装置

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120197642A1 (en) * 2009-10-15 2012-08-02 Huawei Technologies Co., Ltd. Signal processing method, device, and system
US20130129144A1 (en) * 2011-11-23 2013-05-23 Seoul National University Industry Foundation Apparatus and method for detecting object using ptz camera
US20130170557A1 (en) * 2011-12-29 2013-07-04 Pelco, Inc. Method and System for Video Coding with Noise Filtering
US20140003713A1 (en) * 2012-06-29 2014-01-02 Behavioral Recognition Systems, Inc. Automatic gain control filter in a video analysis system
US20150278616A1 (en) * 2014-03-27 2015-10-01 Xerox Corporation Feature- and classifier-based vehicle headlight/shadow removal in video
US20150310615A1 (en) * 2014-04-24 2015-10-29 Xerox Corporation Method and system for automated sequencing of vehicles in side-by-side drive-thru configurations via appearance-based classification
US20160125245A1 (en) * 2014-10-29 2016-05-05 Behavioral Recognition Systems, Inc. Foreground detector for video analytics system
US20160125621A1 (en) * 2014-10-29 2016-05-05 Behavioral Recognition Systems, Inc. Incremental update for background model thresholds
US20160125255A1 (en) * 2014-10-29 2016-05-05 Behavioral Recognition Systems, Inc. Dynamic absorption window for foreground background detector
US20190102888A1 (en) * 2017-09-30 2019-04-04 Canon Kabushiki Kaisha Image processing apparatus and method and monitoring system

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Yun-Fang Zhu, "Moving Objects Detection and Segmentation Based on Background Subtraction and Image Over-Segmentation", Journal of Software, vol. 6, No. 7, Jul. 2011.

Also Published As

Publication number Publication date
JP6598943B2 (ja) 2019-10-30
CN109598741A (zh) 2019-04-09
US20190102887A1 (en) 2019-04-04
JP2019067369A (ja) 2019-04-25

Similar Documents

Publication Publication Date Title
US10872262B2 (en) Information processing apparatus and information processing method for detecting position of object
US20190304102A1 (en) Memory efficient blob based object classification in video analytics
US10445590B2 (en) Image processing apparatus and method and monitoring system
RU2607774C2 (ru) Способ управления в системе захвата изображения, устройство управления и машиночитаемый носитель данных
JP6494253B2 (ja) 物体検出装置、物体検出方法、画像認識装置及びコンピュータプログラム
US9098748B2 (en) Object detection apparatus, object detection method, monitoring camera system and storage medium
US20190199898A1 (en) Image capturing apparatus, image processing apparatus, control method, and storage medium
EP2806373A2 (en) Image processing system and method of improving human face recognition
US20090310822A1 (en) Feedback object detection method and system
US20180047271A1 (en) Fire detection method, fire detection apparatus and electronic equipment
US20140029855A1 (en) Image processing apparatus, image processing method, and program
US12087036B2 (en) Information processing device, information processing method, and program recording medium
Hu et al. A novel approach for crowd video monitoring of subway platforms
JP5088279B2 (ja) 目標追尾装置
US10455144B2 (en) Information processing apparatus, information processing method, system, and non-transitory computer-readable storage medium
US10916016B2 (en) Image processing apparatus and method and monitoring system
KR20160037480A (ko) 지능형 영상 분석을 위한 관심 영역 설정 방법 및 이에 따른 영상 분석 장치
CN115423795A (zh) 静帧检测方法、电子设备及存储介质
US11354808B2 (en) Image processing apparatus and method and monitoring system for classifying visual elements as foreground or background
JP2009182624A (ja) 目標追尾装置
WO2018179119A1 (ja) 映像解析装置、映像解析方法および記録媒体
CN109040673A (zh) 视频图像处理方法、装置及具有存储功能的装置
CN114550060A (zh) 周界入侵识别方法、系统及电子设备
JP5599228B2 (ja) 繁忙検知システム及び繁忙検知プログラム
JP6244221B2 (ja) 人検出装置

Legal Events

Date Code Title Description
FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

AS Assignment

Owner name: CANON KABUSHIKI KAISHA, JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:YANG, QIN;CHEN, TSEWEI;SIGNING DATES FROM 20181024 TO 20181025;REEL/FRAME:047452/0950

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION

STPP Information on status: patent application and granting procedure in general

Free format text: NON FINAL ACTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPP Information on status: patent application and granting procedure in general

Free format text: FINAL REJECTION MAILED

STPP Information on status: patent application and granting procedure in general

Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS

STPP Information on status: patent application and granting procedure in general

Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED

STCF Information on status: patent grant

Free format text: PATENTED CASE