WO2016106595A1 - Moving object detection in videos - Google Patents
Moving object detection in videos Download PDFInfo
- Publication number
- WO2016106595A1 WO2016106595A1 PCT/CN2014/095643 CN2014095643W WO2016106595A1 WO 2016106595 A1 WO2016106595 A1 WO 2016106595A1 CN 2014095643 W CN2014095643 W CN 2014095643W WO 2016106595 A1 WO2016106595 A1 WO 2016106595A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- frames
- moving object
- background
- objective function
- dimensional image
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/262—Analysis of motion using transform domain methods, e.g. Fourier domain methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/10—Segmentation; Edge detection
- G06T7/194—Segmentation; Edge detection involving foreground-background segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/215—Motion-based segmentation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20076—Probabilistic image processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30232—Surveillance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30241—Trajectory
Definitions
- the present disclosure generally relates to video processing, and more specifically, to moving object detection in videos.
- Detecting moving objects such as persons, automobiles and the like in the video plays an important role in video analysis such as intelligent video surveillance, traffic monitoring, vehicle navigation, and human-machine interaction.
- video analysis the outcome of moving object detection can be input into the modules like object recognition, object tracking, behavior analysis or the like for further processing. Therefore, high performance of moving object detection is a key for successful video analysis.
- the detection of the background is a fundamental problem.
- the detection accuracy is limited due to the changing background. More specifically, if the background of the video scene includes water ripples or waving trees, the detection of moving objects is prone to error.
- the illumination variation, camera motion, and/or other kinds of noises in the background may also put negative effects on the moving object detection. Due to the changes of the background, in the conventional solutions, parts of the background might be classified as moving objects, while parts of foreground might be classified as background.
- embodiments of the present invention provide a solution for moving object detection in the videos.
- one embodiment of the present invention provides a computer-implemented method.
- the method comprises: transforming a plurality of frames in a video from an initial image space to a high dimensional image space in a non-linear way; modeling background of the plurality of frames in the high dimensional image space; and detecting a moving object in the plurality of frames based on the modeling of the background of the plurality of frames in the high dimensional image space.
- one embodiment of the present invention provides a computer-implemented apparatus.
- the apparatus comprises: an image transformer configured to transform a plurality of frames in a video from an initial image space to a high dimensional image space in a non-linear way; a modeler configured to model background of the plurality of frames in the high dimensional image space; and a moving object detector configured to detect a moving object in the plurality of frames based on the modeling of the background of the plurality of frames in the high dimensional image space.
- the frames in the videos may be transformed into a very high dimensional image space.
- the non-linear model which is more powerful for describing complex factors such as changing background, changing background, illumination variation, camera motion, noise and the like
- embodiments of the present invention is more robust and accurate to detect moving objects under the complex situations. Additionally, embodiments of the present invention achieve less false alarms and high detection rate.
- FIG. 1 shows a flowchart of a method of detecting moving objects in a video according to one embodiment of the present invention
- FIGs. 2A-2C show the results of moving object detection obtained by a conventional approach and one embodiment of the present invention
- FIG. 3 shows a block diagram of an apparatus of detecting moving objects in a video according to one embodiment of the present invention.
- FIG. 4 shows a block diagram of an example computer system suitable for implementing example embodiments of the present invention.
- the term “includes” and its variants are to be read as open terms that mean “includes, but is not limited to. ”
- the term “or” is to be read as “and/or” unless the context clearly indicates otherwise.
- the term “based on” is to be read as “based at least in part on. ”
- the term “one implementation” and “an implementation” are to be read as“at least one implementation. ”
- the term “another implementation” is to be read as “at least one other implementation. ”
- the terms “first, ” “second, ” “third” and the like may be used to refer to different or same objects. Other definitions, explicit and implicit, may be included below.
- Example embodiments of the present invention model the background of the frames in the videos using a non-linear model.
- a nonlinear model which is better than the linear one in the sense of describing the complex factors, the accuracy and performance of moving object detection in the videos can be improved.
- the non-linear modeling of the background is achieved by transforming or mapping the original frames or images of the video being processed into a higher dimensional space.
- the non-linear modeling of the initial background can be done effectively and efficiently.
- the input of the moving object detection is a sequence of frames or images in the video, denoted as where represents vectorized image, n represents the number of pixels in a frame, T represents the number of frames being taken into consideration.
- image and “frame” can be used interchangeably.
- the goal is to find the positions of a moving object (s) or foreground in the frame x t .
- the terms “foreground” and “moving object” can be used interchangeably.
- the position of foreground location is represented by a foreground-indicator vector s ⁇ ⁇ 0,1 ⁇ n .
- the pixel value of the foreground can be determined according to the foreground-indicator vector:
- P s represents a foreground-extract operator.
- the foreground-extract operator can be expressed as The pixel value of the background can also be determined according to the foreground-indicator vector:
- FIG. 1 shows the flowchart of a method 100 of detecting moving object in a video.
- the video may be of any suitable format.
- the video may be compressed or encoded by any suitable technologies, either currently known or to be developed in the future.
- the method 100 is entered at step 110, where a plurality of frames [x t-T , x t-T-1 , ... , x t-2 , x t-1 , x t ] in the video are transformed into a high dimensional image space in a non-linear way.
- the dimension m of the high dimensional image space can be very high. Theoretically, the dimension can be even infinite.
- the value of m can be selected such that m is much greater than the number of pixels in each frame. In this way, the non-linear correlations among the frames in the low dimensional image space can be better characterized and modeled.
- mapping function denoted as ⁇
- ⁇ any suitable mapping functions can be used in connection with embodiments of the present invention.
- the mapping function satisfying the Mercer’s theorem can be used to guarantee the compactness and convergence of the transform.
- the frames in the initial image space is transformed into the high dimensional image space, thereby obtaining a plurality of transformed frames [ ⁇ (x t-T ) , ... , ⁇ (x t-1 ) , ⁇ (x t ) ] .
- the transformed frames [ ⁇ (x t-T ) , ... , ⁇ (x t-1 ) , ⁇ (x t ) ] may be linear and can thus be more easily described, which will be discussed below.
- the transformed frames [ ⁇ (x t-T ) , ... , ⁇ (x t-1 ) , ⁇ (x t ) ] are not necessarily linear in the high dimensional image space.
- the scope of the invention is not limited in this regard.
- the frames can be transformed into the high dimensional image space without explicitly defining the mapping function.
- the transformed frames and the modeling thereof can be described by use of proper kernel functions. Example embodiment in this regard will be discussed below.
- step 120 the background of the plurality of frames [x t-T , x t-T-1 , ... , x t-2 , x t-1 , x t ] is modeled in the high dimensional image space.
- the background of the frames is assumed to follow the Gaussian distribution and therefore is modeled by a linear transformation matrix where d represents the number of bases and u i is the i-th base vector.
- d represents the number of bases
- u i is the i-th base vector.
- the initial frames are transformed into the high dimensional image space at step 110 and modeled in the image space with a very high dimension at step 120.
- the non-linear modeling of the background of the initial frames is achieved.
- the correlations of the frames can be better characterized to thereby identify the background and foreground (moving objects) more accurately.
- the transformed frames may be linear in the high dimensional image space in one embodiment, as described above.
- the base vector u j may be calculated as a linear sum of the background of the transformed frames as follows:
- the non-linear modeling of background of the flames is achieved by modeling or approximating the background of the transformed flames using a linear model in the high dimensional image space.
- modeling the background of the transformed frame using a linear model in me high dimensional image space would be beneficial in terms of operation efriciency and computation complexity. However, this is not necessarily required.
- the background of the transformed frames can be approximated using any non-linear model in the high dimensional image space.
- step 130 one or more moving objects (foreground) are detected based on the modeling of the background of the frames in the high dimensional image space.
- an objective function can be defined based on the modeling at step 120. More specifically, the objective function at least characterizes the error in the modeling or approximation of background of the frames.
- the objective function may be defined as follows:
- the area of the foreground may be taken into consideration.
- the area of the moving object in each frame is below a predefined threshold because a too large moving objection would probably means inaccurate detection.
- the area term can be given by:
- the connectivity of the moving object across the plurality of frames can be considered. It would be appreciated that the trajectory of a moving object is usually continuous between two consecutive frames.
- the connectivity may be defined as follows:
- N (i) is the set of neighbors of the pixel i.
- the modeling error, foreground area and the connectivity can be combined together to define the objective function as follows:
- the background of the frames can be detected by minimizing the objective function.
- the foreground-indicator vector s, coefficient and the low-dimensional representation y i that can minimize the objective function L.
- the kernel functions associated with the high dimensional image space can be used to solve this optimization problem.
- KPCA Kernel Principal Component Analysis
- the kernel function k (x i , x j ) can be in any form as long as the resulting kernel matrix K is semi-definite.
- an example of the kernel function is shown as follows:
- ⁇ is parameter which can be selected empirically. It is to be understood that the kernel functions shown in equation (15) is given merely for the purpose of illustration, without suggesting any limitations as to the scope of the invention. In other embodiments, any suitable kernel functions such as gaussian kernel function, radial basis function, and the like can be used as well.
- the optimization of the objective function can be achieved by solving the following eigen-decomposition problem:
- ⁇ and ⁇ represent the eigenvalue and eigenvector, respectively. It would be appreciated that there are totally d eigenvalues ⁇ 1 , ... , ⁇ d . In one embodiment, the eigenvalues may be sorted in ascending order, such that ⁇ 1 > ⁇ 2 >... ⁇ d .
- the eigenvector ⁇ i corresponds to the eigenvalues ⁇ i .
- the j entry of ⁇ i is
- y i is the background of the initial frames in the low dimensional image space.
- y i is expressed as follows:
- Equation (19) can be calculated by the kernel function because
- equation (19) can be formulated as terms in the form of kernel function as follows:
- the foreground and background parts in the frames can be identified or indicated by the foreground indicator s which is defined in equation (1) , for example.
- the objective function at least in part by the foreground indicator. That is, by means of the kernel functions, the objective function can be associated with the foreground indicator related to each pixel in each of the plurality of frames, where the foreground indicator indicates whether the related pixel belongs to the moving object (foreground) .
- the kernel function is in the form of equation (15) .
- the kernel function can be approximated by:
- Equation (19) L background as defined in equation (19) can be expressed as follows:
- the objective function is in the form of equation (13) . That is, in addition to L background , the objective function also includes the terms related to the area and connectivity of the moving object (s) . Based on equations (25) and (13) , the objective function L can be written as:
- equation (26) is in a standard form of graph cuts.
- the optimal solution s can be efficiently obtained.
- the method 100 can be implemented by the pseudo code shown in the following table.
- embodiments of the present invention is more robust and accurate to detect moving objects under the complex situations.
- the proposed approach achieves less false alarms and high detection rate.
- FIGs. 2A-2C show an example of moving object detection.
- FIG. 2A shows frame in a video which has dynamic rain.
- FIG. 2B is the result of a conventional approach of moving object detection. It can be seen that in FIG. 2B, the spring is incorrectly classified as moving objects. On the contrary, in the result obtained by one embodiment of the present invention as shown in FIG. 2C, the spring was removed from the foreground and the moving person is correctly detected.
- FIG. 3 shows a block diagram of a computer-implemented apparatus for moving object detection according to one embodiment of the present invention.
- the apparatus 300 comprises an image transformer 310 configured to transform a plurality of frames in a video from an initial image space to a high dimensional image space in a non-linear way; a modeler 320 configured to model background of the plurality of frames in the high dimensional image space; and a moving object detector 330 configured to detect a moving object in the plurality of frames based on the modeling of the background of the plurality of frames in the high dimensional image space.
- the dimension of the high dimensional image space is greater than the number of pixels in each of the plurality of frames.
- the modeler 320 may comprise a non-linear modeler 325 configured to model background of a plurality of transformed frames using a linear model in the high dimensional image space, the plurality of transformed frames obtained by transforming the plurality of frames in the non-linear way.
- the apparatus 300 may further comprise an objective function controller 340 configured to determine an objective function characterizing an error of the modeling of the background of the plurality of frames.
- the moving object detector 330 is configured to detect the moving object based on the objective function.
- the objective function may further characterize at least one of: areas of the moving object in the plurality of frames, and connectivity of the moving object across the plurality of frames.
- the apparatus 300 may further comprise a kernel function controller 350 configured to determine a set of kernel functions associated with the high dimensional image space.
- the objective function controller 340 is configured to associate at least a part of the objective function and the background of the plurality of frames using the set of kernel functions, and the moving object detector 330 is configured to detect the moving object by minimizing the objective function.
- the objective function controller 340 is configured to associate the objective function with a foreground indicator related to each pixel in each of the plurality of frames using the set of kernel functions, where the foreground indicator indicates whether the related pixel belongs to the moving object.
- FIG. 4 shows a block diagram of an example computer system 400 suitable for implementing example embodiments of the present invention.
- the computer system 400 can be a fixed type machine such as a desktop personal computer (PC) , a server, a mainframe, or the like.
- the computer system 400 can be a mobile type machine such as a mobile phone, tablet PC, laptop, intelligent phone, personal digital assistance (PDA) , or the like.
- PC personal computer
- PDA personal digital assistance
- the computer system 400 comprises a processor such as a central processing unit (CPU) 401 which is capable of performing various processes in accordance with a program stored in a read only memory (ROM) 402 or a program loaded from a storage unit 408 to a random access memory (RAM) 403.
- a processor such as a central processing unit (CPU) 401 which is capable of performing various processes in accordance with a program stored in a read only memory (ROM) 402 or a program loaded from a storage unit 408 to a random access memory (RAM) 403.
- ROM read only memory
- RAM random access memory
- data required when the CPU 401 performs the various processes or the like is also stored as required.
- the CPU 401, the ROM 402 and the RAM 403 are connected to one another via a bus 404.
- An input/output (I/O) interface 405 is also connected to the bus 404.
- the following components are connected to the I/O interface 405: an input unit 406 including a keyboard, a mouse, or the like; an output unit 407 including a display such as a cathode ray tube (CRT) , aliquid crystal display (LCD) , or the like, and a loudspeaker or the like; the storage unit 408 including a hard disk or the like; and a communication unit 409 including a network interface card such as a LAN card, a modem, or the like. The communication unit 409 performs a communication process via the network such as the internet.
- a drive 410 is also connected to the I/O interface 405 as required.
- a removable medium 411 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like, is mounted on the drive 410 as required, so that a computer program read therefrom is installed into the storage unit 408 as required.
- embodiments of the present invention comprise a computer program product including a computer program tangibly embodied on a machine readable medium, the computer program including program code for performing the method 100 and/or the pseudo code shown in Table 1.
- the computer program may be downloaded and mounted from the network via the communication unit 409, and/or installed from the removable medium 411.
- the functionally described herein can be performed, at least in part, by one or more hardware logic components.
- illustrative types of hardware logic components include Field-programmable Gate Arrays (FPGAs) , Application-specific Integrated Circuits (ASICs) , Application-specific Standard Products (ASSPs) , System-on-a-chip systems (SOCs) , Complex Programmable Logic Devices (CPLDs) , and the like.
- Various embodiments of the invention may be implemented in hardware or special purpose circuits, software, logic or any combination thereof. Some aspects may be implemented in hardware, while other aspects may be implemented in firmware or software which may be executed by a controller, microprocessor or other computing device. While various aspects of embodiments of the present invention are illustrated and described as block diagrams, flowcharts, or using some other pictorial representation, it will be appreciated that the blocks, apparatus, systems, techniques or methods described herein may be implemented in, as non-limiting examples, hardware, software, firmware, special purpose circuits or logic, general purpose hardware or controller or other computing devices, or some combination thereof.
- embodiments of the present invention can be described in the general context of machine-executable instructions, such as those included in program modules, being executed in a device on a target real or virtual processor.
- program modules include routines, programs, libraries, objects, classes, components, data structures, or the like that perform particular tasks or implement particular abstract data types.
- the functionality of the program modules may be combined or split between program modules as desired in various implementations.
- Machine-executable instructions for program modules may be executed within a local or distributed device. In a distributed device, program modules may be located in both local and remote storage media.
- Program code for carrying out methods of the invention may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowcharts and/or block diagrams to be implemented.
- the program code may execute entirely on a machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
- a machine readable medium may be any tangible medium that may contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
- the machine readable medium may be a machine readable signal medium or a machine readable storage medium.
- a machine readable medium may include but not limited to an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing.
- machine readable storage medium More specific examples of the machine readable storage medium would include an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM) , a read-only memory (ROM) , an erasable programmable read-only memory (EPROM or Flash memory) , an optical fiber, a portable compact disc read-only memory (CD-ROM) , an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
- RAM random access memory
- ROM read-only memory
- EPROM or Flash memory erasable programmable read-only memory
- CD-ROM portable compact disc read-only memory
- magnetic storage device or any suitable combination of the foregoing.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Multimedia (AREA)
- Mathematical Physics (AREA)
- Image Analysis (AREA)
Abstract
The present disclosure relates to moving object detection in videos. In one embodiment, a plurality of frames in a video are transformed to a high dimensional image space in a non-linear way. Then the background of the plurality of frames can be modeled in the high dimensional image space. The foreground or moving object can be detected in the plurality of frames based on the modeling of the background in the high dimensional image space. By use of the non-linear model which is more powerful for describing complex factors such as changing background, changing background, illumination variation, camera motion, noise and the like, embodiments of the present invention is more robust and accurate to detect moving objects under the complex situations.
Description
The present disclosure generally relates to video processing, and more specifically, to moving object detection in videos.
Detecting moving objects such as persons, automobiles and the like in the video plays an important role in video analysis such as intelligent video surveillance, traffic monitoring, vehicle navigation, and human-machine interaction. In the process of video analysis, the outcome of moving object detection can be input into the modules like object recognition, object tracking, behavior analysis or the like for further processing. Therefore, high performance of moving object detection is a key for successful video analysis.
In moving object detection, the detection of the background is a fundamental problem. In lot of conventional approaches for moving object detection in the videos, the detection accuracy is limited due to the changing background. More specifically, if the background of the video scene includes water ripples or waving trees, the detection of moving objects is prone to error. In addition, the illumination variation, camera motion, and/or other kinds of noises in the background may also put negative effects on the moving object detection. Due to the changes of the background, in the conventional solutions, parts of the background might be classified as moving objects, while parts of foreground might be classified as background.
SUMMARY
In general, embodiments of the present invention provide a solution for moving object detection in the videos.
In one aspect, one embodiment of the present invention provides a computer-implemented method. The method comprises: transforming a plurality of frames in a video from an initial image space to a high dimensional image space in a non-linear way; modeling background of the plurality of frames in the high dimensional
image space; and detecting a moving object in the plurality of frames based on the modeling of the background of the plurality of frames in the high dimensional image space.
In another aspect, one embodiment of the present invention provides a computer-implemented apparatus. The apparatus comprises: an image transformer configured to transform a plurality of frames in a video from an initial image space to a high dimensional image space in a non-linear way; a modeler configured to model background of the plurality of frames in the high dimensional image space; and a moving object detector configured to detect a moving object in the plurality of frames based on the modeling of the background of the plurality of frames in the high dimensional image space.
Through the following description, it would be appreciated that in accordance with example embodiments of the present invention, the frames in the videos may be transformed into a very high dimensional image space. By use of the non-linear model which is more powerful for describing complex factors such as changing background, changing background, illumination variation, camera motion, noise and the like, embodiments of the present invention is more robust and accurate to detect moving objects under the complex situations. Additionally, embodiments of the present invention achieve less false alarms and high detection rate.
FIG. 1 shows a flowchart of a method of detecting moving objects in a video according to one embodiment of the present invention;
FIGs. 2A-2C show the results of moving object detection obtained by a conventional approach and one embodiment of the present invention;
FIG. 3 shows a block diagram of an apparatus of detecting moving objects in a video according to one embodiment of the present invention; and
FIG. 4 shows a block diagram of an example computer system suitable for implementing example embodiments of the present invention.
Throughout the drawings, the same or corresponding reference symbols refer to
the same or corresponding parts.
DETAILED DESCRIPTION OF EXAMPLE EMBODIMENTS
Example embodiments of the present invention will now be discussed with reference to several example implementations. It should be understood these implementations are discussed only for the purpose of enabling those skilled persons in the art to better understand and thus implement embodiments of the invention, rather than suggesting any limitations on the scope of the invention.
As used herein, the term “includes” and its variants are to be read as open terms that mean “includes, but is not limited to. ” The term “or” is to be read as “and/or” unless the context clearly indicates otherwise. The term “based on” is to be read as “based at least in part on. ” The term “one implementation” and “an implementation” are to be read as“at least one implementation. ” The term “another implementation” is to be read as “at least one other implementation. ” The terms “first, ” “second, ” “third” and the like may be used to refer to different or same objects. Other definitions, explicit and implicit, may be included below.
Traditionally, the background modeling in the moving object detection is done using a linear model. The underlying assumption of the linear model is that the background follows a Gaussian distribution. However, the inventors have found that this is usually not the case in practice. Therefore, the linear model is unable to fully describe the complex factors of changing background, illumination, camera motion, noise, and the like. Example embodiments of the present invention model the background of the frames in the videos using a non-linear model. By using a nonlinear model which is better than the linear one in the sense of describing the complex factors, the accuracy and performance of moving object detection in the videos can be improved.
In general, the non-linear modeling of the background is achieved by transforming or mapping the original frames or images of the video being processed into a higher dimensional space. By modeling the background of the transformed frame in that high dimensional image space, the non-linear modeling of the initial background can be done effectively and efficiently.
For the sake of discussion, a number of notations are defined as follows. The
input of the moving object detection is a sequence of frames or images in the video, denoted as where represents vectorized image, n represents the number of pixels in a frame, T represents the number of frames being taken into consideration. In the following, the terms “image” and “frame” can be used interchangeably.
The goal is to find the positions of a moving object (s) or foreground in the frame xt. In the context of the present disclosure, the terms “foreground” and “moving object” can be used interchangeably. In one embodiment, the position of foreground location is represented by a foreground-indicator vector s ∈ {0,1} n. The i-element of s is si which equals to either zero or one, where si=1 means that the i pixel in the frame xt is foreground, while si=0 means that the i pixel in the frame xt is background. That is,
The pixel value of the foreground can be determined according to the foreground-indicator vector:
where Ps represents a foreground-extract operator. For the sake of discussion, the foreground-extract operator can be expressed asThe pixel value of the background can also be determined according to the foreground-indicator vector:
whererepresents a background-extract operator. For the sake of discussion, the background-extract operator can be expressed as
Now some example embodiments of the present invention will be discussed. Reference is first made to FIG. 1 which shows the flowchart of a method 100 of detecting moving object in a video. According to embodiments of the present invention, the video
may be of any suitable format. The video may be compressed or encoded by any suitable technologies, either currently known or to be developed in the future.
As shown, the method 100 is entered at step 110, where a plurality of frames [xt-T, xt-T-1, ... , xt-2, xt-1, xt] in the video are transformed into a high dimensional image space in a non-linear way. According to embodiments of the present invention, the dimension m of the high dimensional image space can be very high. Theoretically, the dimension can be even infinite. For example, in one embodiment, the value of m can be selected such that m is much greater than the number of pixels in each frame. In this way, the non-linear correlations among the frames in the low dimensional image space can be better characterized and modeled.
In one embodiment, it is possible to use a non-linear transformation or mapping function, denoted as φ, in order to transform the frames. Any suitable mapping functions can be used in connection with embodiments of the present invention. Specifically, in one embodiment, the mapping function satisfying the Mercer’s theorem can be used to guarantee the compactness and convergence of the transform.
By applying the mapping function, the frames in the initial image space is transformed into the high dimensional image space, thereby obtaining a plurality of transformed frames [φ (xt-T) , … , φ (xt-1) , φ (xt) ] . Specifically, in one embodiment, by selecting the proper parameters for the mapping function φ (x) , the transformed frames [φ (xt-T) , … , φ (xt-1) , φ (xt) ] may be linear and can thus be more easily described, which will be discussed below. However, it is to be understood that the transformed frames [φ(xt-T) , … , φ (xt-1) , φ (xt) ] are not necessarily linear in the high dimensional image space. The scope of the invention is not limited in this regard.
Specifically, in one embodiment, the frames can be transformed into the high dimensional image space without explicitly defining the mapping function. For example, in one embodiment, the transformed frames and the modeling thereof can be described by use of proper kernel functions. Example embodiment in this regard will be discussed below.
The method 100 then proceeds to step 120, where the background of the plurality of frames [xt-T, xt-T-1, ... , xt-2, xt-1, xt] is modeled in the high dimensional image
space.
Traditionally, the background of the frames is assumed to follow the Gaussian distribution and therefore is modeled by a linear transformation matrix where d represents the number of bases and ui is the i-th base vector. In such convention approaches, the representationof the background is given by:
where represents the background, U′ represents the transposed matrix of U. As a result, backgroundis approximated as follows:
It can be seen from equations (4) and (5) that the relationship between background and the base vectors is always linear. However, it is possible that the frames [xt-T, xt-T-1, ... , xt-2, xt-1, xt] are not linear, for example, if there is changing background, illumination, camera motion, noise, or the like. Experiments of the inventors have found that the conventional linear model is not robust to describe such complex factors. Inaccurate background modeling in turn degrades the detection rate of the moving objects in the foreground.
On the contrary, according to embodiments of the present invention, the initial frames are transformed into the high dimensional image space at step 110 and modeled in the image space with a very high dimension at step 120. As such, the non-linear modeling of the background of the initial frames is achieved. Along this line, the correlations of the frames can be better characterized to thereby identify the background and foreground (moving objects) more accurately.
Specifically, the transformed frames may be linear in the high dimensional image space in one embodiment, as described above. In this embodiment, the base vector uj may be calculated as a linear sum of the background of the transformed frames as follows:
whererepresents the background part of the transfonned frames in the high dimensional image space,and whererepresents a coefficient. For the sake of discussion,is defined. That is, in this embodiment, the non-linear modeling of background of the flames is achieved by modeling or approximating the background of the transformed flames using a linear model in the high dimensional image space.
These base vectors uj together form a linear transformation matrix Thus, the backgroundof each transformed flame in the high dimensional image space can be represented as follows:
It is to be understood that modeling the background of the transformed frame using a linear model in me high dimensional image space would be beneficial in terms of operation efriciency and computation complexity. However, this is not necessarily required. In an alternative embodiment, the background of the transformed frames can be approximated using any non-linear model in the high dimensional image space.
Still wim reference to FIG. 1, the method 100 proceeds to step 130 where one or more moving objects (foreground) are detected based on the modeling of the background of the frames in the high dimensional image space.
In one embodiment, at step 130, an objective function can be defined based on the modeling at step 120. More specifically, the objective function at least characterizes the error in the modeling or approximation of background of the frames. By way of example, in the embodiment where the background of me transformed frames are modeled in a linear way, the objective function may be defined as follows:
Substituting equation (6) into equation (8) yields:
In some embodiments, one or more other relevant factors can be used in the objective function. For example, in one embodiment, the area of the foreground (moving object) may be taken into consideration. In general, it is desired that the area of the moving object in each frame is below a predefined threshold because a too large moving objection would probably means inaccurate detection. In one embodiment, the area term can be given by:
where ||·||1 represents the one-norm operator.
Additionally or alternatively, in one embodiment, the connectivity of the moving object across the plurality of frames can be considered. It would be appreciated that the trajectory of a moving object is usually continuous between two consecutive frames. In order to measure the object connectivity, in one embodiment, the connectivity may be defined as follows:
where N (i) is the set of neighbors of the pixel i.
In one embodiment, the modeling error, foreground area and the connectivity can be combined together to define the objective function as follows:
L=Lbackground+βLarea+γLconnectivity, (12)
where β and γ represent weights and can be set depending on specific requirements and use cases. By substituting equations (9) , (10) and (11) into equation (12) , the objective function is expressed as:
It is to be understood that the objective function shown in equation (13) is discussed merely for the purpose of illustration, without suggesting any limitations to the scope of the invention. In other embodiment, any additional or alternative factors may be used in the objective function. Moreover, as described above, it is possible to simply use the approximation error Lbackground as the objective function.
In one embodiment, the background of the frames can be detected by minimizing the objective function. To this end, in one embodiment, it is possible to directly solve the foreground-indicator vector s, coefficientand the low-dimensional representation yi that can minimize the objective function L. In practice, however, it is sometimes difficult to directly find the optimal solutions. In order to improve the efficiency and reduce computation complexity, in one embodiment, the kernel functions associated with the high dimensional image space can be used to solve this optimization problem.
More specifically, given the objective function such as the one shown in equation (13) , the goal is to solvewhen s and yi are fixed. In one embodiment, the Kernel Principal Component Analysis (KPCA) can be used to accomplish this task. A T×T kernel matrix K is defined, in which the ij-th element is denoted by a kernel function
kij=k (xi, xj) . (14)
The kernel function k (xi, xj) can be in any form as long as the resulting kernel matrix K is semi-definite. In one embodiment, an example of the kernel function is shown as follows:
where σ is parameter which can be selected empirically. It is to be understood that the kernel functions shown in equation (15) is given merely for the purpose of illustration, without suggesting any limitations as to the scope of the invention. In other embodiments, any suitable kernel functions such as gaussian kernel function, radial basis function, and the like can be used as well.
In one embodiment, the optimization of the objective function can be achieved by solving the following eigen-decomposition problem:
Kα=λα, (16)
where λ and α represent the eigenvalue and eigenvector, respectively. It would be appreciated that there are totally d eigenvalues λ1, ... , λd. In one embodiment, the eigenvalues may be sorted in ascending order, such that λ1>λ2>…λd. The eigenvector αi corresponds to the eigenvalues λi. The j entry of αi is
Given the s andthe low-dimensional version ofdenoted as yi, can be solved. That is, yi is the background of the initial frames in the low dimensional image space. In one embodiment, yi is expressed as follows:
Substituting equation (6) into equation (17) yields:
It can be seen from equation (18) that yi can be determined by using the kernel function, without explicitly defining or applying the mapping function.
Next, in the case that yi andare fixed, s can be solved. More
specifically, it can be determined from equation (13) that the j-th element of yi is Therefore, in one embodiment, the background approximation error Lbackground is written as:
where cij represents a coefficient and C represents a constant independent to the data. As such, at least a part of the objective function (that is, the approximation error Lbackground) and the background of the frames are associated using the kernel functions.
In one embodiment, as described above, the foreground and background parts in the frames can be identified or indicated by the foreground indicator s which is defined in equation (1) , for example. Based on equation (21) , in one embodiment, it is possible
to express the objective function at least in part by the foreground indicator. That is, by means of the kernel functions, the objective function can be associated with the foreground indicator related to each pixel in each of the plurality of frames, where the foreground indicator indicates whether the related pixel belongs to the moving object (foreground) .
For the sake of discussion, suppose that the kernel function is in the form of equation (15) . By use of the Taylor expansion, the kernel function can be approximated by:
where xiz represents the value of pixel z in frame i. By substituting equation (22) into equation (23) , Lbackground as defined in equation (19) can be expressed as follows:
whereIt would be appreciated that C′ is a constant and thus can be removed from equation (24) . As a result, Lbackground is expressed as:
For the sake of discussion, suppose that the objective function is in the form of equation (13) . That is, in addition to Lbackground, the objective function also includes the terms related to the area and connectivity of the moving object (s) . Based on equations (25) and (13) , the objective function L can be written as:
where sz is the z entry of s. It can be seen that equation (26) is in a standard form of graph cuts. In one embodiment, by use of the well-known algorithm of graph cuts, the optimal solution s can be efficiently obtained.
In one embodiment where the kernel functions are used, the method 100 can be implemented by the pseudo code shown in the following table.
Table 1
It is to be understood that the pseudo code in Table 1 is given only for the purpose of illustration, without suggesting any limitations as to the scope of the invention. Various modifications or variations are possible in practice.
By use of the non-linear model which is more powerful for describing complex factors such as changing background (e.g., water ripples and waving trees) , illumination variation, camera motion, noise and the like, embodiments of the present invention is more robust and accurate to detect moving objects under the complex situations. The proposed approach achieves less false alarms and high detection rate.
FIGs. 2A-2C show an example of moving object detection. FIG. 2A shows frame in a video which has dynamic rain. FIG. 2B is the result of a conventional approach of moving object detection. It can be seen that in FIG. 2B, the spring is incorrectly classified as moving objects. On the contrary, in the result obtained by one embodiment of the present invention as shown in FIG. 2C, the spring was removed from the foreground and the moving person is correctly detected.
FIG. 3 shows a block diagram of a computer-implemented apparatus for moving object detection according to one embodiment of the present invention. As shown, the apparatus 300 comprises an image transformer 310 configured to transform a plurality of frames in a video from an initial image space to a high dimensional image space in a non-linear way; a modeler 320 configured to model background of the plurality of frames in the high dimensional image space; and a moving object detector 330 configured to detect a moving object in the plurality of frames based on the modeling of the background of the plurality of frames in the high dimensional image space.
In one embodiment, the dimension of the high dimensional image space is greater than the number of pixels in each of the plurality of frames.
In one embodiment, the modeler 320 may comprise a non-linear modeler 325 configured to model background of a plurality of transformed frames using a linear model in the high dimensional image space, the plurality of transformed frames obtained by transforming the plurality of frames in the non-linear way.
In one embodiment, the apparatus 300 may further comprise an objective function controller 340 configured to determine an objective function characterizing an error of the modeling of the background of the plurality of frames. In this embodiment, the moving object detector 330 is configured to detect the moving object based on the objective function.
In one embodiment, the objective function may further characterize at least one of: areas of the moving object in the plurality of frames, and connectivity of the moving object across the plurality of frames.
In one embodiment, the apparatus 300 may further comprise a kernel function controller 350 configured to determine a set of kernel functions associated with the high dimensional image space. In this embodiment, the objective function controller 340 is configured to associate at least a part of the objective function and the background of the plurality of frames using the set of kernel functions, and the moving object detector 330 is configured to detect the moving object by minimizing the objective function.
In one embodiment, the objective function controller 340 is configured to associate the objective function with a foreground indicator related to each pixel in each of the plurality of frames using the set of kernel functions, where the foreground indicator
indicates whether the related pixel belongs to the moving object.
FIG. 4 shows a block diagram of an example computer system 400 suitable for implementing example embodiments of the present invention. The computer system 400 can be a fixed type machine such as a desktop personal computer (PC) , a server, a mainframe, or the like. Alternatively, the computer system 400 can be a mobile type machine such as a mobile phone, tablet PC, laptop, intelligent phone, personal digital assistance (PDA) , or the like.
As shown, the computer system 400 comprises a processor such as a central processing unit (CPU) 401 which is capable of performing various processes in accordance with a program stored in a read only memory (ROM) 402 or a program loaded from a storage unit 408 to a random access memory (RAM) 403. In the RAM 403, data required when the CPU 401 performs the various processes or the like is also stored as required. The CPU 401, the ROM 402 and the RAM 403 are connected to one another via a bus 404. An input/output (I/O) interface 405 is also connected to the bus 404.
The following components are connected to the I/O interface 405: an input unit 406 including a keyboard, a mouse, or the like; an output unit 407 including a display such as a cathode ray tube (CRT) , aliquid crystal display (LCD) , or the like, and a loudspeaker or the like; the storage unit 408 including a hard disk or the like; and a communication unit 409 including a network interface card such as a LAN card, a modem, or the like. The communication unit 409 performs a communication process via the network such as the internet. A drive 410 is also connected to the I/O interface 405 as required. A removable medium 411, such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like, is mounted on the drive 410 as required, so that a computer program read therefrom is installed into the storage unit 408 as required.
Specifically, in accordance with example embodiments of the present invention, the processes described above with reference to FIG. 1 and Table 1 may be implemented by computer program. For example, embodiments of the present invention comprise a computer program product including a computer program tangibly embodied on a machine readable medium, the computer program including program code for performing the method 100 and/or the pseudo code shown in Table 1. In such embodiments, the computer program may be downloaded and mounted from the network via the
communication unit 409, and/or installed from the removable medium 411.
The functionally described herein can be performed, at least in part, by one or more hardware logic components. For example, and without limitation, illustrative types of hardware logic components that can be used include Field-programmable Gate Arrays (FPGAs) , Application-specific Integrated Circuits (ASICs) , Application-specific Standard Products (ASSPs) , System-on-a-chip systems (SOCs) , Complex Programmable Logic Devices (CPLDs) , and the like.
Various embodiments of the invention may be implemented in hardware or special purpose circuits, software, logic or any combination thereof. Some aspects may be implemented in hardware, while other aspects may be implemented in firmware or software which may be executed by a controller, microprocessor or other computing device. While various aspects of embodiments of the present invention are illustrated and described as block diagrams, flowcharts, or using some other pictorial representation, it will be appreciated that the blocks, apparatus, systems, techniques or methods described herein may be implemented in, as non-limiting examples, hardware, software, firmware, special purpose circuits or logic, general purpose hardware or controller or other computing devices, or some combination thereof.
By way of example, embodiments of the present invention can be described in the general context of machine-executable instructions, such as those included in program modules, being executed in a device on a target real or virtual processor. Generally, program modules include routines, programs, libraries, objects, classes, components, data structures, or the like that perform particular tasks or implement particular abstract data types. The functionality of the program modules may be combined or split between program modules as desired in various implementations. Machine-executable instructions for program modules may be executed within a local or distributed device. In a distributed device, program modules may be located in both local and remote storage media.
Program code for carrying out methods of the invention may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes,
when executed by the processor or controller, cause the functions/operations specified in the flowcharts and/or block diagrams to be implemented. The program code may execute entirely on a machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine readable medium may be any tangible medium that may contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine readable medium may be a machine readable signal medium or a machine readable storage medium. A machine readable medium may include but not limited to an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of the machine readable storage medium would include an electrical connection having one or more wires, a portable computer diskette, a hard disk, a random access memory (RAM) , a read-only memory (ROM) , an erasable programmable read-only memory (EPROM or Flash memory) , an optical fiber, a portable compact disc read-only memory (CD-ROM) , an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
Further, while operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order, or that all illustrated operations be performed, to achieve desirable results. In certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are contained in the above discussions, these should not be construed as limitations on the scope of the the present invention, but rather as descriptions of features that may be specific to particular implementations. Certain features that are described in the context of separate implementations may also be implemented in combination in a single implementation. Conversely, various features that are described in the context of a single implementation may also be implemented in multiple implementations separately or in any suitable sub-combination
Although the invention has been described in language specific to structural features and/or methodological acts, it is to be understood that the invention defined in the appended claims is not necessarily limited to the specific features or acts described above.
Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.
Claims (16)
- A computer-implemented method comprising:transforming a plurality of frames in a video to a high dimensional image space in a non-linear way;modeling background of the plurality of frames in the high dimensional image space; anddetecting a moving object in the plurality of frames based on the modeling of the background of the plurality of frames in the high dimensional image space.
- The method of claim 1, wherein a dimension of the high dimensional image space is greater than the number of pixels in each of the plurality of frames.
- The method of claim 1, wherein the modeling background of the plurality of frames in the high dimensional image space comprises:modeling background of a plurality of transformed frames using a linear model in the high dimensional image space, the plurality of transformed frames obtained by transforming the plurality of frames in the non-linear way.
- The method of claim 1, wherein the detecting a moving object in the plurality of frames comprises:determining an objective function characterizing an error of the modeling of the background of the plurality of frames; anddetecting the moving object based on the objective function.
- The method of claim 4, wherein the objective function further characterizes at least one of:areas of the moving object in the plurality of frames, andconnectivity o f the moving object across the plurality o f frames.
- The method of claim 4, wherein the detecting the moving object based on the objective function comprises:determining a set of kernel functions associated with the high dimensional image space;associating at least a part of the objective function and the background of the plurality of frames using the set of kernel functions; anddetecting the moving object by minimizing the objective function.
- The method of claim 6, wherein the associating at least a part of the objective function and the background of the plurality of frames using the set of kernel functions comprises:associating the objective function with a foreground indicator related to each pixel in each of the plurality of frames using the set of kernel functions, the foreground indicator indicating whether the related pixel belongs to the moving object.
- A computer-implemented apparatus comprising:an image transformer configured to transform a plurality of frames in a video to a high dimensional image space in a non-linear way;a modeler configured to model background of the plurality of frames in the high dimensional image space; anda moving object detector configured to detect a moving object in the plurality of frames based on the modeling of the background of the plurality of frames in the high dimensional image space.
- The apparatus of claim 8, wherein a dimension of the high dimensional image space is greater than the number of pixels in each of the plurality of frames.
- The apparatus of claim 8, wherein the modeler comprises:a non-linear modeler configured to model background of a plurality of transformed frames using a linear model in the high dimensional image space, the plurality of transformed frames obtained by transforming the plurality of frames in the non-linear way.
- The apparatus of claim 8, further comprising:an objective function controller configured to determine an objective function characterizing an error of the modeling of the background of the plurality of frames,wherein the moving object detector is configured to detect the moving object based on the objective function.
- The apparatus of claim 11, wherein the objective function further characterizes at least one of:areas of the moving object in the plurality of frames, andconnectivity of the moving object across the plurality of frames.
- The apparatus of claim 11, further comprising:a kernel function controller configured to determine a set of kernel functions associated with the high dimensional image space,wherein the objective function controller is configured to associate at least a part of the objective function and the background of the plurality of frames using the set of kernel functions,and wherein the moving object detector is configured to detect the moving object by minimizing the objective function.
- The apparatus of claim 13, wherein the objective function controller is configured to associate the objective function with a foreground indicator related to each pixel in each of the plurality of frames using the set of kernel functions, the foreground indicator indicating whether the related pixel belongs to the moving object.
- A device comprising:a processor; anda memory including computer-executable instructions which, when executed by the processor, cause the device to carry out the method of any one of claims 1 to 7.
- A computer program product being tangibly stored on a non-transient computer-readable medium and comprising machine executable instructions which, when executed, cause the machine to perform steps of the method according to any one of claims 1 to 7.
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2014/095643 WO2016106595A1 (en) | 2014-12-30 | 2014-12-30 | Moving object detection in videos |
EP14909406.2A EP3241185A4 (en) | 2014-12-30 | 2014-12-30 | Moving object detection in videos |
CN201480084456.9A CN107209941A (en) | 2014-12-30 | 2014-12-30 | Mobile object detection in video |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2014/095643 WO2016106595A1 (en) | 2014-12-30 | 2014-12-30 | Moving object detection in videos |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2016106595A1 true WO2016106595A1 (en) | 2016-07-07 |
Family
ID=56283871
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2014/095643 WO2016106595A1 (en) | 2014-12-30 | 2014-12-30 | Moving object detection in videos |
Country Status (3)
Country | Link |
---|---|
EP (1) | EP3241185A4 (en) |
CN (1) | CN107209941A (en) |
WO (1) | WO2016106595A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109859302A (en) * | 2017-11-29 | 2019-06-07 | 西门子保健有限责任公司 | The compression of optical transport matrix senses |
CN113591840A (en) * | 2021-06-30 | 2021-11-02 | 北京旷视科技有限公司 | Target detection method, device, equipment and storage medium |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101303763A (en) * | 2007-12-26 | 2008-11-12 | 公安部上海消防研究所 | Method for amplifying image based on rarefaction representation |
CN103324955A (en) * | 2013-06-14 | 2013-09-25 | 浙江智尔信息技术有限公司 | Pedestrian detection method based on video processing |
CN103500454A (en) * | 2013-08-27 | 2014-01-08 | 东莞中国科学院云计算产业技术创新与育成中心 | Method for extracting moving target of shaking video |
US20140232862A1 (en) * | 2012-11-29 | 2014-08-21 | Xerox Corporation | Anomaly detection using a kernel-based sparse reconstruction model |
CN104200485A (en) * | 2014-07-10 | 2014-12-10 | 浙江工业大学 | Video-monitoring-oriented human body tracking method |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103489199B (en) * | 2012-06-13 | 2016-08-24 | 通号通信信息集团有限公司 | video image target tracking processing method and system |
CN104113789B (en) * | 2014-07-10 | 2017-04-12 | 杭州电子科技大学 | On-line video abstraction generation method based on depth learning |
-
2014
- 2014-12-30 WO PCT/CN2014/095643 patent/WO2016106595A1/en active Application Filing
- 2014-12-30 CN CN201480084456.9A patent/CN107209941A/en active Pending
- 2014-12-30 EP EP14909406.2A patent/EP3241185A4/en not_active Withdrawn
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101303763A (en) * | 2007-12-26 | 2008-11-12 | 公安部上海消防研究所 | Method for amplifying image based on rarefaction representation |
US20140232862A1 (en) * | 2012-11-29 | 2014-08-21 | Xerox Corporation | Anomaly detection using a kernel-based sparse reconstruction model |
CN103324955A (en) * | 2013-06-14 | 2013-09-25 | 浙江智尔信息技术有限公司 | Pedestrian detection method based on video processing |
CN103500454A (en) * | 2013-08-27 | 2014-01-08 | 东莞中国科学院云计算产业技术创新与育成中心 | Method for extracting moving target of shaking video |
CN104200485A (en) * | 2014-07-10 | 2014-12-10 | 浙江工业大学 | Video-monitoring-oriented human body tracking method |
Non-Patent Citations (1)
Title |
---|
See also references of EP3241185A4 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109859302A (en) * | 2017-11-29 | 2019-06-07 | 西门子保健有限责任公司 | The compression of optical transport matrix senses |
CN113591840A (en) * | 2021-06-30 | 2021-11-02 | 北京旷视科技有限公司 | Target detection method, device, equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
EP3241185A1 (en) | 2017-11-08 |
CN107209941A (en) | 2017-09-26 |
EP3241185A4 (en) | 2018-07-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Hoang et al. | Metaheuristic optimized edge detection for recognition of concrete wall cracks: a comparative study on the performances of roberts, prewitt, canny, and sobel algorithms | |
CN108230357B (en) | Key point detection method and device, storage medium and electronic equipment | |
US20200074205A1 (en) | Methods and apparatuses for vehicle appearance feature recognition, methods and apparatuses for vehicle retrieval, storage medium, and electronic devices | |
US20190279014A1 (en) | Method and apparatus for detecting object keypoint, and electronic device | |
US20230410493A1 (en) | Image processing system, image processing method, and program storage medium | |
US20230134967A1 (en) | Method for recognizing activities using separate spatial and temporal attention weights | |
US20150235092A1 (en) | Parts based object tracking method and apparatus | |
US11822621B2 (en) | Systems and methods for training a machine-learning-based monocular depth estimator | |
WO2016145591A1 (en) | Moving object detection based on motion blur | |
US9129152B2 (en) | Exemplar-based feature weighting | |
CN109255382B (en) | Neural network system, method and device for picture matching positioning | |
CN110910445B (en) | Object size detection method, device, detection equipment and storage medium | |
CN113469025B (en) | Target detection method and device applied to vehicle-road cooperation, road side equipment and vehicle | |
CN108229494B (en) | Network training method, processing method, device, storage medium and electronic equipment | |
US20150030231A1 (en) | Method for Data Segmentation using Laplacian Graphs | |
CN107992791A (en) | Target following failure weight detecting method and device, storage medium, electronic equipment | |
CN112861940A (en) | Binocular disparity estimation method, model training method and related equipment | |
US20140247996A1 (en) | Object detection via visual search | |
WO2016106595A1 (en) | Moving object detection in videos | |
US20210216829A1 (en) | Object likelihood estimation device, method, and program | |
Andéol et al. | Confident Object Detection via Conformal Prediction and Conformal Risk Control: an Application to Railway Signaling | |
CN113869163A (en) | Target tracking method and device, electronic equipment and storage medium | |
US20190251703A1 (en) | Method of angle detection | |
Shi et al. | The Augmented Lagrange Multiplier for robust visual tracking with sparse representation | |
CN111353464B (en) | Object detection model training and object detection method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 14909406 Country of ref document: EP Kind code of ref document: A1 |
|
REEP | Request for entry into the european phase |
Ref document number: 2014909406 Country of ref document: EP |
|
NENP | Non-entry into the national phase |
Ref country code: DE |