KR101600617B1

KR101600617B1 - Method for detecting human in image frame

Info

Publication number: KR101600617B1
Application number: KR1020150155389A
Authority: KR
Inventors: 누완 산지와 라자수리야; 서정원; 이용훈
Original assignee: 주식회사 센텍
Priority date: 2015-11-05
Filing date: 2015-11-05
Publication date: 2016-03-07

Abstract

According to an embodiment of the present invention, a method to detect a human in an image includes: a preprocessing step of converting an RGB color image of an input image frame into a grayscale image; a background differentiating step of generating a background image frame through a learning procedure of a pixel unit about an image, formed of preprocessed input image frames, and generating each background differentiated image by differentiating the input image frame, sequentially inputted, into background image frames; a noise removal process of removing a noise from each background differentiated image; a movement area detection step of detecting a movement area from the background differentiated image without the noise; and a human object detection step of detecting a human object by using both the aspect ratio and oval ratio of the movement area.

Description

[0001] The present invention relates to a method for detecting a human in an image,

The present invention relates to a method for detecting a person in an image, which removes background and noise in the image and detects only a human image.

A face detector for detecting the face candidate region using the face color from the input image, a face detector for detecting the face using the AdaBoost for the detected face candidate region (Korean Patent Application No. 1020060055959) Algorithm is presented.

However, these existing technologies do not consider the human visual processing mechanism at all, and in particular, are not algorithms that detect people in moving images. In recent years, many CCTVs have been installed for security, and there is a great need for an algorithm that can efficiently detect a moving image of a person moving in real time, such as CCTV.

For this purpose, human detection in video is a subject of great interest in computer vision and pattern recognition. Various approaches to detecting pedestrians have been studied, and various techniques have been proposed for robust detection of noise that satisfies both performance and speed. However, people have difficulty in detecting accurately because they have various characteristics of appearance, color, and posture. Therefore, in the research for the detection of the existing person, the object was traced and searched using the image based model which removed the background.

However, since it is difficult to effectively detect humans in complex images with multiple persons, N. Dalal and B. Triggs proposed a Histograms of Gradient (HOG) descriptor to solve human detection problems

HOG-based feature extraction extracts features for the input image for human detection. The feature information is extracted by histogramming the direction information of the gradient for each block. The HOG-based human detection is a method for determining whether or not a person is human by SVM (Support Vector Machine).

However, HOG - based feature extraction relies heavily on the magnitude of the slope value because it histograms the slope direction information when extracting features for the input image. That is, the vertical direction information of the standing person is frequently entered into the feature values. Therefore, the above feature extraction method is dominantly influenced by the edge size value, and has a disadvantage in that there are many false detection problems in which people having vertical edges such as streetlights are judged as people by human detection.

Korean Patent Application Number: 10-2006-0055959

SUMMARY OF THE INVENTION It is an object of the present invention to provide an efficient method for background difference, noise reduction, and human detection in order to accurately detect a human image in an image.

An embodiment of the present invention includes a preprocessing process of converting an RGB color image image of an input image frame into a grayscale image; A background image frame is generated through a pixel-by-pixel learning process on an image composed of preprocessed input image frames, and a background image frame is generated by subtracting a background image frame from an input image frame, Difference process; A noise removal process for removing noise from each background differential image; A motion area detecting step of detecting a motion area in the noise-removed background differential image; And a human body detecting step of detecting a human body by using the aspect ratio of the moving region and the ellipticity ratio of the moving region together.

The pre-treatment _{is, f (x, y, z} ) denotes the gray scale values of the input image frame within [x, y] of _{pixels, f (x, y, z} ) R, f (x, y, z) G _{, f (x, y, z} ) B is the input _{(z x, y,) (} x, y) to represent the RGB components of the _{pixels, f (x, y, z} ) = 0.299f of image frames each of _R + 0.587 _{f (x, y, z) G} + 0.144f _{(x, y, z) B} to convert the RGB color image image into a grayscale image.

The noise removal process, T _p [k] are adaptive thresholds, N _zp is the background difference zero number of pixels having a pixel value, k is the frame sequence number of the background differential image, N is the frame of the background differential image in the image The total number, and the average value μ [k] of the background difference image between each input image frame,

Calculating an adaptive threshold T _p [k] by the adaptive threshold T _p [k]; And

And removing pixels having pixel values lower than the adaptive threshold value T _p [k] among the [x, y] pixel values in the background difference image as noise.

The motion region detection process can detect a motion region when a pixel group region, which is a region in which pixels having a pixel value higher than the adaptive threshold value T _p [k] are consecutively collected, in a background difference image is larger than a predetermined basic region have.

The human body detecting process includes a first detecting process of detecting an aspect ratio of each moving region and using an aspect ratio of detecting a moving region having an aspect ratio higher than a preset reference aspect ratio as a first object candidate of a human body; A second detection step of detecting a second candidate for a human body using an ellipticity in the entire region of the first candidate of the human body; And a final detection process of finally detecting a human body using an ellipticity ratio of a protruding region protruding upward from the second candidate of the human body.

Wherein the second detection step comprises the steps of: determining a width a and a length b of the first candidate region of the human body;

Extracting a primary ellipse region satisfying a condition of < RTI ID = 0.0 >

And the total number of pixels in the elliptical region, as compared with the total number of pixels in the entire primary object region of the human body, calculated by Process.

The final detection process may include defining a corrected length a1 'as a length of 60% when compared with the transverse length a, and defining a length equal to the length b as a corrected longitudinal length b1'.

Extracting a secondary elliptic region satisfying a condition of < RTI ID = 0.0 > And

Which is the ratio of the number of pixels in the protruding region protruding above the second-order candidate of the human body as compared with the number of pixels in the 1/4 quadrant of the first-order candidate region of the human body calculated by the second- And detecting the second candidate of the object and finally determining it as a human body.

According to an embodiment of the present invention, it is possible to accurately detect a person in an image by detecting a moving object and a moving object based on background difference, noise reduction, and various radial elliptic methods.

1 is a configuration diagram of an in-vivo person detecting apparatus according to an embodiment of the present invention;
2 is a flowchart illustrating a person detecting process in an image according to an exemplary embodiment of the present invention.
3 is a flowchart illustrating a background difference process according to an embodiment of the present invention.
FIG. 4 is a photograph showing a transform from an RGB color space to a gray scale color space according to an embodiment of the present invention. FIG.
Figure 5 is a diagram illustrating 26 new quantization levels quantized gray scaled to each pixel in accordance with an embodiment of the present invention;
Figure 6 illustrates a background modeling and updating view in accordance with an embodiment of the present invention;
7 is a photograph of a current image and a background differential image according to an embodiment of the present invention.
FIG. 8 is a graph illustrating the operation of an adaptive threshold according to an embodiment of the present invention and a comparison with a conventional conventional method. FIG.
9 is a graph showing the number of unnecessary pixels remaining after a threshold according to an embodiment of the present invention.
10 is a graph illustrating an adaptive threshold value and a background differential image after a conventional fixed threshold value according to an embodiment of the present invention
11 is an illustration of a photograph in which a moving object is detected according to an embodiment of the present invention.
12 is a photograph showing a state where a human body and another body are firstly detected according to an aspect ratio according to an embodiment of the present invention.
FIG. 13 is a diagram for explaining secondary detection of a human body according to an embodiment of the present invention; FIG.
14 is a diagram for explaining the final detection of a human body according to an embodiment of the present invention;

BRIEF DESCRIPTION OF THE DRAWINGS The advantages and features of the present invention, and how to achieve them, will be apparent from the following detailed description of embodiments thereof taken in conjunction with the accompanying drawings. The present invention may, however, be embodied in many different forms and should not be construed as being limited to the exemplary embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete and will fully convey the concept of the invention to those skilled in the art. And the present invention is only defined by the scope of the claims. In the following description, well-known functions or constructions are not described in detail since they would obscure the invention in unnecessary detail.

1 is a configuration diagram of a human detection apparatus according to an embodiment of the present invention.

The in-vivo person detecting apparatus includes a CPU such as a computer and is capable of arithmetic processing and includes a preprocessing unit 100, a background difference processing unit 200, a noise removing unit 300, and a human body detecting unit 400 can do.

The preprocessing unit 100 performs a preprocessing for converting an RGB color image image of an input image frame into a grayscale image.

The background difference processing unit 200 generates a background image frame through a process of learning pixel by pixel for an image composed of input image frames continuously input from the past, And performs a background difference to generate respective background difference images.

The noise eliminator 300 performs noise elimination that removes noise from each background differential image.

The human body detection unit 400 detects a moving region that detects a moving region in the noise-removed background differential image, and detects a human body by using the aspect ratio of the detected moving region and the ellipticity ratio of the moving region together.

FIG. 2 is a flowchart illustrating a human detection process according to an exemplary embodiment of the present invention. FIG. 3 is a flowchart illustrating a background difference process according to an exemplary embodiment of the present invention.

The present invention is an algorithm for detecting a moving person in an image. The core of the algorithm of the present invention is low complexity and high accuracy implemented in real time. We also apply a simple background update method to update the background. Each algorithm ultimately detects a human shape using a new application threshold method for noise reduction in binary images and an elliptical method with various radii.

To this end, a preprocessing step (S210) of converting RGB image images of an input image frame sequentially inputted from a CCD camera into a gray-scale image, respectively. This pre-treatment _{is, f (x, y, z} ) denotes the gray scale values of the input image frame within [x, y] of _{pixels, f (x, y, z} ) R, f (x, y, z) G _{, f (x, y, z} ) B is the input _{(z x, y,) (} x, y) to represent the RGB components of the _{pixels, f (x, y, z} ) = 0.299f of image frames each of _R + 0.587 _{f (x, y, z) G} + 0.144f Converts an RGB color image into a grayscale image by ₍ _{x, y, z} ₎

When the preprocessing process (S210) is described in detail, the CCD camera inherently receives a color image as an RGB color model. The present invention uses brightness images in which each pixel having only the brightness information whose lowest value is black and the highest value is black is processed as the value of a single sample. Thus, the original input image in the RGB format is converted to a gray scale image before preprocessing. When an RGB image is converted to grayscale, the RGB value is output as a single value per pixel, reflecting the brightness of the pixel. Since recognition of brightness is often dominated by green elements, the present invention reflects the brightness of pixels by applying a weighted average method.

The color model conversion process of the color image is as follows.

[Equation 1]

_{f (x, y, z)} = 0.299f (x, y, z) R + 0.587 f (x, y, z) G + 0.144f (x, y, z) B

Where f _{(x, y, z)} is within [x, y] represents the gray scale value of a _{pixel, f (x, y, z} ) R, f (x, y, z) in turn, the input image frame from the CCD camera _G , f _{(x, y, z) B} represent the RGB elements of the (x, y) pixels of the input image frame, respectively. For reference, FIG. 4 shows grayscale conversion in an RGB color space of an image using Equation (1).

On the other hand, after having a preprocessing step (S210) of converting an RGB color image of input image frames continuously inputted from the CCD camera into a rail-scale image, a pixel-by-pixel learning And a background difference process (S220) of generating background image frames through a learning process and generating background difference images by subtracting background image frames from input image frames input in turn.

As shown in FIG. 3, the background difference process (S220) includes a process S221 of assigning image pixel values having a value of 0 to 255 to substitute pixel values of 1 to 26, (S222) replacing the image pixel value of each pixel of the image frames with the substitute pixel value, and replacing the substitute pixel value of the input image frames A step (S223) of calculating a weighted substitute pixel value, a step (S224) of generating a background image frame having the same pixel size as the input image frame and assigning the weighted substitute pixel value to each pixel in the background image frame, And calculating a background difference image (D _im ) by subtracting the input image frame from the background image frame (S ₂₂₅ ).

This background difference process (S220) will be described in detail. Background subtraction is the most common method for foreground detection in the field of image processing and computer vision. As is known, the background difference technique is a technique for generating a background model through a learning process for an image continuously input from the past, and classifying the background and the motion region using the generated background model. The background difference technique classifies the background and foreground by comparing the input image with the generated background in pixel unit or area unit. And the computational complexity and performance vary depending on the background generation algorithm used. Generating a background means that the image data input from a certain point in the past is calculated and established based on a specific reference of each algorithm. In the MOD, a pixel-based processing method is applied, and the background is learned by calculating the pixels at the same position according to the time difference of each frame.

In general, candidate areas are considered as the foreground while people, cars, textures, etc. are considered as background while static objects such as buildings, trees, etc. are considered as the background. Therefore, the background difference is widely used as a technique for moving object detection from a static camera image. Furthermore, the background difference ends with the current image and background or reference image.

In order to obtain a reliable background, the background image needs to be updated in real time according to the environment of the monitoring range. In the example of the present invention, it is used as a simplified Gaussian model for background update based on the probability of a pixel sequence over a certain period of time. However, in addition to the background difference technique described below, a variety of known background difference techniques may be used for the billet difference technique.

In the present invention, for the background difference, first of all, for background modeling, an image pixel value having a value of 0 to 255 is assigned to an alternative pixel value of 1 to 26 (S221). That is, as shown in FIG. 5, 26 alternate pixel values having 10 difference values of pixels are allocated and set in image labels 0 to 255, respectively. In other words, the gray level image is quantized to divide the pixel values of 0 to 255 into 26 levels. An example of the present invention assumes that system noise, such as white Gaussian, or some pixel value may vary from 0 to 10 due to noise. Therefore, the difference value in one substitute pixel value is set to 10.

Thereafter, a step (S222) of replacing the image pixel value of each pixel of the input image frames, which are sequentially inputted for a predetermined period, with the substitute pixel value. That is, the substitute pixel values are appropriately added to the pixels of the current frame, and the filling is repeated until the last N ^th frame.

Thereafter, the weighted replacement pixel value is calculated and generated (S223), which is a substitute pixel value most frequently substituted among substitute pixel values substituted for each pixel of the input image frames sequentially input for a predetermined period. That is, as shown in FIG. 6, the weighted replacement pixel value ( _? ), Which is the most weighted average value as background pixels in the background image Bimg, is calculated. These processes are shown to be performed for each pixel of the input image sequence of a certain time period, and finally the background is obtained by obtaining an average of the values of the most weighted alternate pixel values among the most repeated values for a certain period of time. Therefore, the average of the most repeated values among them is arranged as the background pixel value of the relevant coordinates.

Thereafter, a background image frame having the same pixel size as the input image frame is generated to assign the weighted substitute pixel value to each pixel in the background image frame (S224), and the input image frame is subtracted from the background image frame, The image D _im can be generated (S225). That is, the current input image C _im is subtracted from the background image B _im and the following equation (2) to obtain a background differential image D _im having the size of the difference value.

&Quot; (2) "

7 shows the result of the background and the resulting image.

On the other hand, the background difference image (D _im ) not only has a motion area, but also has noise regions such as system noise, shadow, texture and the like. Therefore, after the background difference process, the noise removal process (S230) for removing noise from each background difference image is performed.

In the noise removal process of the present invention, T _p [k] is an adaptive threshold value, N _zp is the number of pixels having a pixel value of '0' in the background differential image, k is the frame sequence number of the background differential image, The total number of frames, and the average value μ [k] of the background differential image between each input image frame,

Threshold adaptation by T _p [k] the process and, in the background differential image for calculating the [x, y] has a process of removing a noise of a pixel having a lower pixel value than the adaptation threshold value from the pixel value T _p [k].

The noise removal process will be described in detail.

Most existing systems use fixed thresholds for noise reduction. Traditional fixed threshold value (T _com) is to the average value of the standard deviation σ [k] and the background differential image between each frame, as shown in Equation 3 is computed by a combination of μ [k].

&Quot; (3) "

λ is a user-defined constant and N is the total number of frames k. The fixed threshold value is calculated for each frame, but depends on the fully defined lambda value. Therefore, this method has the disadvantage of setting according to a user-defined constant. Since the standard deviation σ [k] and the average value μ [k] of the background differential image have a lover value when the number of '0' pixels increases, the threshold value is lowered and the noise pixels are kept as high as the foreground . Therefore, this method can only remove noise such as white Gaussian noise when Gaussian noise always has a low value. Of course, the value of λ can be increased and removed, but the problem is that λ has already been set and can not be changed in real-time situations. However, there are disadvantages that many pixels of an object in an image may be deleted when noise is removed by using the fixed threshold value. Also, this conventional method has a disadvantage of high calculation burden for calculating the standard deviation.

Therefore, the present invention proposes a method that is completely adaptive in real-time situations of the monitoring range and does not need to set parameters by the user. All parameter constants are automatically calculated.

Adaptive threshold Tp is applied in the present invention, T _p [k] are adaptive thresholds, N _zp is a number of pixels having a zero pixel values in the background differential image, k is the frame sequence number of the background differential image, N is the background subtraction The adaptive threshold value Tp [k] is calculated by the following equation (4), assuming that the total number of frames of the image and the average value μ [k] of the background difference images between the input image frames.

&Quot; (4) "

μ [k] is calculated by the above-described expression (3), and the number of pixels of the zero value is found by scanning the background difference image. Therefore, the operation of the adaptive threshold according to the introduced parameters can be described as follows.

- If all the pixels in the differential image are zero, the state of the camera is very good if there is no moving object, and no thresholding is needed to remove the noise.

- If the number of pixels with a value of 0 decreases, there are a lot of moving objects or a lot of noise in the camera. Therefore, the adaptation threshold value should increase.

- If the number of pixels with a value of 0 is reduced, there is a small object motion or system noise is relatively low. Therefore, the adaptation threshold value should decrease.

8 is a graph showing a method using a conventional fixed threshold value and a result using an adaptive threshold value of the present invention. The pattern of the two thresholds is the same in the image, and the proposed method always has a higher value than the conventional method when the adaptive threshold value is directly different according to the number of pixels having zero value in the differential image. Therefore, the adaptive threshold of the present invention is slightly higher than noise, but it is possible to remove noise, additional shadow, and scattering when the value is lower than the object. Figure 9 also shows the number of remaining noise pixels after the threshold. The object pixels of the background differential image are directly removed by the "human eye" and the graph shows the number of residual noise after the threshold.

10, the number of remaining noise pixels is very small as compared with the conventional fixed threshold value after noise is applied due to the application threshold of the present invention. This is because it reduces algorithm processing time when object labeling is applied. 10 (b), it is possible to remove unnecessary pixels, shadows, scattering, and the like. Therefore, the proposed application threshold is suitable for adaptation in the real-time situation in the monitoring range and it is not necessary to set the definition of the parameters by the user.

Meanwhile, as described above, when noise removal for removing noise from each background differential image is completed, a motion area detection process (S240) for detecting a motion area in the noise-removed background differential image is performed. The motion region detection process detects a motion region when a pixel group region, which is a region in which pixels having a pixel value higher than the adaptive threshold value T _p [k] are continuously collected in a background differential image, is greater than a predetermined basic region.

In other words, objects obtained from previous noise processing usually consist of scattered small regions. Therefore, additional steps to separate disordered pixels are needed to obtain the final moving object. In order that the labels labeled with a small number of pixels are removed in consideration of scattering and noise, a pixel group region in which pixels having a pixel value higher than the adaptive threshold value T _p [k] are consecutively collected in the background difference image And detects it as a motion area when it is larger than a preset basic area. Thus, a rectangle containing pixels, which are finally the same label, is obtained, and the single input image result is as shown in FIG.

On the other hand, after the detection of the motion area (S240), there is a human object detection process (S250) for detecting a human object by using the aspect ratio of the motion area and the ellipticity ratio of the motion area together. The human body detection process (S250) includes a first detection process (S251) using an aspect ratio for detecting an aspect ratio of each motion region and detecting a motion region having an aspect ratio higher than a preset reference aspect ratio as a first human candidate, A second detection step (S252) of detecting a second candidate for a human body using an ellipticity ratio in the entire region of the first candidate of the human body, a second detection step (S253) for finally detecting a human body by using the ratio.

The human body detection process (S250) will be described in more detail. All motion regions are extracted in the previous step, and the human body is extracted from the motion candidate region. Basically, the moving area can be a car, a person, an animal, a tree, or the like when the area of a general place is monitored. Now you have to find the body of the person in the candidate areas. In the first step, the aspect ratio of the moving region is measured to extract the human body, and the primary candidate of the human body is detected using the aspect ratio (S251). The aspect ratio of the person's body standing in the process of detecting the first candidate of the human body is always greater than 1 when the aspect ratio of the car is less than 1. This is the process of removing large lateral objects such as cars or shaking trees.

The method of calculating the aspect ratio is as shown in the following equation (5).

&Quot; (5) "

Therefore, as shown in Fig. 12, the extracted region is any object having the same aspect ratio as a human or human body. Therefore, the next step is to extract the correct human body. The present invention further proposes a new human candidate region extraction method which improves the conventional elliptic method.

The conventional method had to pass 80% of the ellipse having the radius of the candidate region in the human body. This method did not take into account the detailed shape of the candidate area, thereby increasing the five senses when all objects exceeded 80% of the ellipses. Therefore, the present invention considers not only the shape of the ellipse when detecting the human body but also the shape of the upper body of the person. Thereby reducing the number of senses

That is, the present invention includes a secondary detection process (S252) for detecting a secondary object of a human body using an ellipticity ratio in the entire region of a primary object of a human body after a primary detection process (S251) using an aspect ratio, And a final detection process (S253) for finally detecting a human body using the ellipticity ratio of the protruding region protruding upward from the second candidate of the human body.

First, the secondary detection process (S252) will be described. The second detection process includes the steps of determining a horizontal length a and a vertical length b of the first candidate region of the human body,

Extracting a primary elliptic region satisfying a condition of < RTI ID = 0.0 >

13, the second detection process S252 first detects that the candidate region is full in the ellipse having the major axis radius b and the minor axis radius a, Check whether it is cold or not.

It is determined whether the ellipse in FIG. 13 satisfies the following equation (6).

&Quot; (6) "

If r1, which is the ratio of candidate pixels countable in the ellipse to all pixels in the candidate region, exceeds 80%, this candidate region is considered as the first candidate for a human body, The process goes to the final detection process (S253) of ultimately detecting the human body using the ellipticity ratio of the protruding region protruding in the upward direction.

In the final detection step S253, a length of 60% is defined as a corrected lateral length a1 ', and a length equal to the longitudinal length b is defined as a corrected vertical length b1'

Extracting a secondary elliptic region satisfying a condition of < RTI ID = 0.0 >

Which is the ratio of the number of pixels in the protruding region protruding above the second-order candidate of the human body as compared with the number of pixels in the 1/4 quadrant of the first-order candidate region of the human body calculated by the second- A second candidate for the object is detected and finally determined as a human body.

The final detection process (S253) is described in more detail. When a person stands upright, the head always exists in the upper body part, and the upper body takes the form of a tower that fits in part of the ellipse. Therefore, when 80% of the person candidate region upper body is within the basic ellipse, it is finally considered as the human body. 14, the radius of the new ellipse is defined as a1 = a × 0.6 AND b1 = b. Each value is defined according to the shape of the human upper body and the second ellipse must also satisfy the condition of Equation (7) below.

&Quot; (7) "

To be a candidate area determined to be human body, when the ratio of pixel r2 of the colored area portion of an ellipse (E ₂₎ to pass the 80% candidate region is identified as the final body of the person.

As a result, the present invention can effectively detect moving objects and moving persons based on background differentials and various radial elliptical methods. Table 1 below is a table showing results of human detection experiments in images according to the method of the present invention.

No

Environment
Person
frame
Number
detection
Not detected
False detection
Detection rate
(%)
Non-detection rate
(%)
False detection rate
(%)
One

inside

1250

1221
29
15
97.68
2.32
1.20
2

inside

1880
1765
35
19
98.06
1.94
1.06
3

outdoor

3050
2991
59
34
98.07
1.93
1.11
4

outdoor

2521
2473
48
28
98.10
1.90
1.11
5

outdoor

3200
3106
94
39
97.06
2.94
1.22
6

outdoor

1540
1511
29
14
98.12
1.88
0.91
7

outdoor

950
922
28
8
97.05
2.95
0.84
8

outdoor

2100
2051
48
16
97.67
2.33
0.76
9

outdoor

1151
1123
28
7
97.57
2.43
0.61
Total result

17652

17613

399

180

97.73

2.27

1.02

The adaptive threshold method, which proposes a motion object extracted from the background difference, removes noise from the background difference image. Here, the adaptive threshold method is completely adaptive and shows relatively good performance compared to the conventional method. Also, the proposed adaptive threshold method has a relatively small amount of computation compared to other conventional methods. A moving object extracted after a single noise thresholding process distinguishes a person from other objects based on the aspect ratio and the various elliptic method algorithms proposed. Experimental results showed that 97.3% of moving people were detected, and the false detection rate was 1.02%. No detection is 2.27%. Experimental results show that the proposed method has better performance than the conventional method.

The embodiments of the present invention described above are selected and presented in order to facilitate the understanding of those skilled in the art from a variety of possible examples. The technical idea of the present invention is not necessarily limited to or limited to these embodiments Various changes, modifications, and other equivalent embodiments are possible without departing from the spirit of the present invention.

100: preprocessing section
200: background difference processing unit
300: noise rejection
400: human body detection unit

Claims

A preprocessing process of converting an RGB color image image of an input image frame into a grayscale image;
A background image frame is generated through a pixel-by-pixel learning process on an image composed of preprocessed input image frames, and a background image frame is generated by subtracting a background image frame from an input image frame, Difference process;
A noise removal process for removing noise from each background differential image;
A motion area detecting step of detecting a motion area in the noise-removed background differential image; And
A human body detecting step of detecting a human body by using the aspect ratio of the moving region and the ellipticity ratio of the moving region together;
&Lt; / RTI >
The pre-
f _{(x, y, z)} is the input image frame within [x, y] represents the gray scale value of a _{pixel, f (x, y, z} ) R, f (x, y, z) G, f (x, _{y, z) B} represent the RGB elements of the (x, y) pixels of the input image frame, respectively,
_{f (x, y, z)} = 0.299f (x, y, z) R + 0.587 f (x, y, z) G + 0.144f (x, y, z) gray scale the RGB color video image by the _B A process of converting into an image;
/ RTI >
The noise removing process includes:
T _p [k] is an adaptive threshold, N _zp is the number of pixels having a pixel value of '0' in the background differential image, k is the frame sequence number of the background differential image, N is the total number of frames of the background differential image, When the average value μ [k] of the interframe background difference image,

Calculating an adaptive threshold T _p [k] by the adaptive threshold T _p [k]; And
Removing pixels having pixel values lower than the adaptive threshold T _p [k] among the [x, y] pixel values in the background differential image as noise;
/ RTI >
The motion area detection process includes:
When the pixel group region, which is an area in which pixels having a pixel value higher than the adaptive threshold value T _p [k] are continuously collected in a background differential image, is greater than a preset basic region,
Wherein the human body detecting step comprises:
A first detecting step of detecting an aspect ratio of each moving region and using an aspect ratio of detecting a moving region having an aspect ratio higher than a preset reference aspect ratio as a primary object of a human body;
A second detection step of detecting a second candidate for a human body using an ellipticity in the entire region of the first candidate of the human body; And
A final detection process of finally detecting a human body using an ellipticity ratio of a protruding region protruding upward from the second candidate of the human body;
And detecting the in-vivo person.

delete

2. The method of claim 1,
A step of grasping a horizontal length a and a vertical length b of the first candidate region of the human body;

Extracting a primary ellipse region satisfying a condition of < RTI ID = 0.0 >

And the total number of pixels in the elliptical region, as compared with the total number of pixels in the entire primary object region of the human body, calculated by process;
And detecting the in-vivo person.

7. The method of claim 6,
Defining a length of 60% when compared to the transverse length a as a corrected transverse length a1 'and defining a length equal to the transverse length b as a corrected longitudinal length b1';

Which is the ratio of the number of pixels in the protruding region protruding above the second-order candidate of the human body as compared with the number of pixels in the 1/4 quadrant of the first-order candidate region of the human body calculated by the second- Detecting the second candidate of the object and finally determining it as a human body;
And detecting the in-vivo person.