CN101551853A

CN101551853A - Human ear detection method under complex static color background

Info

Publication number: CN101551853A
Application number: CNA2008102330509A
Authority: CN
Inventors: 刘嘉敏; 朱晟君; 潘英俊; 黄虹溥; 李丽娜
Original assignee: Chongqing University
Current assignee: Chongqing University
Priority date: 2008-11-14
Filing date: 2008-11-14
Publication date: 2009-10-07

Abstract

The invention relates to a human ear detection method with phase optimization under complex static color background; the method has the following steps of: (1) selecting YCbCr space as a skin color division space, using a Gaussian model as a skin color distribution model, conducting skin color likelihood score conversion and dynamic threshold division to the conversed image; (2) using a morphological method to optimize each divided region and conducting skin color region screening so as to exclude the skin color region not covering human side face and reduce interference; (3) using a wavelet modulus maximum method to detect the image edges under different scales and overlapping edge binary images under different scales, thereby not only accurately detecting the internal and external edges of human ear, but also suppressing noise interference; and (4) realizing human ear detection by conducting expansion, filling, refinement and reconstruction to the edge binary images. The test result shows that the method obtains good effect and is expected to provide useful reference for the development of a human ear automatic identification system.

Description

Human ear detection method under the complex static color background

Technical field

The present invention relates to a kind of human ear detection method, particularly a kind of human ear detection method that has utilized under the information detection of complex static color backgrounds such as the colour of skin, side face geometric properties, side face half-tone information, people's ear internal edge.

Background technology

In recent years, biological discriminating receives more and more researchists' concern.Its various aspects from authentication to the feeder connection safety check has all been brought into play vital role.But present stage, most of biological authentication technique all had harsh requirement to its working environment, thereby had limited its scope of application.So the researchist is striving to find new biological authentication technique.

Ear recognition is a kind of novel recognition technology, and present correlative study both domestic and external is all gone back seldom.The ear recognition technology makes it have very high theoretical research value and actual application prospect with its unique physiological characteristic and observation angle.It relates to numerous areas such as biological characteristic extraction, computer vision, Flame Image Process, pattern-recognition and identity identifying technology.

People's ear not only has and other individual biological characteristic something in commons, also has some unique features: Stability Analysis of Structures, be not subjected to the influence of facial expression, and stationkeeping, sample collection does not have relevant hygienic issues, can not make people's anxiety yet, and the easier people of allowing accepts.Although and people's ear is littler than people face, palmmprint, bigger than iris, retina, fingerprint, gather than being easier to.People's ear detection and Identification technology becomes the another focus in biological characteristic detection and Identification field just gradually.

Because the ear recognition systematic study still is in the exploratory stage, detects as the Complex Background ear of its prerequisite and expansion and to rarely have the people to study at home and abroad.Have only the people such as Zhang Wei of University of Science ﹠ Technology, Beijing that the Adaboost algorithm is incorporated in people's ear detection system now, proposed the people's ear detection and tracking method under a kind of complex background.

The Adaboost algorithm comes from Boosting (bootstrapping) algorithm that Schapire proposed in 1989 the earliest, and it is a kind of universal method that can " Boost " any given learning algorithm precision.Nineteen ninety-five Freund and Schapire improve it again, have formed initial Adaboost (Adaptive Boosting) algorithm.

With similar based on the method for detecting human face of Adaboost algorithm, the method for propositions such as Zhang Wei also is divided into two stages, i.e. off-line cascaded stages and online detection-phase.This method has realized that the people's ear under the complex background detects and follows the tracks of, and has confirmed to carry out the feasibility that people's ear detects research under complex background.But the off-line training time of this method is longer, and only the training of sorter has just been used 16 days.Simultaneously, the AdaBoost method relatively relies on the training sample in the image library, and bigger if the source of the ear image under the different background changes, the effect that adopts this method to detect is just not ideal.

Summary of the invention

The purpose of this invention is to provide the human ear detection method under a kind of complex static color background, it utilizes the information of the colour of skin in the human body still image, zone, gray scale, shape, carries out people's ear and detects, and has advantage such as to detect fast, method is simple, effective.

Detection method of the present invention has following steps:

(1). gather people's ear

(2). Face Detection:

Utilize the colour of skin information in the coloured image to carry out Region Segmentation, the people's ear sensing range under the complex background is narrowed down in the colour of skin scope.The character of existing color space has been analyzed in this invention, considers from the aspects such as complexity that brightness and chrominance separation effect, space transform, and has chosen the YCbCr color space.

After selecting YCbCr as colour of skin representation space, set up Gaussian distribution model, the colour of skin that adopts adaptive threshold to cut apart is cut apart area of skin color, obtains initial candidate's area of skin color binary map;

(3). the area of skin color screening:

Use for reference in people's face detection algorithm and utilize the front face priori to carry out regional method for screening, find out the side face feature under the complex background, and in view of the above the area of skin color after cutting apart is screened, sensing range is narrowed down to side face zone.Depth-width ratio example, the area ratio/occupancy ratio of the side face colour of skin in image according to the side face carries out the area of skin color screening then, obtains edge binary images;

(4). Image Edge-Detection:

Utilize the monochrome information of image, adopt wavelet modulus maximum method under four kinds of different scales, to search the image border.Find out in the image after the interested area of skin color, further image edge information is studied at brightness space.Analyze the applicable situation and the edge feature of people's lug areas under complex background of current various edge detection operators, adopt wavelet modulus maximum method to carry out Image Edge-Detection at last.

240 width of cloth side face edge images are analyzed, drawing in the side face scope has only people's lug areas to have the characteristic at intensive edge, when edge binary images is expanded, when filling, refinement, corrosion and reconstructed operation, independently outline line is eliminated, and the edge that Ren Erchu is intensive then obtains keeping after reconstruct owing to form a fringe region when expanding and fill.At this moment, people's ear is judged and located, thereby realize the detection of people's ear in conjunction with original image.

It also is the edge binary images that step (3) is obtained, utilize the gray scale of edge image candidate region, adopt wavelet modulus maximum method to extract image, expand, filling, refinement and reconstruct, again edge image is carried out the fringe region search, promptly get people's ear to be detected.

Adaptive threshold described in the step (2) is cut apart, and adopts the method for cycle detection connected region, generates the threshold value of a suitable present image.

The method of setting up colour of skin distributed model described in the step (2) is as follows:

1. choose side face area of skin color;

2. the sample area of skin color is carried out low-pass filtering, remove noise

The low-pass filter impulse response array that described low-pass filtering is selected for use is:

h = \frac{1}{9} [\begin{matrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \end{matrix}]

3. R, G, the B tristimulus value with each pixel in the filtered area of skin color is transformed into the YCbCr color space, and (Cb Cr), draws its distribution and Gaussian distribution model under the YCbCr space to obtain the chromatic value of each skin pixel point;

4. determine two-dimentional Gauss model G (M, the C) unknown parameter in, i.e. average M and variance C.

Adopt wavelet modulus maximum method to search the method for image border in the step (4):

It is long-pending that the two-dimentional dyadic wavelet of rim detection is designed to dividing of one dimension dyadic wavelet, and its Fourier transform is expressed as:

\hat{ψ^{x}} (w_{x}, w_{y}) = G (w_{x} / 2) \hat{φ} (w_{x} / 2) \hat{φ} (w_{y} / 2)

\hat{ψ^{y}} (w_{x}, w_{y}) = G (w_{y} / 2) \hat{φ} (w_{x} / 2) \hat{φ} (w_{y} / 2)

Wherein, ψ ^X(w _x, w _y), ψ ^y(w _x, w _y) be respectively two-dimentional smooth function θ (x, partial derivative y),

The Fourier who is them respectively changes. Be a low-pass filter, and

G (w) = - i \sqrt{2} e^{- iw / 2} \sin (w / 2)

It is a high-pass digital filter;

If scaling function satisfies following two yardstick equations:

\hat{φ} (w) = Π_{p = 1}^{+ \infty} \frac{H (2^{- p} w)}{\sqrt{2}} = \frac{1}{2} H (\frac{w}{2}) \hat{φ} (\frac{w}{2})

If the selecting scale function is m batten, promptly

\hat{φ} (w) = e^{- \frac{iϵw}{2}} {(\frac{\sin (w / 2)}{w / 2})}^{m + 1},

Then can get

H (w) = \sqrt{2} e^{- iϵw / 2} {[\cos (w / 2)]}^{m + 1}

Fourier transform for low-pass filter;

If sampling interval equals 1, then the discrete wavelet coefficient is:

d_{j}^{x} (n, m) = w^{x} (2^{j}, n, m), d_{j}^{y} (n, m) = w^{y} (2^{j}, n, m)

Equally, the definition original image signal is:

a ₀(n，m)＝<f(x，y)，φ(x-n)φ(y-m)>

And the smoothed image signal of j 〉=0 o'clock

a _j(n，m)＝<f(x，y)，φ _j(x-n)φ(y-m)>

So, two-dimensional discrete dyadic wavelet transform

The trous algorithmic notation is following discrete convolution form:

\{\begin{matrix} d_{j + 1}^{x} (n, m) = a_{j} * \overset{&OverBar;}{g_{j}} δ (n, m) \\ d_{j + 1}^{y} (n, m) = a_{j} * δ \overset{&OverBar;}{g_{j}} (n, m) \end{matrix}

Wherein:

In the formula, a _J+1Be a _jAlong the result of horizontal and vertical low-pass filtering, d _J+1 ^xBe a _jAlong the result of horizontal high-pass filtering, d _J+1 ^yBe a _jThe result of high-pass filtering longitudinally.

The present invention is directed to the coloured image under the complex background, proposed a kind of human ear detection method, realized complex background people's ear detection down that no front face disturbs based on colour of skin information, side face statistics, people's ear internal edge feature.

Detect research by the people's ear under the complex background, the present invention has found a feasible human ear detection method on multiple information bases such as the colour of skin of utilizing human body in still image, zone, gray scale, shape, and improve wherein all link, finally finished the people's ear testing process under the complex background.The training time that experiment showed, this method can shorten in 9 days, was less than the needed 16 days time of existing human ear detection method far away; In addition, this method does not rely on training sample, has overcome the constraint that sample detects people's ear substantially.

Description of drawings

Fig. 1 is a side face colour of skin sample;

Fig. 2 is the distribution of the colour of skin under the YCbCr space: two coordinates on plane are represented red color component and chroma blue component respectively, and ordinate is represented the chromatic value of different people skin color;

Fig. 3 is a colour of skin Gauss model: two coordinates of level are represented red color component and chroma blue component respectively, and ordinate is represented colour of skin Gaussian distribution value;

Fig. 4 wherein, (a) is original image for the area of skin color screening, (b) is target area 1, (c) is target area 2, (d) is target area 3, (e) is screening back image;

Fig. 5 is the minimum boundary rectangle of area of skin color;

Fig. 6 is the Wavelet Modulus Maxima image under the different scale;

Fig. 7 is the edge binary images under the different scale;

Fig. 8 is the edge binary images after superposeing;

Fig. 9 is for getting rid of the edge binary images of disturbing;

Figure 10 is that reconstruction result (a) is original image, (b) is image after the reconstruct;

Figure 11 is an expansion results for expanding and filling (a), (b) for filling the result;

Figure 12 (a) is the refinement result, (b) is Corrosion results;

Figure 13 is a reconstruction result;

Figure 14 ear testing result of behaving;

Figure 15 is the result images that many people ear detects;

Figure 16 is system module figure.

Embodiment

Step 1: to the collection of ear image

200 width of cloth that this paper chooses different angles deflection in UMIST (the University of ManchesterInstitute of Science and Technology) face database that Britain graceful Chester Polytechnics built contain ear image and laboratory and adopt 90 width of cloth complex background images that digital camera takes as the statistics foundation, the standard that is used for the statistical regions screening is for use in filtering out effective class area of skin color.

In addition, for verifying the whole structure of this experiment, the image that detects usefulness at last mainly comes from a unification, standard, open people's ear database CEID (Chinese Ear ImageDatabase) that digital camera is taken, and this database has been gathered 200 Chinese ear images altogether.This experiment is randomly drawed two groups and is comprised 240 width of cloth pictures altogether from this database.First group is the ear image under the single background, totally 200 width of cloth.Wherein every width of cloth image all only contains single, that do not have other colours of skin interference, as not have rotation significantly and deflection positive dough figurine ear.Second group is the complicated and diversified coloured image of background, totally 40 width of cloth.

Step 2: Face Detection

Analyze the relative merits of colour of skin cluster under the different colours space, select suitable colour of skin representation space, set up complexion model, adopt a kind of area of skin color location algorithm of cutting apart based on adaptive threshold, finish area of skin color and cut apart at static colour image.

Choosing of colour of skin representation space: after each color space is analyzed, draw as drawing a conclusion:

1. the RGB color space is not suitable for doing the space that complexion model is set up, and in rgb space, R, G, three components of B have very strong relevance, can not realize that brightness separates, so be not available to do Face Detection.

2. normalized color space has been eliminated the influence that brightness changes to a great extent, can be used for doing Face Detection.But its significant disadvantages is exactly under the situation of low-light level, and nonlinear transformation makes that normalized RGB noise ratio is bigger.

3. YUV series and HSI series color space can both be as the color space of Face Detection, but comparatively speaking, more simple transformational relation is arranged between YUV series and the RGB, and there is not singularity in the while.And in YUV series, the many method for expressing of YCbCr color space as the digital video input equipment, so when hardware device links to each other with computing machine, if carry out Flame Image Process with it, can reduce the time of color conversion, processing speed is very fast.

Through after the above-mentioned analysis, we select the YCbCr color space as colour of skin representation space.

The particular content of Threshold Segmentation is as follows: obtain after the colour of skin likelihood score transition diagram of image, detected next bright areas is not just necessarily to be meant skin area, and is meant that these parts are identical with skin color or close with skin color.Below we also need the colour of skin and background are made a distinction, Threshold Segmentation is exactly the separate targets zone used always and the method for background area.

Will tell target and its shape is intactly extracted from complex background, the selection of threshold value is crucial.If selection of threshold is too high, then too much impact point is classified as background by mistake; Threshold value is selected lowly excessively, then opposite circumstances can occur.Usually, the selection of threshold value has fixed threshold method, adaptive threshold method etc.Fixed threshold refers to that entire image uses same threshold value to make dividing processing, is applicable to that background and prospect have the image of obvious contrast.Under complex background,, make picture quality exist very big-difference because detected image is subjected to the influence of illumination condition difference and various interference easily.If adopt the fixed threshold method, then threshold value obtains too high or too lowly, all can produce the colour of skin and the more serious erroneous judgement of background, has influence on the effect of cutting apart, and causes omission easily.And the self-adapting threshold segmentation method finally can generate the threshold value of a suitable present image by the cycle detection connected region, is not easy to exist omission.

We adopt the adaptive threshold method to carry out cutting apart of area of skin color, with 0.1 is the interval, threshold value is decremented to 0.05 successively by 0.55, the area of skin color number of pixels is poor under continuous two threshold values of cycle calculations, tries to achieve a parameter that makes the area of skin color number of pixels change minimum and cuts apart as optimal threshold.The method of this choice of dynamical threshold value has been got rid of different light and background to a certain extent to the influence of image, has replaced artificial judgement and interference, finds optimal segmentation threshold automatically.

For setting up colour of skin statistical model, this experiment has artificially been selected to contain the broca scale picture in a large number under different illumination conditions, because groups of people's ear is subjected to blocking of jewelrys such as hair or glasses, when gathering colour of skin sample, choose the side face zone of containing people's ear as far as possible, calculate its colour of skin Gauss model, determine its parameter according to model then, concrete experimental procedure is as follows:

1. the manual side face area of skin color of choosing.

Because the suffered influence of front face and people's ear is different, in order to make the model of setting up more can adapt to the especially detection of people's ear of side people's face, we have only chosen positive people from side face or people's lug areas as experiment sample.Consider the diversity of sample, we have chosen the colour of skin sample under the high light and the low light level respectively.Figure 1 shows that side face colour of skin sample.

2. the sample area of skin color is carried out low-pass filtering.From the signal spectrum angle, the slow changing unit of signal belongs to the low pass part in frequency field, and the quick changing unit of signal belongs to HFS in frequency field.For image, the frequency component of noise all is in the higher part of frequency, therefore can adopt the method for low-pass filtering to remove noise.

Carrying out the low-pass filter impulse response array that low-pass filtering selects for use among the present invention is:

h = \frac{1}{9} [\begin{matrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \end{matrix}]

3. R, G, the B tristimulus value with each pixel in the filtered area of skin color is transformed into the YCbCr color space, and (Cb Cr), draws its distribution under the YCbCr space and its Gaussian distribution model to obtain the chromatic value of each skin pixel point.Figure 2 shows that the distribution of the colour of skin under the YCbCr space, Figure 3 shows that the Gaussian distribution model of the colour of skin.

4. the chromatic value to resulting each skin pixel point of the 3rd step carries out statistical computation, to determine two-dimentional Gauss model G (M, the C) unknown parameter in, i.e. average M and variance C.

In this experiment, the ear image that contains under 90 width of cloth different light that collect is carried out the colour of skin and chooses and add up, the average and the variance that finally obtain are respectively:

M = [\begin{matrix} 106.3488 \\ 148.0373 \end{matrix}]

C = [\begin{matrix} 65.7823 & 61.0333 \\ 61.0333 & 102.4386 \end{matrix}]

Step 3: area of skin color screening: at first adopt medium and small hole of morphological operation removal of images or isolated pixel structure, analyze existing area of skin color feature then, utilizing the moment characteristics of image and geometric properties such as region area, depth-width ratio, the methods such as area ratio/occupancy ratio of the side face colour of skin in minimum boundary rectangle frame to carry out the candidate region optimizes, the side face zone that may contain people's ear is screened, obtain candidate's side face zone.

For accurate these side face features of statistics, this experiment is chosen contain ear image 200 width of cloth and the laboratory of different angles deflection in the standard faces storehouse that Britain graceful Chester Polytechnics (UMIST) built and is adopted complex background image 90 width of cloth that digital camera takes as the statistics foundation.Its statistic processes is as follows:

1. to complex background image, be partitioned into the side face zone that contains people's ear by hand, add up the area of each area of skin color then.For the image of same scale, determine area threshold, promptly, think that then the people's ear yardstick in the colour of skin is too small when side face colour of skin area during less than this threshold value, for follow-up recognition system, detection has little significance.

2. utilize the side face image statistics people side face Aspect Ratio scope of different deflection angles in the UMIST standard faces storehouse, the side face image Aspect Ratio that while adversary worker is partitioned into is added up, and determines the threshold value that can be used for screening.

3. add up the area ratio of area of skin color and its boundary rectangle, with the area ratio/occupancy ratio of the colour of skin in its minimum boundary rectangle in the side face of determining different deflection angles under the complex background.For some areas and all satisfactory class area of skin color of Aspect Ratio, further adopt its area ratio/occupancy ratio to test.

According to the statistical knowledge of above offside face provincial characteristics as can be known, below do not comprise the side face in these zones,, the detection and Identification of people's ear do not had too big meaning yet because yardstick is too small or deflection angle is excessive even perhaps comprise the side face:

1. class colour of skin area is less than the zone of defined threshold.Be 1200 * 1600 as the image size of being gathered in this experiment, then, promptly think meaningless zone when the class colour of skin area of a certain candidate region during less than 250 pixels.

2. the ratio of Qu Yu major and minor axis is greater than 4.5.Because image-side face part, complex background lower part links to each other with neck or upper body, so when statistics, not only consider the proportional range at side face place.

3. in Qu Yu the minimum boundary rectangle, the area ratio/occupancy ratio of the class colour of skin is less than 2/5 or greater than 3/4.For the side face, comprise hair or neck non-area of skin color on every side in its boundary rectangle, so can not all be filled by the colour of skin, the boundary rectangle of other class colours of skin or part area of skin color of human body then may all be filled by the colour of skin, adopts colour of skin occupation rate can remove such zone.

This paper screens the complex background image through skin color segmentation, has deleted ineligible class area of skin color.Fig. 4 is an example of area of skin color screening, and wherein (a) be original image, (b), (c), (d) be target area to be screened in the original image, (e) result images for obtaining after screening.

Step 4: Image Edge-Detection.Utilize the half-tone information of candidate region, analyze the usable condition of various method for detecting image edge, select wavelet modulus maximum method detected image edge and the edge image that obtains carried out overlap-add procedure under the different scale space, according to people's ear inside the characteristics of enriching marginal information are arranged then, intensive edge image after the stack is handled and the fringe region search, thus the judgement and the location of realizing people's ear.

The method that described employing wavelet modulus maximum method is searched the image border is as follows:

It is long-pending to can be designed as dividing of one dimension dyadic wavelet as the two-dimentional dyadic wavelet of rim detection, and specifically, their Fourier transform is expressed as:

\hat{ψ^{x}} (w_{x}, w_{y}) = G (w_{x} / 2) \hat{φ} (w_{x} / 2) \hat{φ} (w_{y} / 2)

\hat{ψ^{y}} (w_{x}, w_{y}) = G (w_{y} / 2) \hat{φ} (w_{x} / 2) \hat{φ} (w_{y} / 2)

The Fourier who is them respectively changes.

Be a low-pass filter, and

G (w) = - i \sqrt{2} e^{- iw / 2} \sin (w / 2)

It is a high-pass digital filter;

If scaling function satisfies following two yardstick equations:

\hat{φ} (w) = Π_{p = 1}^{+ \infty} \frac{H (2^{- p} w)}{\sqrt{2}} = \frac{1}{2} H (\frac{w}{2}) \hat{φ} (\frac{w}{2})

If the selecting scale function is m batten, promptly

\hat{φ} (w) = e^{- \frac{iϵw}{2}} {(\frac{\sin (w / 2)}{w / 2})}^{m + 1},

Then can get

H (w) = \sqrt{2} e^{- iϵw / 2} {[\cos (w / 2)]}^{m + 1}

Fourier transform for low-pass filter;

If sampling interval equals 1, then the discrete wavelet coefficient is:

d_{j}^{x} (n, m) = w^{x} (2^{j}, n, m), d_{j}^{y} (n, m) = w^{y} (2^{j}, n, m)

Equally, the definition original image signal is:

a ₀(n，m)＝<f(x，y)，φ(x-n)φ(y-m)>

And the smoothed image signal of j 〉=0 o'clock

a _j(n，m)＝<f(x，y)，φ _j(x-n)φ(y-m)>

So, two-dimensional discrete dyadic wavelet transform

The trous algorithmic notation is following discrete convolution form:

\{\begin{matrix} d_{j + 1}^{x} (n, m) = a_{j} * \overset{&OverBar;}{g_{j}} δ (n, m) \\ d_{j + 1}^{y} (n, m) = a_{j} * δ \overset{&OverBar;}{g_{j}} (n, m) \end{matrix}

Wherein:

Among the present invention, for exterior contour and the internal edge of describing people's ear well, we adopt two dimension under the matlab experimental situation

The Wavelet Modulus Maxima that the trous algorithm carries out image detects, and wherein the coefficient of spline wavelets wave filter is set to:

H=[0.125,0.375,0.375,0.125]; G=[0.5 ,-0.5]; Delta=[1,0,0] wherein h is the coefficient of low-pass filter, and g is the coefficient of Hi-pass filter.

Experimental procedure is as follows:

1. image is carried out 4 grades of decomposition, obtain 4 kinds of Wavelet Modulus Maxima images under the yardstick.For fear of owing to the omission that causes the ear part edge such as block during skin color segmentation, choose the minimum boundary rectangle of candidate region and handle as the target area.Obtain the wide and meticulous edge of good people's helix in order to guarantee simultaneously, system carries out wavelet decomposition to the target area under four kinds of different scales.Figure 5 shows that an example of the minimum boundary rectangle of area of skin color, the Wavelet Modulus Maxima image of example for this reason shown in Figure 6 under four kinds of different scales.

2. detect mould in the wavelet transformed domain and be local maximum and, be converted to multiple dimensioned two-value boundary image greater than the point of predetermined threshold.Because the graphical rule that collects is bigger, the image border is very meticulous in the yardstick (a), but noise is a lot, so do not consider in the subsequent treatment.Figure 7 shows that the edge binary images that Fig. 6 mesoscale (b), (c), (d) obtain after this process is handled.

3. the edge binary images under multiple dimensioned is superposeed.This moment, the edge, inside and outside of people's ear all was retained, and a part of noise spot is eliminated.Shown in Figure 8 being carried out the result that overlap-add procedure draws to Fig. 7, got rid of a part of noise spot effectively.

4. utilize colour of skin binary map to get rid of the extraneous noise spot of area of skin color.The image that is after handling through this process shown in Figure 9, the result shows and has got rid of the extraneous noise spot of area of skin color effectively.

Step 5: detect people's ear: according to before to people's statistics of edge characteristic in one's ear, learn that pixel or sparse separate edge line isolated in the edge image can not be people's ear regions, so adopt following steps to handle in this paper system, in the hope of obtaining interesting areas in the image.

1. adopt the morphology reconstructed operation to carry out opening operation.For the interference edge in the removal of images, keep former possible people's lug areas not to be eliminated simultaneously, after to the original image corrosion, be reconstructed operation again.Restructuring procedure links to each other because people's ear place edge line is intensive, so can restore to it well.The image that is shown in Figure 10 has restored people's ear place edge line well through the result behind the opening operation, and wherein (a) is original image, (b) is the image after the reconstruct.

2. edge image is carried out expansive working, make originally intensive people in one's ear edge connect into clear zone, an edge.Ear place connected region merges in the image at this moment, the inner cap holes that forms.Hole in the blank map picture makes the background area of ear enclose inside be filled to prospect, and other disturb edge no area in filling process to change.The result who is after this process is handled shown in Figure 11, wherein (a) is expansion results, (b) for filling the result.

3. people's lug areas is judged.The fringe region later that expands is carried out the iterative refinement operation, be refined as the edge that links to each other by single pixel until all edge lines.Corrode operation to image this moment, and trickle edge line is eliminated, if do not have the clear zone in the image this moment, we think no positive dough figurine ear in this image.If still have the clear zone after the corrosion, this zone next step object that will position that is us then.Figure 12 has shown the result of this process, and wherein (a) is the result after the refinement, (b) is the result after the corrosion.

4. utilize the reserve area of reconstruct recovering step in 3..Figure 13 is the result after the reconstruct, shows that this process can recover reserve area effectively.

5. people's lug areas is located.According to the 4. fringe region of gained of step, in former figure with the minimum boundary rectangle at its place as the range of results that detects.Figure 14 shows that the final detection result of this testing process, located the people's ear position in the illustration exactly, adopting width among the figure is that the edge line of 3 pixels is demarcated people's lug areas.

The step that more than detects people's ear is that example describes with single ear image, this detection method is suitable equally to many ear images, for many ear images, same above-mentioned steps one to the step 5 that adopts detects, can be to a plurality of people's ears location, and Figure 15 is the sample result image that a plurality of people's ears detect, this image comprises two people's ears, adopt above-mentioned steps one to step 5 to detect, its effect is identical with single ear, promptly can realize not having the people's ear detection under the complex background that front face disturbs.

Step 6: realize people's ear detection system.

This people's ear detection system mainly comprises four modules: skin color segmentation module, candidate region optimal module, Image Edge-Detection module and people's ear detection module.Figure 16 is the module map of this system.Each module functions is as follows:

The colour of skin extracting section that satisfies condition in the coloured image of Face Detection module with input is come out, and its prerequisite is to set up suitable complexion model.By the analysis to different colours space and complexion model, this paper is chosen in the simple Gauss model of setting up the colour of skin under the YCbCr color space.Similarity according to each pixel and the colour of skin in the image converts colour of skin likelihood figure to then, adopts self-adapting threshold to carry out skin color segmentation, obtains initial candidate's area of skin color binary map.

In the optimal module of candidate region, at first adopt morphologic filtering to handle the colour of skin binary map that obtains is operated, remove isolated or less dot structure in the image.Many aspects such as colour of skin occupation rate from area, depth-width ratio example and the minimum boundary rectangle of target area are screened each zone then, and the zone that remains will enter the Image Edge-Detection module as the candidate region.

At the edge of image detection module, analysis wavelet and current various edge detection operators commonly used carry out the relative merits of rim detection in complex background image, final adopt wavelet modulus maximum method extract image under different scale the edge and synthesize, get rid of most possibly because the interference edge that noise or contextual factor cause from each side such as details and profiles.

At people's ear detection module, the characteristics of Analysis of Complex background servant helix exterior feature at first, because people's ear in-profile is abundanter, compare with the side face colour of skin and hair etc., the edge at people's ear place is more in the image modulus maximum edge that extracts under different scale, and stack back formation fringe region and side other area of skin color on the face have tangible difference, carry out the judgement and the location of people's lug areas in view of the above, detect thereby finish people's ear.

Following table be the people's ear under the simple background detect and complex background under ear image detect and to result's statistical study.

The statistical study of table 1 people ear testing result

Conclusion:

The present invention is on the basis of using for reference people's face detection algorithm, at the coloured image under the complex background, proposed a kind of human ear detection method, realized complex background people's ear detection down that no front face disturbs based on colour of skin information, side face statistics, people's ear internal edge feature.The result shows by experiment, and method of the present invention has quite good detecting effectiveness to simple relatively ear image, and verification and measurement ratio reaches 100%.And for background than complicated situation, verification and measurement ratio has reached 94.5%.

Claims

1. the human ear detection method under the complex static color background is characterized in that following steps:

(1) gathers people's ear

(2) Face Detection:

Select YCbCr as colour of skin representation space, set up Gaussian distribution model, the area of skin color location algorithm that adopts adaptive threshold to cut apart is cut apart area of skin color, obtains initial candidate's area of skin color binary map;

(3) area of skin color screening:

Handle with morphological method, carry out the area of skin color screening, obtain edge binary images according to depth-width ratio example, the area ratio/occupancy ratio of the side face colour of skin in image of side face;

(4) Image Edge-Detection:

To the edge binary images that step (3) obtains, utilize the gray-scale value of edge image candidate region, adopt wavelet modulus maximum method to extract image, expand, filling, refinement and reconstruct, again edge image is carried out the fringe region search, promptly get people's ear to be detected.

2. the human ear detection method under the complex static color background according to claim 1 is characterized in that: the adaptive threshold described in the step (2) is cut apart, and adopts the method for cycle detection connected region, generates the threshold value of a suitable present image.

3. the human ear detection method under the complex static color background according to claim 1 is characterized in that: the method for setting up colour of skin distributed model described in the step (2) is as follows:

1. choose side face area of skin color;

2. the sample area of skin color is carried out low-pass filtering, remove noise;

4. the human ear detection method under the complex static color background according to claim 3 is characterized in that: carrying out the low-pass filter impulse response array that low-pass filtering selects for use in the step (2) is:

h = \frac{1}{9} [\begin{matrix} 1 & 1 & 1 \\ 1 & 1 & 1 \\ 1 & 1 & 1 \end{matrix}]

5. the human ear detection method under the complex static color background according to claim 1, it is characterized in that: step (4) adopts wavelet modulus maximum method that image is extracted, obtain the Wavelet Modulus Maxima image under the different scale, analyze these edge of image binary map then, these edge binary map are superposeed, finally obtain getting rid of the edge binary images of interference.