JP4126721B2 - Face area extraction method and apparatus - Google Patents

Face area extraction method and apparatus Download PDF

Info

Publication number
JP4126721B2
JP4126721B2 JP2002355017A JP2002355017A JP4126721B2 JP 4126721 B2 JP4126721 B2 JP 4126721B2 JP 2002355017 A JP2002355017 A JP 2002355017A JP 2002355017 A JP2002355017 A JP 2002355017A JP 4126721 B2 JP4126721 B2 JP 4126721B2
Authority
JP
Japan
Prior art keywords
face
skin color
area
step
region
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
JP2002355017A
Other languages
Japanese (ja)
Other versions
JP2004185555A (en
Inventor
学 兵藤
Original Assignee
富士フイルム株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 富士フイルム株式会社 filed Critical 富士フイルム株式会社
Priority to JP2002355017A priority Critical patent/JP4126721B2/en
Publication of JP2004185555A publication Critical patent/JP2004185555A/en
Application granted granted Critical
Publication of JP4126721B2 publication Critical patent/JP4126721B2/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Description

[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a face area extraction method and apparatus, and more particularly to a method and apparatus for extracting an area corresponding to a human face existing in a color image acquired by a digital camera or the like.
[0002]
[Prior art]
The most noticeable part when appreciating a portrait is the face of the person. There has been proposed a technique for automatically detecting a region corresponding to a human face in an image so that the human face is reproduced with appropriate brightness and color (Patent Document 1). According to the method disclosed in Patent Document 1, a skin color area is extracted from an original image, an edge in the image is detected, and a skin color area surrounded by the edge is extracted as a face area.
[0003]
[Patent Document 1]
JP-A-9-101579
[0004]
[Problems to be solved by the invention]
However, in the method disclosed in Patent Document 1, there are cases where the edge of the face does not become a closed curve due to the amount of light at the time of shooting, and the face area may be erroneously detected. A problem in face extraction processing is an object having a hue similar to the skin color (for example, sand, ground, wood, brick, etc.). Furthermore, when shooting under a tungsten light source, the white balance cannot be completely achieved due to the robustness of auto white balance (AWB), and the light source color remains (without complete correction, Setting to leave an atmosphere). Therefore, when a white object is photographed under a tungsten light source, it becomes reddish yellow similar to the skin color, which hinders face area detection.
[0005]
The present invention has been made in view of such circumstances, and a face area extraction method and apparatus capable of correctly extracting a face area by eliminating an object having a hue similar to that of the face even in the above-described shooting situation. The purpose is to provide.
[0006]
[Means for Solving the Problems]
  In order to achieve the above object, a face region extraction method according to the present invention is a method for extracting a region corresponding to a human face from an image, and includes an information acquisition step of acquiring focal length information at the time of shooting, A maximum value predicting step for obtaining a maximum value assumed as a face region in the image based on the focal length information acquired in the information acquiring step, and a skin color for analyzing the image data and detecting a region having a skin color hue from the image Region detection processA skin color area detecting step including: a skin color detecting step for detecting a hue of skin color; and a saturation dividing step for further dividing the skin color portion detected in the skin color detecting step according to saturation.And among the skin color areas detected in the skin color area detection stepAbout the skin color area by saturation divided in the saturation division stepThe criterion value set from the maximum value estimated in the maximum value prediction stepCompared to the criterion valueAnd a processing step of performing processing for treating a region larger than that as a face region having a low possibility of being a face region.
[0007]
According to the present invention, the information indicating the focal length of the photographing optical system is acquired, and the maximum size of the human face obtained by actually photographing at the focal length is estimated. On the other hand, the skin tone region in the image is detected by analyzing the image data. Although the detected skin color area can be a face area candidate, in the present invention, a skin color area that is larger than the criterion value determined from the maximum value estimated based on the focal length information is unlikely to be a face area. Treat as a thing. For example, there is an aspect in which an area larger than the determination reference value is excluded from the face area candidates as not being a face area, or an arithmetic weighting coefficient is changed in an area that is unlikely to be a face area.
[0008]
  In this way, the appropriate size as the face area is estimated from the focal length information at the time of shooting, and it is determined that the accuracy of the face area is low for extremely large areas, so the face area can be determined with higher accuracy. It becomes possible to judge.In addition, a region having a flesh-colored hue (skin-colored region) in an image is further divided into fine regions according to saturation, and the shape and size thereof are recognized, thereby facilitating the determination of a face region.
[0009]
The focal length information may be acquired from a camera used for shooting, or may be read from attached information (tag information) added to the image data.
[0010]
According to an aspect of the present invention, the processing step is a region that is larger than a determination reference value set from the maximum value estimated in the maximum value prediction step among the skin color regions detected in the skin color region detection step. And a face area determination step of determining a face area from areas smaller than the determination reference value.
[0011]
In this aspect, the skin color area larger than the determination reference value is excluded from the face area candidates as not being a face area, and the face area is determined from the skin color areas smaller than the determination reference value. By adopting such an algorithm configuration, it is possible to eliminate the influence of an object similar to a flesh-colored hue such as a background or ground under a tungsten light source, and to extract a more accurate face region.
[0014]
In order to identify a true face area from face area candidates that are smaller than the determination reference value, it is preferable to narrow down the face area based on the shape of the area. For example, a person's face is considered to be approximately circular or elliptical, and an aspect ratio for determining whether or not the face is a face region can be determined. Those that are extremely different from the specified aspect ratio (extremely long and narrow) for each candidate for the face area are excluded as non-faces, and those that are close to the specified aspect ratio are determined to be faces.
[0015]
In the face region extraction method of the present invention, a distance information acquisition step for acquiring subject distance information, and a face for obtaining a size assumed as a face region in the image based on the distance information acquired in the distance information acquisition step There is also a mode for adding a size prediction step.
[0016]
When the subject distance information is obtained in addition to the focal length information, the size of the face of the person actually photographed can be estimated more accurately, so the face area candidates are narrowed down based on the predicted value. And the accuracy of face extraction can be improved.
[0017]
  In order to provide an apparatus that embodies the above method invention, a face area extraction apparatus according to the present invention is an apparatus that extracts an area corresponding to a human face from an image, and acquires focal length information at the time of shooting. An information acquisition means, a maximum value prediction means for obtaining a maximum value assumed as a face region in the image based on the focal length information acquired through the information acquisition means, and analyzing the image data to determine the skin color from the image. Skin color area detecting means for detecting an area having a hueA skin color area detecting means including a skin color detecting means for detecting a hue of the skin color, and a saturation dividing means for further dividing the skin color portion detected in the skin color detecting step by saturation.And among the skin color areas detected by the skin color area detection meansAbout the skin color area by saturation divided by the saturation dividing meansThe criterion value set from the maximum value estimated by the maximum value predicting meansCompared to the criterion valueA face area determining unit that deletes a larger area from the face area candidates and determines a face area from areas smaller than the determination reference value.
[0018]
The face area extraction apparatus of the present invention can be incorporated in a signal processing unit of an electronic photographing apparatus (electronic camera) such as a digital camera or a video camera, and reproduces or displays or prints out image data recorded by the electronic camera. It can be incorporated into an image processing apparatus or the like.
[0019]
The face area extraction apparatus of the present invention can be realized by a computer, and a program for realizing the steps of the above-described face area extraction method by a computer is stored on a CD-ROM, a magnetic disk, or other recording medium. The program can be recorded and provided to a third party through a recording medium, or the program download service can be provided through a communication line such as the Internet.
[0020]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, preferred embodiments of a face area extracting method and apparatus according to the present invention will be described in detail with reference to the accompanying drawings.
[0021]
FIG. 1 is a block diagram showing a configuration of an electronic camera according to an embodiment of the present invention. This camera 10 is a digital camera that converts an optical image of a subject into digital image data and records it on a recording medium 12, and the face region of the present invention is part of a signal processing means that processes an image signal obtained by photographing. An extraction device is used.
[0022]
The overall operation of the camera 10 is centrally controlled by a central processing unit (CPU) 14 built in the camera. The CPU 14 functions as a control means for controlling the camera system according to a predetermined program, and various types such as automatic exposure (AE) calculation, automatic focus adjustment (AF) calculation, auto white balance (AWB) control, face area extraction calculation, etc. It functions as a calculation means for performing a calculation.
[0023]
The CPU 14 is connected to a ROM 20 and a memory (RAM) 22 via a bus 16. The ROM 20 stores a program executed by the CPU 14 and various data necessary for control. The memory 22 is used as a program development area and a calculation work area for the CPU 14, and is also used as a temporary storage area for image data.
[0024]
Further, an EEPROM 24 is connected to the CPU 14. The EEPROM 24 stores table data (tables such as skin color data and face size maximum value data) necessary for face area extraction processing, data necessary for control of AE, AF, AWB, etc., or customization information set by the user. Non-volatile storage means that retains the stored contents even when the power is turned off. The CPU 14 performs calculations and the like with reference to data in the EEPROM 24 as necessary. The ROM 20 may be non-rewritable or may be rewritable like an EEPROM.
[0025]
The camera 10 is provided with an operation unit 30 for a user to input various commands. The operation unit 30 includes various operation units such as a macro button 31, a shutter button 32, and a zoom switch 33.
[0026]
The macro button 31 is an operation means for setting (ON) / releasing (OFF) a macro mode suitable for short-distance shooting. Macro mode allows you to take close-up photos with a relatively shallow depth of field and a beautifully blurred background. When the camera 10 is set to the macro mode by pressing the macro button 31, focus control suitable for short-distance shooting is performed, and shooting is possible within a subject distance range of about 20 cm to 80 cm.
[0027]
The shutter button 32 is an operation means for inputting an instruction to start photographing, and is composed of a two-stroke switch having an S1 switch that is turned on when half-pressed and an S2 switch that is turned on when fully pressed. When S1 is on, AE and AF processing are performed, and when S2 is on, recording exposure is performed. The zoom switch 33 is an operation means for changing the photographing magnification and the reproduction magnification.
[0028]
Although not shown, the operation unit 30 includes mode selection means for switching between a shooting mode and a playback mode, a menu button for displaying a menu screen on the liquid crystal monitor 40, and a cross button (for selecting a desired item from the menu screen). Cursor movement operation means), OK button for confirming selection item and executing process, cancel button for canceling desired object such as selection item, canceling instruction contents, or cancel button for inputting a command to return to the previous operation state Such operation means are also included. The operation unit 30 is not limited to a push-type switch member, dial member, lever switch, or the like, but also includes a unit realized by a user interface that selects a desired item from a menu screen. It is.
[0029]
A signal from the operation unit 30 is input to the CPU 14. The CPU 14 controls each circuit of the camera 10 based on an input signal from the operation unit 30. For example, lens driving control, photographing operation control, image processing control, image data recording / reproduction control, display control of the liquid crystal monitor 40, and the like. I do.
[0030]
The liquid crystal monitor 40 can be used as an electronic viewfinder for checking the angle of view at the time of shooting, and is used as a means for reproducing and displaying a recorded image. The liquid crystal monitor 40 is also used as a user interface display screen, and displays information such as menu information, selection items, and setting contents as necessary. Instead of a liquid crystal display, other types of display devices (display means) such as an organic EL can be used.
[0031]
Next, the shooting function of the camera 10 will be described.
[0032]
The camera 10 includes a photographing lens 42 as a photographing optical system and a CCD solid-state imaging device (hereinafter referred to as a CCD) 44. Instead of the CCD 44, other types of image pickup devices such as MOS solid-state image pickup devices can be used. The photographic lens 42 is composed of an electric zoom lens. Although a detailed optical configuration is not shown in the drawing, a magnifying lens group and a correction lens group that mainly provide an effect of changing magnification (variable focal length), and focus adjustment. And a contributing focus lens.
[0033]
When the photographer operates the zoom switch 33, a control signal is output from the CPU 14 to the zoom drive unit 46 in accordance with the switch operation. The zoom drive unit 46 is an electric drive unit including a motor (zoom motor) serving as a power source and a drive circuit thereof. The motor driving circuit of the zoom driving unit 46 generates a lens driving signal based on the control signal from the CPU 14 and supplies the lens driving signal to the zoom motor. Thus, the zoom motor operates by the motor drive voltage output from the motor drive circuit, and the zoom lens group and the correction lens group in the photographic lens 42 move back and forth along the optical axis, so that the focal length of the photographic lens 42 is increased. (Optical zoom magnification) is changed.
[0034]
In this example, it is assumed that the focal length of the photographing lens 42 can be varied in 10 steps within the zoom operation range from the wide (wide angle) end to the tele (telephoto) end. The photographer can select a desired focal length according to the shooting purpose and perform shooting.
[0035]
The zoom position (corresponding to the focal length) of the photographic lens 42 is detected by the zoom position detection sensor 48 and the detection signal is notified to the CPU 14. The CPU 14 can grasp the current zoom position (that is, focal length) from the signal from the zoom position detection sensor 48. The zoom position detection sensor 48 may be a circuit that generates a pulse by the rotation of a zoom motor or the like, or may have a configuration in which a position detection encode plate is disposed on the outer periphery of the lens barrel. At that time, there is no particular limitation.
[0036]
The light that has passed through the photographing lens 42 is incident on the CCD 44 after the amount of light is adjusted through a diaphragm mechanism (not shown). A large number of photosensors (light receiving elements) are arranged in a plane on the light receiving surface of the CCD 44, and primary color filters of red (R), green (G), and blue (B) are arranged in a predetermined arrangement corresponding to each photosensor. Arranged in a structure (Bayer, G stripe, etc.).
[0037]
The subject image formed on the light receiving surface of the CCD 44 is converted into a signal charge of an amount corresponding to the amount of incident light by each photosensor. The CCD 44 has an electronic shutter function that controls the charge accumulation time (shutter speed) of each photosensor according to the timing of the shutter gate pulse.
[0038]
The signal charges accumulated in the photosensors of the CCD 44 are sequentially read out as voltage signals (image signals) corresponding to the signal charges based on the pulses given from the CCD driver 50. The image signal output from the CCD 44 is sent to the analog processing unit 52. The analog processing unit 52 is a preceding processing unit including a CDS (correlated double sampling) circuit and a gain adjustment circuit. In this analog processing unit 52, sampling processing and color separation processing are performed on each of the R, G, and B color signals, The signal level of each color signal is adjusted (pre-white balance processing).
[0039]
The image signal output from the analog processing unit 52 is converted into a digital signal by the A / D converter 54 and then stored in the memory 22 via the signal processing unit 56. The image data stored in the memory 22 at this time is an A / D conversion output of the image signal output from the CCD 44 recorded as it is (unprocessed), and is subjected to signal processing such as gamma conversion and synchronization. The image data is not broken. (Hereafter, it is called CCDRAW data.) However, “raw data” does not exclude any signal processing. For example, defective pixel correction that interpolates defective pixel (scratch) data of an image sensor. Image data obtained by processing is included in the concept of CCDRAW data in that it has not been developed into a general-purpose format.
[0040]
A timing generator (TG) 58 provides timing signals to the CCD driver 50, the analog processing unit 52, and the A / D converter 54 in accordance with instructions from the CPU 14, and the circuits are synchronized by this timing signal. .
[0041]
The signal processing unit 56 is a digital signal processing block that also serves as a memory controller that controls reading and writing of the memory 22. The signal processing unit 56 calculates the color of each point by interpolating a spatial shift of the color signal associated with the color filter array of the single-chip CCD and an auto calculation unit that performs AE / AF / AWB processing. Processing circuit), a white balance circuit, a gamma conversion circuit, a luminance / color difference signal generation circuit, a contour correction circuit, a contrast correction circuit, and the like, and processes image signals using the memory 22 in accordance with commands from the CPU 14 To do.
[0042]
The CCDRAW data stored in the memory 22 is sent to the signal processing unit 56 via the bus 16. The image data input to the signal processing unit 56 is subjected to predetermined signal processing such as white balance adjustment processing, gamma conversion processing, conversion processing to luminance signals (Y signals) and color difference signals (Cr, Cb signals) (YC processing). Is stored in the memory 22.
[0043]
When the captured image is output to the monitor, the image data is read from the memory 22 and transferred to the display circuit 60. The image data sent to the display circuit 60 is converted to a predetermined display signal (for example, an NTSC color composite video signal), and then output to the liquid crystal monitor 40. The image data in the memory 22 is periodically rewritten by the image signal output from the CCD 44, and the video signal generated from the image data is supplied to the liquid crystal monitor 40, so that the video being captured (through image) can be obtained. It is displayed on the liquid crystal monitor 40 in real time. The photographer can check the angle of view (composition) from the video (so-called through movie) displayed on the liquid crystal monitor 40.
[0044]
When the photographer determines the angle of view and presses the shutter button 32, the CPU 14 detects this, performs AE processing and AF processing in response to half-pressing (S1 ON) of the shutter button 32, and fully presses the shutter button 32. In response to (S2 = ON), CCD exposure and readout control for capturing an image for recording is started.
[0045]
The AF control in the camera 10 is, for example, contrast AF that moves the focus lens (a moving lens that contributes to focus adjustment among the lens optical systems constituting the photographing lens 42) so that the high-frequency component of the G signal of the video signal is maximized. Applies. That is, the AF calculation unit is a high-pass filter that passes only a high-frequency component of the G signal, an absolute value processing unit, and an AF area that cuts out a signal in a focus target area set in advance in the screen (for example, the center of the screen). An extraction unit and an integration unit that integrates absolute value data in the AF area are configured.
[0046]
The integrated value data obtained by the AF calculation unit is notified to the CPU 14. The CPU 14 calculates a focus evaluation value (AF evaluation value) at a plurality of AF detection points while moving the focus lens by controlling the focus driving unit 62 including the AF motor, and calculates the AF calculated at each AF detection point. From the evaluation value, the lens position where the value is maximized is determined as the focus position. Then, the focus driving unit 62 is controlled to move the focus lens to the obtained in-focus position.
[0047]
In relation to the AE control, the AE calculation unit includes a circuit that divides one screen into a plurality of areas (for example, 8 × 8) and integrates the RGB signals for each divided area, and provides the integrated value to the CPU 14. To do. The CPU 14 detects the brightness of the subject (subject brightness) based on the integrated value obtained from the AE calculation unit, and calculates an exposure value (shooting EV value) suitable for shooting. An aperture value and a shutter speed are determined according to the obtained exposure value and a predetermined program diagram. Then, a driving unit (not shown) including an iris motor and an electronic shutter of the CCD 44 are controlled to obtain an optimum exposure amount.
[0048]
The image data captured in response to the full press of the shutter button 32 (S2 = ON) undergoes YC processing and other predetermined signal processing in the signal processing unit 56, and then a predetermined compression format (for example, in the compression / expansion circuit 64). , JPEG format). The compressed image data is recorded on the recording medium 12 via the media interface unit 66. The compression format is not limited to JPEG, and MPEG or other methods may be adopted.
[0049]
As a means for storing image data, various media such as a semiconductor memory card represented by SmartMedia (trademark), compact flash (trademark), a magnetic disk, an optical disk, and a magneto-optical disk can be used. Further, the recording medium (internal memory) built in the camera 10 is not limited to the removable medium.
[0050]
When the playback mode is selected by the mode selection means, the last image file (last recorded file) recorded on the recording medium 12 is read. The image file data read from the recording medium 12 is decompressed by the compression / decompression circuit 64 and output to the liquid crystal monitor 38 via the display circuit 60.
[0051]
By operating the cross button during single-frame playback in the playback mode, it is possible to advance the frame in the forward direction or in the reverse direction, and the next file after the frame advance is read from the recording medium 12, and the image is processed in the same manner as described above. Is played.
[0052]
FIG. 2 is a principal block diagram related to face extraction processing in the camera 10 of this example. 2 that are the same as those described in FIG. 1 are denoted by the same reference numerals, and description thereof is omitted.
[0053]
In FIG. 2, an integration circuit 70 and a white balance circuit 72 in the upper stage are processing systems used for normal auto white balance processing. Further, the integration circuit 74, the white balance circuit 76, and the synchronization circuit 78 shown in the lower stage are processing units (hereinafter referred to as a preprocessing system) for performing preprocessing of the face extraction algorithm.
[0054]
The CCDRAW data converted into a digital signal by the A / D converter 54 is stored in the memory 22 and sent from the memory 22 to the integrating circuit 70. The integrating circuit 70 includes a circuit that divides one screen into a plurality of areas (for example, 8 × 8 64 blocks) and calculates an average integrated value for each color of the RGB signal for each area, and the calculation result is a reference. It is sent to the white balance circuit 72 assuming a light source (daylight). The signal whose gain has been adjusted by the white balance circuit 72 is provided to the CPU 14.
[0055]
The CPU 14 obtains an integrated value of R, an integrated value of B, and an integrated value of G, obtains a ratio of R / G and B / G, and takes these R / G and B / G values and a shooting EV by AE calculation. Based on the value information, scene discrimination (light source type discrimination) is performed, and a white balance circuit 80 in the signal processing unit 56 and a predetermined white balance adjustment value suitable for the scene (setting to leave the light source atmosphere) and The amplifier gain of the pre-processing white balance circuit 76 is controlled to correct the signal of each color channel. The method disclosed in Japanese Patent Application Laid-Open No. 2000-224608 and the like can be used for the light source type discrimination method and the white balance control that leaves the light source atmosphere. In scene discrimination, color temperature information such as RY and BY may be used instead of using the values of R / G and B / G.
[0056]
Thereafter, a recording image is captured in response to the full press of the shutter button 32 (S2 ON). The CCDRAW data acquired in response to S2 ON is temporarily stored in the memory 22 and then sent to the signal processing unit 56 and the pre-processing system integrating circuit 74 for face extraction.
[0057]
The preprocessing integration circuit 74 divides one screen into, for example, 130 × 190 areas, and calculates an integrated value for each area. The calculation result of the integration circuit 74 is sent to the white balance circuit 76, where white balance processing reflecting AWB is performed. Data with AWB applied is sent to the synchronization circuit 78, where data of the same number of pixels (3-plane data) is generated for each of R, G, and B.
[0058]
Based on the RGB three-plane data generated by the synchronization circuit 78, the CPU 14 executes skin color detection, saturation division, face candidate extraction, and face area specification processing based on the shape of the face candidate area. Details of the face extraction algorithm will be described later.
[0059]
On the other hand, the CCDRAW data sent from the memory 22 to the signal processing unit 56 is subjected to processing reflecting AWB by the white balance circuit 80 and then sent to the gamma conversion circuit 82. The gamma conversion circuit 82 changes the input / output characteristics so that the RGB signal subjected to white balance adjustment has a desired gamma characteristic, and outputs it to the luminance / color difference signal generation circuit 84.
[0060]
The luminance / chrominance signal generation circuit 84 creates a luminance signal Y and chroma signals Cr, Cb from the gamma-corrected R, G, B signals. The luminance signal Y and chroma signals Cr and Cb (YC signal) are compressed by the compression / decompression circuit 64 in a predetermined format such as JPEG and recorded on the recording medium 12.
[0061]
Next, the face extraction function installed in the camera 10 according to the present embodiment will be described.
[0062]
FIG. 3 is a diagram showing the relationship between the subject and its image in the photographing optical system. As shown in the figure, the ratio of the photographing distance D and the viewing angle L (the entire range that can be photographed at this focal length) is equal to the ratio of the focal length DF of the photographing lens 42 and the CCD size H.
[0063]
[Expression 1]
Shooting distance D: viewing angle L = focal length DF: CCD size H (1)
That is, the following relationship is established between the actual size A of the human face included in the viewing angle L and the image size a on the CCD light receiving surface.
[0064]
[Expression 2]
Shooting distance D: actual size A = focal length DF: image size a (2)
The shooting distance D is the shortest distance (closest distance) that can be shot by the camera 10. If the macro mode is OFF, the shortest distance to the camera 10 is about 60 cm. When the macro mode is ON, the shortest distance is about 20 cm. However, when the macro mode is ON, the possibility of capturing a person is low, and thus face extraction processing is not performed.
[0065]
Although the actual size A of the person's face has some variation, a prescribed value can be set assuming that the size of the face is within a certain size range.
[0066]
Therefore, the image size on the CCD 44 surface (the number of pixels when a human face forms an image on the CCD 44 surface) can be determined from the relationship of the above equation (2). Thus, the face size at each focal length can be obtained.
[0067]
Since the camera 10 of the present example is configured to be able to select 10 stages of focal lengths from the wide end to the tele end, the maximum face size on the CCD 44 surface is the table data for each focal length. 10 is stored in an EEPROM 24 (data storage means).
[0068]
FIG. 4 is a flowchart showing a sequence in face extraction.
[0069]
First, CCDRAW data is recorded in the memory 22 through S1 = ON and S2 = ON of the shutter button 32 (step S110). Thereafter, the CPU 14 obtains focal length information and macro ON / OFF information at the time of photographing (step S112). When the macro mode is ON, face extraction processing is not performed.
[0070]
On the other hand, when the macro mode is OFF, processing by the preprocessing system (74, 76, 78) described in FIG. 2 is performed, and integration processing, white balance processing, and synchronization processing are performed from the recorded CCDRAW data (FIG. 4). Step S114).
[0071]
Next, the CPU 14 acquires a skin color table from the EEPROM 24 (step S116). The skin color table is data that defines a range of hues recognized as “skin color” in a predetermined color space. The CPU 14 detects the hue in the skin color table based on the RGB three-plane data (linear data) acquired from the preprocessing system synchronization circuit 78 (step S118).
[0072]
FIG. 5 is a diagram illustrating a hue range (skin color extraction area) to be extracted as a skin color. The illustrated color space is a linear system before gamma is applied, and is a coordinate system in which the horizontal axis is R / G and the vertical axis is B / G.
[0073]
In FIG. 5, a range surrounded by a rectangular frame denoted by reference numeral 88 is set as a skin color extraction area. That is, a range in which R is slightly larger than G and B is slightly lower than G is defined as “skin color”, and a color within this range is determined as a skin color.
[0074]
Further, the skin color extraction area 88 is further divided into a plurality of areas depending on the saturation. In FIG. 5, “saturation” is represented by the distance from the origin O (1, 1), and the saturation increases as the distance from the origin O increases. In the example of the figure, the skin color detection area 88 is divided into six regions by a concentric boundary line (illustrated by a broken line) centered on the origin.
[0075]
After detecting the skin color area from the RGB three-surface data acquired from the preprocessing system synchronization circuit 78, the CPU 14 further divides the detected skin color area based on the saturation (step S118 in FIG. 4). .
[0076]
FIG. 6 shows an example in which the skin color detected area is further divided by saturation. Different objects in the image have different saturations, and generally the skin color of a person's face tends to be higher in saturation than white objects or trees such as desks under a tungsten light source. Therefore, it becomes easy to discriminate whether it is a face region or a region other than the face by finely dividing the skin color detected region according to the saturation and grasping the shape of the region.
[0077]
After performing the area division by saturation in step S120 of FIG. 4, the process proceeds to step S122. In step S122, in accordance with the focal length information obtained in step S112, a face area maximum value table is obtained from the EEPROM 24, a predicted maximum value of the face area at the focal distance is obtained, and this maximum value is used as a determination reference value for face area determination. Set as. Of course, in consideration of the possibility of shooting near the shortest shooting distance that can be shot by the camera 10, a predetermined reference value is set by adding a predetermined margin to the predicted maximum value, or a margin is set for the determination. It is preferable to have
[0078]
Then, the skin color area divided by the saturation is compared with the predicted maximum value, and an area extremely larger than the maximum value is determined not to be a face, and is excluded from the face area candidates, and the remaining area is excluded from the face area. Are extracted as candidates (step S124). This eliminates large sized regions that cannot actually be “faces”.
[0079]
From the face area candidates narrowed down in step S124, shape detection is further performed for each area, and the face area is specified from the shape (step S126). That is, the aspect ratio determined from the model shape (ellipse or circle) appropriate for the face is compared with the shape of the face candidate, and the aspect ratio of the detected shape greatly deviates from the specified value. It is determined that the region is other than the face. Face candidates are further narrowed down by this shape determination, and an area portion having a shape having a predetermined aspect ratio is determined as a “face”.
[0080]
Thus, the face area is extracted, and the extraction result is used for brightness correction, white balance correction, color correction for bringing the skin color close to the best color (target value), red-eye correction, and the like.
[0081]
According to the above-described embodiment, the size of the face area is estimated by using the focal length information that is camera information, while the skin color area obtained by detecting the skin color in the image is divided into areas by saturation and each area is divided. Since the area that is extremely large compared to the estimated maximum value is excluded from the face area candidates, the face area candidates can be narrowed down. Since the shape of the remaining area is recognized and the final face is determined, a correct face area can be extracted with high accuracy.
[0082]
For example, the scene shown in FIG. 7 is a photograph of a person 94 against a white cloth 92 under the illumination of a light bulb 90, but has a hue similar to the skin color, such as the background cloth 92 or a wooden desk 96. By removing the object, the face of the person 94 can be accurately extracted.
[0083]
[Modification 1]
In the above-described embodiment, the example in which the face area is completely specified from the captured image has been described. However, when the present invention is implemented, an aspect in which the face area is not finally specified is possible.
[0084]
For example, when performing brightness correction with an emphasis on the face of a person, for the skin color area extracted by skin color detection, the weight in brightness calculation is used instead of or in combination with the face area candidates. There is a mode in which the coefficient is changed to the focal length information.
[0085]
As shown in FIG. 8, when a plurality of skin color areas are detected in the screen by skin color detection, weights w i (i = 1, 2, 3) corresponding to the level of the possibility that the face is actually a face based on the focal length information. ...) is set, and brightness Y is calculated according to the following equation (3).
[0086]
[Equation 3]
Y = Σ (wi × Yi) / Σwi (3)
Yi indicates the brightness of each face area.
[0087]
As for the weighting w i, the information of the area that is likely to be a face is strongly reflected by reducing the weighting of the area that is not likely to be a face (a value close to “0” or “0”). Value. Correction processing is performed so that the brightness Y thus obtained approaches a predetermined target value.
[0088]
[Modification 2]
There is an aspect in which the distance information of the subject is used instead of or in combination with the aspect using the focal length information described in FIG.
[0089]
FIG. 9 shows a block diagram of an electronic camera according to another embodiment of the present invention. 9, parts that are the same as or similar to those in FIG. 1 are given the same reference numerals, and descriptions thereof are omitted.
[0090]
The camera 10 shown in FIG. 9 includes a distance measuring sensor 102 as means for measuring the subject distance (shooting distance). A signal obtained from the distance measuring sensor 102 is input to the CPU 14, and the CPU 14 acquires distance information of the subject.
[0091]
As the means for detecting the subject distance, it is possible to use a well-known AF mechanism such as an AF mechanism using a ranging method represented by an active method or a passive method using the principle of triangulation, or an AF mechanism using a phase method. It is. Also in a camera (configuration shown in FIG. 1) in which the distance measuring sensor 102 is omitted, when the focus lens is moved to the in-focus position by contrast AF or the like, the position information (focus position information) of the focus lens from the focus position detection unit. ) And the distance between the camera 10 and the subject (subject distance) can be calculated based on this information.
[0092]
Thus, by acquiring the distance information of the subject, it is possible to estimate the size of the face actually photographed at that distance. When calculating the face size, table data indicating the size of the face corresponding to the subject distance may be stored in the EEPROM 24, or may be calculated using an arithmetic expression.
[0093]
By excluding the skin color area that is extremely larger or smaller than the face size calculated based on the distance information of the subject from the face candidates, the face area can be extracted more accurately.
[0094]
FIG. 10 shows a face extraction sequence using distance information. In the figure, the same or similar steps as those in the flowchart of FIG.
[0095]
In the flowchart shown in FIG. 10, in step S112, a process of reading information from the distance measuring sensor 102 and obtaining subject distance information is added.
[0096]
By obtaining subject distance information in addition to focal length information, the face size can be predicted more accurately, so it is possible to set a reasonable numerical range that is allowed as a face area based on the prediction, Extraction accuracy is improved.
[0097]
Specifically, in step S122 of FIG. 10, the maximum value table is obtained based on the focal length, the minimum value table of the face area is obtained based on the subject distance, and the numerical range defined in these tables is obtained. Based on this, the face area candidates are narrowed down (step S124). Alternatively, in step S122, a determination reference value (or determination reference range) that considers the maximum value and the minimum value of the face area is determined based on the focal length and the subject distance, and a face area candidate is determined according to the determination reference value. There are modes such as narrowing down (step S124).
[0098]
9 and 10, the subject distance information is acquired from the distance measuring sensor 102. However, the subject distance information is not limited to the mode acquired from the camera used for shooting, but is attached information (tag information) added to the image data. ) Etc. are also possible.
[0099]
In the above-described embodiment, the digital camera having the optical zoom function has been described. However, for a camera using a single focus lens, the face size may be determined using the value of the focal length of the lens. Also, for a camera equipped with an electronic zoom (digital zoom) function that electronically processes an image signal to obtain an enlarged image, an image of the entire angle of view is captured and the face area is determined from the entire image data before the electronic zoom processing. By detecting this, it is possible to apply the same technique as in the above-described embodiment.
[0100]
In the above-described embodiment, a digital camera has been exemplified, but the scope of application of the present invention is not limited to this, and other information devices having an electronic imaging function such as a camera-equipped mobile phone, a camera-equipped PDA, and a camera-equipped mobile personal computer. The present invention can also be applied to. In this case, the imaging unit may be a detachable (external type) that is separable from a main body such as a mobile phone.
[0101]
Further, the present invention can also be applied to an image reproducing apparatus that reproduces and displays image data recorded by the device with the electronic imaging function as described above, or a printing apparatus that performs print output.
[0102]
【The invention's effect】
According to the present invention, the maximum value of a person's face is predicted from the focal length information of the photographing optical system, and a skin color region that is larger than the criterion value set from the maximum value is less likely to be a face region. Therefore, it is possible to determine the face area with higher accuracy.
[0103]
Further, by acquiring subject distance information in addition to focal length information, it is possible to more accurately estimate the size of a person's face that is actually photographed, and to improve face extraction accuracy.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of an electronic camera according to an embodiment of the present invention.
FIG. 2 is a principal block diagram related to face extraction processing in the camera of this example.
FIG. 3 is a diagram showing the relationship between a subject and its image in a photographic optical system.
FIG. 4 is a flowchart showing a sequence of face extraction processing;
FIG. 5 is a diagram illustrating an example of a hue range (skin color extraction area) to be extracted as a skin color and saturation division;
FIG. 6 is a diagram showing an example in which a region where skin color is detected is further divided by saturation;
FIG. 7 is a diagram showing an example of a shooting scene
FIG. 8 is a diagram showing an example of brightness calculation when there are a plurality of skin color areas in the screen.
FIG. 9 is a block diagram of an electronic camera according to another embodiment of the present invention.
FIG. 10 is a flowchart showing a sequence of face extraction processing using subject distance information.
[Explanation of symbols]
DESCRIPTION OF SYMBOLS 10 ... Camera, 14 ... CPU, 24 ... EEPROM, 42 ... Shooting lens, 44 ... CCD, 48 ... Zoom position detection sensor, 74 ... Integration circuit, 76 ... White balance circuit, 78 ... Synchronization circuit, 88 ... Skin color extraction area , 102 ... Ranging sensor

Claims (5)

  1. A method for extracting an area corresponding to a human face from an image,
    An information acquisition step of acquiring focal length information at the time of shooting;
    A maximum value prediction step for obtaining a maximum value assumed as a face region in the image based on the focal length information acquired in the information acquisition step;
    A skin color region detection step of analyzing image data to detect a region having a skin color hue from within the image, further comprising: a skin color detection step of detecting a skin color hue; and a skin color portion detected in the skin color detection step A skin color region detection step including a saturation division step of dividing by saturation ,
    Compared with the reference value set from the maximum value estimated in the maximum value prediction step for the skin color region classified by saturation among the skin color regions detected in the skin color region detection step. And a processing step of performing processing that treats an area larger than the determination reference value as a face area that is unlikely to be a face area;
    A facial region extraction method characterized by comprising:
  2.   The processing step excludes, from among the skin color regions detected in the skin color region detection step, a region larger than a determination reference value set from the maximum value estimated in the maximum value prediction step from a face region candidate, The face area extracting method according to claim 1, further comprising a face area determining step of determining a face area from areas smaller than the determination reference value.
  3. The face region determining step, claim 2 Symbol mounting face region extraction method and determining a face region based on the shape of the skin color region to be a candidate for a face region.
  4. A distance information acquisition step of acquiring subject distance information;
    A face size prediction step for obtaining a size assumed as a face region in the image based on the distance information acquired in the distance information acquisition step;
    Face region extraction method according to any one of claims 1 to 3, characterized in that it comprises a.
  5. An apparatus for extracting an area corresponding to a human face from an image,
    Information acquisition means for acquiring focal length information at the time of shooting;
    Maximum value predicting means for obtaining a maximum value assumed as a face area in the image based on focal length information acquired via the information acquiring means;
    Skin color area detecting means for analyzing image data and detecting an area having a flesh-colored hue from the image, further comprising: a flesh-color detecting means for detecting a flesh-colored hue; and a flesh-color portion detected in the flesh-color detecting step. A skin color area detecting means including a saturation dividing means for dividing by saturation ;
    Of the skin color areas detected by the skin color area detecting means, the skin color areas classified by saturation divided by the saturation dividing means are compared with a criterion value set from the maximum value estimated by the maximum value predicting means. A face area determining unit that deletes an area larger than the determination reference value from the face area candidates and determines a face area from areas smaller than the determination reference value;
    A face area extracting apparatus comprising:
JP2002355017A 2002-12-06 2002-12-06 Face area extraction method and apparatus Expired - Fee Related JP4126721B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2002355017A JP4126721B2 (en) 2002-12-06 2002-12-06 Face area extraction method and apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2002355017A JP4126721B2 (en) 2002-12-06 2002-12-06 Face area extraction method and apparatus

Publications (2)

Publication Number Publication Date
JP2004185555A JP2004185555A (en) 2004-07-02
JP4126721B2 true JP4126721B2 (en) 2008-07-30

Family

ID=32755830

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2002355017A Expired - Fee Related JP4126721B2 (en) 2002-12-06 2002-12-06 Face area extraction method and apparatus

Country Status (1)

Country Link
JP (1) JP4126721B2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9824261B2 (en) 2014-12-24 2017-11-21 Samsung Electronics Co., Ltd. Method of face detection, method of image processing, face detection device and electronic system including the same

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4284448B2 (en) 2005-01-28 2009-06-24 富士フイルム株式会社 Image processing apparatus and method
US20060182433A1 (en) 2005-02-15 2006-08-17 Nikon Corporation Electronic camera
JP4581730B2 (en) * 2005-02-15 2010-11-17 株式会社ニコン Digital camera
JP4745724B2 (en) * 2005-06-08 2011-08-10 キヤノン株式会社 Image processing method and image processing apparatus
JP4217698B2 (en) 2005-06-20 2009-02-04 キヤノン株式会社 Imaging apparatus and image processing method
RU2006104311A (en) * 2006-02-14 2007-09-20 Самсунг Электроникс Ко. Face detection method on digital images
JP4971785B2 (en) * 2006-12-28 2012-07-11 キヤノン株式会社 Image processing apparatus and method, and imaging apparatus
JP4571617B2 (en) 2006-12-28 2010-10-27 三星デジタルイメージング株式会社 Imaging apparatus and imaging method
JP4894661B2 (en) 2007-07-24 2012-03-14 株式会社ニコン Imaging device
JP5315666B2 (en) * 2007-10-31 2013-10-16 株式会社ニコン Focus detection device, camera
JP5003529B2 (en) * 2008-02-25 2012-08-15 株式会社ニコン Imaging apparatus and object detection method
JP2009223523A (en) * 2008-03-14 2009-10-01 Seiko Epson Corp Image processor, image processing method, and computer program by image processing
JP2009223524A (en) * 2008-03-14 2009-10-01 Seiko Epson Corp Image processor, image processing method, and computer program for image processing
JP2009239368A (en) * 2008-03-25 2009-10-15 Seiko Epson Corp Image processing method, image processor, image processing program, and printing device
JP2010055468A (en) * 2008-08-29 2010-03-11 Nikon Corp Image processing apparatus, and camera
JP4726251B2 (en) * 2008-09-18 2011-07-20 キヤノン株式会社 Imaging apparatus and image processing method
JP5359150B2 (en) * 2008-09-22 2013-12-04 株式会社ニコン Imaging device
JP2011008716A (en) * 2009-06-29 2011-01-13 Noritsu Koki Co Ltd Image processing apparatus, image processing method and image processing program
EP2602692A1 (en) * 2011-12-05 2013-06-12 Alcatel Lucent Method for recognizing gestures and gesture detector
JP6331566B2 (en) * 2014-03-27 2018-05-30 株式会社リコー Human head detection device and posture estimation device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9824261B2 (en) 2014-12-24 2017-11-21 Samsung Electronics Co., Ltd. Method of face detection, method of image processing, face detection device and electronic system including the same

Also Published As

Publication number Publication date
JP2004185555A (en) 2004-07-02

Similar Documents

Publication Publication Date Title
US8514297B2 (en) Image sensing apparatus and image processing method
US7995116B2 (en) Varying camera self-determination based on subject motion
KR101071625B1 (en) Camera, storage medium having stored therein camera control program, and camera control method
US8462228B2 (en) Image processing method, apparatus and computer program product, and imaging apparatus, method and computer program product
US7903168B2 (en) Camera and method with additional evaluation image capture based on scene brightness changes
JP3541820B2 (en) Imaging device and imaging method
US8659619B2 (en) Display device and method for determining an area of importance in an original image
JP4198449B2 (en) Digital camera
JP4177750B2 (en) Imaging apparatus and method for determining important regions in archive images
JP4518131B2 (en) Imaging method and apparatus
US7822336B2 (en) Image capture device with automatic focusing function
US7634186B2 (en) Imaging apparatus, image storage apparatus, imaging method, storage method, recording medium recording imaging program, and recording medium recording storage program
US7151564B2 (en) Image recording apparatus and method
JP4826028B2 (en) Electronic camera
US8159560B2 (en) Image sensing apparatus having a delete function of image data and control method thereof
CN1992820B (en) Digital camera with face detection function for facilitating exposure compensation
JP4761146B2 (en) Imaging apparatus and program thereof
JP3473552B2 (en) Digital still camera
US7791668B2 (en) Digital camera
JP4340806B2 (en) Image processing apparatus, method, and program
US7509042B2 (en) Digital camera, image capture method, and image capture control program
JP4135100B2 (en) Imaging device
US7706674B2 (en) Device and method for controlling flash
JP4655054B2 (en) Imaging device
JP4153444B2 (en) Digital camera

Legal Events

Date Code Title Description
A621 Written request for application examination

Free format text: JAPANESE INTERMEDIATE CODE: A621

Effective date: 20050318

A711 Notification of change in applicant

Free format text: JAPANESE INTERMEDIATE CODE: A712

Effective date: 20061211

A131 Notification of reasons for refusal

Free format text: JAPANESE INTERMEDIATE CODE: A131

Effective date: 20080128

A521 Written amendment

Free format text: JAPANESE INTERMEDIATE CODE: A523

Effective date: 20080327

TRDD Decision of grant or rejection written
A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

Effective date: 20080421

A01 Written decision to grant a patent or to grant a registration (utility model)

Free format text: JAPANESE INTERMEDIATE CODE: A01

A61 First payment of annual fees (during grant procedure)

Free format text: JAPANESE INTERMEDIATE CODE: A61

Effective date: 20080504

R150 Certificate of patent or registration of utility model

Free format text: JAPANESE INTERMEDIATE CODE: R150

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20110523

Year of fee payment: 3

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20110523

Year of fee payment: 3

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20120523

Year of fee payment: 4

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20130523

Year of fee payment: 5

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20140523

Year of fee payment: 6

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

R250 Receipt of annual fees

Free format text: JAPANESE INTERMEDIATE CODE: R250

LAPS Cancellation because of no payment of annual fees