JP2008299430A - Image processing device, method, and program - Google Patents

Image processing device, method, and program Download PDF

Info

Publication number
JP2008299430A
JP2008299430A JP2007142329A JP2007142329A JP2008299430A JP 2008299430 A JP2008299430 A JP 2008299430A JP 2007142329 A JP2007142329 A JP 2007142329A JP 2007142329 A JP2007142329 A JP 2007142329A JP 2008299430 A JP2008299430 A JP 2008299430A
Authority
JP
Japan
Prior art keywords
face
subject
information
face information
means
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
JP2007142329A
Other languages
Japanese (ja)
Inventor
Satoshi Aoyama
聡 青山
Original Assignee
Canon Inc
キヤノン株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Canon Inc, キヤノン株式会社 filed Critical Canon Inc
Priority to JP2007142329A priority Critical patent/JP2008299430A/en
Publication of JP2008299430A publication Critical patent/JP2008299430A/en
Application status is Withdrawn legal-status Critical

Links

Images

Abstract

<P>PROBLEM TO BE SOLVED: To provide an image processing device, an expression estimation method, and an expression estimation program capable of allowing a user to grasp whether facial expression of an object reaches target expression or not. <P>SOLUTION: The image processing device includes an acquisition means for acquiring image data acquired by imaging an object in a plurality of different timings, a face region detection means for detecting the face region of the object in the plurality of the image data acquired by the acquisition means, a face information extraction means for extracting face information regarding the shape of the face component of the object from each of the plurality of the face regions detected by the face region detection means, a face information selection means for selecting reference face information serving as the basis for estimating facial expression of the object from the plurality of pieces of the face information extracted by the face information extraction means, and a reporting means for reporting information on the amount of change for indicating the amount of change of the other face information to the reference face information and threshold information for indicating threshold value of the amount of change to determine that the expression to be detected is reached in association with the image of the object. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

  The present invention relates to an image processing apparatus, an image processing method, and an image processing program.

  In recent years, techniques for detecting facial expressions, particularly human facial expressions, are being developed. In general, when a person is photographed with a camera, it is often desired that the facial expression of the person who is the subject is photographed with a good expression such as a smile. Therefore, the application of facial expression detection technology to digital cameras is being studied.

Patent Document 1 discloses a technique for evaluating and scoring the expression of a subject included in photographed image data from the viewpoints of smile level and neatness.
JP 2004-46591 A

  Patent Document 1 does not disclose a technique that allows the user to know whether or not the facial expression of the subject has reached the target facial expression when the facial expression of the subject is detected. At this time, it is difficult for the user to check whether or not the subject that the user is interested in can be photographed, and there is a possibility that photographing cannot be performed at the timing when the facial expression of the subject reaches the target facial expression.

  An object of the present invention is to provide an image processing apparatus, an image processing method, and an image processing program that allow a user to grasp whether or not a facial expression of a subject has reached a target facial expression.

  The image processing apparatus according to the first aspect of the present invention includes an acquisition unit that acquires image data obtained by imaging a subject at a plurality of different timings, and each of the plurality of image data acquired by the acquisition unit. A face area detecting means for detecting the face area of the subject, and face information for extracting face information relating to the shape of the constituent elements of the face of the subject from each of the plurality of face areas detected by the face area detecting means. Extraction means; face information selection means for selecting reference face information serving as a reference for estimating facial expression of the subject from the plurality of face information extracted by the face information extraction means; and the reference face information The amount of change information indicating the amount of change of the other face information with respect to and the threshold information indicating the threshold of the amount of change for determining that the facial expression to be detected is reached. Characterized in that it comprises a notifying means for informing in association with the image.

  The image processing method according to the second aspect of the present invention includes an acquisition step of acquiring image data obtained by imaging a subject at a plurality of different timings, and each of the plurality of image data acquired in the acquisition step. A face area detecting step for detecting the face area of the subject, and face information for extracting face information relating to a shape of a constituent element of the face of the subject from each of the plurality of face areas detected in the face area detecting step. An extraction step; a face information selection step for selecting reference face information as a reference for estimating a facial expression of the subject from the plurality of face information extracted in the face information extraction step; and the reference face information Change amount information indicating the change amount of the other face information with respect to the threshold value, threshold information indicating the threshold value of the change amount for determining that the facial expression to be detected has been reached, and Characterized by comprising a notification step of notifying in association with the image of the subject.

  An image processing program according to the third aspect of the present invention provides an image processing apparatus that acquires image data obtained by imaging a subject at a plurality of different timings, and a plurality of the images acquired by the acquisition unit. In each of the data, face area detection means for detecting the face area of the subject, and face information relating to the shape of the constituent elements of the face of the subject is extracted from each of the plurality of face areas detected by the face area detection means. Face information extracting means for selecting, from the plurality of face information extracted by the face information extracting means, face information selecting means for selecting reference face information as a reference for estimating the facial expression of the subject, the reference face Change amount information indicating a change amount of the other face information with respect to information, and threshold information indicating a threshold value of the change amount for determining that the facial expression to be detected has been reached. Characterized in that to function as a notifying means for notifying in association with the image of the subject.

  According to the present invention, the user can be made aware of whether the facial expression of the subject has reached the target facial expression.

  An image processing apparatus 100 according to a first embodiment of the present invention will be described with reference to FIG. FIG. 1 is a configuration diagram of an image processing apparatus 100 according to the first embodiment of the present invention.

  The image processing apparatus 100 is a digital camera, for example. The image processing apparatus 100 includes the following components.

  The protection means 102 is, for example, a barrier. The protection unit 102 is disposed between the outside and the photographing lens 10 to protect the photographing lens 10. The shutter 12 is disposed between the photographing lens 10 and an image sensor 14 described later, and has a diaphragm function for reducing the amount of light guided from the photographing lens 10 to the image sensor 14.

  The acquisition unit 17 acquires image data obtained by imaging the subject at a plurality of different timings. The acquisition means 17 includes an image sensor 14 that photoelectrically converts an optical image to generate an image signal, and an A / D converter 16 that generates image data by A / D converting the image signal received from the image sensor 14. Including. That is, the acquisition unit 17 captures a subject at a plurality of different timings and acquires a plurality of image data of the subject.

  The timing generator 18 is controlled by the memory controller 22 and the system controller 50 to supply a clock signal and a control signal to the image sensor 14, the A / D converter 16, and a D / A converter 26 described later.

  The image processing unit 20 performs predetermined pixel interpolation processing and color conversion processing on the data from the A / D converter 16 or the data from the memory control unit 22.

  The image processing unit 20 performs predetermined calculation processing using the image data received from the A / D converter 16 and the like, and supplies the obtained calculation result to the system control unit 50. As a result, the system control unit 50 controls the exposure control means 40 and the distance measurement control means 42, and performs TTL (through-the-lens) AF (autofocus) processing, AE (automatic exposure) processing, EF (flash pre-emission) processing is performed.

  Further, the image processing unit 20 performs predetermined calculation processing using the captured image data, and also performs TTL AWB (auto white balance) processing based on the obtained calculation result.

  The image processing unit 20 includes a face area detecting unit 20a and a face information extracting unit 20b. The face area detection unit 20 a detects the face area of the subject in each of the plurality of image data acquired by the acquisition unit 17. For example, a shape corresponding to a facial component such as a mouth or eyes is extracted from image data, and a face region is detected based on the position of the structural component. The face information extraction unit 20b extracts face information related to the shape of the constituent elements of the face of the subject from each of the plurality of face regions detected by the face region detection unit 20a. The face information includes, for example, information on specific shapes and coordinates such as the mouth, eyes, and eyebrows, or includes information on the shape and coordinates of the nose end point, the nostril center point, the ear end point, and the like. . Here, the face information is extracted by calculating the end point, the center point, or the black point of the eye from the input face image by using a method such as edge detection using a neural network or a spatial filter, for example. be able to. When the face area detecting means 20a extracts the face area, it is determined whether or not there is a shape that satisfies the conditions as the constituent elements of the face such as the mouth and eyes. In other words, it doesn't matter what the specific shape of the mouth is, as long as it meets the mouth condition, whether it ’s a wide open mouth when you are laughing or a closed mouth when you are silent. On the other hand, when the face information extracting means 20b extracts the face information, even the same mouth, including information on the shape of the mouth corner and the shape of the mouth and the coordinates of the face area are included. Extracted.

  The memory control circuit 22 controls the A / D converter 16, the timing generation unit 18, the image processing unit 20, the image display memory 24, the D / A converter 26, the memory 30, and the compression / decompression unit 32. The memory control circuit 22 writes the data received from the A / D converter 16 and the image processing unit 20 into the image display memory 24 or the memory 30.

  The image display memory 24 stores image data for display. The D / A converter 26 receives display image data from the image display memory 24 via the memory control unit 22 and D / A converts the display image data into an analog signal.

  The first notification means 28 displays an image corresponding to the analog signal received from the D / A converter 26. The first notification means 28 includes, for example, a TFT-LCD. The first notification means 28 can display images sequentially to realize an electronic finder function.

  Here, the first notification means 28 can arbitrarily turn on / off the display according to an instruction from the system control unit 50, and greatly reduce the power consumption of the digital camera 100 when the display is turned off. I can do it.

  The first notifying unit 28 notifies the extracted state information indicating whether or not the face information is extracted by the face information extracting unit 20b described later in association with the subject image. The extracted state information includes, for example, a solid square frame 508 indicating that the face area is not extracted and a broken square frame 506 indicating that the face information is extracted. (See FIGS. 7 and 8).

  Alternatively, the first notification unit 28 reports selection state information indicating whether or not reference face information is selected by a face information selection unit 50a described later in association with the image of the subject (FIGS. 7 to 7). 9). The selection state information includes, for example, a broken-line square frame 506 indicating that the reference face information is not selected and a solid-line round frame 507 indicating that the reference face information is selected. Included (see FIGS. 8 and 9). As will be described later, the reference face information is, for example, face information for a facial expression in an expressionless state, that is, a clear expression.

  Alternatively, the first notification means 28 includes change amount information 602 indicating the amount of change in other face information with respect to the reference face information, and threshold information indicating a change amount threshold value for determining that the facial expression to be detected has been reached. 603 is notified in association with the image of the subject (see FIG. 10). Or the 1st alerting | reporting means 28 alert | reports variation | change_quantity information and the threshold value information changed by the below-mentioned changing means 50c in relation to the image of a to-be-photographed object. Here, the first notifying unit 28 notifies at least when the change amount information 602 is notified, so as to be able to identify whether or not the change amount exceeds a threshold based on a result estimated by a facial expression estimating unit 50b described later. To do. The first notification means 28 further notifies an image 601 (see FIG. 10) corresponding to the reference face information in association with the subject image.

  The memory 30 stores captured still images and moving images. The memory 30 has a storage capacity sufficient to store a predetermined number of still images and a moving image for a predetermined time. Thereby, even in the case of continuous shooting or panoramic shooting in which a plurality of still images are continuously shot, it is possible to write a large amount of images to the memory 30 at high speed. The memory 30 can also be used as a work area for the system control unit 50.

  The compression / decompression unit 32 compresses and decompresses image data by adaptive discrete cosine transform (ADCT) or the like. The compression / decompression unit 32 reads an image stored in the memory 30, performs compression processing or decompression processing, and writes the processed data to the memory 30.

  The exposure control means 40 controls the shutter 12 having an aperture function. The exposure control means 40 can realize a flash light control function by cooperating with the flash 48.

  The distance measurement control means 42 controls the focusing of the taking lens 10. The zoom control unit 44 controls zooming of the taking lens 10. The barrier control unit 46 controls the operation of the protection unit 102. The flash 48 has an AF auxiliary light projecting function and a flash light control function.

  The system control unit 50 controls the entire digital camera 100. For example, the system control unit 50 controls the exposure control unit 40 and the distance measurement control unit 42 based on the calculation result calculated by the image processing unit 20 on the image data acquired by the acquisition unit 17.

  The system control unit 50 includes a face information selection unit 50a, a facial expression estimation unit 50b, and a change unit 50c. The face information selection unit 50a selects reference face information that serves as a reference for estimating the facial expression of the subject from a plurality of pieces of face information extracted by the face information extraction unit 20b. The reference face information is, for example, face information for an expression in an expressionless state, that is, a clear expression. Here, the face information selection unit 50a determines that the expression is in an expressionless state, for example, when the amount of change in the shape of the mouth does not exceed a predetermined threshold for a predetermined period, and the face corresponding to the determination result. Information is selected as reference face information.

  The face information selection unit 50a may use the shape of a component other than the mouth when determining the expressionless state. For example, the face information selection means 50a may further use the eye open / closed state as the face information and select the reference face information based on the eye opening state timing using a change in the size of the eyes. In addition, the face information selection unit 50a may learn a pupil image with a neural network and determine the open / closed state of the eye from the number of pupil neurons greater than a fixed threshold.

  The facial expression estimation means 50b estimates the facial expression of the subject corresponding to the other face information by comparing the reference face information with the other face information. That is, the facial expression estimation means 50b receives the reference face information from the face information selection means 50a, and receives other face information from the face information extraction means 20b of the image processing unit 20. The facial expression estimation means 50b calculates the difference between the reference face information and other face information, and generates change amount information indicating the change amount of the other face information with respect to the reference face information. The facial expression estimation means 50b receives threshold information indicating a change amount threshold for determining that the facial expression to be detected has been reached from the memory 52 described later. The facial expression estimation means 50b determines whether or not the change amount exceeds the threshold value based on the change amount information and the threshold value information. Thereby, the facial expression estimation means 50b determines whether or not the face of the target subject has reached the target facial expression.

  The changing unit 50c receives a change instruction from the input unit 75 described later. The change instruction is an instruction for changing a change amount threshold value for determining that the facial expression to be detected has been reached. The changing unit 50c changes the threshold information according to the change instruction.

  The memory 52 stores constants and variables for the operation of the system control unit 50. Further, the memory 52 stores threshold information received from the input means 75 described later as setting information in advance.

  The second notification unit 54 notifies the operation state, the message, and the like using characters, images, sounds, and the like according to the execution of the program in the system control unit 50. One or a plurality of second informing means 54 are installed at positions in the vicinity of an operation unit 70 (to be described later) of the digital camera 100 that are easily visible. The second notification means 54 is configured by a combination of a liquid crystal display (LCD), an LED, a sounding element (speaker), and the like, for example.

  The second notification means 54 is partially installed in the optical viewfinder 104.

  For example, the second notifying unit 54 includes a single shot / continuous shooting display, a self-timer display, a compression ratio display, a recording pixel number display, a recording number display, a remaining image number display, a shutter speed display, an aperture value display, and An exposure compensation display or the like is displayed on an LCD or the like. Alternatively, the second notification unit 54 may be, for example, a flash display, a red-eye reduction display, a macro shooting display, a buzzer setting display, a clock battery remaining amount display, a battery remaining amount display, an error display, and a multi-digit number information display. Are displayed on the LCD. Or the 2nd alerting | reporting means 54 displays the attachment / detachment state display of the external recording medium 120, communication I / F operation display, date / time display, etc. on LCD etc., for example.

  Further, the second notification means 54 displays in-focus display, camera shake warning display, flash charge display, shutter speed display, aperture value display, exposure correction display, and the like in the optical viewfinder 104.

  The nonvolatile memory 56 is an electrically erasable / recordable memory, and stores a program such as an image processing program. As the nonvolatile memory 56, for example, an EEPROM or the like is used.

  The input means 75 accepts an extraction target instruction that specifies a subject from which face information is to be extracted from a plurality of subjects. Alternatively, the input means 75 accepts a facial expression detection instruction for detecting the facial expression of the subject. Alternatively, the input means 75 accepts a change instruction for changing the change amount threshold value for determining that the facial expression to be detected has been reached.

  Further, the input unit 75 receives various operation instructions from the system control unit 50. A predetermined instruction is received through one or a combination of a switch, a dial, a touch panel, pointing by line-of-sight detection, a voice recognition device, and the like. The input means 75 includes a mode dial switch 60, a shutter button 61, a first shutter switch 62, a second shutter switch 64, an image display ON / OFF switch 66, and an operation unit 70.

  The mode dial switch 60 receives instructions for switching and setting each function mode such as power-off, automatic shooting mode, shooting mode, panoramic shooting mode, playback mode, multi-screen playback / erase mode, and PC connection mode.

  The shutter button 61 receives an instruction for taking a still image from the user. For example, when the shutter button 61 is pressed halfway, AF (autofocus) processing, AE (automatic exposure) processing, AWB (auto white balance) processing, EF (flash pre-flash) processing, and the like are performed. The instruction is accepted. For example, when the shutter button 61 is fully pressed, the second instruction for taking a still image is received.

  When the first shutter switch (SW1) 62 receives the first instruction from the shutter button 61, the first shutter switch (SW1) 62 is turned on, and supplies information indicating the on state to the system control unit 50. The system control unit 50 performs AF (autofocus) processing, AE (automatic exposure) processing, AWB (auto white balance) processing, and EF (flash pre-flash) processing in response to the first shutter switch 62 being in the ON state. The operation start is instructed to each part.

  When the second shutter switch (SW2) 64 receives the second instruction from the shutter button 61, the second shutter switch (SW2) 64 is turned on, and supplies information indicating the on state to the system control unit 50. As a result, the system control unit 50 instructs the start of a series of photographing processing operations. In a series of photographing processes, an exposure process in which a signal read from the image sensor 14 is written into the storage means 30 via the A / D converter 16 and the memory control unit 22, and the image processing unit 20 and the memory control unit 22 Development processing using computation is performed. In a series of photographing processes, image data is read from the storage unit 30, compressed by the compression / decompression unit 32, and recording processing for writing the image data to the recording medium 200 or 210 is performed.

  The image display ON / OFF switch 66 receives an instruction for setting ON / OFF of the first notification means 28. By this instruction, the system control unit 50 can cut power supply to the first notifying unit 28 when taking an image using the optical viewfinder 104, thereby saving power.

  The quick review ON / OFF switch 68 accepts an instruction for setting a quick review function for automatically reproducing image data taken immediately after photographing. For example, it is assumed that the setting of the quick review function when the first notification unit 28 is turned off can be received.

  The operation unit 70 includes various buttons and a touch panel. The operation unit 70 includes a menu button, a macro button, a multi-screen playback page break button, a flash setting button, a single shooting / continuous shooting / self-timer switching button, a menu movement + (plus) button, and a menu movement− (minus) button. . The operation unit 70 includes a playback image movement + (plus) button, a playback image-(minus) button, a shooting image quality selection button, an exposure correction button, a date / time setting button, a bracket mode selection button button, and the like.

  The power supply control means 80 includes a battery detection circuit, a DC-DC converter, a switch circuit that switches a block to be energized, and the like, and detects whether or not a battery is attached, the type of battery, and the remaining battery level. The power control unit 80 controls the DC-DC converter based on the detection result and an instruction from the system control unit 50, and supplies a necessary voltage to each unit including the external recording medium 120 for a necessary period.

  The connector 82 is connected to the power supply control means 80. The connector 84 is connected to the power source 86. The power source 86 is, for example, a primary battery such as an alkaline battery or a lithium battery, a secondary battery such as a NiCd battery, a NiMH battery, or a Li battery, or an AC adapter.

  The card controller 90 transmits / receives data to / from an external recording medium such as a memory card. The interface 91 functions as an interface between the external recording medium 120 such as a memory card and the card controller 90. The connector 92 is connected to an external recording medium 120 such as a memory card. The recording medium attachment / detachment detection means 98 detects whether or not the external recording medium 120 is attached to the connector 92.

  Note that the number of interfaces and connectors for attaching the recording medium may be two or more. Further, interfaces and connectors having different standards may be combined. The interface and the connector may be configured using a semiconductor memory card or the like that conforms to the standard. In this case, by connecting various communication cards, image data and management information attached to the image data can be transferred to and from other computers and peripheral devices such as a printer. The various communication cards are, for example, LAN cards, modem cards, USB cards, IEEE 1394 cards, P1284 cards, SCSI cards, PHS communication cards, and the like.

  The optical viewfinder 104 is used for confirming a subject when shooting. If the optical viewfinder 104 is used, it is possible to perform shooting without using the electronic viewfinder function of the first notification means 28. The optical viewfinder 104 also displays a part of information displayed by the notification means 54, for example, information related to focus display, camera shake warning display, flash charge display, shutter speed display, aperture value display, exposure correction display, and the like. Is done.

  The external recording medium 120 is detachably connected to the connector 92. The external recording medium 120 is, for example, a memory card.

  Next, the flow of processing when the image processing apparatus 100 estimates the facial expression of the subject (smiling shooting mode processing) will be described with reference to the flowchart shown in FIG. FIG. 2 is a flowchart showing a flow of processing when the image processing apparatus 100 estimates the facial expression of the subject (smiling shooting mode processing).

  In step S1, the input means 75 accepts an instruction for selecting a smile shooting mode. This smile shooting mode is a shooting mode in which shooting is automatically performed when a smile on a specific subject is detected. The acquisition unit 17 acquires image data obtained by imaging the subject at a plurality of different timings.

  For example, at this time, the first notification means 28 displays two subjects O1 and O2 on the display screen 501 (see FIG. 5).

  In step S2, the input means 75 receives an instruction for performing face area detection from the user, and supplies the instruction to the face area detection means 20a. In response to the instruction, the face area detecting unit 20a detects the face area of the subject in each of the plurality of image data acquired by the acquiring unit 17. The face area detection unit 20 a supplies information related to the detected face area to the system control unit 50. The system control unit 50 controls the first notification unit 28 according to the information related to the detected face area. The first notification unit 28 displays a face frame indicating that the face area has been detected on the subject on which the face area has been detected on the display screen.

  For example, at this time, the first notification unit 28 displays face frames 503 and 504 indicating that a face area has been detected for each of the two subjects O1 and O2 (see FIG. 6).

  In step S3, a preparation process is performed. Details of the preparation process will be described later.

  In step S4, the system control unit 50 determines whether or not an expression should be detected.

  For example, the system control unit 50 determines that a facial expression should be detected when a facial expression detection instruction for detecting the facial expression of the subject is received from the input means 75. The system control unit 50 determines that the facial expression should not be detected when the facial expression detection instruction for detecting the facial expression of the subject is not received from the input unit 75.

  Alternatively, for example, the system control unit 50 determines that the facial expression should be detected when the reference face information is selected for all of the target subjects. The system control unit 50 determines that the facial expression should not be detected when there is a subject for which the reference face information is not selected among the subject subjects.

  If the system control unit 50 determines that the facial expression should be detected, the process proceeds to step S5. If the system control unit 50 determines that the facial expression should not be detected, the process proceeds to step S1.

  In step S5, the face area detection means 20a detects the face area of the subject in each of the plurality of image data acquired by the acquisition means 17. The area detection unit 20 a supplies information related to the detected face area to the system control unit 50. The system control unit 50 controls the first notification unit 28 according to the information related to the detected face area. The first notification unit 28 displays a face frame indicating that the face area has been detected on the subject on which the face area has been detected on the display screen.

  In step S6, a detection process is performed. Details of the detection process will be described later.

  In step S7, the system control unit 50 determines whether or not shooting should be performed.

  For example, the system control unit 50 determines that shooting should be performed when the number of subjects with the facial expression flag turned on is equal to or greater than a predetermined number, and the number of subjects with the facial expression flag turned on is less than the predetermined number. , Judge that you should not shoot.

  Alternatively, for example, when the information indicating that the display flag is in the ON state is received from the two shutter switch 64, the system control unit 50 determines that shooting should be performed. For example, if the information indicating that the display flag is in the ON state is not received from the two shutter switch 64, the system control unit 50 determines that shooting should not be performed.

  If the system control unit 50 determines that the image should be taken, the process proceeds to step S8. If the system control unit 50 determines that the image should not be taken, the process proceeds to step S9.

  In step S8, the system control unit 50 instructs the start of a series of shooting processing operations. In a series of photographing processes, an exposure process in which a signal read from the image sensor 14 is written into the storage means 30 via the A / D converter 16 and the memory control unit 22, and the image processing unit 20 and the memory control unit 22 Development processing using computation is performed. In a series of photographing processes, image data is read from the storage unit 30, compressed by the compression / decompression unit 32, and recording processing for writing the image data to the recording medium 200 or 210 is performed.

  In step S9, the system control unit 50 determines whether or not the smile shooting mode process should be terminated.

  For example, when the system control unit 50 receives an instruction to change the shooting mode to a shooting mode other than the smile shooting mode, the system control unit 50 determines that the process of the smile shooting mode should be terminated. When the system control unit 50 does not receive an instruction to change the shooting mode to a shooting mode other than the smile shooting mode, the system control unit 50 determines that the process of the smile shooting mode should not be terminated.

  Alternatively, for example, when receiving an instruction to end the smile shooting mode, the system control unit 50 determines that the process of the smile shooting mode should be ended. When the system control unit 50 does not receive an instruction to end the smile shooting mode, the system control unit 50 determines that the process of the smile shooting mode should not be ended.

  Next, the flow of the preparation process (step S3) will be described with reference to FIG. FIG. 3 is a flowchart showing the flow of preparation processing for each subject. FIG. 3 shows processing when attention is paid to a specific subject (for example, the subject O2 shown in FIG. 7). That is, the preparation process is performed in parallel independently for each subject. For example, FIG. 7 shows an example in which different face frames 508 and 505 are displayed for the subjects O1 and O2, and the preparation processing is performed independently and in parallel.

  In step S11, the face information extraction unit 20b determines whether or not the specific subject is the subject.

  For example, when the face information extraction unit 20b receives an extraction target instruction for a specific subject from the input unit 75, the face information extraction unit 20b determines that the specific subject is a target subject. When the face information extraction unit 20b does not receive an extraction target instruction for a specific subject from the input unit 75, the face information extraction unit 20b determines that the specific subject is not the target subject.

  When it is determined that the specific subject is the target subject, the face information extraction unit 20b advances the process to step S12, and when it is determined that the specific subject is not the target subject, the processing ends.

  For example, at this time, the first notification means 28 notifies the solid line square frame 508 indicating that face information is not extracted in association with the image of the subject O2 (see FIG. 7).

  In step S12, the face information extraction unit 20b extracts face information related to the shape of the face component of the subject from each of the plurality of face regions detected by the face region detection unit 20a. For example, the face information extraction unit 20b normalizes the size and orientation of the selected face in order to increase the calculation accuracy in the subsequent steps, and extracts the shapes of the endpoints such as the mouth, eyes, and eyebrows from the normalized face. To do.

  Here, the face information includes information on specific shapes and coordinates such as mouth, eyes, and eyebrows, for example, or information on shapes such as nose end points, nostril center points, and ear end points. Including. Here, the face information is extracted by calculating the end point, the center point, or the black point of the eye from the input face image by using a method such as edge detection using a neural network or a spatial filter, for example. be able to.

  In step S13, the face information extraction unit 20b determines whether or not the face information has been successfully extracted. If it is determined that the face information extraction unit 20b has succeeded in extracting the face information, the process proceeds to step S14. If it is determined that the face information extraction has not been successful, the process ends.

  In step S <b> 14, the face information extraction unit 20 b supplies information indicating that the face information has been successfully extracted to the system control unit 50. The system control unit 50 controls the first notification unit 28 according to information indicating that the face information has been successfully extracted. As a result, the first notification means 28 notifies the extraction state information indicating whether or not the face information is extracted by the face information extraction means 20b in association with the subject image.

  For example, the first notifying unit 28 notifies a dashed square frame 506 indicating that face information is extracted in association with the image of the subject O2 (see FIG. 8).

  In step S15, the face information selection unit 50a selects reference face information serving as a reference for estimating the facial expression of the subject from a plurality of pieces of face information extracted by the face information extraction unit 20b. The reference face information is, for example, face information for an expression in an expressionless state, that is, a clear expression. Here, the face information selection unit 50a determines that the expression is in an expressionless state, for example, when the amount of change in the shape of the mouth does not exceed a predetermined threshold for a predetermined period, and the face corresponding to the determination result. Information is selected as reference face information.

  The face information selection unit 50a may use the shape of a component other than the mouth when determining the expressionless state. For example, the face information selection means 50a may further use the eye open / closed state as the face information and select the reference face information based on the eye opening state timing using a change in the size of the eyes. In addition, the face information selection unit 50a may learn a pupil image with a neural network and determine the open / closed state of the eye from the number of pupil neurons greater than a fixed threshold.

  In step S16, the face information selection unit 50a determines whether or not the reference face information has been successfully selected. When it is determined that the reference face information has been successfully selected, the face information extraction unit 20b advances the process to step S17, and when it is determined that the selection of the reference face information has not been successful, the process ends. .

  In step S17, the first notification means 28 notifies the selection state information indicating whether or not the reference face information is selected by the face information selection means 50a in association with the subject image.

  For example, at this time, the first notification unit 28 notifies the solid image of the round frame 507 indicating that the reference face information has been selected in association with the image of the subject O2 (see FIG. 9).

  As described above, since the selection state information is notified in association with the image of the subject, whether or not the facial expression of the subject can be detected before the detection of the facial expression of the subject is displayed on the screen. Can be grasped by the user who viewed

  In addition, the 1st alerting | reporting means 28 may alert | report extraction state information and selection state information with the form different from the form shown by FIGS. For example, the first notification unit 28 may notify the extraction state information and the selection state information by changing the color, size, or the like instead of the frame shape or in addition to the frame shape.

  Next, the flow of the detection process (step S6) will be described with reference to FIG. FIG. 4 is a flowchart showing the flow of detection processing for each subject. That is, the detection process is performed independently and in parallel for each subject. FIG. 4 shows processing when a specific subject is focused.

  In step S21, the system control unit 50 determines whether or not a change instruction is input by the user. The change instruction is an instruction for changing a change amount threshold value for determining that the facial expression to be detected has been reached.

  For example, the input unit 75 (a set button, a cross key, etc. of the operation unit 70) can accept a change instruction. For example, it is possible to reset the threshold value higher than before by using the + (plus) button of the cross key of the operation unit 70, and conversely, to reset the threshold value lower by using the − (minus) button. In response to receiving the change instruction from the input unit 75, the system control unit 50 determines that the change instruction has been input by the user. In response to not receiving the change instruction from the input means 75, the system control unit 50 determines that no change instruction has been input by the user.

  If the system control unit 50 determines that the change instruction is input by the user, the process proceeds to step S22. If the system control unit 50 determines that the change instruction is not input by the user, the process proceeds to step S23.

  In step S22, the changing unit 50c changes the threshold information according to the change instruction. That is, the change unit 50c accesses the memory 52 in response to receiving the change instruction from the input unit 75, and rewrites the threshold information stored in the memory 52 with the threshold information corresponding to the threshold indicated by the change instruction. .

  Here, in general, there are individual differences in human facial expressions, and there are some people whose facial and eye movements are intense and rich, depending on the person, and on the other hand, there are people who are poor in facial expression changes. In such a situation, even when a uniform reference (threshold) is used, it is difficult to accurately determine that the facial expression to be detected has been reached. For example, when it is determined whether or not a smile is made based on only one threshold, there may be an adverse effect that a subject that easily detects a smile and a subject that does not.

  On the other hand, in this embodiment, as shown in step S21 and step S22, the threshold value of the change amount for determining that the user has reached the facial expression to be detected can be changed. Thus, the threshold value can be set in consideration of the individuality of the subject, and it can be accurately determined that the target facial expression has been reached according to the individuality of the subject.

  In step S23, the face information extraction unit 20b extracts face information related to the shape of the face component of the subject from each of the plurality of face regions detected by the face region detection unit 20a. The details of step S23 are the same as step S12 described above.

  In step S24, the face information extraction unit 20b determines whether or not the face information has been successfully extracted. If it is determined that the face information extraction unit 20b has succeeded in extracting the face information, the process proceeds to step S25. If it is determined that the face information extraction has not been successful, the process ends.

  In step S25, the facial expression estimation means 50b receives the reference face information from the face information selection means 50a, and receives other face information from the face information extraction means 20b of the image processing unit 20. The facial expression estimation means 50b calculates the difference between the reference face information and other face information, and generates change amount information indicating the change amount of the other face information with respect to the reference face information.

For example, the facial expression estimation means 50b calculates the difference between the reference face information and other face information for each component of the face, and uses the following formula 1 to determine the smile facial expression level, that is, the smile level. calculate. When the difference between the reference face information for each component of the subject's face and other face information is v1, v2, v3,..., The facial expression estimation means 50b
SumScore = ΣScore i
= Σg (w i , v i )...
And i = 1 to N are calculated in total. Here, N is a number for identifying each component of the face, w is the weight of each component, and g is a score calculation function. That is, the facial expression estimation means 50b generates this smile level SumScore as change amount information. This smile degree SumScore is represented by a numerical value such as 0 to 100, for example.

  The facial expression estimation means 50b may calculate a ratio between the reference face information and other face information, and generate change amount information indicating a change amount of the other face information with respect to the reference face information.

  In step S <b> 26, the first notification unit 28 receives change amount information from the facial expression estimation unit 50 b and receives threshold information from the memory 52. The first notification means 28 includes change amount information 602 indicating the change amount of other face information with respect to the reference face information, and threshold information 603 indicating a change amount threshold value for determining that the facial expression to be detected has been reached. In association with the image of the subject (see FIG. 10). Alternatively, the first notification unit 28 reports the change amount information 602 and threshold information 603 changed by the changing unit 50c described later in association with the subject image.

  For example, when the smile level calculated using Equation 1 is expressed by a numerical value such as 0 to 100, if the smile level is 100, the first notifying unit 28 indicates that all the scales are filled. 602 is displayed (see FIG. 10). If the smile degree is 0, the first notification means 28 displays the indicator 602 in a state where all the scales are not painted. And the 1st alerting | reporting means 28 displays the figure 603 which shows threshold values, such as a triangle mark, in the position corresponding to a threshold value beside the indicator 602 (refer FIG. 10).

  As described above, since the change amount information and the threshold information are notified for each subject, the user can be made aware of whether or not the facial expression of the subject has reached the target facial expression.

  In step S27, the facial expression estimation means 50b determines that the facial expression to be detected has been reached.

  For example, the facial expression estimation means 50b receives threshold information indicating a change amount threshold for determining that the facial expression to be detected has been reached from the memory 52 described later. The facial expression estimation means 50b determines whether or not the change amount exceeds the threshold value based on the change amount information and the threshold value information. When it is determined that the amount of change has exceeded the threshold value, the facial expression estimation means 50b determines that the facial expression to be detected has been reached for the target subject. When it is determined that the amount of change does not exceed the threshold value, the facial expression estimation means 50b determines that the facial expression to be detected has not reached the target subject.

  When it is determined that the facial expression to be detected has been reached, the facial expression estimation means 50b proceeds to step S28, and when it is determined that the facial expression to be detected has not been reached, the processing is terminated.

  In step S <b> 28, the facial expression estimation unit 50 supplies information indicating that the target facial expression has been reached to the first notification unit 28. When notifying the change amount information at least, the first notifying unit 28 notifies the user whether or not the change amount exceeds the threshold based on the result estimated by the facial expression estimating unit 50b.

  For example, the first notifying unit 28 changes the display color of the display frame 600 and the indicator 602 or blinks the display frame 600 and the indicator 602 so that the change amount does not exceed the threshold value. Then, the fact that the subject has reached the target facial expression is notified.

  In this way, since the change amount information and the threshold information are notified so that it can be identified whether or not the change amount exceeds the threshold value, the user can more easily determine whether or not the facial expression of the subject has reached the target facial expression. Can be grasped.

  In step S29, the facial expression estimation means 50 accesses the memory 52 and rewrites the facial expression flag of the subject that has reached the target facial expression from the OFF state to the ON state.

  For example, in the example of FIG. 9, the facial expression estimation means 50 rewrites the facial expression flag of the subject O2 from the OFF state to the ON state.

  As described above, the user can grasp whether or not the facial expression of the subject can be detected, and the user can grasp whether or not the facial expression of the subject has reached the target facial expression. be able to. Thereby, it is possible to make the user confirm whether or not shooting is possible for the subject to which the user pays attention, and to prompt the user to perform shooting when the facial expression of the subject reaches the target facial expression.

  The first notification unit 28 may further notify maximum value information indicating the maximum value of the change amount in association with the image of the subject. Thereby, the user can grasp the personality of the facial expression of the subject.

  Further, the first notification unit 28 may notify a plurality of subjects instead of notifying the change amount information and the threshold information for each subject (see FIG. 10). That is, as shown in FIG. 11, the first notification unit 28 notifies the change amount information and the threshold information in association with the images of the plurality of subjects so that the correspondence between the change amount information and the subjects can be understood. . For example, the first notification unit 28 displays the face frame 600 of the subject O1, its change amount information 602, and threshold information 603 with solid lines, and the face frame 606 of the subject O2, its change amount information 604, and threshold information 605. Is displayed with a broken line. Alternatively, the first notification unit 28 displays the face frame 600 of the subject O1, its change amount information 602, and threshold information 603 in the first color, and the face frame 606 of the subject O2, its change amount information 604, and the threshold value. Information 605 is displayed in the second color.

  Next, an image processing apparatus 200 according to the second embodiment of the present invention will be described with reference to FIG. FIG. 11 is a configuration diagram of an image processing apparatus 200 according to the second embodiment of the present invention. Below, it demonstrates centering on a different part from 1st Embodiment, and the description about the same part is abbreviate | omitted.

  The image processing apparatus 200 is different from the first embodiment in that an image processing unit 220 is provided. The image processing unit 220 includes a face area detection unit 20a and a face information extraction unit 220b. The face information extraction unit 220b selects a target subject from a plurality of subjects according to priority information indicating a priority for estimating the facial expression of the subject, and performs face information on the target subject. To extract.

  For example, the face information extraction unit 220b is a subject that targets a subject corresponding to the face region closest to the center of the angle of view among a plurality of face regions corresponding to the target subject detected by the face region detection unit 20a. Choose as. The face information extraction unit 220b extracts face information for the selected target subject. The face information extraction unit 220b selects the face area closest to the center of the view angle among the face areas corresponding to the plurality of subjects not selected as the target subject among the plurality of face regions corresponding to the target subject. The corresponding subject is selected as the subject. The face information extraction unit 220b extracts face information for the selected target subject.

  The subject is selected using the previous information, such as selecting the face on the right side of the face information calculated in the previous image, or preferentially referring to the face selected in the previous image. A method such as selection is also conceivable.

  Further, as shown in FIG. 12, the flow of processing when the image processing apparatus 100 estimates the facial expression of the subject (smiling shooting mode processing) is different from the first embodiment in the following points.

  In step S32, the face information extraction unit 220b selects a target subject from a plurality of subjects according to the priority information indicating the priority for estimating the facial expression of the subject, and sets the target subject as the target subject. On the other hand, face information is extracted.

  For example, the face information extraction unit 220b is a subject that targets a subject corresponding to the face region closest to the center of the angle of view among a plurality of face regions corresponding to the target subject detected by the face region detection unit 20a. To extract face information for the subject. The face information extraction unit 220b selects the face area closest to the center of the view angle among the face areas corresponding to the plurality of subjects not selected as the target subject among the plurality of face regions corresponding to the target subject. The corresponding subject is selected as the subject. The face information extraction unit 220b extracts face information for the selected target subject.

  In step S33, a preparation process is performed. As shown in FIG. 13, the flow of this preparation process (step S33) differs from the first embodiment in the following points.

  In step S41, the face information extraction unit 220b determines whether or not the specific subject is the subject.

  For example, when the specific subject is the subject selected in step S32, the face information extraction unit 220b determines that the specific subject is the subject. If the specific subject is not the subject selected in step S32, the face information extraction unit 220b determines that the specific subject is not the target subject.

  When it is determined that the specific subject is the target subject, the face information extraction unit 220b advances the process to step S12, and when it is determined that the specific subject is not the target subject, the processing ends.

  As described above, since the target subject is selected according to the priority information, it is possible to save the user from specifying the subject when performing the preparation process.

1 is a configuration diagram of an image processing apparatus according to a first embodiment. 6 is a flowchart showing a flow of processing when an image processing apparatus estimates a facial expression of a subject (processing in a smile shooting mode). The flowchart which shows the flow of the preparation process for every to-be-photographed object. The flowchart which shows the flow of a detection process for every to-be-photographed object. The figure which shows a display screen. The figure which shows a display screen. The figure which shows a display screen. The figure which shows a display screen. The figure which shows a display screen. The figure which shows a display screen. The figure which shows a display screen (modification). The block diagram of the image processing apparatus which concerns on 2nd Embodiment. 6 is a flowchart showing a flow of processing when an image processing apparatus estimates a facial expression of a subject (processing in a smile shooting mode). The flowchart which shows the flow of the preparation process for every to-be-photographed object.

Explanation of symbols

17 acquisition means 20a face area detection means 20b, 120b face information extraction means 28 first notification means 50a face information selection means 50b facial expression estimation means 50c change means 75 input means 100, 200 image processing apparatus

Claims (11)

  1. Acquisition means for acquiring image data obtained by imaging a subject at a plurality of different timings;
    A face area detecting means for detecting a face area of the subject in each of the plurality of image data acquired by the acquiring means;
    Face information extraction means for extracting face information relating to the shape of the face component of the subject from each of the plurality of face areas detected by the face area detection means;
    Face information selecting means for selecting reference face information serving as a reference for estimating the facial expression of the subject from the plurality of face information extracted by the face information extracting means;
    Change amount information indicating a change amount of other face information with respect to the reference face information, and threshold information indicating a threshold value of the change amount for determining that the facial expression to be detected has been reached in the image of the subject. An informing means for informing and informing;
    An image processing apparatus comprising:
  2. The facial expression estimating means for comparing the reference face information with the other face information and estimating a facial expression of the subject corresponding to the other face information. Image processing device.
  3. The notifying unit notifies at least when the change amount information is notified, based on a result estimated by the facial expression estimating unit, so as to be identifiable as to whether or not the change amount exceeds the threshold value. The image processing apparatus according to claim 2.
  4. 4. The image processing apparatus according to claim 1, wherein the notifying unit further notifies an image corresponding to the reference face information in association with the image of the subject. 5.
  5. Input means for receiving a change instruction for changing the threshold value of the change amount;
    A changing means for changing the threshold information in response to the change instruction;
    Further comprising
    5. The notification unit according to claim 1, wherein the notification unit reports the change amount information and the threshold information changed by the change unit in association with an image of the subject. Image processing device.
  6. The image processing device according to claim 1, wherein the notification unit further notifies maximum value information indicating a maximum value of the change amount in association with the image of the subject.
  7. When there are only a plurality of the subjects from which the face area is detected,
    The notification means is configured to notify the change amount information and the threshold information in association with each of the plurality of subjects so that the correspondence between the change amount information and the subject can be understood. The image processing apparatus according to any one of claims 6 to 6.
  8. 8. The notification means according to claim 1, wherein priority information indicating a priority for estimating a facial expression of the subject is notified in association with each of the plurality of subjects. The image processing apparatus according to item.
  9. The image processing apparatus according to claim 1, wherein the acquisition unit captures the subject at a plurality of different timings to acquire a plurality of the image data of the subject.
  10. An acquisition step of acquiring image data obtained by imaging a subject at a plurality of different timings;
    A face area detecting step of detecting a face area of the subject in each of the plurality of image data acquired in the acquiring step;
    A face information extraction step of extracting face information related to the shape of the constituent element of the face of the subject from each of the plurality of face regions detected in the face region detection step;
    A face information selection step of selecting reference face information as a reference for estimating the facial expression of the subject from the plurality of face information extracted in the face information extraction step;
    Change amount information indicating a change amount of other face information with respect to the reference face information, and threshold information indicating a threshold value of the change amount for determining that the facial expression to be detected has been reached in the image of the subject. An informing step for informing and informing;
    An image processing method comprising:
  11. The image processing device
    Acquisition means for acquiring image data obtained by imaging a subject at a plurality of different timings;
    A face area detecting means for detecting a face area of the subject in each of the plurality of image data acquired by the acquiring means;
    Face information extracting means for extracting face information relating to the shape of the constituent elements of the face of the subject from each of the plurality of face areas detected by the face area detecting means;
    Face information selection means for selecting reference face information serving as a reference for estimating the facial expression of the subject from the plurality of face information extracted by the face information extraction means;
    Change amount information indicating a change amount of other face information with respect to the reference face information, and threshold information indicating a threshold value of the change amount for determining that the facial expression to be detected has been reached in the image of the subject. Informing means for informing in association,
    An image processing program that functions as an image processing program.
JP2007142329A 2007-05-29 2007-05-29 Image processing device, method, and program Withdrawn JP2008299430A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2007142329A JP2008299430A (en) 2007-05-29 2007-05-29 Image processing device, method, and program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP2007142329A JP2008299430A (en) 2007-05-29 2007-05-29 Image processing device, method, and program

Publications (1)

Publication Number Publication Date
JP2008299430A true JP2008299430A (en) 2008-12-11

Family

ID=40172930

Family Applications (1)

Application Number Title Priority Date Filing Date
JP2007142329A Withdrawn JP2008299430A (en) 2007-05-29 2007-05-29 Image processing device, method, and program

Country Status (1)

Country Link
JP (1) JP2008299430A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009207119A (en) * 2007-12-28 2009-09-10 Casio Comput Co Ltd Imaging apparatus and program
JP2010177991A (en) * 2009-01-29 2010-08-12 Nikon Corp Digital camera
JP2014116957A (en) * 2007-12-28 2014-06-26 Casio Comput Co Ltd Imaging apparatus and program

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009207119A (en) * 2007-12-28 2009-09-10 Casio Comput Co Ltd Imaging apparatus and program
US8723976B2 (en) 2007-12-28 2014-05-13 Casio Computer Co., Ltd. Imaging device and storage medium
JP2014116957A (en) * 2007-12-28 2014-06-26 Casio Comput Co Ltd Imaging apparatus and program
JP2010177991A (en) * 2009-01-29 2010-08-12 Nikon Corp Digital camera

Similar Documents

Publication Publication Date Title
US7574128B2 (en) Photographing apparatus and photographing method
US8421899B2 (en) Image pickup device, automatic focusing method, automatic exposure method, electronic flash control method and computer program
US8013929B2 (en) Image display control apparatus
JP4794786B2 (en) Imaging apparatus, automatic exposure method, and program for computer to execute the method
CN1909603B (en) Image processing method and imaging apparatus
JP4683339B2 (en) Image trimming device
CN101415074B (en) Imaging device and imaging control method
JP4264663B2 (en) Imaging apparatus, image processing apparatus, image processing method therefor, and program causing computer to execute the method
US7864240B2 (en) Imaging apparatus in which an enlargement of a live picture displayed is based on a determined in-focus evaluation value
US20040119851A1 (en) Face recognition method, face recognition apparatus, face extraction method, and image pickup apparatus
US20070030381A1 (en) Digital camera
JP5144422B2 (en) Imaging apparatus and imaging method
US8120664B2 (en) Digital camera
US8466977B2 (en) Image data management apparatus and method, and recording medium
JP2007020104A (en) Imaging apparatus, and method and program for managing number of photographed pictures
US9402033B2 (en) Image sensing apparatus and control method therefor
EP1628465A1 (en) Image capture apparatus and control method therefor
JP4182117B2 (en) Imaging device, its control method, program, and storage medium
JP4315148B2 (en) Electronic camera
WO2007106117A2 (en) Method and apparatus for selective rejection of digital images
KR101539043B1 (en) Image photography apparatus and method for proposing composition based person
JP2003107335A (en) Image pickup device, automatic focusing method, and program for making computer execute the method
US20070263904A1 (en) Subject tracking device, subject tracking method, subject tracking program product and optical device
JP5046788B2 (en) Imaging apparatus and control method thereof
JP5361528B2 (en) Imaging apparatus and program

Legal Events

Date Code Title Description
A300 Withdrawal of application because of no request for examination

Free format text: JAPANESE INTERMEDIATE CODE: A300

Effective date: 20100803