US8831282B2 - Imaging device including a face detector - Google Patents
Imaging device including a face detector Download PDFInfo
- Publication number
- US8831282B2 US8831282B2 US13/441,252 US201213441252A US8831282B2 US 8831282 B2 US8831282 B2 US 8831282B2 US 201213441252 A US201213441252 A US 201213441252A US 8831282 B2 US8831282 B2 US 8831282B2
- Authority
- US
- United States
- Prior art keywords
- rotational motion
- motion
- image data
- rotational
- coordinate
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related, expires
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/223—Analysis of motion using block-matching
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/10—Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
- G06V40/16—Human faces, e.g. facial parts, sketches or expressions
- G06V40/161—Detection; Localisation; Normalisation
- G06V40/164—Detection; Localisation; Normalisation using holistic features
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V40/00—Recognition of biometric, human-related or animal-related patterns in image or video data
- G06V40/20—Movements or behaviour, e.g. gesture recognition
- G06V40/28—Recognition of hand or arm movements, e.g. recognition of deaf sign language
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/61—Control of cameras or camera modules based on recognised objects
- H04N23/611—Control of cameras or camera modules based on recognised objects where the recognised objects include parts of the human body
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/68—Control of cameras or camera modules for stable pick-up of the scene, e.g. compensating for camera body vibrations
- H04N23/681—Motion detection
- H04N23/6811—Motion detection based on the image signal
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/68—Control of cameras or camera modules for stable pick-up of the scene, e.g. compensating for camera body vibrations
- H04N23/682—Vibration or motion blur correction
- H04N23/683—Vibration or motion blur correction performed by a processor, e.g. controlling the readout of an image memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30196—Human being; Person
- G06T2207/30201—Face
Definitions
- the present invention relates to an imaging device and imaging method and program to recognize a subject and control imaging operation on the basis of the motion of the subject.
- An imaging device such as a digital camera or a digital video camera with an auto shutter function has been widespread.
- the auto shutter function is to shoot a subject at arbitrary timing automatically or in a predetermined time after a full press of the shutter button.
- a problem may occur that if a user shoots himself or herself after setting a camera and pressing a shutter button, the shooting may be completed before the user is ready for being shot or the user has to wait for the shutter to click after he or she is ready for being shot.
- an imaging device which can release the shutter with a remote control has been developed.
- a user may forget to carry a remote control or find it troublesome to carry it with him or her all the time.
- Japanese Patent Application Publication No. 2010-74735 discloses an imaging device having an automatic shutter control function in accordance with the blinking of the eyes of a subject, for example.
- This imaging device sets the order of priority when capturing multiple subjects or faces to recognize the motion of a subject's face with a high priority or to output a control signal in accordance with a combination of the opening and closing of the eyes of the high-priority subject.
- the above technique faces a problem that it is difficult to recognize the motion of a part of a small face when the subject is at a long distance from the imaging device.
- a group photograph is taken at a relatively long distance with a wide-angle zoom and the size of the individual faces captured tends to be small.
- the faces are accurately detected, it is very hard to accurately recognize a change in a part of the faces such as blinking eyes, making automatic shooting control unfeasible.
- the subjects need to be close to the imaging device or their faces need to be imaged in a certain size or more, using a telephoto zooming.
- Japanese Patent Application Publication No. 2011-78009 discloses an imaging device configured to detect a subject's face or gesture from an image. This imaging device detects a facial image from image data and detects a hand image in association with the facial image to control imaging operation in accordance with the shape or motion of the detected hand image.
- the imaging device requires an enormous amount of information as data on the position of a body part, color data, size data, and texture data to accurately recognize a subject's gesture. Also, it takes a huge amount of time and load on the device to process the enormous amount of data. Moreover, since the ways of gesturing, the color of skin, and the shapes and sizes of a body part are different depending on an individual, even the enormous amount of data may not be sufficient to accurately recognize someone's gesture.
- An object of the present invention is to provide imaging device, method, and program to be able to stably, accurately recognize a subject's gesture at high speed on the basis of a subject's face area detected from an image and a coordinate of the center of rotational motion of the subject and the rotational angle, as well as to control shooting operation on the basis of the detected gesture.
- an imaging device includes an image input section which sequentially inputs image data with a predetermined time interval, a face detector configured to detect a face area of a subject from the image data, a rotational motion detector configured to detect a rotational motion between two frames of image data input with the predetermined time interval, and a controller configured to control the imaging device to execute a predetermined operation when the rotational motion is detected by the rotational motion detector, wherein the rotational motion detector is configured to detect at least one candidate of rotational motion between the two frames of image data and calculate a coordinate of a rotation center and a rotational angle of the at least one candidate, and determine whether or not the at least one candidate is the rotational motion on the basis of a central coordinate of the face area detected by the face detector, the coordinate of the rotation center and the rotational angle.
- FIGS. 1A to 1C are a top view, a front view, and a back view of a digital camera as an example of an imaging device according to one embodiment of the present invention, respectively;
- FIG. 2 is a function block diagram of the imaging device in FIG. 1 ;
- FIG. 3 is a block diagram of automatic shooting control of the imaging device
- FIG. 4 is a flowchart for automatic shooting process
- FIG. 5 shows an example of image data frames input in time series
- FIG. 6 is a flowchart for rotational motion detection
- FIG. 7A shows an example of an image data frame before an arm is moved while FIG. 7B shows the same after the arm is moved;
- FIG. 8 shows a motion vector search area and motion vectors
- FIG. 9A shows an example of an image data frame divided into blocks before an arm is moved while FIG. 9B shows the same after the arm is moved;
- FIG. 10 shows how to calculate the coordinate of rotation center and rotational angle
- FIG. 11 is a graph showing the discrimination of an arm gesture by support vector machine (SVM).
- SVM support vector machine
- An imaging device comprises an image input section 101 which sequentially inputs image data Fa with a predetermined time interval, a face detector 102 configured to detect a face area 50 of a subject 40 from the image data, a rotational motion detector 103 configured to detect a rotational motion between two frames Fb, Fc of image data input with the predetermined time interval, and a controller as a shutter controller 105 configured to control the imaging device to execute a predetermined operation when the rotational motion is detected by the rotational motion detector, wherein the rotational motion detector is configured to detect at least one candidate of rotational motion as a motion vector between the two frames of image data and calculate a coordinate of a rotation center O′ and a rotational angle ⁇ of the at least one candidate, and determine whether or not the at least one candidate is the rotational motion on the basis of a central coordinate O of the face area detected by the face detector, the coordinate of the rotation center and the rotational angle.
- the present embodiment describes a digital camera as an example of the imaging device.
- the imaging device should not be limited to a digital camera and can be any imaging device with an automatic shooting function.
- FIGS. 1A to 1C are a top view, a front view, and a back view of the exterior of a digital camera, respectively.
- the digital camera includes, on the top face, a sub LCD 1 , a shutter button SW 1 , and a mode dial SW 2 .
- FIG. 1B it includes, on the front face, a stroboscopic portion 3 , a ranging unit 5 , and a remote control light receiving portion 6 , a lens unit 7 , and an optical finder 11 .
- a memory card throttle 23 into which a memory card 34 as an SD card is inserted is provided on a side of the camera body.
- the digital camera includes, on the back face, an autofocus light emitting diode (LED) 8 , a stroboscopic LED 9 , a LCD 10 , the optical finder 11 , a telescopic zoom switch SW 4 , a power switch SW 13 , a wide-angle zoom switch SW 3 , a self-timer set/reset switch 15 SW 6 , a menu switch SW 5 , an OK switch SW 12 , a leftward/image check switch SW 11 , a downward/macro switch SW 10 , an upward/strobe switch SW 7 , a rightward switch SW 8 , and a display switch SW 9 .
- LED autofocus light emitting diode
- a stroboscopic LED 9 a LCD 10
- the optical finder 11 includes, on the back face, an autofocus light emitting diode (LED) 8 , a stroboscopic LED 9 , a LCD 10 , the optical finder 11 , a telescopic zoom switch SW 4 , a power
- FIG. 2 is a function block diagram of a control system of the digital camera in FIG. 1 . It comprises a CCD 121 as a solid image sensor, a front end IC 120 to convert an electric signal from the CCD 121 to a digital signal, a signal processor IC 110 to process the digital signal from the front end IC 120 , an SDRAM 33 to temporarily store data, an ROM 30 in which control programs are stored, and a motor driver 32 .
- the lens unit 7 includes a zoom lens, a focus lens and a mechanical shutter and is driven by the motor driver 32 which is controlled by a CPU 111 included in the signal processor IC 110 .
- the CCD 121 on which pixels with RGB filters are arranged is to photo-electrically convert optical images and output analog RGB image signals.
- the front end (F/E) IC 120 includes a correlated double sampling (CDS) 122 to sample analog image data from the CCD 121 , an automatic gain controller (AGC) 123 to adjust the gain of the sampled imaged data, and an analog-digital (A/D) converter 124 , and a timing generator (TG) supplied with a vertical synchronous signal (VD) and a horizontal synchronous signal (HD) from a CCD I/F 112 to generate drive timing signals for the CCD 121 and front end IC 120 .
- CDS correlated double sampling
- AGC automatic gain controller
- A/D analog-digital converter
- a not-shown clock generator supplies clocks to the system clock of the signal processor IC 110 , the timing generator 125 and else.
- the timing generator 125 supplies clocks to the CCD I/F 112 in the signal processor IC 110 for pixel synchronization.
- the digital signals input to the signal processor IC 110 from the front end IC 120 are temporarily stored as RGB data (RAW-RGB) in the SDRAM 33 by a memory controller 115 .
- the signal processor IC 110 comprises the CPU 111 , CCDI/F 112 , a resizing unit 113 , the memory controller 115 , a display output controller 116 , a compressor/decompressor 117 , a media I/F 118 , and a YUV converter 119 .
- the CCD I/F 112 outputs the VD and HD synchronous signals to the CCD 101 and captures digital RGB signals from the A/D converter 124 in line with the synchronous signals, to write RGB data to the SDRAM 33 via the memory controller 115 .
- the display output controller 116 transmits display data from the SDRAM 33 to the display unit to display a captured image. It can transmit display data to the LCD 10 or output it as a TV video signal to an external device.
- the display data refers to YCbCr data as natural images and on-screen display (OSD) data to display shooting mode icons and else. Both data are read by the memory controller 115 from the SDRAM 33 to the display output controller 116 which synthesizes the data as video data for output.
- OSD on-screen display
- the media interface (I/F) 118 performs image data read/write from/to the memory card 34 under the control of the CPU 111 .
- the YUV converter 119 converts RGB data stored in the SDRAM 33 into YUV data based on image process parameters set by the CPU 111 and writes it to the SDRAM 33 .
- the resizing unit 113 reads the YUV data and changes the size of it for display, recording, or thumbnail image display.
- the CPU 111 is a controller of the entire system. Upon turn-on of the power-on switch, it has the control programs loaded onto the SDRAM 33 from the ROM 30 , for example, to control the operations of the respective elements according to the control programs.
- the CPU 111 controls imaging operation, sets image process parameters, and controls the memories, display, and else according to instructions from an operation unit 31 with keys and buttons, a remote control, or an external terminal such as a personal computer.
- the operation unit 31 is for a user to give instructions to the digital camera.
- a predetermined instruction signal is input to the controller.
- the digital camera in FIG. 1 for example, comprises the shutter button 2 and various buttons such as the zoom buttons 12 , 14 to set the magnification of optical or electronic zoom.
- the CPU 111 Upon detection of the turning-on of the power switch SW 13 with the operation unit 31 , the CPU 111 makes predetermined settings to the respective elements. An image generated on the CCD 121 via the lens unit 7 is converted into a digital video signal and input to the signal processor IC 110 .
- the digital video signal is then input to the CCD I/F 112 and photo-electrically converted to an analog signal.
- the CCD I/F 112 subjects the analog signal to a black level adjustment and else and temporarily stores it in the SDRAM 33 .
- the YUV converter 119 reads the RAW-RGB image data from the SDRAM 33 and subjects it to gamma conversion, white balance adjustment, edge enhancement, and YUV conversion to generate YUV image data and write it to the SDRAM 33 .
- the YUV image data is read by the display output controller 116 and changed in size vertically and horizontally by the resizing unit 113 for output to a destination, for example, an NTSC system TV. Thereby, by changing the size of the data in synchronization with the VD signal, still image preview display is enabled.
- the digital camera When the user turns on the power switch SW 13 and sets a still image shooting mode with the mode dial SW 2 , the digital camera is activated in a recording mode. Detecting this, the CPU 111 outputs a control signal to the motor driver 32 to move the lens unit 7 to a photographable position and activate the CCD 121 , F/E IC 120 , signal processor IC 110 , SDRAM 33 , ROM 30 , and LCD 10 .
- An image of a subject captured via the optical system of the lens unit 7 is formed on the pixels of the CCD 121 and analog RGB image signals corresponding to the image are input to the A/D converter 124 via the CDS 122 , AGC 123 and converted into 12-bit RAW-RGB data.
- the RAW-RGB data is captured into the CCD I/F 112 of the signal processor IC 110 and stored in the SDRAM 33 via the memory controller 115 .
- the YUV converter 119 converts the RAW-RGB data to displayable YUV data and stores it in the SDRAM 33 via the memory controller 115 .
- the YUV data is transmitted from the SDRAM 33 to the LCD 10 via the display output controller 116 for display of the captured image (video).
- the number of pixels of the image is thinned out by the CCD I/F 112 to read one image frame in 1/30 second.
- the CCD I/F 112 of the signal processor IC 110 calculates an AF evaluation value, an AE evaluation value and an AWB (auto white balance) evaluation value from the RAW-RGB data.
- the AF evaluation value is calculated from, for example, an integral of outputs of a high-pass filter or an integral of differences in brightness among neighboring pixels.
- focus state the edge portion of a subject is distinctive with highest frequency components.
- AF operation the AF evaluation value is found at each focus lens position to determine the point with the maximal evaluation value as a detected focus position.
- the AE and AWB evaluation values are calculated from an integral of each of the RGB values of the RAW-RGB data. For example, an image area associated with all the pixels of the CCD 121 is equally divided into 256 blocks (16 by 16) to calculate an RGB integral value in each block.
- the CPU 111 reads the RGB integral value to calculate brightness in each block and determine a proper exposure amount from brightness distribution in AE operation. It sets an exposure condition such as a number of electric shutters, f-value of an aperture diaphragm, opening/closing of an ND filter.
- the AWB control value is determined from the RGB distribution in accordance with the color of a light source of a subject.
- the YUV converter 119 performs YUV data conversion with white balance adjusted.
- the AE and AWB operations are continuously executed.
- the controller instructs the motor driver 32 to move the focus lens of the optical system to execute a so-called hill climb (contrast evaluation) AF operation.
- the focus lens In the AF area from infinity to a nearest point, the focus lens is moved to each focus position from infinity to a nearest point or from a nearest point to infinity.
- the controller reads the AF evaluation value calculated at each focus position by the CCD I/F 112 .
- the focus lens is moved to the position with a maximal AF evaluation value and placed into focus.
- the controller instructs the motor driver 32 to close the mechanical shutter and the CCD 121 to output an analog RGB image signal for a still image.
- the A/D converter 124 of the F/E 120 converts it to the RAW-RGB data as in the preview.
- the RAW-RGB data is transmitted to the CCD I/F 112 of the signal processor IC 110 , converted into the YUV data by the YUV converter 119 and stored in the SDRAM 33 via the memory controller 115 .
- the YUV data is changed in size by the resizing unit 113 in line with the number of pixels for recording and compressed into image data in JPEG form by the compressor/decompressor 117 .
- the compressed image data is written to the SDRAM 33 , read therefrom via the memory controller 115 and stored in the memory card 34 via the media I/F 118 .
- the present embodiment describes an example in which the imaging device is controlled to execute shutter control and shooting operation on the basis of face detection and rotational motion detection.
- the controller controls the power supply of the imaging device to turn off according to face detection and rotational motion detection, for example.
- FIG. 3 is a block diagram for automatic shooting control of the imaging device and FIG. 4 is a flowchart for the automatic shooting control executed by the control elements of the imaging device.
- Frames of continuous image data in time series are input to an image input section 101 for display on the LCD 10 with a predetermined time interval ⁇ t, for example, 33 msec, as shown in FIG. 5 (step S 101 ).
- FIG. 5 a second frame of image data is input in a predetermined time ⁇ t after a first frame of image data Fa.
- image data in time series are input and stored in the SDRAM 33 .
- a face detector 102 detects a subject's face from one frame of the image data input in sequence, for example, data frame Fa in step S 102 .
- the face detector 102 is configured to detect a face area 50 equivalent to a face 41 of a subject 40 .
- the face area 50 is surrounded by vertexes A to D with the central coordinate at O(x0, y0) in FIG. 5 .
- the algorithm for the face detection in step S 102 can be any known or novel one such as pattern matching. Further, the subject does not have to be a person and can be an animal instead. In this case a process in which the face of an animal is recognized is executed.
- a later-described arm gesture detector 103 can be configured to detect the arm or leg (rotating body part) of the animal in question using data on the pre-learning results of a learning model.
- step S 103 a determination is made on whether or not the face detector 102 has detected a face.
- the flow returns to the image input in step S 101 and next image data is processed.
- a rotational motion detector 103 When a face is detected (Yes in step S 103 ), a rotational motion detector 103 performs rotational motion detection in step S 104 .
- the rotational motion detection is described in detail with reference to FIG. 6 .
- the present embodiment describes an example in which the rotational motion of a hand or palm 44 of the subject is detected while the subject is moving an arm 43 horizontally around an elbow 42 as the center of rotation.
- this rotational motion is referred to as arm gesture.
- FIG. 7A shows image data before the arm gesture and FIG. 7B shows the same after the arm gesture. Note that an area indicated by a broken line in FIG. 7B is added to clarify how the arm has moved.
- the arm gesture is not limited to the rotation of the hand 44 around the elbow 42 . Instead, it can be the motion of the entire arm with a shoulder 45 as the center of rotation. Further, the rotary direction of the arm 43 should not be limited to the one in FIG. 7B , and the arm 43 can be held down from the elbow 42 . The arm motion vertical to the image can be detected as long as a rotational angle between the frames occurs.
- gestures than the arm gesture can be used as long as they are rotational motion around the base point.
- the subject 40 holds a tool such as a rod or a flag and rotates it, the center of the rotation and the top end of the rotating tool can be detected.
- relative motion data For detecting a gesture, relative motion data needs to be learned by a learning model in advance with a teacher to calculate discriminant formulas. Moreover, it is preferable to detect a plurality of kinds of gestures and allow the user to select one to control the imaging device.
- step S 201 image data is input to the image input section 101 .
- a first frame Fa of image data in which the face detector performs face detection and second and third frames Fb, Fc of image data in which the rotational motion detector 103 performs rotational motion detection are considered different image data.
- Image data Fb is defined to be image data input M-frames after the first image data Fa in FIG. 5 .
- the value M is a small value equal to or over 1.
- Image data is input at a frame rate of 33 msec, for example.
- face detection step S 102
- rotational motion detection step S 104
- only either of the face detector 102 and the rotational motion detector 103 is preferably operated in a one-frame process time.
- image processing by an image processing chip having a specific function operating only either of them can lead to reducing power consumption.
- the image data Fa, Fb can be considered to be in the same frame.
- the arm gesture in the two image frames Fb, Fc input with a predetermined frame interval N is detected in FIG. 5 .
- the value of the frame interval N is preset as a parameter.
- the frame interval N can be set selectable and the minimal and maximal values thereof can be limited.
- the interval N needs to be properly set in accordance with the speed of rotational motion of a subject to be detected.
- the rotational angle of an arm rotating at a certain angular velocity between the frames becomes too large or that of the arm in reciprocating rotation becomes too small.
- the frame interval N is determined on the basis of the assumed angular velocity of the arm and the frame rate of 33 msec so that the rotational angle ⁇ of the arm falls within 45 to 90 degrees. For example, suppose that the arm rotates in the range of 45 to 90 degrees in about 0.5 second, the frame interval N will be 15.
- the frame interval N can be arbitrarily set to 1 to detect the arm gesture between continuous frames or to about several frames, for example.
- a search area setter 201 sets a motion vector search area 51 in which an arm motion vector as a candidate of rotational motion is detected, according to the detected face area 50 .
- the motion vector search area 51 is set in a movable area of the arm 43 in the frame, for example, in a predetermined pixel area around the face area 50 or the central coordinate O(x0, y0), as shown in FIG. 8 . It is possible to detect a candidate of rotational motion over the entire image by a later-described processing instead of setting the motion vector search area 51 . However, detection in the preset limited area is preferable to reduce throughput.
- a block matching element 202 detects, as a motion vector, an area in which a motion has been detected between the two frames.
- the image data frames Fb, Fc are each divided into blocks 52 with a predetermined number of pixels (n by n) to find a motion amount or motion vector HH′ between the same portions of the two frames by block matching.
- Block matching can be conducted by any known or novel technique.
- the size of each block is a parameter which is properly set in accordance with a subject of rotational motion. In the present embodiment it is set so that the motion of the arm 43 is distinguishable. For example, it can be decided from the size of a face on the basis of a ratio of general face size and arm size. With the face area being 20 by 20 pixels, the block size can be 5 by 5 pixels.
- the starting and ending points of the motion vector are in the center block of the 5 by 5 pixel area by way of example.
- the block size can be arbitrarily set to an optimal value since the size of a subject changes in accordance with a focal length and a distance to the subject.
- a motion remover 203 removes a motion or a blur having occurred in the entire image.
- a motion or a blur in the entire image hardly occurs. Otherwise, it is preferable to detect and remove the amount of motion in the entire image. This can improve the accuracy of the detected motion vector.
- the motion vector detected by block matching is affected by the movement so that it needs to be canceled.
- the motion vector in the entire image can be calculated using the motion vectors Va, Vb in areas other than the motion vector search area 51 .
- the motion vector (Vx0, Vy0) in the entire image between the frames Fb, Fc at arbitrary coordinate (x, y) can be found by the above formula.
- the motion vector (Vx ⁇ Vx0, Vy ⁇ Vy0) can be obtained by subtracting the motion vector (Vx0, Vy0) in the entire image from that (Vx, Vy) at the coordinate (x, y).
- the present embodiment describes an example where the motion vector in the entire image is found according to the motion vector in the area other than the motion vector search area 51 . However, it can be obtained from the motion vector in the motion vector search area 51 . Further, the motion vector in the entire image can be calculated by sampling a certain number of blocks 52 .
- the present embodiment describes an example where the motion amount in the parallel direction of the imaging device is corrected by affine transformation.
- it can be corrected by any known or novel technique.
- the motion amount in the vertical direction of the image can be corrected by projective transformation.
- a noise vector remover 204 removes an isolated motion vector as a noise vector from the motion vector search area 51 . This is to remove a motion vector considered to be not an actual arm motion in the motion vector search area 51 to prevent an erroneous detection of the arm motion.
- Noise vectors Vc, Vd, Ve are shown in FIG. 8 .
- motion vectors Vf, Vg in an area other than the motion vector search area 51 have to be also removed.
- the radius R can be arbitrarily set, for example, to 20 pixels.
- the pixel as a reference of the vector determination can be any pixel constituting the motion vector in question, for example, a starting point of the motion vector.
- the noise vectors Vc, Vd, Ve as determined above are excluded from the motion vectors.
- steps S 201 to S 205 the arm motion vectors as a candidate of rotational motion are detectable.
- a rotation center/rotational angle calculator 205 calculates the coordinate of rotation center and a rotational angle on the basis of the detected motion vector.
- the coordinate O′(x1, y1) of the rotation center of arm gesture and the rotational angle ⁇ are calculated on the basis of motion vectors H 1 H 1 ′, H 2 H 2 ′, H 3 H 3 ′detected from the frames Fb, Fc.
- the three motion vectors H 1 to H 3 are used for the sake of simple explanation, the actual number Hn of detected motion vectors will be 10 or more, for example.
- the coordinate of the rotation center O′(x1, y1) is defined to be a point on which normal lines (indicated by broken lines in FIG. 10 ) passing the midpoints of motion vectors of the arm parts and vertical to the midpoints are gathered. With two or more intersection points, the coordinate of the rotation center O′(x1, y1) can be set to an average value of coordinates of intersection points of two neighboring normal lines in a case where the intersection points of the normal lines do not coincide with one another.
- a rotation radius R 1 is a distance between the coordinate of rotation center O′(x1, y1) and each motion vector, for example, a distance between a starting point H 1 and an ending point H 1 ′.
- the rotational angle ⁇ is defined to be 90 degrees or less. However, it can be configured to detect the rotational angle of 90 degrees or more. Further, the rotational angle ⁇ 1 can be found from the coordinate of rotation center O′(x1, y1) and the starting and ending points H 1 , H 1 ′ of the motion vector.
- the rotary radiuses R 2 , R 3 and lengths L 2 , L 3 of the motion vectors H 2 H 2 ′, H 3 H 3 ′ are found from the motion vectors to calculate the rotational angles ⁇ 2 , ⁇ 3 , respectively.
- the average value of the rotational angles ⁇ 1 , ⁇ 2 , ⁇ 3 is found as the arm rotational angle ⁇ .
- a rotational motion discriminator 206 determines whether or not the detected candidate of rotational motion is an actual arm gesture from a relation between the positions of the center of an arm rotation and the center of the face area and a characteristic amount of the rotational angle.
- a positional shift amount is calculated from the central coordinate O(x0, y0) of the face area and that O′(x0, y0) of the arm rotation by the following equations.
- Distances dx, dy therebetween and the rotational angle ⁇ are defined to be the characteristic amount for arm gesture detection in step S 207 .
- dx x 1 ⁇ x 0
- dy y 1 ⁇ y 0
- the characteristic amount is normalized in step S 208 .
- the size of a subject differs depending on the zoom ratio of the imaging device. For example, the entire image is changed in size and normalized so that the average facial size becomes 20 by 20 pixels. Thus, even with a change in a subject size, a common recognition dictionary can be used irrespective of zoom ratio to accurately recognize the arm rotational motion.
- step S 209 a determination is made on whether or not the candidate is an arm gesture, using the characteristic amount and discriminant formula calculated in advance from machine learning.
- Sample data (video data) of arm gestures are collected to create an arm gesture recognition dictionary as data learned by learning model, using sample data of the positional relation between the rotation center of the face area and that of the arm gesture and the arm rotational angle (O′(x1, y1), dx, dy, and ⁇ ).
- the coefficients A, B, and C are calculated by pre-learning. With a value f obtained by the discriminant function f(dx, dy, ⁇ ) being over a threshold th, the candidate is recognized as an arm gesture. With the value f being lower than the threshold th, it is recognized as a non-arm gesture.
- FIG. 11 is a graph showing distributions of the characteristic amounts of arm gesture and non-arm gesture.
- two variables dx and ⁇ are used in the graph so that a recognition level is linear.
- the points z over the line are recognized as arm gesture while those x below the line are recognized as non-arm gesture.
- the distribution will be a three-dimensional space and the recognition level will be planar.
- the characteristic amounts over the recognition plane are of arm gesture and those below the recognition plane are of non-arm gesture.
- the present embodiment describes an example where the linear SVM is used as a learning model for creating the recognition dictionary.
- a non-linear discriminant function can be used, or other learning models such as AdaBoost can be used.
- AdaBoost AdaBoost
- the recognition dictionary is prepared in advance and the discriminant function calculated by the recognition dictionary is stored in the memory unit of the imaging device.
- step S 105 With the function value f being over the threshold th (Yes in step S 105 ), the presence of a desired arm gesture is determined. With the function value f being below the threshold th (No in step S 105 ), the presence of a desired arm gesture is not determined so that the flow returns to the image input in step S 101 .
- a notifier 104 displays an indicator to notify the presence to a user or a person as a subject of the detection of the arm gesture in step S 106 .
- the notifier can be configured to notify the detection in various manners. For example, a not-shown LED light or any other type of light is provided on the front face of the imaging device to turn on. Also, text or marks can be displayed on the sub LCD or LCD 10 . Alternatively, the shutter can be released in a predetermined time after the detection in step S 107 instead of the notification by the notifier 104 in step S 106 .
- step S 107 a shutter controller 105 controls the shutter to release in a predetermined time after the display of the indicator. Then, shooting and image recording are performed in step S 108 . These operations are the same as those by the full press of the shutter button 2 and captured image data is stored in the memory card 34 , for instance.
- step S 102 In shooting plural subjects, their faces are detected in step S 102 . Therefore, it is preferable to decide order of priority over the subjects in advance so that the arm gesture on which the imaging control bases can be decided from plural arm gestures in accordance with the order of priority of subjects in question. It can be configured to detect the rotational motion of only a major subject with the highest priority in step S 104 .
- the priorities of subjects can be decided arbitrarily. For example, a face located at the center of the frame or a face in a largest size can be set to a major subject. Moreover, a plurality of priority patterns can be prepared to allow the user to select a desired pattern.
- the imaging device can automatically release the shutter at a user's desired timing by using face detection and arm motion detection as a trigger even when a subject is far from the imaging device or a large number of subjects are concurrently shot so that the sizes of the subjects' faces are small.
- detecting the arm gesture of the subject after detecting the face makes it possible to improve the accuracy of the arm motion recognition and reduce erroneous recognition.
- the arm gesture can be obtained from the positional relation between the arm rotation center and the face center by simpler calculation and at a higher speed than finding a hand position by pattern recognition and then recognizing an arm gesture.
- the accuracy of detection may be decreased with a change in the shape of a hand.
- hand motion as a parallel shift other than axial rotation is not a subject of detection in the present embodiment, which can improve the accuracy of arm gesture detection.
- the coordinate of the center of the face area and the coordinate of rotation center and rotational angle of motion are used as the characteristic amount data. This makes it unnecessary to individually set or record specific parameters as arm length, arm rotational angle, and a distance between the centers of arm rotation and a face. Moreover, it is possible to accurately detect the hand or arm motion by discriminant formula based on a dictionary of various arm gesture data, irrespective of shapes of a hand or arm, a subject's age, gender, or body size. This arm gesture detection is unsusceptible to noise.
- the imaging device includes a notifier to notify the detection of arm gesture to outside before shooting. This makes it possible for a subject person to know the detection of the gesture and prepare his or her facial expression or posture for photo taking.
- the image input section 101 , face detector 102 , arm gesture detector 103 , notifier 104 , shutter controller 105 , search area setter 201 , block matching element 202 , motion remover 203 , noise vector remover 204 , rotation center/rotational angle calculator 205 , and rotational motion discriminator 206 can be realized by software or imaging program executed by the CPU 111 of the imaging device. Necessary data for execution of the software or program are loaded for example on the SDRAM 33 .
- these elements can be configured as modules and a program to execute the functions of the face detector 102 and arm gesture detector 103 can be applied to hardware.
- a non-transit computer-readable medium storing the imaging program to cause the imaging device to execute the above operation can be also provided.
Abstract
Description
where (x1, y1) and (x0, y0) are the coordinates of corresponding points in the frames Fb, Fc. With a large number (6 or more) of corresponding points, the coefficients a to e can be calculated.
θ1=L1/R1
where L1 is a length of the motion vector H1H1′.
dx=x1−x0
dy=y1−y0
f=A*dx+B*dy+C*θ
Claims (20)
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2011097935 | 2011-04-26 | ||
JP2011-097935 | 2011-04-26 | ||
JP2012031360A JP6106921B2 (en) | 2011-04-26 | 2012-02-16 | Imaging apparatus, imaging method, and imaging program |
JP2012-031360 | 2012-02-16 |
Publications (2)
Publication Number | Publication Date |
---|---|
US20120275648A1 US20120275648A1 (en) | 2012-11-01 |
US8831282B2 true US8831282B2 (en) | 2014-09-09 |
Family
ID=47056017
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/441,252 Expired - Fee Related US8831282B2 (en) | 2011-04-26 | 2012-04-06 | Imaging device including a face detector |
Country Status (3)
Country | Link |
---|---|
US (1) | US8831282B2 (en) |
JP (1) | JP6106921B2 (en) |
CN (1) | CN102761706B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10404720B2 (en) | 2015-04-21 | 2019-09-03 | Alibaba Group Holding Limited | Method and system for identifying a human or machine |
US20220283020A1 (en) * | 2019-09-03 | 2022-09-08 | Shinkawa Ltd. | Vibration detection system |
Families Citing this family (27)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2012171190A1 (en) * | 2011-06-15 | 2012-12-20 | 青岛海信信芯科技有限公司 | Television, control method and control device for the television |
JP2013219556A (en) * | 2012-04-09 | 2013-10-24 | Olympus Imaging Corp | Imaging apparatus |
TWI454966B (en) * | 2012-04-24 | 2014-10-01 | Wistron Corp | Gesture control method and gesture control device |
US20130286227A1 (en) * | 2012-04-30 | 2013-10-31 | T-Mobile Usa, Inc. | Data Transfer Reduction During Video Broadcasts |
JP5872981B2 (en) * | 2012-08-02 | 2016-03-01 | オリンパス株式会社 | Shooting equipment, moving body shooting method, shooting program |
KR20140099111A (en) * | 2013-02-01 | 2014-08-11 | 삼성전자주식회사 | Method for control a camera apparatus and the camera apparatus |
CN103297696B (en) * | 2013-05-24 | 2016-12-28 | 小米科技有限责任公司 | Image pickup method, device and terminal |
US10694106B2 (en) | 2013-06-14 | 2020-06-23 | Qualcomm Incorporated | Computer vision application processing |
CN104349197B (en) * | 2013-08-09 | 2019-07-26 | 联想(北京)有限公司 | A kind of data processing method and device |
JPWO2015029620A1 (en) | 2013-08-27 | 2017-03-02 | オリンパス株式会社 | Imaging apparatus, imaging method, and imaging program |
CN104463782B (en) * | 2013-09-16 | 2018-06-01 | 联想(北京)有限公司 | Image processing method, device and electronic equipment |
US9727915B2 (en) * | 2013-09-26 | 2017-08-08 | Trading Technologies International, Inc. | Methods and apparatus to implement spin-gesture based trade action parameter selection |
CN103945107B (en) * | 2013-11-29 | 2018-01-05 | 努比亚技术有限公司 | Image pickup method and filming apparatus |
US11435895B2 (en) | 2013-12-28 | 2022-09-06 | Trading Technologies International, Inc. | Methods and apparatus to enable a trading device to accept a user input |
CN104754202B (en) * | 2013-12-31 | 2019-03-29 | 联想(北京)有限公司 | A kind of method and electronic equipment of Image Acquisition |
US20150201124A1 (en) * | 2014-01-15 | 2015-07-16 | Samsung Electronics Co., Ltd. | Camera system and method for remotely controlling compositions of self-portrait pictures using hand gestures |
CN105874284B (en) * | 2014-05-27 | 2019-11-12 | 松下电器(美国)知识产权公司 | The control method of sensor performed by conditioner |
JP2016140030A (en) | 2015-01-29 | 2016-08-04 | 株式会社リコー | Image processing apparatus, imaging device, and image processing program |
TWI555378B (en) * | 2015-10-28 | 2016-10-21 | 輿圖行動股份有限公司 | An image calibration, composing and depth rebuilding method of a panoramic fish-eye camera and a system thereof |
JP6134411B1 (en) * | 2016-03-17 | 2017-05-24 | ヤフー株式会社 | Information processing apparatus, information processing system, information processing method, and information processing program |
JP6977813B2 (en) * | 2016-05-18 | 2021-12-08 | ヤマハ株式会社 | Automatic performance system and automatic performance method |
US11182853B2 (en) | 2016-06-27 | 2021-11-23 | Trading Technologies International, Inc. | User action for continued participation in markets |
CN106681503A (en) * | 2016-12-19 | 2017-05-17 | 惠科股份有限公司 | Display control method, terminal and display device |
JP7020263B2 (en) | 2018-04-17 | 2022-02-16 | 富士通株式会社 | Body orientation estimation program, body orientation estimation device, and body orientation estimation method |
CN110460772B (en) * | 2019-08-14 | 2021-03-09 | 广州织点智能科技有限公司 | Camera automatic adjustment method, device, equipment and storage medium |
BR112022018723A2 (en) * | 2020-03-20 | 2022-12-27 | Huawei Tech Co Ltd | METHODS AND SYSTEMS FOR CONTROLLING A DEVICE BASED ON MANUAL GESTURES |
WO2023032274A1 (en) * | 2021-08-31 | 2023-03-09 | ソニーセミコンダクタソリューションズ株式会社 | Information processing device, information processing method, and program |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0738873A (en) | 1993-07-23 | 1995-02-07 | Atr Tsushin Syst Kenkyusho:Kk | Method for real-time recognition and composition of human image |
CN1940964A (en) | 2005-09-28 | 2007-04-04 | 欧姆龙株式会社 | Apparatus, method, recording media, and program for recognition |
US20090109304A1 (en) | 2007-10-29 | 2009-04-30 | Ricoh Company, Limited | Image processing device, image processing method, and computer program product |
US20090228841A1 (en) | 2008-03-04 | 2009-09-10 | Gesture Tek, Inc. | Enhanced Gesture-Based Image Manipulation |
US20100027906A1 (en) | 2008-07-29 | 2010-02-04 | Ricoh Company, Ltd. | Image processing unit, noise reduction method, program and storage medium |
US20100073497A1 (en) | 2008-09-22 | 2010-03-25 | Sony Corporation | Operation input apparatus, operation input method, and program |
US20100164862A1 (en) * | 2008-12-31 | 2010-07-01 | Lucasfilm Entertainment Company Ltd. | Visual and Physical Motion Sensing for Three-Dimensional Motion Capture |
US20100202693A1 (en) | 2009-02-09 | 2010-08-12 | Samsung Electronics Co., Ltd. | Apparatus and method for recognizing hand shape in portable terminal |
JP2011078009A (en) | 2009-10-01 | 2011-04-14 | Olympus Corp | Imaging device and program for imaging device |
US20110122264A1 (en) | 2009-11-24 | 2011-05-26 | Yuji Yamanaka | Imaging apparatus, image processing method, and computer program product |
WO2011142480A1 (en) | 2010-05-14 | 2011-11-17 | Ricoh Company, Ltd. | Imaging apparatus, image processing method, and recording medium for recording program thereon |
US20110298946A1 (en) | 2009-02-20 | 2011-12-08 | Haike Guan | Image processing apparatus, image pickup apparatus, image processing method, and computer program |
US20110310007A1 (en) * | 2010-06-22 | 2011-12-22 | Microsoft Corporation | Item navigation using motion-capture data |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002063579A (en) * | 2000-08-17 | 2002-02-28 | Hitachi Plant Eng & Constr Co Ltd | Device and method for analyzing image |
JP2011253292A (en) * | 2010-06-01 | 2011-12-15 | Sony Corp | Information processing system, method and program |
-
2012
- 2012-02-16 JP JP2012031360A patent/JP6106921B2/en active Active
- 2012-04-06 US US13/441,252 patent/US8831282B2/en not_active Expired - Fee Related
- 2012-04-26 CN CN201210126650.1A patent/CN102761706B/en active Active
Patent Citations (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0738873A (en) | 1993-07-23 | 1995-02-07 | Atr Tsushin Syst Kenkyusho:Kk | Method for real-time recognition and composition of human image |
CN1940964A (en) | 2005-09-28 | 2007-04-04 | 欧姆龙株式会社 | Apparatus, method, recording media, and program for recognition |
US20090109304A1 (en) | 2007-10-29 | 2009-04-30 | Ricoh Company, Limited | Image processing device, image processing method, and computer program product |
US20090228841A1 (en) | 2008-03-04 | 2009-09-10 | Gesture Tek, Inc. | Enhanced Gesture-Based Image Manipulation |
US20100027906A1 (en) | 2008-07-29 | 2010-02-04 | Ricoh Company, Ltd. | Image processing unit, noise reduction method, program and storage medium |
JP2010074735A (en) | 2008-09-22 | 2010-04-02 | Sony Corp | Operation input apparatus, operation input method, and program |
US20100073497A1 (en) | 2008-09-22 | 2010-03-25 | Sony Corporation | Operation input apparatus, operation input method, and program |
US20100164862A1 (en) * | 2008-12-31 | 2010-07-01 | Lucasfilm Entertainment Company Ltd. | Visual and Physical Motion Sensing for Three-Dimensional Motion Capture |
US20100202693A1 (en) | 2009-02-09 | 2010-08-12 | Samsung Electronics Co., Ltd. | Apparatus and method for recognizing hand shape in portable terminal |
US20110298946A1 (en) | 2009-02-20 | 2011-12-08 | Haike Guan | Image processing apparatus, image pickup apparatus, image processing method, and computer program |
JP2011078009A (en) | 2009-10-01 | 2011-04-14 | Olympus Corp | Imaging device and program for imaging device |
US20110122264A1 (en) | 2009-11-24 | 2011-05-26 | Yuji Yamanaka | Imaging apparatus, image processing method, and computer program product |
WO2011142480A1 (en) | 2010-05-14 | 2011-11-17 | Ricoh Company, Ltd. | Imaging apparatus, image processing method, and recording medium for recording program thereon |
JP2011244046A (en) | 2010-05-14 | 2011-12-01 | Ricoh Co Ltd | Imaging apparatus, image processing method, and program storage medium |
US20110310007A1 (en) * | 2010-06-22 | 2011-12-22 | Microsoft Corporation | Item navigation using motion-capture data |
Non-Patent Citations (3)
Title |
---|
Chinese Office Action dated May 5, 2014, issued in Chinese Patent Application No. 201210126650.1 (with English translation). |
Machine translation of JPN 2011-078009. * |
U.S. Appl. No. 13/519,610, filed Jun. 28, 2012, Guan. |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10404720B2 (en) | 2015-04-21 | 2019-09-03 | Alibaba Group Holding Limited | Method and system for identifying a human or machine |
US20220283020A1 (en) * | 2019-09-03 | 2022-09-08 | Shinkawa Ltd. | Vibration detection system |
Also Published As
Publication number | Publication date |
---|---|
JP2012239156A (en) | 2012-12-06 |
CN102761706A (en) | 2012-10-31 |
CN102761706B (en) | 2014-10-01 |
JP6106921B2 (en) | 2017-04-05 |
US20120275648A1 (en) | 2012-11-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8831282B2 (en) | Imaging device including a face detector | |
EP2273450B1 (en) | Target tracking and detecting in images | |
CN100493147C (en) | Image capturing device having a hand shake correction function and hand shake correction method | |
KR101431535B1 (en) | Apparatus and method for picturing image using function of face drecognition | |
JP5159515B2 (en) | Image processing apparatus and control method thereof | |
JP4819001B2 (en) | Imaging apparatus and method, program, image processing apparatus and method, and program | |
JP6184189B2 (en) | SUBJECT DETECTING DEVICE AND ITS CONTROL METHOD, IMAGING DEVICE, SUBJECT DETECTING DEVICE CONTROL PROGRAM, AND STORAGE MEDIUM | |
US8350918B2 (en) | Image capturing apparatus and control method therefor | |
JP4732299B2 (en) | Method for detecting specific subject image and digital camera | |
US8648960B2 (en) | Digital photographing apparatus and control method thereof | |
US20100329552A1 (en) | Method and apparatus for guiding user with suitable composition, and digital photographing apparatus | |
JP2009268086A (en) | Imaging apparatus | |
KR20140096843A (en) | Digital photographing apparatus and control method thereof | |
JP5105616B2 (en) | Imaging apparatus and program | |
KR101817659B1 (en) | Digital photographing apparatus and method of controlling the same | |
KR101599871B1 (en) | Photographing apparatus and photographing method | |
JP2011095985A (en) | Image display apparatus | |
JP5109853B2 (en) | Electronic camera | |
US20230148125A1 (en) | Image processing apparatus and method, and image capturing apparatus | |
JP5448868B2 (en) | IMAGING DEVICE AND IMAGING DEVICE CONTROL METHOD | |
EP2690859B1 (en) | Digital photographing apparatus and method of controlling same | |
JP2009246700A (en) | Imaging apparatus | |
JP5375943B2 (en) | Imaging apparatus and program thereof | |
JP2009130840A (en) | Imaging apparatus, control method thereof ,and program | |
JP2012120003A (en) | Imaging device, imaging device control method, and control program of the same |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: RICOH COMPANY, LTD., JAPAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:GUAN, HAIKE;REEL/FRAME:028005/0862 Effective date: 20120323 |
|
FEPP | Fee payment procedure |
Free format text: PAYOR NUMBER ASSIGNED (ORIGINAL EVENT CODE: ASPN); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551) Year of fee payment: 4 |
|
FEPP | Fee payment procedure |
Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
LAPS | Lapse for failure to pay maintenance fees |
Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY |
|
STCH | Information on status: patent discontinuation |
Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362 |
|
FP | Lapsed due to failure to pay maintenance fee |
Effective date: 20220909 |