CN101262561B

CN101262561B - Imaging apparatus and control method thereof

Info

Publication number: CN101262561B
Application number: CN2008100825795A
Authority: CN
Inventors: 大石诚
Original assignee: Fujifilm Corp
Current assignee: Fujifilm Corp
Priority date: 2007-03-05
Filing date: 2008-03-05
Publication date: 2012-11-21
Anticipated expiration: 2028-03-05
Also published as: JP2008219449A; CN101262561A

Abstract

The invention discloses an imaging apparatus and a control method thereof. A digital camera has human extraction means, non-human extraction means, and composition judgment means. The human extraction means extracts a human figure region by analysis of image data. The non-human extraction means extracts a major subject other than a human figure by analysis of a region other than the human figure region having been extracted by the human extraction means. The composition judgment means evaluates arrangement of the human figure and the major subject according to results of the extraction, and judges whether composition is appropriate. Based on whether the composition is appropriate, timing to record the image data is determined. Preferably, recording means is controlled so as to record the image data at the determined timing, or the timing is notified to a user.

Description

Imaging device and control method thereof

Technical field

The present invention relates to carry out the imaging device and the control method thereof of taking control based on the formation of the image that will take.The invention still further relates to through using voice as triggering imaging device and the control method thereof of carrying out automatic shooting.

Background technology

AE of digital camera (automatic exposure) and AF (automatic focus) function are improved year by year, even make to operating the picture rich in detail that the unfamiliar people of camera also can shoot bright color.But, use camera to carry out the mode of picture catching and when press the technology that shutter release button still depends on photographer.Therefore, for the beginner, it is still difficult to shoot the image with suitable formation.

In order to address this problem, the open 2001-051338 of japanese laid-open patent discloses a kind of camera, thus its through the identification personage facial towards whether coming the controlling recording operation in the judgement of predetermined direction based on relevant personage's face.But the open 2001-051338 of japanese laid-open patent discloses a kind of method of controlling shooting, and it is used to take a people's situation, and openly is not used to take the method that the control of the situation of other targets beyond a plurality of people or the personage is taken.Simultaneously, the open 2006-203346 of japanese laid-open patent discloses a kind of camera, and it is provided with shooting condition through the formation of analyzing photographed scene.But japanese laid-open patent openly 2006-203346 proposes to disclose the concrete grammar of the control shooting of the target outside people's face and the sky through except that detecting people's face, also detecting sky and detecting the inclination of being taken the photograph image and take control.

As the other a kind of method that addresses the above problem, attempted to press the image that obtains before the shutter release button and control the timing of Imagery Data Recording and obtain image with suitable formation through being based on.The open 2000-196934 of japanese laid-open patent discloses a kind of imaging device, the predetermined portions of the image that its concern is taken and operation shutter when this part changes.But this imaging device assigns to carry out control through only paying close attention to the specified reservations of user, does not carry out the shooting control of considering that entire image constitutes.

In addition, as the another kind of method that addresses this problem, it was suggested through using special sound to carry out the digital camera of automatic shooting as the triggering of indication right moment for camera.For example, the open 2006-184589 of japanese laid-open patent discloses a kind of digital camera, and it obtains image through automatically snapping operation when the particular phrase that detects through identification from the input of camera microphone.Although the digital camera through using voice to carry out automatic shooting as triggering is very convenient,, camera is carried out unnecessary operation sometimes and is responded irrelevant voice.For example, in the place such as the place of interest, a large amount of people flock together, and such camera may also produce response to contiguous stranger's voice.Under the situation of taking colony's photo, say " eggplant (saycheese) " and when not considering that uncompleted shooting is prepared, such camera also may be carried out shooting as the people.

Summary of the invention

The invention solves the problems referred to above of the prior art, target of the present invention provides a kind of imaging device, and it makes the beginner can easily shoot the suitable image of formation.Another target of the present invention is the while easily in the automatic shooting that keeps the use voice to bring as triggering, solves and unnecessarily carries out the inconvenience that automatic shooting is operated.

In order to obtain above-mentioned target, the invention provides three types imaging device.

First imaging device comprise be used for through photographed scene generate the expression scene view data imaging device and be used at booking situation medium record by record image data device that imaging device generated.Said first imaging device further comprises: person extraction device, non-personage's extraction element, formation judgment means and recording timing are confirmed device, will describe all these devices below.

The person extraction device extracts one or more image-regions of the one or more character graphic of expression through the view data of analyzing said imaging device and generating.For example, said person extraction device is the search of executor's object plane portion in view data, and the information of the position of the number of output expression detected personage face through said search and each personage face and size, as the result who extracts.In the case, can be identified in the facial expression of detected one or more faces in the said search, the information of the facial expression that further thus output expression identifies.In addition, said person extraction device can be identified in the posture of the one or more character graphic that comprise in the view data, can export the information of the posture that identifies of expression thus, as extracting the result.

Said one or more image-regions image-region in addition that non-personage's extraction element is extracted by said person extraction device through analysis from the view data that said imaging device generated, and extract the main object beyond said one or more character graphic.For example, said non-personage's extraction element extracts said main object through using high pass filter that view data is carried out Filtering Processing.Replacedly, the predetermined target of registration in advance of identification in the target that said non-personage's extraction element can comprise in view data, thus extract target as said main object.In addition, said non-personage's extraction element can extract said main object through using two kinds of above-mentioned methods.

Constitute judgment means according to the extraction result of said person extraction device and the extraction result of said non-personage's extraction element; Whether whether assessment satisfies predetermined condition to the layout of the said main object beyond said one or more character graphic and the said one or more character graphic, and suitable according to the formation of the assessment of said layout being judged view data.

Recording timing confirms that device confirms the timing of recording image data based on the judged result of said formation judgment means.

In an embodiment of the present invention; Except imaging device, tape deck, person extraction device, non-personage's extraction element, formation judgment means and recording timing are confirmed the device; Said first imaging device also comprises recording control apparatus, is used to control said tape deck so that confirmed the determined time recording view data of device by said recording timing.In this embodiment, when take constituting suitable image, recording image data automatically.Therefore no matter photographer's technical ability how, always can obtain to constitute suitable image.

In another embodiment of the present invention, said first imaging device comprises notifying device, is used for notice and confirms the determined timing of device by said recording timing.After obtaining the notice of notifying device, photographer knows the timing that obtains to constitute suitable image.Therefore, through pressing shutter release button, can easily obtain to constitute suitable image in the timing that obtains notifying.

First imaging device of the present invention may further include: constitute proposing apparatus, be used for confirming to satisfy the said one or more character graphic of predetermined condition and the layout of said main object through the extraction result who uses said person extraction device and said non-personage's extraction element; And imaging control device, be used to control the operation of said imaging device, so that generate the view data of coming said one or more character graphic of layout and said main object with the determined layout of said formation proposing apparatus.In comprising the configuration that constitutes proposing apparatus and imaging control device, under the inappropriate situation of the formation of the image that photographs, after this change the operation (such as amplifying) of imaging device.Therefore, improve formation, and can obtain to constitute suitable image apace.

First imaging device of the present invention may further include: above-mentioned formation proposing apparatus; And image processing apparatus, be used for the view data carries out image processing, so that make that the layout of said one or more character graphic and said main object is consistent with the determined layout of said formation proposing apparatus.In comprising the configuration that constitutes proposing apparatus and image processing apparatus, under the inappropriate situation of the formation of the image that photographs, can automatically generate the suitable image of formation through image processing.Therefore, can obtain to constitute suitable image apace.

Preferably, said tape deck writes down the extraction result of said person extraction device and said non-personage's extraction element with view data in said recording medium.Like this, when the view data of reference record on personal computer etc. in recording medium, can extract the result and the edited image data through using.

Except above-mentioned device, first imaging device can also comprise speech analysis means, is used for carrying out the detection to the predetermined characteristic relevant with voice through the voice of analyzing input.For example, said speech analysis means detects predetermined volume change, predetermined phrase, perhaps has been registered as the characteristic of being scheduled to people's phonetic feature in advance, as predetermined characteristic.In the case, whether suitable said formation judgment means based on by detected predetermined characteristic of said speech analysis means and the judgement that constitutes of the assessment of said layout being carried out relevant view data.And preferably, in the case, said tape deck writes down the extraction result of said person extraction device and said non-personage's extraction element and the testing result of said speech analysis means with view data in said recording medium.Through except when judging formation, assessing, also considering voice, can be in more suitable time recording view data.

First control method of the present invention is a kind of like this method, and it makes vision facilities be operating as the first above-mentioned imaging device through controlling said equipment in the following manner.At first, extract the image-region of expression character graphic through analyzing view data that said imaging device generates.Through the image-region beyond the image-region of analyzing the expression character graphic in the view data that said imaging device generates, and extract the main object beyond the said character graphic.Whether whether assessment satisfies predetermined condition to the character graphic extracted and the layout of said main object, and suitable based on the formation of the assessment of said layout being judged view data.Based on the timing that the judged result of said formation is come confirm subsequently recording image data.In an embodiment of the present invention, control said tape deck so that in determined time recording view data.In another embodiment, through the operation of control such as the predetermined output device of monitor, loud speaker or light etc. to the determined timing of user notification.

Second imaging device of the present invention comprise be used for through photographed scene generate the expression scene view data imaging device and be used at booking situation medium record by record image data device that imaging device generated.Said second imaging device has person extraction device, speech analysis means, formation judgment means and recording timing and confirms device, will describe all these devices below.

Speech analysis means is carried out the detection to the predetermined characteristic relevant with voice through the voice of analyzing input.For example, said speech analysis means detects predetermined volume change, predetermined phrase, perhaps has been registered as the characteristic of being scheduled to people's phonetic feature in advance, as predetermined characteristic.Constitute judgment means based on the extraction result of said person extraction device and the testing result of said speech analysis means, judge whether the formation of view data is suitable.

In an embodiment of the present invention; Except imaging device, tape deck, person extraction device, speech analysis means, formation judgment means and recording timing are confirmed the device; Said first imaging device also comprises recording control apparatus, is used to control said tape deck so that confirmed the determined time recording view data of device by said recording timing.In this embodiment, even, do not satisfy predetermined condition, can not carry out automatic shooting if constitute producing under the situation of voice as the triggering of automatic shooting yet.Therefore, only needn't worry unnecessary shooting in response to voice.

In another embodiment, second imaging device comprises notifying device, is used for notice and confirms the determined timing of device by said recording timing.Second imaging device is not carried out automatic shooting in this embodiment.But, satisfying predetermined condition and produced under the situation of voice of predetermined characteristic in formation, imaging device is pressed the timing of shutter release button to user notification.Therefore, the user can benefit from the facility identical with automatic shooting.In addition, owing to can not automatically perform shooting operation, imaging device can not run counter to user's intention and carry out unnecessary operation.

Preferably, said tape deck writes down the extraction result of said person extraction device and the testing result of said speech analysis means with view data in said recording medium.Like this, when the view data of reference record on personal computer etc. in recording medium, can extract the result and the edited image data through using.

Second control method of the present invention is a kind of like this method, and it makes vision facilities be operating as the second above-mentioned imaging device through controlling said equipment in the following manner.At first, extract the image-region of expression character graphic through analyzing view data that said imaging device generates.Parallel with extraction, carry out detection through the voice of analyzing input to the predetermined characteristic relevant with voice.After this, based on the result of said extraction and the result of said detection, judge whether the formation of view data is suitable.Confirm the timing of recording image data subsequently based on the result of said judgement.In an embodiment, control said tape deck so that in determined time recording view data.In another embodiment, through the operation of controlling predetermined output device to the determined timing of user notification.

The 3rd imaging device of the present invention comprise be used for through photographed scene generate the expression scene view data imaging device and be used at booking situation medium record by record image data device that imaging device generated.Said the 3rd imaging device further comprises: person extraction device, non-personage's extraction element, formation judgment means and formation proposing apparatus, below all these devices will be described.

Constitute judgment means according to the extraction result of said person extraction device and the extraction result of said non-personage's extraction element; Whether whether assessment satisfies predetermined condition to the layout of the said main object beyond said one or more character graphic and the said one or more character graphic, and suitable according to the formation of the assessment of said layout being judged view data.Constitute proposing apparatus through using the extraction result of said person extraction device and said non-personage's extraction element, confirm to satisfy the said one or more character graphic of predetermined condition and the layout of said main object.

In an embodiment of the present invention; Except imaging device, tape deck, person extraction device, non-personage's extraction element, formation judgment means and formation proposing apparatus; The 3rd imaging device also comprises imaging control device; Be used to control the operation of said imaging device, so that generate the view data of coming said one or more character graphic of layout and said main object with the determined layout of said formation proposing apparatus.According to the imaging device among this embodiment, under the inappropriate situation of the formation of the image that photographs, after this change the operation (such as magnification ratio) of imaging device.Therefore, automatically improved formation.

In another embodiment of the present invention; The 3rd imaging device comprises image processing apparatus; Be used for the view data carries out image processing, so that make that the layout of said one or more character graphic and said main object is consistent with the determined layout of said formation proposing apparatus.According to the imaging device among this embodiment, under the inappropriate situation of the formation of the image that photographs, can automatically generate image, and improve formation through image processing with preferred formation.

The 3rd imaging device can comprise recording control apparatus, is used for confirming the timing of recording image data according to the judged result of said formation judgment means, and is used to control said tape deck so that in determined time recording view data.In the configuration with recording control apparatus, automatic recording chart is as data when obtaining to constitute suitable image.Therefore, no matter photographer's technology how, always can obtain to constitute suitable image.

Replacedly, the 3rd imaging device does not comprise recording control apparatus, but can comprise notifying device, is used for confirming the timing of recording image data according to the judged result of said formation judgment means, and is used to notify determined timing.In the configuration with notifying device, photographer obtains about taking the notice of the timing that constitutes suitable image.Therefore, through pressing shutter release button, can obtain to constitute suitable image in the timing of being notified.

Except above-mentioned device, the 3rd imaging device further comprises speech analysis means, is used for carrying out the detection to the predetermined characteristic relevant with voice through the voice of analyzing input.For example, said speech analysis means detects predetermined volume change, predetermined phrase, perhaps has been registered as the characteristic of being scheduled to people's phonetic feature in advance, as predetermined characteristic.In the case, whether suitable said formation judgment means based on the judgement that constitutes of being carried out relevant view data by detected predetermined characteristic of said speech analysis means and the top assessment of having described to said layout.Preferably, in the case, said tape deck writes down the extraction result of said person extraction device and said non-personage's extraction element and the testing result of said speech analysis means with view data in said recording medium.Through except when judging formation, assessing, also considering voice, can be in more suitable time recording view data.

The 3rd control method of the present invention is a kind of like this method, and it makes vision facilities be operating as the 3rd above-mentioned imaging device through controlling said equipment in the following manner.At first, extract the image-region of expression character graphic through analyzing view data that said imaging device generates.Subsequently, through the analytical table image-region beyond the image-region of thing figure of leting others have a look at, and from the view data that image device generated, extract the main object beyond the said character graphic.After this, whether whether the character graphic that assessment is extracted and the layout of said main object satisfy predetermined condition, suitable based on the formation of the assessment of said layout being judged view data.Then, confirm to satisfy the character graphic of being extracted of predetermined condition and the layout of said main object.In an embodiment, control the operation of said imaging device, so that generate the view data of arranging said character graphic and said main object with determined layout.In another embodiment, to the view data carries out image processing, so that make consistent with determined layout to the layout of said character graphic and said main object.

Description of drawings

Figure 1A illustrates the front perspective view of digital camera;

Figure 1B illustrates the rear view of digital camera;

Fig. 2 illustrates the internal configurations of digital camera;

Fig. 3 is the flow chart (automatic shooting pattern) that the operation of digital camera is shown;

Fig. 4 is the flow chart (taking the assistance pattern) that the operation of digital camera is shown;

Fig. 5 illustrates the regularly example of notice;

Fig. 6 illustrates regularly another example of notice;

Fig. 7 illustrates the example that is used to assist the demonstration of taking;

Fig. 8 illustrates and takes another example of assisting demonstration;

Fig. 9 illustrates the regularly configuration of detecting unit;

Figure 10 A illustrates facial the detection and handles;

Figure 10 B illustrates facial the detection and handles;

Figure 10 C illustrates facial the detection and handles;

Figure 10 D illustrates facial the detection and handles;

Figure 11 A illustrates human facial expression recognition and handles;

Figure 11 B illustrates human facial expression recognition and handles;

Figure 11 C illustrates human facial expression recognition and handles;

Figure 11 D illustrates human facial expression recognition and handles;

Figure 12 A illustrates gesture recognition and handles;

Figure 12 B illustrates gesture recognition and handles;

Figure 13 A illustrates the processing of the extraction that is used for non-who object;

Figure 13 B illustrates non-personage's object extraction and handles;

Figure 13 C illustrates non-personage's object extraction and handles;

Figure 13 D illustrates non-personage's object extraction and handles;

Figure 14 illustrates the example of speech analysis;

Figure 15 is the flow chart that the example that constitutes judgment processing is shown;

Figure 16 illustrates the formation judgment processing;

Figure 17 illustrates the formation judgment processing;

Figure 18 A illustrates and constitutes the suggestion processing;

Figure 18 B illustrates and constitutes the suggestion processing;

Figure 18 C illustrates and constitutes the suggestion processing; And

Figure 19 illustrates the example that is used to select want the screen of images recorded.

Embodiment

Below, with open digital camera, as the embodiment of method and apparatus of the present invention through using different control methods to come executable operations to control selectively.This digital camera has four operator schemes, comprises common screening-mode, image playback mode, automatic shooting pattern and takes the assistance pattern.

The configuration of digital camera at first will be described.Figure 1A and 1B illustrate the external view of digital camera 1.Figure 1A illustrates the front perspective view of camera 1, and Figure 1B illustrates the rear view of camera 1.Shown in Figure 1A and 1B, digital camera 1 has taking lens 2, shutter release button 3, microphone 4, operation rotating disk and action button 5a to 5f, monitor 6 and LED lamp 9.Loud speaker 8 covers the bottom that (not shown) is positioned at digital camera 1 with the groove that can open and close.The draw-in groove that is used to insert storage card 7 is installed in the groove lid.

Fig. 2 illustrates the internal configurations of digital camera 1.As shown in Figure 2, digital camera 1 comprises image-generating unit, and image-generating unit comprises taking lens 2, lens driving unit 16, aperture 13, aperture driver element 17, CCD 14 and timing generator (TG) 18.Taking lens 2 comprises the camera lens that is used for various functions, such as the amasthenic lens that is used for object is focused on, and the zoom lens of realizing zoom function.Through using miniature motor, such as stepper motor, the position of each camera lens is regulated in lens driving unit 16, thereby the distance of feasible and CCD 14 is suitable for taking.Aperture 13 comprises a plurality of aperture blades.Through using miniature motor, such as stepper motor, aperture driver element 17 is regulated the position of aperture blades, thereby makes the pore size of aperture be suitable for taking.CCD 14 is CCD of 5,000,000 to 1,200 ten thousand pixels, has main filter, according to from the command signal of timing generator 18 and charge stored is discharged.Timing generator 18 transmits a signal to CCD 14, thereby makes 14 stored charges during the desired time section of CCD, regulates shutter speed thus.

Digital camera 1 also has: A/D converting unit 15, and being used for the conversion of signals from CCD 14 outputs is digital signal; Image Input Control Element 23 is used for the view data from 15 outputs of A/D converting unit is sent to another processing unit via system bus 24; And memory 22, be used for the view data that interim storage sends from image Input Control Element 23.

Digital camera 1 further comprises: focus adjustment unit 20 is used for carrying out lens focus through instruction lens driving unit 16 moving lens; Exposure regulon 21 is used for confirming that f-number is with shutter speed and be used for command signal is sent to aperture driver element 17 and timing generator 18.Digital camera 1 also has graphics processing unit 25, is used for being stored in the view data carries out image processing of memory 22.Graphics processing unit 25 is carried out various types of processed so that image looks attractive, such as: color gray correction and gamma correction, so that image has natural colour and brightness; Blood-shot eye illness is proofreaied and correct, and being used for any blood-shot eye illness that is included in view data is proofreaied and correct is interchangeable color; And the processing that is used under the inappropriate situation of image construction, proofreading and correct formation.The view data of having been carried out image processing by graphics processing unit 25 is stored in the memory 22 once more.

In addition, digital camera 1 has indicative control unit 26, is used for monitor 6 is arrived in control store in the view data of memory 22 output.The view data " weight reducing " of indicative control unit 26 in will being stored in memory 22 outputs to monitor 6 for behind the pixel count that is suitable for showing with view data.Indicative control unit 26 is also controlled the demonstration of the screen that is used for setting operation condition etc.

And digital camera 1 comprises and reads write control unit 27, is used for control store and is written to storage card 7 and the view data that will be stored in the storage card 7 is loaded into memory 22 in the view data of memory 22.Read write control unit 27 according to the set setting of user, do not compress or after it is carried out compressed encoding, will in storage card 7, be recorded as Exif (exchangeable image file format) file through taking the view data that is obtained.Exif is the determined file format of Japanese Electronic Industries Development Association (JEIDA).When the request playback is stored in the image file in the storage card 7, reads write control unit 27 view data in the Exif file is loaded in the memory 22.Carried out in view data under the situation of compression, read write control unit 27 and after view data is decompressed, again view data is loaded in the memory 22.

Digital camera 1 also comprises: LED control unit 19 is used to carry out the ON/OFF control of LED 9; And audio frequency input and output control unit 12, be used to carry out I/O control to microphone 4, loud speaker 8, A/D converting unit 10, D/A converting unit 11 and sound.The voice data that audio frequency input and output control unit 12 will convert numerical data from microphone 4 inputs and by A/D converting unit 10 into is sent to memory 22 via system bus 24, and voice data is stored in the memory 22.The whole control unit that to describe from each processing unit and back offers the voice data of audio frequency input and output control unit 12 by 11 conversions of D/A converting unit, and outputs to loud speaker 8.

Digital camera 1 comprises regularly detecting unit 28, is used for detecting regularly to obtain image.View data and the voice data of timing detecting unit 28 analyzing stored in memory 22, and export the signal that the data that indicate in the memory 22 satisfy the timing of predetermined condition.

Digital camera 1 has whole control unit 30, the EEPROM (electrically-erasable and programmable read only memory) 33 that it comprises the RAM (random access memory) 32 of CPU (CPU) 31, storage operation/control program and stores various set points.The CPU 31 of whole control unit 30 consults the set point that is stored among the EEPROM 33, selects and carry out one of program that is stored among the RAM 32 according to set point.Therefore; How to operate shutter release button 3 or operate rotating disk/button 5a through detecting to 5f; Perhaps through receiving the result of each processing unit, whole control unit 30 will indicate the processed instruction signal that will carry out and send to LED control unit 19, focus adjustment unit 20, exposure regulon 21, image Input Control Element 23, graphics processing unit 25, indicative control unit 26, reads write control unit 27, timing detecting unit 28 or audio frequency input and output control unit 12.Like this, the operation of control digital camera 1.

At common screening-mode, automatic shooting pattern with take the assistance pattern, wait and obtain image through under the control of whole control unit 30, carries out control, the glisten control, image processing, record of focus adjustment, exposure by each processing unit.At playback mode, under the control of whole control unit 30, the image that is stored in the storage card 7 is outputed to monitor 6.Pattern is being set, under the control of integral body control power supply 30, on monitor 6, is showing screen is set, receiving the input of operation from operation rotating disk/button 5a to 5f.From the information that screen is selected is set,, be stored among the EEPROM 33 to 5f through user's operating operation rotating disk/button 5a perhaps from the information of storage card 7 inputs.

Below, will further describe the automatic shooting pattern and take the assistance pattern.Fig. 3 is the flow chart that the operation of the digital camera 1 that is set to the pattern of automatically snapping is shown.When being set to automatically snap pattern, digital camera 1 begins to generate the view data (S101) of the scene that performance watches through camera lens.Whether the formation that digital camera 1 is judged the image that view data showed that generates suitable (S102).Constituting under the suitable situation digital camera 1 document image (S103) and no matter whether the user has operated shutter release button 3 in storage card 7.Constituting under inappropriate situation; Digital camera 1 suggestion better constitutes (S104); And control the operation of image-generating unit or make graphics processing unit 25 carry out predetermined processing (S105), thereby make the view data that generates at step S101 have the formation of being advised at step S104.For example, under the too little situation of main object, digital camera makes the image-generating unit zoom.Not so that balance mode is arranged preferably, digital camera 1 instruction graphics processing unit 25 carries out image processing are wherein pruned the zone of object and it are moved or amplifies at main object.Replacedly, when object deflection that should be upright, digital camera 1 makes graphics processing unit 25 carry out rotation processing, makes object look upright thus.

The view data that is generated (S101) by image-generating unit or graphics processing unit 25 is once more assessed at step S102 once more.Repeat above-mentioned flow process, change operation (S106) until detecting pattern.

Fig. 4 is the flow chart that the operation of the digital camera 1 under the situation that digital camera is set to take the assistance pattern is shown.When being set to take the assistance pattern, digital camera 1 begins to generate the view data (S201) of the scene that performance watches through camera lens.Digital camera 1 is judged the formation (S202) of (assessment) image that view data showed subsequently.

Constituting under the suitable situation timing (S203) that digital camera 1 notice is taken.Fig. 5 and 6 shows how to notify example regularly.Fig. 5 illustrates the example of notifying through display mark 34 on monitor 6, and shutter release button is pressed in these mark 34 promptings.Can show message, replace mark 34 such as " right moment for camera ".In the example of Fig. 6, through making 9 flickers of LED lamp notify this timing.In addition, can be through notifying this timing from the voice of loud speaker.

Constituting under inappropriate situation, digital camera 1 suggestion better constitutes (S204).Through the formation (hereinafter, this demonstration will be called as assists to show) that on monitor 6, shows suggestion, digital camera 1 prompting photographer changes camera and carries out the mode of picture catching or operate one of preset operation button (S205).Fig. 7 and 8 illustrates the example of assisting demonstration.Fig. 7 shows through coming the example of suggestion preferable configuration in the demonstration that does not add stack frame 35 on any image data processed.In the example that Fig. 8 shows, show image, and mark 36 be shown how its suggestion carries out structure to obtain just in images displayed on screen edges and corners through the preferable configuration that image processing generated.In addition, can show message a little left, or be output as voice, so that the preferred structure of suggestion such as " asking zoom " or " please camera being shaken ".Taking the assistance pattern, repeat above-mentioned flow process, change operation (S206) up to detecting pattern.

To describe in further detail below at step S102 and S202 and be used to judge formation and be used for the flow process that suggestion constitutes at step S104 and S204.Judgement and suggestion that timing detecting unit 28 shown in Fig. 2 is carried out constituting.Fig. 9 illustrates the regularly configuration of detecting unit 28.As shown in Figure 9, regularly detecting unit 28 comprises person extraction device 41, non-personage's extraction element 42, speech analysis means 43, constitutes judgment means 44 and constitutes proposing apparatus 45.Regularly detecting unit 28 can be the circuit that comprises LSI, and LSI plays auto levelizer 41 to 45, perhaps can be microcomputer, wherein is equipped with to be used for the software of flow process of final controlling element 41 to 45.

Person extraction device 41 reads the view data that is stored in the memory 22, in view data, searches for any one or a plurality of character graphic regional (hereinafter, be called the character graphic zone simply, the number that comprises the character graphic zone is one situation).In the present embodiment, person extraction device 41 detects character graphic through seeker's face.Detect at person extraction device 41 under the situation of people's face; Person extraction device 41 adds the identifier such as sequence number to each face, calculates the area of each facial zone, the area in each zone that expression comprises facial whole body (whole body zone hereinafter referred to as) and the coordinate of barycenter.Surpass under the situation of predetermined value at the area of any one facial zone, the coordinate of barycenter that uses this facial zone is as the coordinate of barycenter.Otherwise, use the coordinate of the coordinate of the regional barycenter of corresponding whole body as barycenter.For example, regional when relatively large when face shown in the example among Figure 10 A and the 10B, calculate the coordinate of the barycenter of each facial zone.Under the less relatively situation of facial zone, shown in the example among Figure 10 C and the 10D, calculate the coordinate of the barycenter in each represented whole body zone of cross hairs.After the whole zone of search; The person extraction device all is stored in the number of detected character graphic, scope and area, the scope in each whole body zone and the coordinate of area and barycenter of each facial zone regularly in the memory (not shown) of detecting unit 28, as number, its position and the big or small information thereof of the detected character graphic of expression.

As through the facial method of searching and detecting, known have various types of methods.For example; Like what introduced among the open 2001-51338 of japanese laid-open patent, knownly a kind of skin-coloured regions is detected to facial method and a kind of existence that has a facial composition (such as hair, eyes, mouth) of geometric properties through judgement whether detect facial method.The face that any known method may be used to person extraction device 41 detects.

Person extraction device 41 is discerned the facial expression of detected face subsequently.But human facial expression recognition only is being provided with just execution under the situation that the human facial expression recognition function is ON in the detailed setting of automatic shooting pattern.Replacedly, human facial expression recognition possibly only surpass just execution under the situation of predetermined value in detected any one facial size.In the present embodiment, four kinds of facial expressions of person extraction device 41 identifications, i.e. smile, anger, sobbing, surprised is respectively by shown in the example among Figure 11 A, 11B, 11C and the 11D.Can find out obviously in the example from Figure 11 A to 11D that these expressions have following characteristic respectively: how eyes and mouth open, how eyebrow and the corners of the mouth are rolled.Therefore, can discern facial expression based on the illustrated features of each facial composition.As the method for identification facial expression, known polytype method, the method described in the open 2001-51338 of japanese laid-open patent.Any known method may be used to the human facial expression recognition of person extraction device 41.Person extraction device 41 is stored in the facial expression that identifies regularly in the memory of detecting unit 28.

The person extraction device is the posture of the character graphic of the detected face of identification further.Gesture recognition only is being provided with just execution under the situation that the gesture recognition function is ON in the detailed setting of automatic shooting pattern.Replacedly, surpass under the situation of predetermined value in detected any one facial size, the identification facial expression, otherwise can carry out gesture recognition.

In this embodiment, regularly the memory of detecting unit 28 is stored the data of known posture as the geometric properties of each posture of expression in advance.For example, registered in advance shown in Figure 12 A open that the arm shown in the forefinger posture represented with middle finger (peaceful symbol), Figure 12 B is lifted the aerial represented posture attitude of triumph (hail or), tip contacts with thumbtip and form roughly circle lift simultaneously all the other point represented posture (OK or Japanese coin), like the posture (good) of thumbs-up symbol.The geometric properties that person extraction device 41 relatively extracts in the zone around each the detected face from the view data that memory 22 reads and the data of registration.Under the situation that one of posture of the characteristic of extracting and registration conforms to, person extraction device 41 is stored in the title or the predefined identifier of this posture in the memory of timing detecting unit 28.

The known method that also has several different methods as gesture recognition is included in the method for describing among the open 2001-51338 of japanese laid-open patent.Any known method may be used to the gesture recognition of person extraction device 41 and handles.

After this, person extraction device 41 calculates the gross area of facial zone.For example, in the example shown in the 10D, person extraction device 41 calculates the gross area in the zone shown in the frame of broken lines at Figure 10 A.But the person extraction device can calculate the gross area in whole body zone.

Surpass under the situation of predetermined threshold at the gross area in the zone that calculates, 41 in person extraction device provides the information of coordinate, facial expression and the posture of the area in the area of the facial number stored in the memory of timing generator 28, each facial zone, each whole body zone, barycenter to constituting judgment means 44.Otherwise person extraction device 41 provides the information that is stored in the memory to constituting judgment means 44 with non-personage's extraction element 42.

Main object beyond the character graphic that non-personage's extraction element 42 extracts in the view data.In this embodiment; Non-personage's extraction element 42 reads the view data that is stored in the memory 22, through will corresponding to character graphic, the pixel value that comprises the zone of its face or health replace with 0 or additive method and from view data deletion corresponding to the view data part in character graphic zone.For example, suppose that from the view data that memory 22 reads be the view data that comprises the object 51 beyond people 50a, people 50b, the people, shown in Figure 13 A.Person extraction device 41 provides the information such as center-of-mass coordinate of region surrounded 52a of frame of broken lines institute and 52b.Through the view data part of deletion from view data, obtain to include only the residual image data of object 51, shown in Figure 13 B corresponding to character

graphic zone

52a and 52b.

Non-personage's extraction element 42 uses the view data execution filtering operation of high pass filters to the information of having got rid of character image zone 52a and 52b.Like this, obtain edge image 53, wherein extracted the edge of object 51, for example, shown in Figure 13 C.Edge image 53 is the images that comprise the profile of the object beyond the character graphic in the view data, and, can discern and make object 53 be positioned at wherein rough regional 54 through analyzing edge image, shown in Figure 13 D.Non-personage's extraction element 42 calculates the area in the zone 54 that is identified and the coordinate of barycenter, and area that calculates and coordinate are offered formation judgment means 44.

Can carry out the method for only extracting CF composition (corresponding to the edge) through Fourier transform, replace high-pass filtering to handle, as the method in the object zone beyond the identification character graphic.Replacedly, can adopt the method for extracting main object through the analysis of using color information, with the alternative frequency analysis.For example, represent at a pixel value to stay this value under the situation of predetermined color.Otherwise this value is replaced by 0 or 1.Like this, image is divided into two zones, then, extracts zone with pre-color or the zone with object of the color beyond the pre-color.In addition, for the target that trends towards taking with the personage continually (, receiving an acclaim), can discern the data that generate expression object zone through using evaluation algorithm (such as Adaboost algorithm) based on study as pet such as animal.

Simultaneously, image sharpness depends on the shutter speed when obtaining view data sometimes, and image color depends on flash ranging value or aperture in some cases.Therefore, through considering various types of regulated values and set point when the graphical analysis, can become to the identification in object zone is more prone to.

Speech analysis means 43 is analyzed from the voice of microphone 4 inputs, the voice below detecting.But in this embodiment, following flow process is only carried out under the speech analysis function setting is the situation of ON.At first, speech analysis means 43 is measured from the volume of the voice of microphone 4 inputs constantly, and this volume and predetermined threshold are compared.Figure 14 is the time represented through the transverse axis and the longitudinal axis respectively and the figure of volume.In the example of Figure 14, speech analysis means 43 detects T constantly, and just, volume increases and surpass the moment of threshold value Th suddenly.When sports tournament or party are taken, the situation of the cheer that sounds, when when scoring in the football match or in the wedding, toasting, right moment for camera normally.Therefore, through detecting the moment that volume changes suddenly, can detect right moment for camera.Replacedly, do not detect volume change, but only can carry out detection, because the time of the cheer that sounds always can be considered to right moment for camera the volume that surpasses threshold value Th.In contrast, be under the situation of right moment for camera when becoming quiet, such as the situation of the face of taking the child sleeping, can detect the volume state that moment of being lower than threshold value or volume be lower than threshold value that becomes.The result that which need be analyzed as volume time is and to be detected, and this can change according to setting.

Speech analysis means 43 is gone back the phrase that recognizing voice is said, and the particular phrase of this phrase with registration in advance compared.The data of registration can be stored in regularly in the memory of detecting unit 28, are pressing storing such as " eggplant " or phrases such as " cheers " that shutter release button says simultaneously probably.In this embodiment, voice can be registered as one of data of registration, and voice can be registered with phrase relatively.Data through with registration compare, and speech analysis means 43 can detect the moment that one of the phrase of registration said in (a) voice, (b) have been registered the moment that the personage of voice sends voice, and (c) personage says the moment of phrase as voice.In (c) which the detection moment (a) arrive, and be definite through setting in principle.But,, can carry out the processing that is different from setting according to the state of data registration.For example, detect (c) constantly, under the situation of not registering any voice, also can detect (a) constantly even be set at.

Carrying out that volume detects still is execution phrase detection relatively, or the two all carries out, and this depends on setting.

Next will describe and constitute the performed flow process of judgment means 44.As shown in Figure 9, to constitute judgment means 44 view data that reads from memory 22 is provided, from the extraction result of person extraction device 41 and non-personage's extraction element 42 and from the testing result of speech analysis means 43.But, under the situation of carry out not extracting or detecting, the value (such as 0) of the information that does not have in this input expression to provide.

Figure 15 is the flow chart that the example that constitutes the performed flow process of judgment means 44 is shown.Constitute judgment means 44 and receive information from coordinate, facial expression and the posture of the regional scope of the scope of each facial zone of person extraction device 41 and area, each whole body and area, barycenter; From the information of scope, area and the center-of-mass coordinate of the object beyond the character graphic of non-personage's extraction element 42, and from the information of the result of voice analysis of speech analysis means 43.

Constitute judgment means 44 and at first assess the balance (S301) of the layout of the object that comprises character graphic.Detect respectively under the situation of N people (wherein N is an integer) and M object (wherein M is an integer) at person extraction device 41 and non-personage's extraction element 42; Constitute the center-of-mass coordinate of judgment means 44, calculate center-of-mass coordinate as a whole with M+N zone based on people's object area that has extracted and object zone.For example; For by Figure 13 A to the represented example of the image shown in the 13D; Constitute judgment means 44 (promptly with three zones; People's

object area

52a and 52b and object zone 54) as a whole,, calculate its whole center-of-mass coordinate G according to the center-of-mass coordinate g2 of the center-of-mass coordinate g1 of regional 52a, regional 52b and the center-of-mass coordinate g3 in object zone 54.If barycenter G is positioned at the presumptive area 55 of picture centre part, constitutes judgment means 44 and just judge that the balance of arranging is suitable.Otherwise, constitute judgment means 44 and judge that this balance is inappropriate.

Under the situation of calculating this N+M zone center-of-mass coordinate as a whole, can after carrying out weighting, obtain this coordinate according to its area center-of-mass coordinate regional to each.If to each zone, many more to its weighting when its area is big more, then the zone centroid position on more near the bigger zone of area.For example, in example shown in Figure 17, the resulting barycenter of zone leveling weighting is positioned at a GA, it is outside zone 55.Therefore, constitute judgment means 44 and judge that the balance of arranging is inappropriate.But, when the area in zone is big more to the more situation of its weighting under, the barycenter that calculates is positioned at a GB, it is within zone 55.Therefore, the balance of layout be judged as being suitable.

Except arranging the balance assessment, constitute judgment means 44 also to some object assessment rotating deviation.Rotating deviation refer to object in the image direction or towards and real world in object direction or towards between difference.For example, skyscraper that should be upright in supposition looks in image under the situation about tilting, and constitutes judgment means 44 and judges and observe rotating deviation.Use under the situation based on the evaluation algorithm extraction object of learning at non-personage's extraction element 42, non-personage's extraction element 42 not only can be judged the profile of object, can also judge the type of object.For such object, non-personage's extraction element 42 provides the information of expression object type to constituting judgment means 44.At the object that is extracted is in real world under the situation for level or vertical object, such as skyscraper or horizon, constitute judgment means 44 calculate the object that extracts direction or towards, judge whether to exist rotating deviation.

At step S302; Constituting judgment means 44, constitute judgment means 44 outputs and represent to constitute inappropriate judged results (NG) (S306) through judging under the inappropriate situation of layout balance or constituting judgment means 44 through judging under the situation of observing rotating deviation.

Judged and arrange that balance is suitable and do not observe under the situation of rotating deviation constituting judgment means 44, the facial expression information that constituting judgment means 44 is provided based on person extraction device 41 subsequently judges that whether the facial expression of character graphic is to be worth the specific facial expression (S303) of taking.Replacedly, constitute judgment means 44 through with judge before the facial expression information that provides soon compare and judge whether facial expression changes.But, can surpass just execution under the situation of predetermined value by a area at any one detected facial zone to the judgement of facial expression.Be under the situation of particular emotion (perhaps observing under the situation of variation of facial expression) in facial expression, constitute judgment means 44 output expressions and constitute suitable judged result (OK) (S307).

Not (perhaps under the situation of the variation of not observing facial expression) under the situation of particular emotion in facial expression, constitute judgment means 44 and judge whether based on the information of the posture that provides from person extraction device 41 that any one personage's figure is manifesting and be worth the posture (S304) of taking.Replacedly, constitute judgment means 44 through with judge before the pose information that provides soon compare and judge whether to observe the variation in the character graphic motion.But, posture judge can be only be just to carry out under predetermined value or the bigger situation at the area in any one detected character graphic zone.Under the situation of observing given pose or motion change, constitute judgment means 44 output expressions and constitute suitable judged result (OK) (S307).

Under the situation of observing given pose or motion change, constitute judgment means 44 and judge whether to detect special sound (S305) based on the information that provides from speech analysis means 43.Under the situation that does not detect special sound, constitute judgment means output expression and constitute inappropriate judged result (S306).Detecting under the situation of special sound, constituting judgment means 44 output expressions and constitute suitable judged result (S307).

Constituting under the situation of suitable judged result from constituting judgment means 44 output expressions, regularly detecting unit 28 sends judged result to whole control unit 30.Be set at digital camera 1 under the situation of the pattern of automatically snapping, whole control unit 30 instruction that receives the result is read write control unit 27 and in storage card 7, is write down and be stored in the view data in the memory 22.Be set to take under the situation of assistance pattern at digital camera 1, whole control unit 30 instruction indicative control units 26 display mark or message on monitor are pointed out right moment for camera (referring to Fig. 5).Replacedly, whole control unit 30 instruction LED control units 19 flicker LED 9 (referring to Fig. 6).

In this embodiment, read write control unit 27 and in storage card 7, write down and be used to constitute information judged, as the accompanying information of view data.More specifically, this information is recorded in the label of Exif file.Be judged as under inappropriate situation in formation, constituted judgment means 44 and be provided for information judged to constituting proposing apparatus 45.Constitute proposing apparatus 45 and carry out following flow process through using this information.

Constitute proposing apparatus 45 and analyze from constituting the information that judgment means 44 provides, be judged as inappropriate image for formation, suggestion preferably constitutes.Suggestion constitutes to refer to confirms to satisfy the layout that constitutes Rule of judgment.The formation of confirming is exported with carrying out information processed, is used to the image that obtains to constitute.For example, shown in the example among Figure 18 A, the regional 52a that in image, extracts, 52b and 54 are arranged under the situation under the left side substantially, and the barycenter G that constitutes regional 52a, 52b and 54 of suggestion is positioned at the core of image, shown in Figure 18 B.Replacedly, shown in the example among Figure 18 C, core and object that the barycenter G that constitutes regional 52a, 52b and 54 of suggestion is positioned at image look bigger.There is two types information to be exported, is used to obtain the image of the formation of suggestion as the information processed that will carry out.

The first information that formation proposing apparatus 45 is exported is to make the view data that in shooting, obtains convert the required information of view data of preferred formation into through image processing.For example, in the example in Figure 18 B, the information of the zone (the runic frame among Figure 18) that output is pruned and the direction of motion (motion vector) of barycenter G.In the example shown in Figure 18 C, for example, the information of zone, the direction of motion and magnification ratio is pruned in output.

Second information that formation proposing apparatus 45 is exported is to be used for through taking the preferred required information of view data that constitutes that obtains again.For example, in the example shown in Figure 18 B, the information of the operation on the left side is shaken camera in the output expression.In the example shown in Figure 18 C, output expression is shaken the operation on the left side with camera and the information of the magnification ratio that will be provided with.

For as the rotating deviation that constitutes inappropriate another reason, constitute the information of direction and angle of rotation that proposing apparatus 45 output expressions are used for slant correction as the first information, the information of exporting inclined camera to the left or to the right is as second information.Constituting inappropriate reason is under the situation of facial expression, posture or voice, can not carry out through image processing and proofread and correct.Therefore, this reason of output expression is the information of facial expression, posture or voice.

The information that formation proposing apparatus 45 is exported is sent to whole control unit 30.The whole control unit 30 that receives this information judges whether digital camera 1 is set to the automatic shooting pattern or takes the assistance pattern, and carries out processing according to pattern.

Be set at digital camera 1 under the situation of the pattern of automatically snapping, whole control unit 30 instruction graphics processing units 25 are from memory 22 reads image data and carry out the required image processing of improvement formation (such as pruning, amplifying/dwindle and rotate).Whole control unit 30 also instructs indicative control unit 26 display image processing unit 25 handled view data on monitor 6.In addition, 30 instructions of whole control unit are read write control unit 27 and in storage card 7, are write down by image processing apparatus 25 handled view data.

In this embodiment; After receiving instruction; Indicative control unit 26 shows selects screen; Shown in figure 19, let the user select still data (image of suggestion) or the image of image of taking and suggestion this two record all of the image of the formation of suggestion of data (image of shooting) of the image of the formation that records photographing arrives.Read write control unit 27 and in storage card 7, be recorded in selected view data in the screen.But, can not show the selection screen, only the image of record suggestion perhaps writes down the image of suggestion with the image of taking.

In this embodiment; The write control unit 27 that reads that has received instruction writes down the formation information judged that is used to constitute judgment means 44 in storage card 7; Be the N+M regional integration center-of-mass coordinate, detect the information towards, facial expression and posture and detected voice of the object of rotating deviation, as the accompanying information of view data.And, reading write control unit 27 and in storage card 7, write down by the first information that constitutes proposing apparatus 45 outputs, the view data that obtains in promptly feasible the shooting is converted into the required information of view data of preferred formation, as the accompanying information of view data.More specifically, above-mentioned information is recorded in the label of Exif file.

When edited image on personal computer, can use information recorded in the label of Exif file.For example, if the image that photographs can be used with the first information that formation proposing apparatus 45 is exported, can generate the image that be equivalent to the image of being advised through personal computer.Therefore, do not write down the size that the image of being advised just can the downscaled images file.And, can be through the image that photographs is edited to be created on to constitute going up and constitute slightly pictures different of formation that proposing apparatus 45 advised based on constituting the first information that proposing apparatus 45 exported.

Be set to take under the situation of assistance pattern at digital camera 1, whole control unit 30 instruction graphics processing units 25 are from memory 22 reads image data and carry out the required processing (such as pruning, change, amplify/dwindling and rotate) of improvement formation.Whole control unit 30 also instructs indicative control unit 26 display image processing units, 25 handled view data and according to constituting mark or the message that second information that proposing apparatus 45 exported generates.Like this, carry out combination Fig. 7 and 8 described assistance demonstrations.

In this embodiment, suitable and when detecting predetermined facial expression, posture or voice if digital camera is set to the automatic shooting pattern when the balance of arranging, recording image data in storage card automatically.Therefore, even be unfamiliar with the people of shooting, also always can obtain to constitute suitable image.In addition, digital camera does not respond when detecting predetermined voice separately.Therefore, can not take before the object or in response to unnecessarily carrying out shooting operation near the people's of photographed scene voice by chance at camera.In other words, in the while easily that keeps the use voice as the automatic shooting that triggers, solved the inconvenience of unnecessary shooting.

In taking the assistance pattern, photographer has obtained the notice about right moment for camera.Therefore,, suitable image on constituting can be easily obtained, the facility that is equivalent to automatic shooting can be enjoyed simultaneously through pressing shutter release button in the timing that obtains notifying.Under the situation that detects predetermined voice separately, can not carry out this notice.This notice is only arranged in image that balance is suitable and is detected just execution under the situation that shows suitable facial expression variation, posture or voice of formation.Therefore, can the execution error notice.

Do not obtaining to constitute under the situation of suitable image, the operation of control image-generating unit is so that obtain to constitute suitable image, perhaps the image carries out image processing to having obtained.Therefore, the user can obtain to constitute suitable image, and need not change the standing place, perhaps need not change the mode of picture catching, perhaps need not regulate the setting such as magnification ratio.

In the above-described embodiments, the method for judgement/suggestion has been described, such as method through judging or advise constituting for each zone calculating center-of-mass coordinate.But, can enumerate various types of conditions and datas as condition that constitute to satisfy and the data that are used to judge, and this conditions and data is not necessarily limited to the example shown in the foregoing description.In the above-described embodiments, situation that rest image takes has been described as an example.But, the present invention aspect the timing of confirming beginning taking moving image also of great use.

Claims

1. imaging device, have be used for through photographed scene generate the expression scene view data imaging device and be used at booking situation medium record by the record image data device that imaging device generated, said imaging device comprises:

The person extraction device; The view data executor's object plane portion search that is used for generating at said imaging device is with an image-region that extracts a character graphic of expression or a plurality of image-regions of representing a plurality of character graphic, and the information of output the expression facial position of the facial number of detected personage and each personage and size through said search is as the extraction result;

Non-personage's extraction element; Be used for through analyzing the image-region beyond the said one or more image-regions that extract by said person extraction device from the view data that said imaging device generated; And carry out the extraction to the main object beyond said one or more character graphic, and the information of scope, area and the center-of-mass coordinate of the object of output except said character graphic is as extracting the result;

Constitute judgment means; Be used for according to the extraction result of said person extraction device and the extraction result of said non-personage's extraction element; Assess the balance of the layout of the said main object beyond said one or more character graphic and the said one or more character graphic; When the balance of arranging judges that the formation of view data is suitable when being suitable, and when the formation of the balance of arranging judgement view data when being inappropriate be inappropriate; And

Checkout gear regularly, the formation that is used for judged result in said formation judgment means and is said view data are to detect the timing that obtains view data under the suitable situation.

2. imaging device according to claim 1 further comprises recording control apparatus, is used to control said tape deck so that obtain view data in the timing that is detected by said timing checkout gear.

3. imaging device according to claim 1 further comprises notifying device, is used to notify the timing that is detected by said timing checkout gear.

4. imaging device according to claim 1 further comprises:

Constitute proposing apparatus, be used for confirming to satisfy the said one or more character graphic of predetermined condition and the layout of said main object through the extraction result who uses said person extraction device and said non-personage's extraction element; And

Imaging control device is used to control the operation of said imaging device, so that generate the view data of coming said one or more character graphic of layout and said main object with the determined layout of said formation proposing apparatus.

5. imaging device according to claim 1 further comprises:

Image processing apparatus is used for the view data carries out image processing, so that make that the layout of said one or more character graphic and said main object is consistent with the determined layout of said formation proposing apparatus.

6. imaging device according to claim 1, wherein said tape deck write down the extraction result of said person extraction device and said non-personage's extraction element with view data in said recording medium.

7. imaging device according to claim 1, wherein said person extraction device is identified in the facial expression of detected one or more faces in the said search, and the further information of the facial expression that identifies of output expression.

8. imaging device according to claim 1, wherein said person extraction device is identified in the posture of the one or more character graphic that comprise in the view data, and the information of the posture that identifies of output expression, as extracting the result.

9. imaging device according to claim 1, wherein said non-personage's extraction element extracts said main object through using high pass filter that view data is carried out Filtering Processing.

10. imaging device according to claim 1, wherein said non-personage's extraction element extracts the predetermined target of registering in advance as said main object through recognition objective in the target that in view data, comprises.

11. imaging device according to claim 1 further comprises speech analysis means, is used for carrying out the detection to the predetermined characteristic relevant with voice through the voice of analyzing input, wherein

Whether suitable said formation judgment means based on by detected predetermined characteristic of said speech analysis means and the judgement that constitutes of the assessment of said layout being carried out relevant view data.

12. imaging device according to claim 11, wherein said tape deck write down the extraction result of said person extraction device and said non-personage's extraction element and the testing result of said speech analysis means with view data in said recording medium.

13. imaging device according to claim 11, wherein said speech analysis means detect predetermined volume change as predetermined characteristic.

14. imaging device according to claim 11, wherein said speech analysis means detect predetermined phrase as predetermined characteristic.

15. imaging device according to claim 11, the characteristic that wherein said speech analysis means detection has been registered as predetermined people's phonetic feature in advance is as predetermined characteristic.

16. control method to imaging device; This imaging device have be used for through photographed scene generate the expression scene view data imaging device and be used at booking situation medium record by the record image data device that imaging device generated, said method comprises the steps:

The image-region with extraction expression character graphic is searched for by executor's object plane portion in the view data that said imaging device generated, and the information of the position of the number of output expression detected personage face through said search and each personage face and size is as extracting the result;

Through the image-region beyond the image-region of analyzing the expression character graphic in the view data that said imaging device generates; And extract the main object beyond the said character graphic, and the information of scope, area and the center-of-mass coordinate of the object of output except said character graphic is as extracting the result;

The balance of the character graphic that assessment is extracted and the layout of said main object;

When the balance of arranging judges that the formation of view data is suitable when being suitable, and when the formation of the balance of arranging judgement view data when being inappropriate be inappropriate; And

, the formation of said view data detects the timing that obtains view data under being suitable situation.

17. the control method to imaging device according to claim 16 further comprises step: control said tape deck so that obtain view data in the timing that is detected.

18. the control method to imaging device according to claim 16 further comprises step: through the operation of controlling predetermined output device to the timing that user notification detected.

19. an imaging device, have be used for through photographed scene generate the expression scene view data imaging device and be used at booking situation medium record by the record image data device that imaging device generated, said imaging device comprises:

Speech analysis means is used for carrying out the detection to the predetermined characteristic relevant with voice through the voice of analyzing input, and said predetermined characteristic comprises predetermined volume change, predetermined phrase and has been registered as at least one in the characteristic of being scheduled to people's phonetic feature in advance;

Constitute judgment means, be used for, carry out about the whether suitable judgement of the formation of view data according to the extraction result of said person extraction device and the testing result of said speech analysis means; And

20. imaging device according to claim 19 further comprises recording control apparatus, is used to control said tape deck so that obtain view data in the timing that is detected by said timing checkout gear.

21. imaging device according to claim 19 further comprises notifying device, is used to notify the timing that is detected by said timing checkout gear.

22. imaging device according to claim 19, wherein said tape deck write down the extraction result of said person extraction device and the testing result of said speech analysis means with view data in said recording medium.

23. imaging device according to claim 19, wherein said person extraction device is identified in the facial expression of detected one or more faces in the said search, and the further information of the facial expression that identifies of output expression.

24. imaging device according to claim 19, wherein said person extraction device is identified in the posture of the one or more character graphic that comprise in the view data, and the information of the posture that identifies of output expression, as extracting the result.

25. control method to imaging device; This imaging device have be used for through photographed scene generate the expression scene view data imaging device and be used at booking situation medium record by the record image data device that imaging device generated, said method comprises the steps:

The detection to the predetermined characteristic relevant with voice carried out in voice through analyzing input, and said predetermined characteristic comprises predetermined volume change, predetermined phrase and has been registered as at least one in the characteristic of being scheduled to people's phonetic feature in advance;

According to the result of said extraction and the result of said detection, carry out about the whether suitable judgement of the formation of view data; And

26. the control method to imaging device according to claim 25 further comprises step: control said tape deck so that obtain view data in the timing that is detected.

27. the control method to imaging device according to claim 25 further comprises step: through the operation of controlling predetermined output device to the timing that user notification detected.

28. a vision facilities, have be used for through photographed scene generate the expression scene view data imaging device and be used at booking situation medium record by the record image data device that imaging device generated, said imaging device comprises:

Constitute proposing apparatus, be used for confirming the suitable said one or more character graphic and the layout of said main object through using the extraction result of said person extraction device and said non-personage's extraction element.

29. imaging device according to claim 28; Further comprise imaging control device; Be used to control the operation of said imaging device, so that generate the view data of coming said one or more character graphic of layout and said main object with the determined layout of said formation proposing apparatus.

30. imaging device according to claim 28; Further comprise image processing apparatus; Be used for the view data carries out image processing, so that make that the layout of said one or more character graphic and said main object is consistent with the determined layout of said formation proposing apparatus.

31. imaging device according to claim 28; Further comprise recording control apparatus; The formation that is used for judged result in said formation judgment means and is said view data is to detect the timing that obtains view data under the suitable situation, and is used to control said tape deck so that obtain view data in the timing that is being detected.

32. imaging device according to claim 28; Further comprise notifying device; The formation that is used for judged result in said formation judgment means and is said view data is to detect the timing that obtains view data under the suitable situation, and is used to notify the timing that is detected.

33. imaging device according to claim 28, wherein said tape deck write down the extraction result of said person extraction device and said non-personage's extraction element with view data in said recording medium.

34. imaging device according to claim 28, wherein said person extraction device is identified in the facial expression of detected one or more faces in the said search, and the further information of the facial expression that identifies of output expression.

35. imaging device according to claim 28, wherein said person extraction device is identified in the posture of the one or more character graphic that comprise in the view data, and the information of the posture that identifies of output expression, as extracting the result.

36. imaging device according to claim 28, wherein said non-personage's extraction element extracts said main object through using high pass filter that view data is carried out Filtering Processing.

37. imaging device according to claim 28, wherein said non-personage's extraction element extracts the predetermined target of registering in advance as said main object through recognition objective in the target that in view data, comprises.

38. imaging device according to claim 28 further comprises speech analysis means, is used for carrying out the detection to the predetermined characteristic relevant with voice through the voice of analyzing input, wherein

39. according to the described imaging device of claim 38, wherein said tape deck writes down the extraction result of said person extraction device and said non-personage's extraction element and the testing result of said speech analysis means with view data in said recording medium.

40. according to the described imaging device of claim 38, wherein said speech analysis means detects predetermined volume change as predetermined characteristic.

41. according to the described imaging device of claim 38, wherein said speech analysis means detects predetermined phrase as predetermined characteristic.

42. according to the described imaging device of claim 38, the characteristic that wherein said speech analysis means detection has been registered as predetermined people's phonetic feature in advance is as predetermined characteristic.

43. control method to imaging device; This imaging device have be used for through photographed scene generate the expression scene view data imaging device and be used at booking situation medium record by the record image data device that imaging device generated, said method comprises the steps:

Through the analytical table image-region beyond the image-region of thing figure of leting others have a look at; And from the view data that image device generated, extract the main object beyond the said character graphic, and the information of scope, area and the center-of-mass coordinate of the object of output except said character graphic is as extracting the result;

Confirm the suitable character graphic of being extracted and the layout of said main object.

44., further comprise step: control the operation of said imaging device, so that generate the view data of arranging said character graphic and said main object with determined layout according to the described control method of claim 43 to imaging device.

45., further comprise step: to the view data carries out image processing, so that make that the layout of said character graphic and said main object is consistent with determined layout according to the described control method of claim 43 to imaging device.