WO2019065454A1 - 撮像装置およびその制御方法 - Google Patents
撮像装置およびその制御方法 Download PDFInfo
- Publication number
- WO2019065454A1 WO2019065454A1 PCT/JP2018/034818 JP2018034818W WO2019065454A1 WO 2019065454 A1 WO2019065454 A1 WO 2019065454A1 JP 2018034818 W JP2018034818 W JP 2018034818W WO 2019065454 A1 WO2019065454 A1 WO 2019065454A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- image
- user
- unit
- learning
- shooting
- Prior art date
Links
- 238000003384 imaging method Methods 0.000 title claims abstract description 383
- 238000000034 method Methods 0.000 title claims abstract description 351
- 230000008569 process Effects 0.000 claims abstract description 256
- 230000008859 change Effects 0.000 claims abstract description 172
- 238000012545 processing Methods 0.000 claims description 219
- 238000001514 detection method Methods 0.000 claims description 172
- 238000004891 communication Methods 0.000 claims description 96
- 238000012546 transfer Methods 0.000 claims description 47
- 238000012217 deletion Methods 0.000 claims description 37
- 230000037430 deletion Effects 0.000 claims description 37
- 238000012937 correction Methods 0.000 claims description 32
- 230000007704 transition Effects 0.000 claims description 12
- 230000007246 mechanism Effects 0.000 claims description 8
- 239000000203 mixture Substances 0.000 claims description 5
- 230000014509 gene expression Effects 0.000 claims description 4
- 238000013528 artificial neural network Methods 0.000 description 61
- 230000000694 effects Effects 0.000 description 32
- 230000001133 acceleration Effects 0.000 description 31
- 230000033001 locomotion Effects 0.000 description 27
- 230000005236 sound signal Effects 0.000 description 23
- 230000004913 activation Effects 0.000 description 21
- 238000003860 storage Methods 0.000 description 18
- 230000006870 function Effects 0.000 description 15
- 238000006243 chemical reaction Methods 0.000 description 11
- 238000011156 evaluation Methods 0.000 description 11
- 230000003287 optical effect Effects 0.000 description 11
- 230000007613 environmental effect Effects 0.000 description 10
- 238000004364 calculation method Methods 0.000 description 9
- 238000010586 diagram Methods 0.000 description 9
- 210000002569 neuron Anatomy 0.000 description 9
- 230000005540 biological transmission Effects 0.000 description 8
- 238000003825 pressing Methods 0.000 description 7
- 230000007423 decrease Effects 0.000 description 6
- 230000008921 facial expression Effects 0.000 description 6
- 238000013527 convolutional neural network Methods 0.000 description 5
- 210000000887 face Anatomy 0.000 description 5
- 238000010801 machine learning Methods 0.000 description 5
- 238000009966 trimming Methods 0.000 description 5
- 241000282326 Felis catus Species 0.000 description 4
- 238000005070 sampling Methods 0.000 description 4
- 241000135164 Timea Species 0.000 description 3
- 230000017531 blood circulation Effects 0.000 description 3
- 238000009826 distribution Methods 0.000 description 3
- 239000000284 extract Substances 0.000 description 3
- 238000009825 accumulation Methods 0.000 description 2
- 230000009471 action Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 238000012790 confirmation Methods 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 230000005484 gravity Effects 0.000 description 2
- 238000010191 image analysis Methods 0.000 description 2
- 230000004622 sleep time Effects 0.000 description 2
- 208000033748 Device issues Diseases 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000004087 circulation Effects 0.000 description 1
- 229920001940 conductive polymer Polymers 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- 230000002354 daily effect Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 230000030808 detection of mechanical stimulus involved in sensory perception of sound Effects 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 230000004399 eye closure Effects 0.000 description 1
- 230000001815 facial effect Effects 0.000 description 1
- 238000009432 framing Methods 0.000 description 1
- 238000011478 gradient descent method Methods 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000006855 networking Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000004091 panning Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
- 238000010079 rubber tapping Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000004904 shortening Methods 0.000 description 1
- 230000006641 stabilisation Effects 0.000 description 1
- 238000011105 stabilization Methods 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 230000001960 triggered effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G03—PHOTOGRAPHY; CINEMATOGRAPHY; ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ELECTROGRAPHY; HOLOGRAPHY
- G03B—APPARATUS OR ARRANGEMENTS FOR TAKING PHOTOGRAPHS OR FOR PROJECTING OR VIEWING THEM; APPARATUS OR ARRANGEMENTS EMPLOYING ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ACCESSORIES THEREFOR
- G03B15/00—Special procedures for taking photographs; Apparatus therefor
-
- G—PHYSICS
- G03—PHOTOGRAPHY; CINEMATOGRAPHY; ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ELECTROGRAPHY; HOLOGRAPHY
- G03B—APPARATUS OR ARRANGEMENTS FOR TAKING PHOTOGRAPHS OR FOR PROJECTING OR VIEWING THEM; APPARATUS OR ARRANGEMENTS EMPLOYING ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ACCESSORIES THEREFOR
- G03B17/00—Details of cameras or camera bodies; Accessories therefor
-
- G—PHYSICS
- G03—PHOTOGRAPHY; CINEMATOGRAPHY; ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ELECTROGRAPHY; HOLOGRAPHY
- G03B—APPARATUS OR ARRANGEMENTS FOR TAKING PHOTOGRAPHS OR FOR PROJECTING OR VIEWING THEM; APPARATUS OR ARRANGEMENTS EMPLOYING ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ACCESSORIES THEREFOR
- G03B17/00—Details of cameras or camera bodies; Accessories therefor
- G03B17/02—Bodies
-
- G—PHYSICS
- G03—PHOTOGRAPHY; CINEMATOGRAPHY; ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ELECTROGRAPHY; HOLOGRAPHY
- G03B—APPARATUS OR ARRANGEMENTS FOR TAKING PHOTOGRAPHS OR FOR PROJECTING OR VIEWING THEM; APPARATUS OR ARRANGEMENTS EMPLOYING ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ACCESSORIES THEREFOR
- G03B17/00—Details of cameras or camera bodies; Accessories therefor
- G03B17/56—Accessories
-
- G—PHYSICS
- G03—PHOTOGRAPHY; CINEMATOGRAPHY; ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ELECTROGRAPHY; HOLOGRAPHY
- G03B—APPARATUS OR ARRANGEMENTS FOR TAKING PHOTOGRAPHS OR FOR PROJECTING OR VIEWING THEM; APPARATUS OR ARRANGEMENTS EMPLOYING ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ACCESSORIES THEREFOR
- G03B5/00—Adjustment of optical system relative to image or object surface other than for focusing
-
- G—PHYSICS
- G03—PHOTOGRAPHY; CINEMATOGRAPHY; ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ELECTROGRAPHY; HOLOGRAPHY
- G03B—APPARATUS OR ARRANGEMENTS FOR TAKING PHOTOGRAPHS OR FOR PROJECTING OR VIEWING THEM; APPARATUS OR ARRANGEMENTS EMPLOYING ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ACCESSORIES THEREFOR
- G03B7/00—Control of exposure by setting shutters, diaphragms or filters, separately or conjointly
- G03B7/08—Control effected solely on the basis of the response, to the intensity of the light received by the camera, of a built-in light-sensitive device
- G03B7/091—Digital circuits
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/90—Dynamic range modification of images or parts thereof
- G06T5/92—Dynamic range modification of images or parts thereof based on global image properties
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/50—Constructional details
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/50—Constructional details
- H04N23/51—Housings
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/50—Constructional details
- H04N23/54—Mounting of pick-up tubes, electronic image sensors, deviation or focusing coils
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/50—Constructional details
- H04N23/55—Optical parts specially adapted for electronic image sensors; Mounting thereof
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/56—Cameras or camera modules comprising electronic image sensors; Control thereof provided with illuminating means
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/58—Means for changing the camera field of view without moving the camera body, e.g. nutating or panning of optics or image sensors
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/61—Control of cameras or camera modules based on recognised objects
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/617—Upgrading or updating of programs or applications for camera control
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/62—Control of parameters via user interfaces
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/64—Computer-aided capture of images, e.g. transfer from script file into camera, check of taken image quality, advice or proposal for image composition or decision on when to take image
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/667—Camera operation mode switching, e.g. between still and video, sport and normal or high- and low-resolution modes
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/68—Control of cameras or camera modules for stable pick-up of the scene, e.g. compensating for camera body vibrations
- H04N23/681—Motion detection
- H04N23/6812—Motion detection based on additional sensors, e.g. acceleration sensors
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/68—Control of cameras or camera modules for stable pick-up of the scene, e.g. compensating for camera body vibrations
- H04N23/682—Vibration or motion blur correction
- H04N23/685—Vibration or motion blur correction performed by mechanical compensation
- H04N23/687—Vibration or motion blur correction performed by mechanical compensation by shifting the lens or sensor position
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/69—Control of means for changing angle of the field of view, e.g. optical zoom objectives or electronic zooming
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/60—Control of cameras or camera modules
- H04N23/695—Control of camera direction for changing a field of view, e.g. pan, tilt or based on tracking of objects
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/80—Camera processing pipelines; Components thereof
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/80—Camera processing pipelines; Components thereof
- H04N23/81—Camera processing pipelines; Components thereof for suppressing or minimising disturbance in the image signal generation
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N23/00—Cameras or camera modules comprising electronic image sensors; Control thereof
- H04N23/80—Camera processing pipelines; Components thereof
- H04N23/84—Camera processing pipelines; Components thereof for processing colour signals
- H04N23/88—Camera processing pipelines; Components thereof for processing colour signals for colour balance, e.g. white-balance circuits or colour temperature control
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N5/00—Details of television systems
- H04N5/76—Television signal recording
- H04N5/765—Interface circuits between an apparatus for recording and another apparatus
- H04N5/77—Interface circuits between an apparatus for recording and another apparatus between a recording apparatus and a television camera
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N9/00—Details of colour television systems
- H04N9/64—Circuits for processing colour signals
- H04N9/68—Circuits for processing colour signals for controlling the amplitude of colour signals, e.g. automatic chroma control circuits
-
- G—PHYSICS
- G03—PHOTOGRAPHY; CINEMATOGRAPHY; ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ELECTROGRAPHY; HOLOGRAPHY
- G03B—APPARATUS OR ARRANGEMENTS FOR TAKING PHOTOGRAPHS OR FOR PROJECTING OR VIEWING THEM; APPARATUS OR ARRANGEMENTS EMPLOYING ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ACCESSORIES THEREFOR
- G03B17/00—Details of cameras or camera bodies; Accessories therefor
- G03B17/56—Accessories
- G03B17/561—Support related camera accessories
-
- G—PHYSICS
- G03—PHOTOGRAPHY; CINEMATOGRAPHY; ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ELECTROGRAPHY; HOLOGRAPHY
- G03B—APPARATUS OR ARRANGEMENTS FOR TAKING PHOTOGRAPHS OR FOR PROJECTING OR VIEWING THEM; APPARATUS OR ARRANGEMENTS EMPLOYING ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ACCESSORIES THEREFOR
- G03B2205/00—Adjustment of optical system relative to image or object surface other than for focusing
- G03B2205/0046—Movement of one or more optical elements for zooming
-
- G—PHYSICS
- G03—PHOTOGRAPHY; CINEMATOGRAPHY; ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ELECTROGRAPHY; HOLOGRAPHY
- G03B—APPARATUS OR ARRANGEMENTS FOR TAKING PHOTOGRAPHS OR FOR PROJECTING OR VIEWING THEM; APPARATUS OR ARRANGEMENTS EMPLOYING ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ACCESSORIES THEREFOR
- G03B2206/00—Systems for exchange of information between different pieces of apparatus, e.g. for exchanging trimming information, for photo finishing
-
- G—PHYSICS
- G03—PHOTOGRAPHY; CINEMATOGRAPHY; ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ELECTROGRAPHY; HOLOGRAPHY
- G03B—APPARATUS OR ARRANGEMENTS FOR TAKING PHOTOGRAPHS OR FOR PROJECTING OR VIEWING THEM; APPARATUS OR ARRANGEMENTS EMPLOYING ANALOGOUS TECHNIQUES USING WAVES OTHER THAN OPTICAL WAVES; ACCESSORIES THEREFOR
- G03B37/00—Panoramic or wide-screen photography; Photographing extended surfaces, e.g. for surveying; Photographing internal surfaces, e.g. of pipe
- G03B37/02—Panoramic or wide-screen photography; Photographing extended surfaces, e.g. for surveying; Photographing internal surfaces, e.g. of pipe with scanning movement of lens or cameras
Definitions
- the present invention relates to an imaging device and a control method thereof.
- an imaging device such as a camera
- Such an imaging apparatus is provided with a function of detecting a user's operation error and notifying the user, or detecting an external environment and notifying the user when it is not suitable for photographing.
- life log camera that shoots regularly and continuously without the user giving a shooting instruction to an imaging apparatus that performs shooting by such a user operation
- the life log camera is used in a state of being attached to the user's body by a strap or the like, and records a scene that the user sees in daily life as an image at regular time intervals.
- the life log camera does not shoot at a timing intended by the user, for example, but shoots at fixed time intervals, so that unexpected moments that are not normally shot can be left as images.
- the present invention has been made in view of the above-described problems, and an object thereof is to provide an imaging device capable of acquiring a video favorite of the user without the user performing a special operation.
- a control method of an image pickup apparatus comprising: a change step of changing processing of the image pickup apparatus based on first data related to a photographed image photographed by a photographing unit; In the step, when changing the processing of the imaging device, weighting of the first data in the photographed image instructed by the user is made larger than the first data in the photographed image automatically processed. It is characterized by
- FIG. 7 is a diagram for explaining an example of area division within a shooting angle of view. It is a figure for demonstrating the example of an image of the view
- FIG. 8 is a view schematically showing shooting direction change operation detection. It is a figure which shows the image imaged at the timing of time ta. It is a figure which shows the image imaged when a user rotates the lens-barrel 102 rightward with respect to the fixing
- FIG. 17 is a view showing an image when a target subject enters the angle of view by rotating the pan axis to near a new subject by the user's shooting direction change operation with the control output of the compensator 1702 turned off at time tc. . It is a figure which shows the image which the new subject after imaging
- FIG. 10 is a diagram for describing notification that a subject has been registered in the smart device 301. It is a flowchart explaining the automatic imaging
- FIG. 1 is a view schematically showing an imaging device of the first embodiment.
- the imaging apparatus 101 shown in FIG. 1A is provided with an operation member capable of operating the power switch (hereinafter referred to as a power button, but may be an operation such as tap, flick, or swipe on the touch panel).
- a lens barrel 102 which is a housing including an imaging lens group for capturing an image and an imaging element, is attached to the imaging apparatus 101, and is provided with a rotation mechanism capable of rotationally driving the lens barrel 102 with respect to the fixed portion 103.
- the tilt rotation unit 104 is a motor drive mechanism capable of rotating the lens barrel 102 in the pitch direction shown in FIG. 1B
- the pan rotation unit 105 is a motor drive mechanism capable of rotating the lens barrel 102 in the yaw direction.
- the lens barrel 102 can rotate in one or more axes.
- FIG. 1B is an axis definition at the fixed portion 103 position.
- Both the angular velocity meter 106 and the accelerometer 107 are mounted on the fixing unit 103 of the imaging device 101.
- the vibration of the imaging apparatus 101 is detected based on the angular velocity meter 106 and the accelerometer 107, and the tilt rotation unit and the pan rotation unit are rotationally driven based on the detected swing angle.
- the shake of the lens barrel 102 which is the movable portion is corrected, and the tilt is corrected.
- FIG. 2 is a block diagram showing the configuration of the imaging device of the present embodiment.
- the first control unit 223 includes a processor (for example, a CPU, a GPU, a microprocessor, an MPU, and the like) and a memory (for example, a DRAM, an SRAM, and the like). These units execute various processes to control each block of the imaging apparatus 101 or to control data transfer between the blocks.
- a non-volatile memory (EEPROM) 216 is an electrically erasable and recordable memory, and stores constants, programs and the like for the operation of the first control unit 223.
- the zoom unit 201 includes a zoom lens that performs magnification change.
- the zoom drive control unit 202 drives and controls the zoom unit 201.
- the focus unit 203 includes a lens that performs focus adjustment.
- the focus drive control unit 204 drives and controls the focus unit 203.
- the imaging device receives light incident through each lens group, and outputs information on charge according to the light amount to the image processing unit 207 as analog image data.
- the image processing unit 207 applies image processing such as distortion correction, white balance adjustment, color interpolation processing, and the like to digital image data output by A / D conversion, and outputs the applied digital image data.
- the digital image data output from the image processing unit 207 is converted into a recording format such as JPEG format by the image recording unit 208, and is transmitted to the memory 215 or a video output unit 217 described later.
- the lens barrel rotation drive unit 205 drives the tilt rotation unit 104 and the pan rotation unit 105 to drive the lens barrel 102 in the tilt direction and the pan direction.
- the device shake detection unit 209 is mounted with, for example, an angular velocity meter (gyro sensor) 106 that detects angular velocities in the three axis directions of the imaging device 101, and an accelerometer (acceleration sensor) 107 that detects an acceleration in the three axis directions of the device. .
- the device shake detection unit 209 calculates the rotation angle of the device, the shift amount of the device, and the like based on the detected signal.
- the audio input unit 213 acquires an audio signal around the imaging device 101 from a microphone provided in the imaging device 101, performs analog-to-digital conversion, and transmits the signal to the audio processing unit 214.
- the audio processing unit 214 performs processing related to audio such as optimization processing of the input digital audio signal.
- the first control unit 223 transmits the audio signal processed by the audio processing unit 214 to the memory 215.
- the memory 215 temporarily stores the image signal and the audio signal obtained by the image processing unit 207 and the audio processing unit 214.
- An image processing unit 207 and an audio processing unit 214 read out an image signal and an audio signal temporarily stored in the memory 215 and perform encoding of the image signal, encoding of the audio signal, and the like, and a compressed image signal and a compressed audio signal. Generate The first control unit 223 transmits the compressed image signal and the compressed audio signal to the recording and reproducing unit 220.
- the recording and reproducing unit 220 records, on the recording medium 221, the compressed image signal and the compressed sound signal generated by the image processing unit 207 and the sound processing unit 214, and other control data related to photographing.
- the first control unit 223 causes the recording / reproducing unit 220 to transmit the audio signal generated by the audio processing unit 214 and the compressed image signal generated by the image processing unit 207. It transmits and makes the recording medium 221 record.
- the recording medium 221 may be a recording medium built in the imaging device 101 or a removable recording medium.
- the recording medium 221 can record various data such as a compressed image signal, a compressed audio signal, and an audio signal generated by the imaging device 101, and a medium having a larger capacity than the non-volatile memory 216 is generally used.
- the recording medium 221 includes any recording medium such as a hard disk, an optical disk, a magneto-optical disk, a CD-R, a DVD-R, a magnetic tape, a nonvolatile semiconductor memory, a flash memory, and the like.
- the recording and reproducing unit 220 reads (reproduces) the compressed image signal, the compressed audio signal, the audio signal, various data, and the program recorded on the recording medium 221. Then, the first control unit 223 transmits the read compressed image signal and compressed audio signal to the image processing unit 207 and the audio processing unit 214.
- the image processing unit 207 and the audio processing unit 214 temporarily store the compressed image signal and the compressed audio signal in the memory 215, decode the signal according to a predetermined procedure, and transmit the decoded signal to the video output unit 217 and the audio output unit 218. Do.
- the voice input unit 213 has a plurality of microphones mounted on the imaging apparatus 101, and the voice processing unit 214 can detect the direction of the sound on the plane on which the plurality of microphones are installed. Used. Further, the voice processing unit 214 detects a specific voice command.
- the voice command may be configured so that the user can register a specific voice in the imaging device in addition to some commands registered in advance. It also performs sound scene recognition. In sound scene recognition, sound scene determination is performed by a network learned in advance by machine learning based on a large amount of audio data. For example, a network for detecting a specific scene such as “Chaiing,” “Applause,” “Speaking,” etc. is set in the audio processing unit 214. When a specific sound scene or a specific voice command is detected, a detection trigger signal is output to the first control unit 223 or the second control unit 211.
- a second control unit 211 provided separately from the first control unit 223 that controls the entire main system of the imaging device 101 controls the power supply of the first control unit 223.
- the first power supply unit 210 and the second power supply unit 212 respectively supply power for operating the first control unit 223 and the second control unit 211. Power is supplied to both the first control unit 223 and the second control unit 211 by pressing the power button provided on the imaging apparatus 101. However, as described later, the first control unit 223 performs the first power supply. It is controlled to turn off its own power supply to the unit 210. Even while the first control unit 223 is not in operation, the second control unit 211 is in operation, and information from the device shake detection unit 209 and the audio processing unit 214 is input. The second control unit is configured to determine whether or not to start the first control unit 223 based on various input information, and when the start determination is made, instructs the first power supply unit to supply power. .
- the audio output unit 218 outputs an audio pattern set in advance from a speaker incorporated in the imaging apparatus 101, for example, at the time of shooting.
- the LED control unit 224 controls, for example, a lighting blink pattern in which an LED provided in the imaging device 101 is preset when photographing or the like.
- the video output unit 217 includes, for example, a video output terminal, and transmits an image signal to display a video on a connected external display or the like. Further, the audio output unit 218 and the video output unit 217 may be one combined terminal, for example, a terminal such as an HDMI (registered trademark) (High-Definition Multimedia Interface) terminal.
- HDMI registered trademark
- High-Definition Multimedia Interface High-Definition Multimedia Interface
- the communication unit 222 performs communication between the imaging apparatus 101 and an external device, and transmits and receives data such as an audio signal, an image signal, a compressed audio signal, and a compressed image signal, for example. Also, the imaging device 101 is driven based on an instruction from an external device that can mutually communicate with the imaging device 101 by receiving imaging start and end commands and control signals related to imaging such as pan / tilt and zoom driving. In addition, information such as various parameters related to learning processed by a learning processing unit 219 described later is transmitted and received between the imaging apparatus 101 and an external apparatus.
- the communication unit 222 is, for example, a wireless communication module such as an infrared communication module, a Bluetooth (registered trademark) communication module, a wireless LAN communication module, a wireless USB, or a GPS receiver.
- FIG. 3 is a diagram showing a configuration example of a wireless communication system of the imaging apparatus 101 and the external apparatus 301.
- the imaging apparatus 101 is a digital camera having a photographing function
- the external apparatus 301 is a smart device including a Bluetooth communication module and a wireless LAN communication module.
- the imaging apparatus 101 and the smart device 301 communicate with each other via, for example, a wireless LAN based on the IEEE 802.11 standard series 302, and master-slave relationship such as Bluetooth Low Energy (hereinafter referred to as "BLE") such as a control station and a dependent station.
- BLE Bluetooth Low Energy
- the first communication such as a wireless LAN can perform faster communication than the second communication such as BLE, and the second communication consumes more than the first communication. It is assumed that the power is low and / or the communicable distance is short.
- the configuration of the smart device 301 will be described with reference to FIG.
- the smart device 301 includes, for example, a public line control unit 406 for public wireless communication in addition to the wireless LAN control unit 401 for wireless LAN and the BLE control unit 402 for BLE.
- the smart device 301 further includes a packet transmitting / receiving unit 403.
- a wireless LAN control unit 401 performs RF control of a wireless LAN, communication processing, a driver that performs various controls of communication by a wireless LAN conforming to the IEEE 802.11 standard series, and protocol processing regarding communication by a wireless LAN.
- the BLE control unit 402 performs RF control of BLE, communication processing, and protocol processing regarding communication by a driver that performs various controls of communication by BLE and BLE.
- a public line control unit 406 performs RF control of public wireless communication, communication processing, a driver that performs various controls of public wireless communication, and protocol processing related to public wireless communication.
- the public wireless communication conforms to, for example, the International Multimedia Telecommunications (IMT) standard or the Long Term Evolution (LTE) standard.
- the packet transmission / reception unit 403 performs processing for performing transmission and / or reception of packets related to communication by wireless LAN and BLE and public wireless communication.
- the smart device 301 is described as performing at least one of transmission and reception of a packet in communication, but other communication formats such as circuit switching may be used other than packet switching. Good.
- the smart device 301 further includes, for example, a control unit 411, a storage unit 404, a GPS reception unit 405, a display unit 407, an operation unit 408, an audio input audio processing unit 409, and a power supply unit 410.
- the control unit 411 controls the entire smart device 301, for example, by executing a control program stored in the storage unit 404.
- the storage unit 404 stores, for example, a control program executed by the control unit 411 and various information such as parameters required for communication. Various operations described later are realized by the control unit 411 executing a control program stored in the storage unit 404.
- the power supply unit 410 supplies power to the smart device 301.
- the display unit 407 has a function capable of outputting visually recognizable information, such as an LCD or an LED, or a sound output such as a speaker, and displays various information.
- the operation unit 408 is, for example, a button for receiving an operation of the smart device 301 by the user.
- the display unit 407 and the operation unit 408 may be configured by a common member such as a touch panel, for example.
- the voice input voice processing unit 409 may be configured to obtain a voice uttered by the user from, for example, a general-purpose microphone built in the smart device 301, and to obtain a user's operation command by voice recognition processing.
- a voice command is acquired by the user's pronunciation via a dedicated application in the smart device. Then, it can be registered as a specific voice command for causing the voice processing unit 214 of the imaging apparatus 101 to recognize a specific voice command via the communication 302 by wireless LAN.
- a GPS (Global Positioning System) 405 receives a GPS signal notified from a satellite, analyzes the GPS signal, and estimates the current position (longitude / latitude information) of the smart device 301.
- position estimation may be performed using Wi-Fi Positioning System (WPS) or the like to estimate the current position of the smart device 301 based on information of wireless networks present in the surroundings.
- WPS Wi-Fi Positioning System
- the acquired current GPS position information is located in a position range (within a predetermined radius range) set in advance, movement information is notified to the imaging apparatus 101 via the BLE control unit 402, and will be described later. Use as a parameter for automatic shooting and editing.
- the movement information is notified to the imaging apparatus 101 via the BLE control unit 402, and used as a parameter for automatic photographing and automatic editing described later.
- the imaging apparatus 101 and the smart device 301 exchange data with the imaging apparatus 101 by communication using the wireless LAN control unit 401 and the BLE control unit 402. For example, it transmits and receives data such as an audio signal, an image signal, a compressed audio signal, and a compressed image signal. Further, the smart device issues an operation instruction such as photographing of the imaging apparatus 101, transmission of voice command registration data, notification of predetermined position detection based on GPS position information, and notification of location movement. It also sends and receives learning data via a dedicated application in the smart device.
- the external device 301 is not limited to the smart device 301.
- the display unit 407 and the operation unit 408 may be omitted, and an apparatus specialized for voice input may be used.
- this apparatus the voice uttered by the user is acquired from the above-described microphone, and the user's operation command is acquired by voice recognition processing, and notified to the imaging apparatus 101.
- this device may have voice recognition, a communication function with the cloud, and a read-out function of news using a speaker. In addition, it may have a sound output of a search using a search engine or a function of a dialogue system.
- FIG. 5 is a diagram showing a configuration example of an external apparatus 501 that can communicate with the imaging apparatus 101.
- the imaging apparatus 101 is a digital camera having a photographing function
- the external apparatus 501 is a wearable device including various sensing units capable of communicating with the imaging apparatus 101 by, for example, a Bluetooth communication module.
- the wearable device 501 is configured to be attached to, for example, the arm of the user, and an acceleration capable of detecting a user's pulse information, heart rate, blood flow and other biological information at a predetermined cycle, and the user's exercise state.
- a sensor etc. is mounted.
- the biological information detection unit 502 detects, for example, a change in electric potential detected by contact of the skin with a conductive polymer, a pulse sensor that detects a pulse, a heartbeat sensor that detects a heartbeat, a blood flow sensor that detects blood flow, Include sensors that The present embodiment will be described using a heart rate sensor as the biological information detection unit 502.
- the heart rate sensor irradiates the skin with infrared light using, for example, an LED or the like, and detects the heart rate of the user by detecting the infrared light transmitted through the body tissue with a light receiving sensor and processing the signal.
- the biological information detection unit 502 outputs the detected biological information as a signal to a control unit 607 described later.
- an acceleration sensor or a gyro sensor is mounted on the shake detection unit 503 that detects the motion state of the user, and it is determined whether the user is moving based on the acceleration information or whether the user is swinging an arm And other motions can be detected.
- an operation unit 505 that receives an operation of the wearable device 501 by the user, and a display unit 504 that outputs visually recognizable information such as an LCD or an LED are mounted.
- the configuration of the wearable device 501 will be described with reference to FIG.
- the wearable device 501 includes, for example, a control unit 607, a communication unit 601, a biological information detection unit 502, a shake detection unit 503, a display unit 504, an operation unit 505, a power supply unit 606, and a storage unit 608.
- the control unit 607 controls the entire wearable device 501, for example, by executing a control program stored in the storage unit 608.
- the storage unit 608 stores, for example, a control program executed by the control unit 607 and various information such as parameters required for communication. Various operations described later are realized, for example, by the control unit 607 executing a control program stored in the storage unit 608.
- the power supply unit 606 supplies power to the wearable device 501.
- the display unit 504 has a function capable of outputting visually recognizable information such as an LCD or an LED, or a sound output such as a speaker, and displays various information.
- the operation unit 505 is, for example, a button for receiving an operation of the wearable device 501 by the user.
- the display unit 504 and the operation unit 505 may be configured by a common member such as a touch panel, for example.
- the operation unit acquires voice uttered by the user from, for example, a general-purpose microphone built in the wearable device 501, acquires voice uttered by the user by voice processing, and receives user's operation instruction by voice recognition processing. May be acquired.
- the various detection information processed by the control unit 607 from the biological information detection unit 502 or the shake detection unit 503 transmits the detection information to the imaging apparatus 101 by the communication unit 601.
- the detection information is transmitted to the imaging apparatus 101 at the timing at which a change in the user's heartbeat is detected, or the detection information is transmitted at the change timing of the movement state such as walking movement / running movement / stop. Also, for example, detection information is transmitted at a timing at which a pre-set arm-like motion is detected, or detection information is transmitted at a timing at which movement of a preset distance is detected.
- FIG. 30 shows an example of a hand-held operation attachment.
- the imaging apparatus main body 101 may be configured such that the imaging apparatus 101 can be operated by each operation member provided in the attachment, without providing an operation member such as a shutter button for instructing photographing.
- the hand-held operation attachment 5001 may be provided with a changeover switch 5005 capable of switching between an auto setting mode depending on the camera and a mode in which the user can perform manual camera operation.
- the changeover switch 5005 is set to the manual camera operation mode, pan / tilt driving for camera shake correction is performed, but a large pan / tilt angle change is not performed for object search.
- an attachment detection unit 5002 that can detect whether the attachment 5001 is connected to the imaging device 101 may be provided without providing the changeover switch 5005.
- the detection of attachment attachment may be an existing method such as voltage change or ID.
- the user wants to manually change the pan / tilt direction.
- the hand-held operation attachment 5001 may be provided with an operation member 5003 whose pan / tilt direction can be changed.
- the operation member 5003 may be freely translated in XY coordinates, and pan and tilt may be moved depending on the operated direction. For example, when the operation member is moved upward, the tilt is driven upward and when the operation member is moved downward, the tilt is driven downward and the operation member is moved right and left. Drive the pan according to the direction.
- a shutter button 5004 may be provided which allows the user to shoot at any timing.
- a switch 5006 which can switch a shooting mode (for example, still image shooting mode / moving image shooting mode / panorama shooting mode / time-lapse shooting mode) may be provided.
- the method of instructing the imaging device 101 from the hand-held operation attachment 5001 may use non-contact communication means.
- the operation instruction can be performed by the connector for connecting the electric signal provided respectively to the imaging device 101 and the hand-held operation attachment 5001
- the connector for the battery is attached to the hand-held operation attachment. unnecessary. Therefore, if a connector is provided for the purpose of operation such as release, it is necessary to add a drip-proof function at the connection portion, various parts are needed, and the apparatus becomes bulky and cost increases.
- the non-contact communication means may use Bluetooth Low Energy (BLE), NEAR FIELD COMMUNICATION (NFC) BLE, or any other method.
- BLE Bluetooth Low Energy
- NFC NEAR FIELD COMMUNICATION
- the radio wave generation power of the hand-held operation attachment 5001 may be small with a small power source capacity, and may be a means for generating a small amount of power by pressing a button battery or a shutter button 5004, for example.
- an attachment having an operation member separate from the imaging device and instructing the release of the imaging device and an operation member instructing the rotation mechanism of the imaging device may be attached to the imaging device.
- an attachment having an operation member that issues an imaging mode change instruction capable of setting any two or more of the still image mode, the moving image mode, the panoramic mode, and the time lapse mode of the imaging unit may be attached to the imaging apparatus.
- the operation instruction by the operation member from the attachment to the imaging device is notified by the non-contact communication means.
- the imaging device may detect attachment information attached to the imaging device, and change the control frequency band of the shake correction unit based on the attachment information. By detecting the attachment information, it is possible to change whether or not to perform inclination correction that holds an angle in a certain direction based on the direction of gravity.
- the low frequency side of the shake correction control band may be cut by detecting the attachment information.
- FIG. 32 shows an example of a configuration that can be mechanically attached to the accessory shoe 3202 of the camera 3201 different from the imaging device 101.
- the mounting direction of the imaging device 101 and the camera 3201 and the difference in angle between the optical axis direction of the camera 3201 and the optical axis direction of the imaging device 101 become known. Therefore, it becomes easy to coordinately control the camera 3201 and the imaging device 101.
- the notification of information between the imaging device 101 and the camera 3201 may be a mechanism for providing information on the imaging device 101 and the camera 3201 while providing an electrical contact at a location connected to the accessory shoe. Further, the imaging device 101 and the camera 3201 may be configured to notify information via a communication cable such as USB. Also, wireless communication (BLE, NFC, etc.) may be used, or another method may be used.
- FIG. 7 is a flowchart for explaining an example of the operation of the first control unit 223 of the imaging apparatus 101 according to the present embodiment.
- the first power supply unit 210 causes the power supply unit to supply power to each block of the first control unit 223 and the imaging apparatus 101.
- step 701 the start condition is read.
- the activation condition is as follows. (1) The power button is manually pressed and power on (2) from an external device (eg, 301) according to an instruction from an external communication (eg, BLE communication): power on (3) From the sub processor (second control unit 211) Power on
- the start condition read in here is used as one parameter element at the time of subject search and automatic photographing, which will be described later.
- the process proceeds to step 702.
- the sensor read in here is a sensor that detects vibration such as a gyro sensor from the apparatus shake detection unit 209 or an acceleration sensor.
- the rotational position of the tilt rotation unit 104 or the pan rotation unit 105 may be set.
- it may be a voice level detected by the voice processing unit 214, a detection trigger of specific voice recognition, or a sound direction detection.
- a sensor that detects environmental information also acquires information.
- a temperature sensor that detects the temperature around the imaging device 101 at a predetermined cycle
- an air pressure sensor that detects a change in air pressure around the imaging device 101.
- an illuminance sensor that detects the brightness around the imaging device 101
- a humidity sensor that detects the humidity around the imaging device 101
- a UV sensor that detects the amount of ultraviolet light around the imaging device 101, or the like may be provided.
- the temperature change amount, pressure change amount, brightness change amount, humidity change, etc. are calculated by calculating the change rate at predetermined time intervals from the detected various information. It is used for determination of automatic photographing etc. which mention an amount, an ultraviolet ray variation amount, etc. later.
- step 702 If various sensors are read in step 702, the process proceeds to step 703.
- step 703 it is detected whether communication from the external device is instructed, and when communication is instructed, communication with the external device is performed.
- data such as an audio signal, an image signal, a compressed audio signal, or a compressed image signal is transmitted or received from the smart device 301 via wireless LAN or BLE.
- an operation instruction such as shooting of the imaging device 101 from the smart device 301, transmission of voice command registration data, an instruction of transmission / reception of predetermined position detection notification based on GPS position information, location movement notification and learning data Do you read?
- step 703 when there is an update of biological information such as user's exercise information, arm action information, and heartbeat from the wearable device 501, information is read via BLE.
- the various sensors for detecting the environmental information described above may be mounted on the imaging apparatus 101, or may be mounted on the smart device 301 or the wearable device 501. In that case, the environmental information is read via BLE Do also.
- communication reading from the external device is performed in step 703, the process proceeds to step S704.
- step 704 mode setting determination is performed.
- the mode set in step 704 is determined and selected from the following.
- Automatic shooting mode From each detection information (image, sound, time, vibration, location, change in the body, environmental change) set by learning described later, elapsed time after transition to the automatic shooting mode, and past shooting information, etc. When it is determined that the automatic shooting should be performed, the automatic shooting mode is set.
- pan / tilt and zoom are driven to automatically search for a subject based on each detection information (image, sound, time, vibration, place, body change, environment change). Then, when it is determined that it is the timing at which the user's favorite shooting can be performed, determination of the shooting method among various shooting methods such as single still image shooting, still image continuous shooting, moving image shooting, panoramic shooting, time-lapse shooting Processing is performed and photographing is performed automatically.
- step 712 In the automatic editing mode processing (step 712), still image and moving image selection processing based on learning is performed, and a highlight moving image summarized into one moving image by image effect and time of edited moving image based on learning, etc.
- the automatic editing process to be created is performed.
- the imaging apparatus 101 automatically extracts an image that may be the user's preference, automatically extracts the user's favorite image to the smart device 301, and image transfer is performed.
- the user's preference image extraction is performed using a score that determines the user preference added to each image described later.
- step 716 learning is performed according to the user's preference. Based on information such as each operation in the smart device 301 and notification of learning information from the smart device 301, learning is performed according to the user's preference using a neural network. As information on each operation in the smart device 301, for example, image acquisition information from the imaging apparatus, information on which a manual editing instruction is given via a dedicated application, and a determination value input by the user with respect to the image in the imaging apparatus There is information.
- learning about detection such as registration of personal authentication, voice registration, sound scene registration, general object recognition registration, etc., and learning of the conditions of the above-mentioned low power consumption mode, etc. are simultaneously performed.
- Automatic file deletion mode [Mode judgment condition] If it is determined that automatic file deletion should be performed from the elapsed time since the previous automatic file deletion and the remaining capacity of the non-volatile memory 216 storing the image, the automatic file deletion mode is set.
- a file to be automatically deleted is specified (three-option processing) from the tag information of each image and the date and time of photographing from the images in the non-volatile memory 216.
- step 705 it is determined in step 704 whether the mode setting determination is set to the low power consumption mode.
- the low power consumption mode determination is not a judgment condition of any mode of "automatic shooting mode", “automatic editing mode”, “automatic image transfer mode”, “learning mode” and “automatic file deletion mode” described later. If so, it is determined to be in the low power consumption mode.
- the process proceeds to step 705.
- step 705 if it is determined that the low power consumption mode condition is satisfied, the process proceeds to step 706.
- step 706 various parameters relating to the activation factor to be determined in the Sub processor (the shake detection determination parameter, the sound detection parameter, the time lapse detection parameter) are notified to the Sub processor (second control unit 211). Various parameters change in value by being learned in a learning process described later.
- the process of step 706 proceeds to step 707, the power of the main processor (first control unit 223) is turned off, and the process is ended.
- step 705 If it is determined in step 705 that the mode is not the low power consumption mode, the process proceeds to step 709 to determine whether the mode setting is the automatic shooting mode. If the mode setting is the automatic shooting mode, the process proceeds to step 710; Processing is performed. When the process ends, the process returns to step 702 and the process is repeated. If it is determined in step 709 that the automatic photographing mode is not set, the process proceeds to step 711.
- step 711 it is determined whether the mode setting is the automatic editing mode. If the mode is the automatic editing mode, the process proceeds to step 712 where the automatic editing mode processing is performed. When the process ends, the process returns to step 702 and the process is repeated. If it is determined in step 711 that the automatic editing mode is not set, the process proceeds to step 713.
- step 713 it is determined whether the mode setting is the automatic image transfer mode. If the automatic image transfer mode is selected, the process proceeds to step 714, where the automatic image transfer mode processing is performed. When the process ends, the process returns to step 702 and the process is repeated. If it is determined in step 713 that the automatic image transfer mode is not set, the process proceeds to step 715.
- step 715 it is determined whether the mode setting is the learning mode. If the mode setting is the learning mode, the process proceeds to step 716, and learning mode processing is performed. When the process ends, the process returns to step 702 and the process is repeated. If it is determined in step 715 that the learning mode is not set, the process proceeds to step 717.
- step 717 it is determined whether the mode setting is the file automatic deletion mode. If the mode is the file automatic deletion mode, the process proceeds to step 718, where the file automatic deletion mode processing is performed. When the process ends, the process returns to step 702 and the process is repeated. If it is determined in step 717 that the mode is not the file automatic deletion mode, the process returns to step 702 to repeat the process.
- FIG. 8 is a flowchart for explaining an example of the operation of the second control unit 211 of the imaging apparatus 101 according to the present embodiment.
- the second power supply is also supplied to the second control unit 211 in the same manner as the first power supply unit 210 supplies power from the power supply unit to the first control unit 223.
- the power supply unit supplies power to the second control unit 211 by the unit 212.
- the Sub processor (second control unit 211) is activated, and the process of FIG. 8 starts.
- step 801 it is determined whether or not a predetermined period which is a sampling cycle has elapsed. For example, if it is set to 10 msec, the process proceeds to step 802 in a 10 msec cycle. If it is determined that the predetermined period has not elapsed, the Sub processor returns to step 801 without performing any processing and waits for the predetermined period to elapse.
- step 802 learning information is read.
- the learning information is information transferred when performing information communication with the Sub processor in step 706 of FIG. 7, and, for example, the following information is read.
- Judgment condition of specific sound detection
- the shake detection value is an output value from a sensor that detects vibration such as a gyro sensor or an acceleration sensor from the device shake detection unit 209.
- step 803 When the shake detection value is acquired in step 803, the process proceeds to step 804, and processing of shake state detection set in advance is performed.
- the determination process is changed according to the learning information read in step S802.
- Tap Detection A state in which the user taps the imaging device 101 with, for example, a fingertip can be detected from an output value of an acceleration sensor attached to the imaging device 101.
- BPF band pass filter
- a signal region of acceleration change due to a tap can be extracted.
- Tap detection is performed based on whether the number of times the acceleration signal after BPF exceeds the predetermined threshold value ThreshA is the predetermined number of times CountA during the predetermined time TimeA. For double taps, CountA is set to 2, and for triple taps, CountA is set to 3. Further, Time A and Thresh A can also be changed by learning information.
- the swing state of the imaging device 101 can be detected from the output value of a gyro sensor or an acceleration sensor attached to the imaging device 101.
- the output of the gyro sensor or the acceleration sensor is cut by HPF and the low frequency component is cut by LPF, and then the absolute value conversion is performed.
- the vibration detection is performed based on whether or not the number of times the calculated absolute value exceeds the predetermined threshold ThreshB is equal to or more than the predetermined number CountB during the predetermined time TimeB. For example, it is possible to determine whether the shake is small when the imaging apparatus 101 is placed on a desk or the like, or is large when the imaging apparatus 101 is worn and walked with a wearable. Moreover, it is also possible to detect a fine shaking state according to the shaking level by having a plurality of conditions of the judgment threshold value and the judgment count number.
- Time B, Thresh B, and Count B can also be changed by learning information.
- the above has described the method of specific swing state detection by condition determination of the swing detection sensor.
- a specific shake state registered in advance by the learned neural network by inputting to the shake state determination unit using the neurus network It is also possible to detect.
- the learning information reading in step 802 is the weight parameter of the neural network.
- step 804 When the specific swing state detection process is performed in step 804, the process proceeds to step 805, and a predetermined specific sound detection process is performed.
- the detection determination process is changed according to the learning information read in step S802.
- Specific voice command detection A specific voice command is detected.
- the voice command allows the user to register a specific voice in the imaging device, in addition to several commands registered in advance.
- Specific Sound Scene Recognition Sound scene determination is performed by a network learned in advance by machine learning based on a large amount of audio data. For example, it detects a specific scene such as "Cheers up”, “Applause”, “Speaking” or the like. The scene to detect changes with learning.
- Sound Level Determination Detection based on sound level determination is performed by adding time during which the magnitude of the sound level exceeds the level predetermined value during a predetermined time.
- the predetermined time, the magnitude of the predetermined level, and the like change by learning.
- the above-described determination processing is performed in the voice processing unit 214, and it is determined in step 805 whether a specific sound is detected by each setting learned in advance.
- step 805 When the specific sound detection process is performed in step 805, the process proceeds to step 806.
- step 806 it is determined whether the Main processor (first control unit 223) is in the OFF state. If the Main processor is in the OFF state, the process proceeds to step 807, and processing for detecting a lapse of time set in advance is performed.
- the detection determination process is changed according to the learning information read in step S802.
- the learning information is information transferred when performing information communication with the Sub processor (second control unit 211) in step 706 described with reference to FIG.
- TimeC is a parameter that changes according to learning information.
- step 807 When the time lapse detection process is performed in step 807, the process proceeds to step 808, in which it is determined whether the low power consumption mode cancellation determination has been made.
- the low power consumption mode release condition is determined by the following. (1) Judgment condition of specific vibration detection (2) Judgment condition of specific sound detection (3) Judgment condition of time lapse judgment
- each specific swing state detection process at step 804 it can be determined whether or not the determination condition of the specific swing detection has been entered. Further, by the specific sound detection process in step 805, it can be determined whether or not the determination condition of the specific sound detection is entered. Further, by the time lapse detection processing in step 807, it can be determined whether or not the determination condition of time lapse detection is entered. Therefore, if any one or more conditions are entered, it is determined that the low power consumption mode is canceled.
- step 808 When the release condition is determined in step 808, the process proceeds to step 809 to turn on the power of the main processor, and in step 810, the main processor is notified of the condition (shake, sound, time) determined to be the low power consumption mode release. The flow returns to step 801 to loop the process.
- step 808 If it is determined in step 808 that none of the release conditions are met and it is determined that the low power consumption mode release determination is not made, the process returns to step 801 to loop processing.
- step 806 If it is determined in step 806 that the main processor is in the ON state, the information acquired in steps 803 to 805 is notified to the main processor, and the process returns to step 801 to loop processing.
- the conditions for transitioning to the low power consumption mode and the conditions for canceling the low power consumption mode are learned based on the user's operation. Ru.
- the imaging operation can be performed according to the convenience of the user who owns the imaging device 101. The method of learning will be described later.
- the low power consumption mode cancellation method may be performed by environmental information.
- Environmental information can be determined based on whether the absolute amount or change amount of temperature, atmospheric pressure, brightness, humidity, or ultraviolet light amount exceeds a predetermined threshold, and the threshold can be changed by learning described later.
- step S901 the image processing unit 207 performs image processing on the signal acquired by the imaging unit 206 to generate an image for object recognition.
- subject recognition such as human and object recognition is performed.
- the face or human body of the subject is detected.
- a pattern for determining the face of a person is determined in advance, and a portion matching the pattern included in the captured image can be detected as a face image of the person.
- the reliability indicating the certainty as the face of the subject is also calculated at the same time, and the reliability is calculated from, for example, the size of the face area in the image, the degree of coincidence with the face pattern, and the like.
- a method of extracting a characteristic subject by a method of using a histogram such as hue or saturation in a captured image is also performed.
- a histogram such as hue or saturation
- the process of dividing the distribution derived from the histogram of the hue, saturation, etc. into a plurality of sections and classifying the captured images into sections is performed. Be done.
- a histogram of a plurality of color components is created for the captured image, divided by the mountain-shaped distribution range, the captured image is classified in the region belonging to the combination of the same sections, and the image region of the subject is Be recognized.
- Each subject information can be obtained from the imaging information by the above method.
- the image shake correction amount is calculated. Specifically, first, the absolute angle of the imaging device is calculated based on the angular velocity and the acceleration information acquired by the device shake detection unit 209. Then, an image stabilization angle for moving the tilt rotation unit 104 and the pan rotation unit 105 in the angular direction that cancels out the absolute angle is determined, and is used as the image shake correction amount. In the image shake correction amount calculation processing here, the calculation method can be changed by learning processing described later.
- step S903 the state determination of the imaging apparatus is performed. Based on angular velocity information, acceleration information, angle detected by GPS position information, etc., it is determined what kind of vibration / motion state of the imaging device is at present.
- subject information such as a surrounding scenery changes largely depending on the moved distance.
- the “post-taking state” it may be considered that there is no change in the angle of the imaging device 101 itself, so that it is possible to perform a subject search for the post-taking.
- a subject search process is performed.
- the subject search is configured by the following processing.
- area division is performed all around the position of the imaging device (the origin O is at the imaging device position).
- division is performed at 22.5 degrees in each of the tilt direction and the pan direction.
- FIG. 13B when division is performed as shown in FIG. 13A, as the angle in the tilt direction deviates from 0 degree, the horizontal circumference decreases and the area area decreases. Therefore, as shown in FIG. 13B, when the tilt angle is 45 degrees or more, the area range in the horizontal direction is set larger than 22.5 degrees.
- FIGS. 13C and 13D show an example of area division within the shooting angle of view.
- An axis 1301 is the direction of the imaging apparatus 101 at the time of initialization, and area division is performed with this direction angle as a reference position.
- Reference numeral 1302 denotes the angle of view area of the image being captured, and an example of the image at that time is shown in FIG. 13D. In the image projected at the angle of view, the image is divided as shown by 1303 to 1318 in FIG. 13D based on the area division.
- the importance level indicating the priority to be searched is calculated according to the scene status of the subject present in the area and the area.
- the importance level based on the condition of the subject is, for example, the number of persons present in the area, the size of the face of the person, the face direction, the certainty of face detection, the facial expression of the person, and the personal authentication result of the person. calculate.
- the importance level according to the situation of the scene is, for example, general object recognition result, scene discrimination result (blue sky, back light, sunset scene etc.), sound level from the direction of area and voice recognition result, motion detection in area It is information etc.
- the vibration state of the imaging device is detected in the state determination (S903) of the imaging device, and the importance level can be changed according to the vibration state. For example, when it is determined that the “deep-shooting state” is selected, a subject search is performed so that a subject search is performed centering on a high-priority subject (for example, the user of the imaging apparatus) registered in face authentication. When face recognition is detected, it is determined that the importance level is high.
- the automatic photographing described later is also performed with priority given to the face, and even if the user of the imaging apparatus wears the imaging apparatus and takes a lot of time taking pictures, the imaging apparatus is removed and the desk is By placing it on top, it is possible to leave many images captured by the user.
- the importance level is changed according to the past shooting information. Specifically, the area continuously designated as the search area for a predetermined time is lowered in importance level, or in the area photographed in S910 to be described later, the importance level is lowered for a predetermined time. It is also good.
- pan and tilt driving is performed. Specifically, the pan / tilt drive amount is calculated by adding the drive shake amount at control sampling based on the image shake correction amount and the pan / tilt search target angle, and the lens barrel rotation drive unit 205 performs tilt rotation. It drives and controls the unit 104 and the pan rotation unit 105 respectively.
- step S906 the zoom unit 201 is controlled to perform zoom driving. Specifically, the zoom is driven according to the state of the search target subject determined in S904. For example, when the search target subject is the face of a person, if the face on the image is too small, detection is not possible because the size is smaller than the minimum detectable size, and there is a risk of losing sight. In such a case, control is performed to increase the size of the face on the image by zooming to the telephoto side. On the other hand, when the face on the image is too large, the subject or the imaging apparatus itself easily moves the subject out of the angle of view. In such a case, control is performed to reduce the size of the face on the screen by zooming to the wide angle side. By performing zoom control in this manner, it is possible to maintain a state suitable for tracking an object.
- the subject search may be performed with an imaging system that captures an image in all directions at once using a plurality of wide-angle lenses.
- an imaging system that captures an image in all directions at once using a plurality of wide-angle lenses.
- image processing such as object detection is performed
- a vast amount of processing is required. Therefore, a part of the image is cut out, and the search processing of the subject is performed in the cut out image range. Similar to the method described above, the important level for each area is calculated, the cutout position is changed based on the important level, and the automatic photographing determination described later is performed. This makes it possible to reduce power consumption and perform high-speed object search by image processing.
- step S 907 it is determined whether or not there is a photographing instruction from the user (manually). If there is a photographing instruction, the process advances to step S 910. At this time, the photographing instruction by the user (manually) may be pressing of a shutter button provided on the imaging apparatus 101 or pressing of a shutter button provided on the hand-held operation attachment 5001. Alternatively, the housing of the imaging device may be tapped with a finger or the like (tap), voice command input, instructions from an external device, or the like may be used.
- the photographing instruction by the tap operation is a photographing instruction method in which when the user taps the housing of the imaging device, the device shake detection unit 209 detects the acceleration of high frequency continuous in a short period, and uses it as a trigger for photographing.
- the voice command input is a shooting instruction method of recognizing a voice by the voice processing unit 214 and using it as a trigger of shooting when the user utters a combination (for example, "take a photo” or the like) instructing a predetermined shooting.
- the instruction from the external device is, for example, a photographing instruction method using as a trigger a shutter instruction signal transmitted through a dedicated application from a smartphone or the like connected to the imaging device and Bluetooth.
- automatic shooting determination is performed.
- the automatic shooting determination it is determined whether automatic shooting is to be performed and the shooting method (any one of still image shooting, still image continuous shooting (continuous shooting), moving image shooting, panorama shooting, time-lapse shooting, etc. Make a decision.
- a scene may be considered in which the user wears the imaging device and shoots while slightly projecting forward.
- the user it is desirable that the user usually shoot a still image.
- the user knows such a shooting method by presenting the image obtained by the panorama shooting to the user, and for future shooting. It becomes possible to take advantage of it.
- a scene is determined by, for example, detecting the moving distance of the imaging device from the holding state to shooting.
- the preferred imaging method may differ depending on the manner of holding the imaging device, so that it is possible to switch the imaging method according to the state of the imaging device at the time of imaging.
- the user for the purpose of the user in the case of a scene in which the subject is slightly above and photographed to shoot it above, it is desirable that the user usually shoots a still image. Therefore, in order to make these determinations, for example, the subject distance is detected. This makes it possible to determine the scene.
- the imaging method can be switched according to the state of the imaging device at the time of imaging and the state of the viewed object.
- the user switches the image pickup apparatus which has been lowered from the neck to the hand, and shows the state of taking a picture while holding it upward.
- This may be, for example, a scene where a high-rise building is photographed at a sightseeing spot.
- the user can display such a photographing method by presenting the user with an image obtained by vertical panoramic shooting. Knowing, it will be possible to use for future shooting.
- the holding angle is detected. This makes it possible to determine the scene. It should be noted that, by further determining, for example, the distance to the subject and the distances to the subject in the upper and lower, left and right regions of the subject as the state of the subject at this time, it is said which of the portrait panorama and landscape panorama is preferable. It is also possible to improve the accuracy of discrimination. That is, if the distance between the subject and its upper and lower regions is at the same distance, it is possible to make a determination such as performing panoramic photography in the vertical direction. In addition, in order to shoot 360 degrees, it is possible to switch the image capture device that the user has lowered from the neck to a hand-held, and capture a scene while holding it directly above.
- This may be, for example, a scene in which an image is taken to look around at the top of a mountain.
- the user sets the 360-degree imaging mode using an external device and issues an imaging instruction. Therefore, for example, when it is attempted to operate the external device in such a case, it is possible to reduce the time and effort of the user's operation by presenting a UI that covers the transition to 360-degree imaging. Furthermore, while repeating such a thing several times, the user is expected to be able to take a 360-degree picture just by holding it up and pressing the shutter button without having to operate the external device. . Therefore, in such a case, for example, when the moving direction of the imaging apparatus from the holding state to the shooting is the upward direction, 360-degree shooting is performed to reduce the time and effort of the user for shooting. It will be possible to
- the judgment based on the neural network is the same as the determination method of ⁇ determination of whether to perform automatic photographing>. Can also determine the photographing method. Further, in this determination processing, it is also possible to change the determination condition for each user by learning processing described later. In such a case, in the initial stage of learning, a plurality of images are recorded by a plurality of photographing methods, and in the learning process described later, the determination condition is changed according to which photographing method the user preferred. It is possible to
- processing is an automatic imaging determination when there is no manual imaging instruction such as S908b. It is possible to apply also in processing. That is, when it is determined that the user holds the imaging device, it is possible to similarly determine the photographing method reflecting the user's intention by detecting how to hold the imaging device.
- (1) Determination of Whether to Perform Automatic Shooting is performed based on the following two determinations. First, based on the importance level for each area obtained in S904, when the importance level exceeds a predetermined value, it is determined to perform automatic shooting. The second is determination based on a neural network.
- An example of a multi-layer perceptron network is shown in FIG. 12 as an example of a neural network.
- a neural network is used to predict an output value from an input value, and learns in advance an input value and an output value that is an exemplar for the input to obtain a new input value.
- FIG. 12 An example of a multi-layer perceptron network is shown in FIG. 12 as an example of a neural network.
- a neural network is used to predict an output value from an input value, and learns in advance an input value and an output value that is an exemplar for the input to obtain a new input value.
- 1201 and its vertically arranged circles are neurons in the input layer
- 1203 and its vertically arranged circles are neurons in the intermediate layer
- 1204 are neurons in the output layer.
- Arrows such as 1202 indicate connections connecting each neuron.
- the characteristics of the subject include the current zoom magnification, the result of general object recognition at the current angle of view, the result of face detection, the number of faces in the current angle of view, the degree of smile and closing of the face, face angle, and face authentication ID
- the number, the gaze angle of the subject person, the scene discrimination result, the detection result of the specific composition, and the like are used. Also, use the elapsed time since the last shooting, the current time, GPS position information, the amount of change from the last shooting position, the current voice level, the person making a voice, whether applause or cheering has risen, etc.
- vibration information acceleration information, state of imaging device
- environment information temperature, atmospheric pressure, illuminance, humidity, amount of ultraviolet light
- notification information user's exercise information, arm's action information, biological information such as heart beat, etc.
- This feature is converted into a numerical value of a predetermined range, and given to each neuron of the input layer as a feature amount. Therefore, each neuron of the input layer is required as many as the number of feature quantities to be used.
- the judgment based on this neural network can change the output value by changing the connection weight between the neurons by learning processing described later, and the result of the judgment can be adapted to the learning result.
- the frequency of photographing is set to be increased.
- any one of still image shooting, moving image shooting, continuous shooting, panorama shooting and the like is detected based on the state of the imaging apparatus and the surrounding objects detected in S901 to S904.
- To determine whether to execute For example, when the subject (person) is stationary, still image shooting is performed, and when the subject is moving, moving image shooting or continuous shooting is performed.
- images taken sequentially while operating the pan and tilt are A panoramic imaging process may be performed to generate a panoramic image by combining.
- the imaging method can also be determined by determining various information detected before imaging based on a neural network. The determination conditions can be changed by learning processing described later.
- step S909 if it is determined in step S908 that shooting is determined by the automatic shooting determination in step S908, the process advances to step S910; otherwise, the process advances to end of shooting mode processing.
- the imaging device may perform a notification process to the effect that the person to be photographed is to be photographed before photographing.
- the method of notification may use, for example, voice from the audio output unit 218 or LED lighting light by the LED control unit 224, or a motion operation of visually guiding the line of sight of the subject by driving the pan / tilt.
- the predetermined conditions are, for example, the number of faces within the angle of view, the degree of smile / eye closure of the face, the gaze angle or face angle of the subject person, the face authentication ID number, the number of persons registered for personal authentication, .
- the information includes vibration information (acceleration information, state of imaging device), environment information (temperature, atmospheric pressure, illuminance, humidity, amount of ultraviolet light), and the like.
- the method and timing of the notification can also be determined by judgment based on neural network based on information of the shot image or various information detected before shooting. Further, in this determination process, the determination condition can be changed by a learning process described later.
- step S911 editing processing such as processing the image generated in step S910 or adding it to a moving image is performed. More specifically, image processing is trimming processing based on the face of a person or the in-focus position, image rotation processing, HDR (high dynamic range) effect, blur effect, color conversion filter effect, and the like. A plurality of image processes may be generated based on the image generated in S910 by a combination of the above-described processes, and may be stored separately from the image generated in S910. In addition, in the moving image processing, processing may be performed such that the captured moving image or still image is added to the generated edited moving image while performing special effect processing of slide, zoom, and fade. With regard to the editing in S911, the method of image processing can also be determined by judgment based on information of the photographed image or various information detected before photographing based on the neural network, and this judgment processing is the learning processing described later. The judgment conditions can be changed.
- learning information generation processing of the photographed image is performed.
- the information used for the learning process described later is generated and recorded.
- the zoom magnification at the time of shooting, the result of general object recognition at the time of shooting, the result of face detection, the number of faces included in the shot image, the degree of smile / eye closing of face, face angle, face The authentication ID number, the gaze angle of the subject person, and the like.
- Pieces of information are generated and recorded as tag information in the photographed image file.
- the information may be written to the non-volatile memory 216 or may be stored in the recording medium 221 as so-called catalog data in the form of a list of information of each photographed image.
- step S913 the past shooting information is updated. Specifically, the number of shots for each area described in the description of S908, the number of shots for each person registered for personal identification, the number of shots for each subject recognized by general object recognition, the number of shots for each scene determination scene The count of the number of images corresponding to the image taken this time is incremented by one.
- the user (manual) imaging instruction also includes an instruction by voice command input.
- the voice command input includes voice command input (for example, "take my photo") when the user wants to perform shooting including himself. Then, in a search process using pan / tilt or zoom, a subject uttering a voice is searched, and photographing including a subject uttering a voice command is performed within a photographing angle of view.
- FIG. 24 shows the process determined in the process of S 907 in FIG.
- step S 907 manual photographing instruction processing, it is determined whether or not photographing is performed by voice command input.
- the voice processing unit 214 determines whether or not a specific voice command input (for example, “take my photo”) is detected. If no voice command is detected, the process proceeds to S2416, and the manual shooting determination is The voice manual photographing determination process is ended without being performed. If the voice command is detected in S2401, the process proceeds to S2402.
- candidates for the first sound direction, the second sound direction, or the third and fourth sound direction are calculated in descending order of the reliability of the sound direction.
- the accuracy of the sound direction detection is very high, it is not necessary to calculate a plurality of candidates and to perform subsequent search processing or the like.
- noise may be included in the detected sound direction due to the surrounding noise at the time of voice command detection or the influence of the object environment such as sound reflection, so multiple candidates are calculated. doing.
- the time for uttering a pre-registered voice command can be predicted to some extent (for example, when “take my photo” is a command, the time required to utter the command is set as a parameter) .
- the first peak 2501 is set as the first voice direction
- the second peak 2502 is set as the second voice direction in the histogram processing as shown in FIG. 25 from all the sound direction detection values detected within the predetermined time.
- the first voice direction and the second voice direction are respectively calculated, and the process proceeds to S2405.
- step S2403 it is determined whether the pan / tilt retry setting has been made.
- pan / tilt retry setting pan / tilt retry is set in S2415 later, and when the audio manual photographing determination process of this flow is started, pan / tilt retry determination is not performed.
- the process advances to S2404 to set the first sound direction calculated in S2402 as the sound direction as the sound direction. If it is determined in S2403 that pan / tilt retry is set, the process advances to S2405, and the second audio direction calculated in S2404 is set as the sound direction.
- the process proceeds to S2406.
- step S2406 it is determined whether the difference between the set sound direction and the current pan / tilt angle is out of the predetermined range, that is, the difference between the sound direction and the current view angle center is out of the predetermined range. If it is out of the predetermined range, the process advances to step S2407, pan / tilt driving is performed so that the sound direction in which the pan / tilt angle is detected comes to the center of the angle of view, and the process advances to step S2408. If the difference between the sound direction and the current view angle center is within the predetermined range in S2406, the sound direction is located near the center in the view angle, so the process proceeds to S2408 without driving the pan / tilt.
- step S2408 it is determined by image processing analysis of the captured image whether the main subject is within the current angle of view. Specific determination methods are shown below.
- Main subject detection by convolutional neural network As a general machine learning means for image recognition processing, main subject detection by a convolutional neural network is known. By the convolutional neural network, the presence or absence of the detected main subject (the subject subjected to voice response) and the position information on the image if it exists can be obtained. Alternatively, the main subject may be determined by a convolutional neural network for each image obtained by cutting out the area of each person based on the face detection or the human body detection result, and the main subject may be estimated. Although this convolutional neural network is prepared in advance as learned based on the image of a person who made a voice command, it can be learned while being used in a method to be described later.
- the subject is after uttering a voice command toward the imaging device 101, so the possibility that the subject is facing the camera is very high. Therefore, weighting factors may be applied to the detection results of the face authentication ID number, the face expression result, the face angle, the gaze direction, and the gesture determination result, and the determination may be performed simply. If the face authentication ID has already been registered, there is a high possibility that the subject is the main subject. When the degree of smile of the facial expression is high, the main subject is likely to be. When the face angle or the gaze direction is directed to the camera, the main subject is highly likely. If you are doing a gesture (e.g., hand to the camera), it is likely to be the main subject. The main subject may be estimated using any one or more pieces of information.
- Either method can be used to determine whether the main subject is within the current angle of view, or determination may be made by combining (1) to (3).
- step S2409 determines whether a main subject has been found in the process of step S2408. If it is determined in S2409 that there is a main subject, the process advances to S2410. In S2410, zoom and pan / tilt are driven to adjust the composition, and the flow proceeds to S2411.
- the determination of the composition suitable for capturing an image including the main subject may be made by a neural network. Further, by changing the connection weight between the neurons by learning processing described later, the output value changes, and the result of the determination can be adapted to the learning result.
- step S2411 it is determined that a manual imaging instruction has been issued, and the process advances to step S2416 to end the audio manual imaging determination process. Then, the process advances to step S910 in FIG. 9 to start imaging.
- step S 2412 it is determined whether a predetermined time has elapsed since the pan / tilt is completed in S 2407. If it is determined in S2406 that the sound direction and the current angle of view center are within the predetermined range, it is determined by the passage of time from the determined time. Here, if the predetermined time has not elapsed, the process advances to step S2413 to perform a search by zoom.
- the subject uttering the voice command is very small within the angle of view, the size of the face is small and the resolution for the face is also small, which may affect the detection accuracy by the image analysis. Therefore, in that case, the zoom is driven in the direction to narrow the angle of view, and the processing from S2408 is performed again.
- the zoom is driven in the direction to widen the angle of view, and the processing from S2408 is performed again.
- the process proceeds to S2416, and the manual audio imaging determination process is ended without performing the manual imaging determination.
- a means for performing notification processing may be employed.
- sound from the audio output unit 218 or LED lighting light by the LED control unit 224 may be used.
- motion operation may be performed to visually guide the subject's line of sight by driving pan / tilt, or a method of communicating and notifying the smart device 301 or the wearable device 501 may be adopted.
- the voice command input may be a voice command input (for example, “register me”, “follow me”, etc.) when the user wants to register himself as a main subject.
- a voice command input for example, “register me”, “follow me”, etc.
- the search process a subject uttering a voice is searched for and the subject is registered. After the subject is registered, automatic photographing is performed centering on the registered subject.
- shooting can be performed while always holding the subject registered within the angle of view by pan / tilt or zoom driving.
- pan / tilt or zoom is driven to perform detection / registration processing so that face angle registration is easy to perform face recognition registration and color detection / registration of clothes worn. Become.
- the smart device 301 may be notified that the subject has been registered, or the image data of the registered subject may also be transmitted so that the user can confirm.
- the communication means 222 communicates data 2601 to the smart device 301 to make a notification 2602 that indicates that the object has been registered.
- the image data is transmitted 2603 and displayed so that the subject 2604 registered in the smart device can be confirmed.
- the related image of the registered subject 2604 may be superimposed and displayed in the face frame or in the vicinity thereof (below) to indicate that the imaging apparatus 101 is performing face authentication. . It may be displayed during moving image shooting, or may be displayed during moving image playback.
- shooting of the sound direction and object registration by voice command input have been described using both pan and tilt and zoom driving, but it is also possible to perform shooting and object registration using only pan and tilt. Shooting and object registration can also be performed using only the zoom drive.
- the zoom drive When only the zoom drive is used, after the sound direction is detected, the zoom drive is set so that the sound direction falls within the angle of view, and the main subject is searched by the zoom drive to perform photographing and subject registration.
- step 704 of FIG. 7 it is determined whether or not the automatic editing process (highlighted moving image) is to be performed.
- the automatic editing mode process in step 712 is performed.
- FIG. 10 shows the process flow of determining whether or not to shift to the automatic editing mode, which is determined in the mode setting determination process of step 704.
- step 1001 an elapsed time TimeD after the previous automatic editing processing is performed is acquired, and the process proceeds to step 1002.
- step 1002 learning information, scores, and the like corresponding to each image captured after the time of the previous editing process are acquired, and the process proceeds to step 1003.
- step 1003 an evaluation value DB is calculated to determine whether or not automatic editing should be performed from the data acquired in step 1002.
- the calculation method of the evaluation value is, for example, to extract the feature of the image from each piece of image information, and to increase the score when there are many types of features.
- the score for which the preference of the user is determined is calculated for each image, and the score is increased even when there are many images having high scores.
- the score is calculated to be high even if the number of captured images is large.
- the evaluation value depends on the height of the score, the number of images, and the feature type.
- a threshold value DA is calculated from TimeD. For example, the threshold value DAa when TimeD is smaller than a predetermined value is set to be larger than the threshold value DAb when larger than the predetermined value, and the threshold value is set to decrease with time.
- step 1004 ends, the process proceeds to step 1005, and if the evaluation value DB is larger than the threshold value DA, the process proceeds to step 1006.
- Set the automatic editing mode to TRUE, because it is possible to obtain data to be automatically edited since the last time automatic editing was performed, or because it is determined that automatic editing should be performed with a large lapse of time. End the automatic editing mode determination. If it is determined in step 1005 that the evaluation value DB is equal to or less than the threshold value DA, it is determined that the data to be automatically edited is not complete, and the automatic editing mode determination is set to FALSE so that the automatic editing process is not performed. Finish.
- step 712 processing in the automatic editing mode processing (step 712) will be described.
- the detailed flow of the automatic editing mode process is shown in FIG.
- step S1101 the first control unit 223 executes selection processing of still images and moving images stored in the recording medium 221 to select an image to be used for editing, and the processing proceeds to step S1102.
- the image selection processing mentioned here is a threshold determined by extracting metadata such as the captured still image, the number of faces in the moving image, the size of the face, and the color group for each image, and converting it into an evaluation value. I will list the above.
- the selection ratio of the still image and the moving image is determined by learning described later, and selection is preferentially performed in consideration of the setting of the user, the photographing frequency, and each setting.
- step S1102 the first control unit 223 and the image processing unit 207 apply image effects to the image selected in step S1101 and the process advances to step S1103.
- the image effect application here refers to trimming processing at the center of the person's face or in-focus position, image rotation processing, HDR (high dynamic range) effect, blur effect, slide, zoom, and fade special effects in still images. Effect processing, color filter effect, etc.
- color filters are applied to moving images.
- the first control unit 223 sets the image reproduction time, and the process advances to S1104.
- an appropriate image reproduction time is set based on learning described later.
- step S1104 the first control unit 223 sets music (BGM) to be added to the highlight moving image described in step S1105, and the process advances to step S1105.
- music (BGM) settings the most appropriate one to be provided to the user is set from the learning results described later.
- a first control unit 223 performs a series of highlight moving image creations using the results performed in S1101 to S1104.
- the created highlight moving image is stored in the recording medium 221.
- the selection of images, the addition of image effects, playback time and BGM selection described above can be performed from the tag information added to each image (information of the photographed image or various information detected before photographing), neural network It can be judged by the judgment based on Further, in this determination processing, the determination conditions can also be changed by learning processing described later.
- the recording medium runs out of space, shooting can not be performed, and shooting may not be possible at the user's intention, or a target scene can not be shot in automatic shooting. It is possible to delete the image by user operation, but it is complicated. Therefore, it is necessary to automatically delete the photographed image according to the conditions by the processing described below. On the other hand, since there is a risk that the user may delete an image required later, it is necessary to select an appropriate image and delete it.
- the free space of the storage medium is confirmed.
- the target number of deleted pages is determined according to the free space of the storage medium. For example, the target number of deleted images increases as the free space decreases and increases as the setting of the imaging frequency increases. In addition, you may change by learning mentioned later.
- a list is created by sorting the photographed images stored in the storage medium in the descending order of the score obtained by quantifying the preferences of the user's images described later. From S2904, it is determined whether or not the images are to be deleted one by one in order from the top to the bottom of the sorted list, and deletion processing is executed. In step S2905, it is determined whether the target image on the list meets the deletion condition.
- the condition for deletion may be, for example, that the image is not an image manually shot by the user, or that the image is not an image highly rated by the user. Since these are images that the user likes or may need later, it is desirable to exclude them from the deletion conditions.
- the image has been transferred to an external communication device such as a smart device in the automatic transfer mode, that the user has not browsed the image from the external communication device, or the like. If the image has been transferred, the image of the transfer destination can be used, so deletion is unlikely to be a disadvantage for the user. In addition, it is considered that there is no disadvantage in deleting an image because the user does not recognize an automatically captured image that the user has never browsed. If the deletion condition is satisfied, the process advances to step S2906 to delete the image, and then the process advances to step S2907. If not, the process proceeds to S2907 without deleting. In S2907, it is determined whether the deletion target number has been achieved. If it has achieved, the processing of the automatic deletion mode is ended. If not achieved, the process returns to S2904, and the process is sequentially repeated for the next ranked image in the list. If there is no target image on the list in step S2904, the processing ends.
- an external communication device such as a smart device in the automatic transfer mode
- the learning processing unit 219 performs learning in accordance with the user's preference.
- the neural network is used to predict the output value from the input value, and learns in advance the actual value of the input value and the actual value of the output value to output an output for the new input value.
- the value can be estimated.
- learning is performed according to the user's preference for the above-mentioned automatic photographing, automatic editing, and subject search.
- subject registration face recognition, general object recognition, etc.
- the elements to be learned by the learning process are as follows.
- the learning for automatic shooting will be described.
- learning is performed to automatically take an image that suits the user's preference.
- the learning information generation process is performed after the photographing (step S912).
- An image to be learned is selected by a method to be described later, and learning is performed by changing weights of the neural network based on learning information included in the image.
- the learning is performed by changing a neural network that performs automatic shooting timing determination and determining a shooting method (still image shooting, moving image shooting, continuous shooting, panoramic shooting, etc.).
- An image to be learned is selected by a method to be described later, and learning is performed by changing weights of the neural network based on learning information included in the image.
- Various detection information obtained by shooting or information immediately before shooting is input to a neural network, and image effects are applied (trimming processing, rotation processing, HDR effect, blur effect, slide, zoom, fade, color conversion filter effect, BGM, time And still image / moving image ratio).
- Subject Search Learning for subject search will be described.
- learning is performed to automatically search for a subject matching the user's preference.
- the subject search process step S904
- the importance level of each area is calculated, pan / tilt and zoom are driven, and the subject search is performed.
- the learning is performed based on the photographed image and the detection information during the search, and learning is performed by changing the weight of the neural network.
- Various detection information during the search operation is input to the neural network, the importance level is calculated, and the pan / tilt angle is set based on the importance level to perform the object search reflecting the learning.
- learning of pan and tilt driving speed, acceleration, frequency of movement
- subject registration learning is performed to automatically register and rank subjects according to the user's preference. As learning, for example, registration of face recognition registration and registration of general object recognition, registration of gesture and voice recognition, and scene recognition by sound are performed.
- authentication registration authentication registration is performed for a person and an object, and rank setting is performed based on the number and frequency of image acquisition, the number and frequency of manual photographing, and the frequency of appearance of a subject under search. The registered information is registered as an input of determination using each neural network.
- an image to be learned is selected by a method described later, and learning is performed by changing weights of the neural network based on learning information included in the image.
- information on how the notification operation is performed immediately before shooting is embedded, and detection information added to the selected image and the notification operation information immediately before the shooting are learned as teacher data.
- Each detection information immediately before shooting is input to the neural network, and it is judged whether or not to perform notification, each notification operation (sound (sound level / type of sound / timing), LED light (color, lighting time, blink interval) , And pan / tilt motion (motion, driving speed)).
- the learning of each informing operation may be performed by selecting which informing method to perform from among informing methods (sound, LED light, and pan / tilt motion combined operation) prepared in advance.
- separate neural networks may be provided for each notification operation of sound, LED light, and pan / tilt motion to learn each operation.
- the predetermined time TimeA and the predetermined threshold ThreshA are changed by learning. Temporary tap detection is also performed in the state where the above-described tap detection threshold is lowered, and parameters of Time A and Thresh A are set so as to be easily detected depending on whether temporary tap detection has been determined before tap detection. Also, if it is determined from the detection information after tap detection that it is not an activation factor (no shooting target as a result of subject search or automatic shooting determination described above), it is difficult to detect TimeA and Thresh A parameters Do. The determination as to whether or not there is a shooting target at the time of activation varies with subject detection information embedded in an image learned by a learning method described later.
- the predetermined time Time B, the predetermined threshold Thresh B, the predetermined number of times Count B, and the like are changed by learning.
- the activation condition is entered due to the shaking state, activation is performed, but it is determined from the detection information for a predetermined time after activation that it is not an activation factor (as a result of subject search and automatic shooting determination described above, no shooting target). Then, the parameter of the swing state determination is changed, and learning is performed so that it is difficult to start.
- the photographing frequency in the state where the shaking is large is high, it is set to be easily activated by the shaking state determination.
- the determination of whether or not there is a shooting target at the time of activation, and the determination of whether the shooting frequency is high in a large shaking state are the subject detection information embedded in the image learned by the learning method described later and the shaking information at the time of shooting It will change due to
- [Environmental information detection] learning can be performed by manually setting the condition of the environmental information change that the user wants to activate, for example, through communication with the dedicated application of the external device 301.
- activation can be performed according to specific conditions of temperature, atmospheric pressure, brightness, absolute amount of humidity, and the amount of change of ultraviolet light.
- the activation based on the environment information. For example, when learning many images taken at the time of temperature rise, learning is performed so as to be easy to drive at the time of temperature rise.
- the above parameters also change depending on the remaining capacity of the battery. For example, when the battery remaining amount is low, it becomes difficult to enter various determinations, and when the battery remaining amount is large, it becomes easy to enter various determinations.
- the sway state detection result and the sound scene detection of sound detection which are conditions that do not require the user to always start the imaging device, change the ease of each detection determination depending on the battery remaining amount. Become.
- the mode setting determination 704 is neither “automatic shooting mode” “automatic editing mode” “automatic image transfer mode” “learning mode” “automatic file deletion mode”, it is low. Enter power consumption mode.
- the determination conditions of each mode are as described above, but the conditions under which each mode is determined also change by learning.
- the automatic shooting mode as described above, the importance level for each area is determined, and automatic shooting is performed while searching for a subject with pan and tilt.
- the importance level for each area is calculated based on the number and size of subjects such as people and objects in the area, so that in a situation where there are no subjects in the surroundings, the importance levels for all areas are low.
- the automatic shooting mode may be canceled based on the condition that importance levels of all the areas or a value obtained by adding the importance levels of the respective areas is equal to or less than a predetermined threshold.
- a predetermined threshold may be lowered according to the elapsed time since the transition to the automatic shooting mode. The transition to the low power consumption mode is facilitated as the elapsed time after the transition to the automatic shooting mode increases. Further, by changing the predetermined threshold according to the remaining capacity of the battery, it is possible to perform the low power consumption mode control in consideration of the battery durability.
- the threshold decreases, and when the battery remaining amount is large, the threshold increases.
- the parameter (elapsed time threshold TimeC) of the next low power consumption mode cancellation condition is set for the Sub processor based on the elapsed time since the transition to the automatic shooting mode last time and the number of shots.
- the learning can be performed by, for example, manually setting the imaging frequency, the activation frequency, and the like through communication with a dedicated application of the external device 301. Further, an average value of elapsed time from turning on the power button of the imaging apparatus 101 to turning off the power button or distribution data for each time zone may be accumulated to learn each parameter. In this case, for users with short time from power ON to OFF, the time interval for returning from the low power consumption mode and transition to the low power consumption state becomes short, and the time from power ON to OFF is long Is learned so that the interval is long. In addition, learning is also performed by the detection information under search. While it is determined that the number of important subjects set by learning is large, the time interval for returning from the low power consumption mode and the transition to the low power consumption state is short, and the number of important subjects is small. , Will be learned to increase the interval.
- Automatic File Deletion The learning for automatic file deletion will be described.
- learning is performed on the free space of files and the selection of images to be deleted first.
- An image to be learned is selected by a method described later, and learning can be performed by changing weights of the neural network based on learning information included in the image.
- a score in which the preference of the user is determined is calculated for each image, and an image having a low score is preferentially deleted from the recording medium 221.
- learning is performed not only by the score of the score, but also by the shooting date and time embedded in each image in the recording medium 221, and the editing content of the selected highlight moving image (moving image edited automatically) by the method described later.
- the acquired highlight moving image has many images captured at short time intervals
- files with older capture dates and times are preferentially deleted, but include images captured at long time intervals.
- Files with high scores are learned not to be deleted even if the date and time are old.
- the score of each image in the recording medium 221 is recalculated at predetermined time intervals sequentially.
- the shooting date and time information is also input to the neural network at the time of score calculation, and when there are many images shot at short time intervals, the file with the old shooting date and time is learned so that the score becomes low.
- priority is given to deletion, and when an image shot at a long time interval is included, learning is performed so that the score is not lowered even if the date and time are old, so that the score is high even if the date and time is old Files are learned not to delete.
- an image to be learned is selected by the method described later, but in the case where the date and time of the selected image is often selected by concentrating on relatively new ones, files with older shot dates and times are given priority Are deleted. However, if the date and time of the image to be selected is often also selected, learning is performed so that a file with a high score is not deleted even if the date and time are old.
- the file is automatically deleted so as to take up a lot of free space of the file, and if it is learned so that the shooting frequency becomes low, the file Automatic deletion of files is performed to save space.
- the file is automatically deleted so as to take up a lot of free space of the file, and learning is performed so that the shooting frequency of still images is increased. Automatic deletion of files is performed so that free space of files is reduced.
- Image shake correction Learning for image shake correction will be described.
- the image shake correction is performed by calculating the correction amount in S902 of FIG. 9 and driving the pan and tilt in S905 based on the correction amount to perform the image shake correction.
- image shake correction learning is performed to perform correction in accordance with the characteristics of the user's shake. It is possible to estimate the direction and size of blur by, for example, estimating PSF (Point Spread Function) with respect to a captured image.
- PSF Point Spread Function
- the direction and the size of the estimated blur are added to the image as information.
- the learning mode processing in step 716 of FIG. 7 the estimated direction and size of blur are output, and each detection information at the time of shooting is input to learn weights of the neural network for vibration correction.
- Each detection information at the time of shooting includes motion vector information of an image at a predetermined time before shooting, motion information of a detected subject (person or object), vibration information (gyro output, acceleration output, state of imaging device), etc. .
- environmental information temperature, barometric pressure, illuminance, humidity
- sound information sound scene determination, specific voice detection, sound level change
- time information elapsed time from startup, elapsed time from previous shooting
- Location information GPS position information, position movement change amount
- the pan / tilt driving speed for photographing the subject without blurring is estimated from detection information before photographing.
- Subject blur correction may be performed.
- the panning shooting is a shooting in which a moving subject has no blur and a non-moving background flows.
- the drive speed during still image shooting is estimated by inputting the above detection information into a neural network.
- learning by dividing the image into blocks and estimating the PSF of each block, it is possible to estimate the direction and size of blurring in the block in which the main subject is located, and learn based on the information .
- the learning for automatic image transfer will be described.
- learning is performed on selection processing of an image to be transferred with priority and transfer frequency among the images recorded in the recording medium 221.
- An image to be learned is selected by a method described later, and learning can be performed by changing weights of the neural network based on learning information included in the image.
- the score for which the preference of the user is determined is calculated, and the image with the highest score is preferentially transferred.
- learning information corresponding to an image transferred in the past is also used for image transfer determination.
- an image to be learned by a method to be described later When an image to be learned by a method to be described later is selected, what is regarded as important in the learning information (feature amount) included in the image is set, and an image transferred in the past includes similar feature amounts In many cases, it is set to transfer an image including another feature amount and having a high score.
- the image transfer frequency also changes according to each state of the imaging device. It changes with the remaining capacity of the battery. For example, when the battery remaining amount is low, image transfer is difficult, and when the battery remaining amount is large, image transfer is set to be easy. Specifically, for example, the elapsed time from the previous automatic transfer is multiplied by the highest score among the images taken during the elapsed time, and the image is transferred when the multiplied value exceeds the threshold.
- the frequency of automatic image transfer is changed according to the set imaging frequency of the imaging apparatus 101.
- the frequency of automatic image transfer is set to increase, and when learning is performed to decrease the frequency of shooting, the frequency of automatic image transfer is also decreased Set to At this time, it is possible to change the image transfer frequency according to the setting of the imaging frequency by changing the threshold value according to the imaging frequency.
- the frequency of automatic image transfer is also changed according to the free space of the file (recording medium 221).
- the frequency of automatic image transfer is low, and when the free space of the file is small, the frequency of automatic image rotation is set to be high. At this time, the image transfer frequency can be changed according to the file free space by changing the above-mentioned threshold value according to the file free space.
- the imaging apparatus 101 can perform two shootings: manual shooting and automatic shooting. .
- step S 912 information is added that the captured image is an image captured manually . If it is determined in step S909 that automatic imaging is ON and imaging is performed, information is added in step S912 that the captured image is an image captured automatically.
- learning is performed regarding extraction of feature amounts in the captured image, registration of personal authentication, registration of individual facial expressions, and registration of human combinations.
- learning is performed to change the importance of a nearby person or object from the expression of the subject registered individually.
- the angle of view is instructed by “the user rotates the pan / tilt by hand,” which will be described later with reference to FIGS. 17 to 22, an object present within the angle of view after rotation is learn. This is also part of learning by manual operation detection information.
- the importance is high when the time ratio in which the person A of the personal identification registration subject is shown simultaneously with the person B of the personal identification registration subject is higher than a predetermined threshold. For this reason, when the person A and the person B fall within the angle of view, various types of detection information are stored as learning data so as to increase the score of the automatic photographing determination, and learning is performed in the learning mode processing 716.
- the imaging apparatus 101 and the external device 301 have communication means of the communication 302 and 303.
- the image transmission / reception is mainly performed by the communication 302, and the image in the imaging apparatus 101 can be acquired by communication with the external device 301 via a dedicated application in the external device 301.
- the thumbnail image of the image data stored in the imaging apparatus 101 can be browsed through a dedicated application in the external device 301. In this way, the user can select an image that he / she likes from the thumbnail images, check the image, and operate the image acquisition instruction to acquire the image on the external device 301.
- the acquired image is very likely to be an image preferred by the user. Therefore, it is possible to perform various types of learning of the user's preference by determining that the acquired image is an image to be learned and learning based on the learning information of the acquired image.
- FIG. 407 An example in which an image in the imaging apparatus 101 is browsed through an application dedicated to the external device 301 which is a smart device is shown in FIG.
- the display unit 407 displays thumbnail images (1604 to 1609) of the image data stored in the imaging apparatus, and the user can select an image that he / she likes and can acquire an image.
- a display method change unit (1601, 1602, 1603) is provided to change the display method.
- the display order is changed to the date and time priority display mode, and the images are displayed on the display unit 407 in the order of the shooting date and time of the image in the imaging device 101 (for example, 1604 has a new date and a 1609 has an old date Is displayed on
- the recommended image priority display mode is changed.
- the images are displayed on the display unit 407 in the descending order of the score of the image in the imaging apparatus 101 based on the scores obtained by determining the preference of the user for each image calculated in step S912 in FIG. High, 1609 is displayed as the score is low).
- the imaging device 101 and the external device 301 have communication means, and are stored in the imaging device 101. These images can be browsed through a dedicated application in the external device 301.
- the user may be configured to score each image.
- a high score for example, 5 points
- a low score for example, 1 point
- the score of each image is used for relearning along with learning information in the imaging device.
- the output of the neural network when the feature data from the designated image information is input is learned so as to approach the point designated by the user.
- the user inputs the determination value to the photographed image via the communication device 301.
- the imaging device 101 may be operated to directly input the determination value to the image.
- the imaging device 101 is provided with a touch panel display, and the user presses a GUI button displayed on the touch panel display screen display unit to set a mode for displaying a photographed image. Then, the user can perform similar learning by a method of inputting a determination value to each image while confirming the photographed image.
- the external device 301 has a storage unit 404, and an image other than the image captured by the imaging device 101 is also stored in the storage unit 404. Configuration. At this time, it is easy for the user to browse the images stored in the external device 301, and it is also easy to upload the images to the shared server via the public line control unit 406. Very likely to be included.
- the external device 301 may be configured to be able to process the image stored in the storage unit 404 through the dedicated application by the control unit 411 with the same learning processing as the learning processing unit 219 in the imaging apparatus 101. . In this case, by communicating the processed data for learning to the imaging apparatus 101, the learning can be performed. In addition, an image or data to be learned by the imaging device 101 may be transmitted to be learned in the imaging device 101.
- the user may select an image to be learned from among the images stored in the storage unit 404 via a dedicated application, and learn.
- SNS Social networking services
- the dedicated SNS application downloaded into the external device 301 can acquire the image and information about the image uploaded by the user as described above. Also, it is possible to obtain user's favorite image and tag information by inputting whether the user likes the image uploaded by another user. These images and tag information are analyzed, and a configuration is made such that a learning set can be made in the imaging device 101.
- An image uploaded by the user as described above or an image determined to be liked by the user may be acquired, and the control unit 411 may be capable of processing the same learning processing as the learning processing unit 219 in the imaging apparatus 101. .
- the processed learning data may be communicated to the imaging apparatus 101 to learn.
- an image to be learned by the imaging device 101 may be transmitted, and learning may be performed in the imaging device 101.
- the color conversion filter effect of the automatic editing mode process 712 of FIG. 7 and the editing S 911 of FIG. 9 is learned to change.
- learning is performed by estimating subject information which the user prefers from the subject information set in the tag information, and registering it as a subject to be detected which is input to the neural network.
- the subject information may be, for example, subject object information such as a dog or a cat, scene information such as a beach, or expression information such as a smile.
- image information currently in circulation in the world may be estimated from statistical values of tag information (image filter information and subject information) in the SNS so that learning setting can be performed in the imaging apparatus 101. it can.
- the imaging device 101 and the external device 301 have communication means, and learning parameters currently set in the imaging device 101 Can be communicated to the external device 301 and stored in the storage unit 404 of the external device 301.
- learning parameters for example, weights of a neural network, selection of an object to be input to the neural network, and the like can be considered.
- the learning parameter set in the dedicated server can be acquired via the public line control unit 406 via the dedicated application in the external device 301, and can be set as the learning parameter in the imaging apparatus 101. I assume.
- the parameters at a certain point in time can be stored in the external device 301, and learning parameters can be returned by setting them in the imaging apparatus 101, or learning parameters possessed by other users can be obtained via a dedicated server. It is also possible to acquire and set the image capturing apparatus 101 of its own.
- voice commands, authentication registrations, gestures registered by the user may be registered via an application dedicated to the external device 301, or important places may be registered.
- the imaging trigger described in the automatic imaging mode process (FIG. 9) is treated as input data for automatic imaging determination.
- the shooting frequency, start interval, still image and moving image ratio, and a favorite image can be set, and the start interval described in ⁇ Low Power Consumption Mode Control>, and still image moving image ratio described in ⁇ Auto Edit> You may set it.
- a dedicated application of the external device 301 is provided with a function that can be manually edited by the user's operation, and the contents of editing work can be fed back to learning. It can. For example, it is possible to learn an automatic editing neural network so that editing of the image effect can be performed and the manually edited image effect can be determined with respect to the learning information of the image.
- the image effect for example, trimming processing, rotation processing, slide, zoom, fade, color conversion filter effect, time, still image / moving image ratio, BGM can be considered.
- the data for learning is various data (image information, vibration information, environment information, sound information, place information, etc.) recorded as tag information at the time of shooting or searching, and when reflecting in learning, these various data Is stored in the form of a list.
- the number of data groups for learning is assumed to have a fixed number determined.
- the data group for learning is divided into two areas of an area which is learning data intentionally performed by the user and an area which is learning data not intentionally used by the user, and the ratio of the number of data in the area is The ratio is set so that the learning data area intentionally performed by the user is larger.
- a new learning reflection instruction it is deleted from the learning data corresponding to each area, and new learning data is added. For example, when adding two learning data intentionally performed by the user, two data are deleted from the learning data area intentionally performed by the user, and two new data are added and re-learned. .
- the date and time when the learning data is generated is managed, and a weighting coefficient La according to the elapsed time from the date and time when the learning data is generated is calculated.
- the weighting factor La is updated to be smaller as the elapsed time becomes larger.
- a weighting factor Lb depending on whether the learning data intentionally performed by the user or the learning data not intended by the user is also managed corresponding to each learning data.
- the weighting coefficient Lb is set to be larger in the case of the learning data intentionally performed by the user than the learning data in which the user is not intentional.
- the weighting coefficient Lb may be changed depending on which learning of (1), (3) to (8) the learning data intentionally performed by the user.
- the data is preferentially deleted from the learning data with the smallest value obtained by multiplying the weighting coefficients La and Lb in the current learning data group, and then the additional data is inserted. Perform machine learning based on the updated learning data group.
- step 704 of FIG. 7 it is determined whether or not learning processing should be performed. If learning processing is performed, it is determined that the learning mode is set, and learning mode processing in step 716 is performed.
- FIG. 14 shows a flow of processing for determining whether or not to shift to the learning mode, which is determined in the mode setting determination processing of step 704.
- step 1401 it is determined whether there is a registration instruction from the external device 301.
- the registration is a determination as to whether or not there is a registration instruction for learning as described above. For example, there are ⁇ learning based on image information acquired by the communication device> and ⁇ learning based on inputting a determination value to an image through the communication device>. Also, there are ⁇ learning by analyzing the image stored in the communication device>, ⁇ learning by analyzing the image stored in the communication device> and the like. If there is a registration instruction from the external device in step S 1401, the process advances to step S 1410 to set the learning mode determination to TRUE and perform processing of step S 716.
- step S1402 it is determined whether there is a learning instruction from an external device.
- the learning instruction here is a determination as to whether or not there is an instruction to set a learning parameter, as in ⁇ Learning by changing the imaging device parameter in the communication device>. If it is determined in step S1402 that a learning instruction has been issued from an external device, the process advances to step S1410 to set the learning mode determination to TRUE, perform processing of step S716, and end the learning mode determination process. If it is determined in step 1402 that there is no learning instruction from the external device, the process advances to step 1403.
- step 1403 it is determined whether the planned learning condition is satisfied. For example, learning conditions based on scheduled time, such as learning at 24:00 every day, may be used. As a result, regular learning execution is performed, so that freshness of learning results can be kept constant.
- the condition may be that the power-off by pressing the power button is instructed to the imaging apparatus 101. In this case, the power is turned off after the learning process is completed.
- the learning process generally requires a long processing time, but by performing it at a timing when the user does not use it for shooting for a while, such as when the power is off, it can be performed without disturbing the use of the user. Can. If the planned learning condition is satisfied, the process proceeds to step S1410. If the condition is not satisfied, the process proceeds to step 1404.
- the possibility of shooting is determined. As described above, since the learning process takes time, it is better to avoid the implementation at the timing when imaging may be performed. Therefore, for example, it is determined that the possibility of shooting for a while is low based on the condition that manual shooting has not been instructed for a predetermined time or more in the past, and the importance level of the area in the automatic shooting mode is below a predetermined level. Do. If it is determined that the photographing possibility is low, the process proceeds to step 1405. If not, the process proceeds to step 1411 and the learning mode determination is set to FALSE. In step S1405, the elapsed time TimeN from the previous learning process (recalculation of the weights of the neural network) is acquired, and the process proceeds to step S1406.
- step 1406 the number of new data to be learned DN (the number of images designated to be learned during the elapsed time Time N since the last learning process) is acquired, and the process proceeds to step 1407.
- a threshold value DT is calculated from TimeN. For example, the threshold value DTa when TimeN is smaller than a predetermined value is set larger than the threshold value DTb when larger than the predetermined value, and the threshold value is set to be smaller with the passage of time. As a result, even when the amount of learning data is small, the image pickup apparatus is made to easily change learning according to the time of use by learning again if the time elapsed is large.
- step 1407 After calculating the threshold value DT in step 1407, the process proceeds to step 1408, where it is determined whether the number of data to be learned DN is larger than the threshold value DT. If the DN is larger than the threshold DT, the process proceeds to step 1409, sets DN to 0, and then proceeds to step 1410 to set the learning mode determination to TRUE and perform the process of step 716 to perform the learning mode determination. End the process.
- step 1408 If it is determined in step 1408 that the DN is smaller than the threshold DT, then the process proceeds to step 1411. Since there is neither a registration instruction from an external device nor a learning instruction from an external device, and the number of learning data is less than a predetermined value, the learning mode determination is set to FALSE and the process of step 716 is set not to be performed. The determination process ends.
- step 716 the process in the learning mode process (step 716) will be described.
- the detailed flow of the learning mode process is shown in FIG.
- step 1501 it is determined whether there is a registration instruction from the external device 301. If it is determined in step 1501 that there is a registration instruction from an external device, the process advances to step 1502. At step 1502, various registration processing is performed.
- the various registrations are registrations of features to be input to the neural network, such as registration of face recognition, registration of general object recognition, registration of sound information, registration of location information, and the like.
- step 1503 the process proceeds to step 1503, and the information to be input to the neural network is changed from the information registered in step 1502.
- step 1503 When the process of step 1503 ends, the process proceeds to step 1507.
- step 1501 If there is no registration instruction from the external device 301 in step 1501, the process proceeds to step 1504, and it is determined whether there is a learning instruction from the external device 301. If there is a learning instruction from the external device, the process proceeds to step 1505, the learning parameter communicated from the external device is set in each determiner (such as weight of neural network), and the process proceeds to step 1507.
- step 1504 If it is determined in step 1504 that there is no learning instruction from an external device, learning (recalculation of neural network weights) is performed in step 1506.
- the process of step 1506 is a condition under which the number of data to be learned DN exceeds the threshold and the respective learners can relearn, as described with reference to FIG. Retraining is performed using a method such as error back propagation method or gradient descent method, and weights of the neural network are recalculated to change parameters of each decision unit. Once the learning parameters are set, the process proceeds to step 1507.
- the image in the file is again scored.
- all photographed images stored in the file (recording medium 221) are scored based on learning results, and automatic editing and automatic file deletion are performed according to the scored scores. It is a structure. Therefore, when re-learning or setting of learning parameters from an external device is performed, it is also necessary to update the scores of captured images. Therefore, in step 1507, recalculation is performed to add a new score to the photographed image stored in the file, and when the process ends, the learning mode process ends.
- a method of proposing a user's favorite video by extracting a scene that the user seems to like, learning its features, and reflecting the feature on operations such as automatic shooting and automatic editing has been described.
- the invention is not limited to this application.
- it can also be used for an application that proposes an image different from the user's own preference.
- the realization method it is as follows.
- an image having a feature different from the image acquired by the external communication device may be added to the teacher data, or an image having a feature similar to the image acquired may be deleted from the teacher data.
- data different from the user's preference is collected in the teacher data, and as a result of learning, the neural network can determine the situation different from the user's preference.
- the automatic shooting a scene different from the user's preference can be shot by shooting according to the output value of the neural network.
- automatic editing it is possible to propose an edited image that is different from the preference of the user.
- the degree of conformity to the user's preference may be changed according to the mode setting, the state of the various sensors, and the state of the detection information.
- the external device 301 has a learning process, data necessary for learning is communicated to the external device 301, and only the external device The same learning effect can be realized even with a configuration that executes learning.
- learning is performed by setting parameters such as weights of the neural network learned on the external device side in the imaging apparatus 101 by communication. It may be configured.
- both the inside of the imaging apparatus 101 and the inside of the external device 301 may have a learning process.
- learning information possessed by the external device 301 may be communicated to the imaging apparatus 101 at the timing when the learning mode processing 716 is performed in the imaging apparatus 101, and learning may be performed by merging learning parameters.
- FIG. 17 is a block diagram showing the configuration of the lens barrel rotation drive unit 205. As shown in FIG. Reference numerals 1701 to 1707 in FIG. 17 denote configurations related to the drive of the pan axis. Reference numerals 1708 to 1714 in FIG. 17 relate to drive control of the tilt axis.
- Reference numeral 1701 denotes an image position-pan position conversion unit for calculating a target position when driving the pan shaft 1706 from the difference between the target position and the current position on the image of the subject.
- FIG. 18 is a diagram showing the relationship between the current position of an object and the target position in an image captured by an imaging device.
- Reference numeral 1801 denotes a momentary image obtained by the image processing unit 207 while the imaging device is searching for a subject.
- Reference numeral 1802 denotes the current position (x1, y1) of the subject.
- Reference numeral 1803 denotes the target position (x0, y0) of the subject.
- Kp (f) is a conversion coefficient for calculating the pan target position from the difference between the target position and the current position on the image of the subject that changes according to the focal length f of the imaging device.
- kt (f) is a conversion coefficient for calculating the target position of the tilt from the difference between the target position and the current position on the image of the subject which changes according to the focal length f of the imaging device.
- Reference numeral 1702 in FIG. 17 denotes a compensator.
- the compensator 1702 calculates the control output by performing PID control so as to eliminate the difference between the current pan position and the pan target position calculated by the image position-pan position converter 1701.
- An imaging direction change operation detection unit 1703 detects an imaging direction change operation from the difference between the pan target position and the current position (hereinafter referred to as position deviation), the control output, and the moving speed of the pan.
- position deviation an imaging direction change operation from the difference between the pan target position and the current position
- the control output is turned off to stop the pan drive.
- pan drive control is performed according to the control output calculated by the compensator 1702.
- Reference numeral 1704 denotes a driver for generating a drive signal according to the control output calculated by the compensator 1702.
- An ultrasonic motor (USM) 1705 is an actuator for driving the pan shaft 1706.
- Reference numeral 1707 denotes a moving speed detection unit for calculating the moving speed of the pan from the time change of the pan position. The moving speed detection unit 1707 calculates the moving speed of the pan from the amount of change in the pan position for each control sampling.
- FIG. 19 is a flowchart showing a flow of detecting a shooting direction change operation by a user operation and updating learning information with the shooting area after the shooting direction change operation as an important area.
- step S1901 it is determined whether the user has performed a shooting direction change operation of the imaging apparatus.
- the shooting direction change operation detection unit 1703 determines that the shooting direction change is present when a control output and a positional deviation described later satisfy predetermined conditions. If the photographing direction change operation is detected in S1901, the process advances to S1902 to stop the position control operation. If the subject is being tracked or searched, the position control operation is stopped after being interrupted. On the other hand, when the shooting direction change operation is not detected in S1901, the detection of the shooting direction change operation is continued. After the position control is stopped in S1902, the process advances to S1903 to determine the end of the shooting direction change operation by the user.
- the shooting direction change operation detection unit 1703 determines continuation or end of the shooting direction change operation based on the moving speed of pan. If it is determined that the shooting direction change operation has ended, the process advances to step S1904 to store shooting area information after it is determined that the shooting direction change operation has ended.
- the area to be stored stores the closest area by comparing the angle of view determined from the position of the imaging apparatus, the pan position, the tilt position, and the focal length with each divided area. If it is determined in S1903 that the shooting direction change operation is in progress, detection of the end of the shooting direction change operation is continued. In S1905, the learning information is updated with the area stored in S1904 as an area more important than the other divided areas.
- step S1906 after subject tracking and position control are enabled, the process advances to step S1901 to resume detection of the shooting direction change operation.
- a special image (image effect) different from the face authentication described above is displayed on the tracking target image or its periphery.
- the user rotates the lens barrel 102 while shooting a flower with the imaging device 101 so that the optical axis of the imaging device 101 faces a specific person outside the angle of view. An example when the shooting direction change operation is performed will be described.
- FIG. 20 illustrates an example in which the learning information is updated with the area where the person 2003 exists as the important area after rotating the lens barrel 102 with the user's hand in the direction of the person 2003 while photographing the flower 2001 with the imaging device 101.
- Reference numeral 2002 in FIG. 20 denotes an optical axis of the imaging device 101 during imaging of the flower 2001.
- Reference numeral 2004 denotes an optical axis after the user manually changes the shooting direction.
- Reference numeral 2005 indicates the rotation direction of the lens barrel 102 when the user changes the shooting direction.
- FIG. 21A, 21 B, 21 C, and 21 D change the shooting direction from while shooting a flower, change the shooting direction to the direction of the specific person 2003, and update an image of learning at a certain moment before updating the learning information.
- FIG. FIG. 22 shows a pan control output 2201 until the user changes the shooting direction to the direction of the specific person 2003 while shooting a flower, and the area of the angle of view after the change is regarded as an important area and the learning information is updated. It is the figure which showed the time change of the position deviation 2202 and the moving speed 2203.
- FIG. Ta, tb, tc, and td in FIG. 22 are times at which the images shown in FIGS. 21A, B, C, and D were captured, respectively. ThC in FIG.
- ThDiff is a threshold of positional deviation used to determine that the user has manually rotated the lens barrel 102.
- the control output of the compensator 1702 is turned off assuming that the user changes the imaging direction when the control output is ThC or more and the position deviation is ThDiff or more for a predetermined time (t2-t1 in FIG. 22).
- ThV is a threshold value of the moving speed of the pan axis used to determine that the user has finished the shooting direction operation.
- CMax is the maximum value of the control output of the compensator 1702.
- T1 in FIG. 22 indicates the time when the control output 2201 is ThC or more and the positional deviation is ThDiff or more after the user starts the shooting direction operation.
- t2 indicates the time when the control output 2201 is ThC or more and the positional deviation 2202 is ThDiff or more and the imaging direction change determination time (t2-t1) has elapsed.
- t3 indicates the time when the moving speed of the pan axis is less than ThV for the first time after time t2.
- t4 indicates a time when an elapsed time after the moving speed becomes ThV or less at time t3 becomes a photographing direction change end determination time (t4-t3).
- FIG. 21A shows an image captured at the timing of time ta while the flower 2001 is captured.
- Reference numeral 2101 in FIG. 21A denotes a subject frame indicating a subject to be tracked, searched, or photographed.
- Reference numeral 2102 denotes a target point as a target position on the image at the center of the subject frame 2101. A point at which two lines 2102 intersect is a target position on the image of the subject.
- alignment is performed by drive control of the pan axis or tilt axis such that the center of the subject frame 2101 and the target point 2102 overlap.
- FIG. 21B is an image captured when the user rotates the lens barrel 102 in the right direction with respect to the fixed unit 103 at the timing of time tb in the state of FIG. 21A.
- the black arrows in FIG. 21B indicate the pan drive direction of position control, and the white arrows indicate the rotational direction of the lens barrel 102 due to the user's shooting direction change operation.
- the control output 2201 and the position deviation 2202 at time tb the position deviation 2202 tends to increase even though the control output is at the maximum value CMax. From this, it can be determined that the user intentionally rotates the pan axis.
- the camera waits for a predetermined time (t2-t1) before the control output of the compensator 1702 is turned off. Judgment of direction change is performed. This is because the shooting direction has been changed if the user has not performed a direction change operation because the user has unintentionally touched the lens barrel, or if the user has not performed a direction change operation due to a load change of the pan axis or tilt axis during search driving. It is a measure for not judging it as The time from the start of the user's shooting direction change operation to the quick determination of the shooting direction change determination may be shortened or eliminated.
- FIG. 21C is a diagram when the target subject enters the angle of view by rotating the pan axis to the vicinity of a new subject by the user changing the shooting direction with the control output of the compensator 1702 turned off at time tc. is there.
- the user needs to continue the shooting direction change operation until the subject to be newly photographed enters the angle of view.
- the smart device is used to confirm the image being changed and the operation is performed. Make sure you are in the corner.
- the LED control unit 224 may cause the LED to emit light The user may be notified by making the output unit 218 output voice.
- FIG. 21D shows an image being captured, tracking a new subject after the shooting direction change, with the control output of the compensator 1702 turned ON at the timing of time t4.
- the time t4 is a timing after the time when the moving speed 2203 of the pan becomes ThV or less at time td or more has passed the photographing direction change operation end determination time (t4-t3) or more. If it is determined that the shooting direction change operation by the user is completed at time t4, the learning information is updated after setting the shooting area at time t4 as a user's favorite area with higher importance than other areas. .
- the subject existing in this area may be tracked as an important subject, and one or more operations of photographing and authentication registration may be performed. For example, as shown in FIG.
- the learning information update process may be performed only when there is a learning instruction from the user, not automatically. For example, after the imaging apparatus notifies the user that the subject is in the angle of view, the learning information is updated only when a specific voice command for learning instruction registered in advance is input. The learning information may be updated only when the user gives a learning instruction.
- the detection of the start and the end of the shooting direction change operation of the imaging apparatus by the user is detected by the control output of the compensator, the position deviation, and the moving speed of the drive axis. If detectable, it may be detected by other methods. For example, the presence or absence of the imaging direction change by the user may be detected based on the time change of the signal of the gyro sensor or the acceleration sensor from the device shake detection unit 209.
- FIG. 23 shows a change in output of the acceleration sensor of the device shake detection unit 209 when the imaging direction of the imaging device is changed by a user operation. 2301 indicates time change of acceleration.
- ThA1 is a threshold value of acceleration used when it is determined that the user has started the shooting direction change operation.
- ThA2 is an acceleration threshold value that determines that the user has finished the shooting direction change operation.
- the threshold and the acceleration may be compared to detect the start and end of the shooting direction change operation.
- the time change pattern of the acceleration at the time of the shooting direction change operation is learned in advance, and the similarity to the time change pattern learned the time change of the detected acceleration It may be determined that the imaging direction has been changed when the value of is a predetermined value or more.
- the presence or absence of the shooting direction operation may be detected according to the change of the motion vector of the image captured by the imaging device.
- the processing for learning the imaging area that is within the angle of view after the imaging direction change operation has been described as an important area.
- the present invention is not limited to this, and when there is a change in the shooting area due to a zoom change or a user operation on an external device, processing may be performed to learn the shooting area after the change operation as an important area.
- FIG. 27 shows the processing of the photographing mode in this case.
- Steps S2701 to S2703 are the same as the normal processing described with reference to FIG.
- step S2704 unlike the normal processing, the search is performed while performing pan / tilt driving so as to cover the entire angle of view.
- step S2705 it is determined whether a specific authenticated person is within the angle of view. At this time, it is desirable to have the owner register the face of the owner's own face in advance for authentication face, and to search for the owner itself as a specific certified person. If the owner is found within the angle of view, the process advances to step S2706.
- step S2706 pan / tilt zoom driving is performed so that the owner falls within the angle of view, and then the process proceeds to the shooting start operation in step S2712.
- Steps S2707 to S2715 are omitted because they are similar to the processes of S905 to S913 in FIG.
- FIG. 28 shows the processing of the photographing mode in this case.
- Steps S2801 to S2803 are the same as the normal process described with reference to FIG.
- step S2804 unlike the normal processing, pan / tilt driving is performed so that the direction in which the sound direction is detected is included in the angle of view.
- step S2805 it is determined whether there is a person at the angle of view in the sound direction. If there is a person, the person is regarded as a sound or voice command generation source, and the process advances to step S2806 to shoot the person.
- step S2806 pan / tilt zoom driving is performed so that the person falls within the angle of view, and then the process proceeds to the shooting start operation in step S2812.
- Steps S2807 to S2815 are omitted because they are similar to the processes of S905 to S913 in FIG.
- the automatic sleep is entered. Also adjust the sleep time.
- the device enters an automatic sleep.
- the problem of a low power-off effect and the risk of losing the photo opportunity can be solved.
- Automatic image transfer Automatically transfers an image or automatically determines an image transfer frequency according to at least one of elapsed time, evaluation value of photographed image, remaining battery capacity, and card capacity. .
- the image is automatically transferred according to the condition (when a high evaluation image is taken after a predetermined time has elapsed).
- the image transfer frequency is automatically determined according to the conditions (it is difficult to transfer the image when the remaining amount of the battery is low, if the shooting frequency is set high, the transfer frequency is also increased, the storage medium is empty) When the capacity is small, increase the transfer frequency).
- the learning mode is automatically entered (when new teacher data is accumulated more than a predetermined amount, the time elapsed from the previous learning is long, there is no noticeable object around, etc. It seems that shooting is not done). As a result, it is possible to solve the problem that time to wait for the learning process occurs or power is wasted unless the condition for entering the learning mode is appropriately set.
- Automatic deletion of images Automatic deletion is performed according to conditions. Set the deletion target number according to the shooting frequency and free space. It is difficult to delete ⁇ image manually shot by the user>, ⁇ image highly evaluated by the user>, and ⁇ image with high importance score calculated by the imaging device>. Also, ⁇ image transferred to external device> and ⁇ image which has not been seen by the user even once> are easily deleted. Also, as long as the acquired highlight moving image is taken at short intervals, old files may be deleted preferentially. In addition, as long as the images are taken at long intervals, it is possible not to delete old files with high scores. In addition, if learning is performed so as to increase the frequency of shooting a moving image, the number may be automatically deleted more than usual.
- the editing process is automatically executed in accordance with at least one of the following conditions: degree of accumulation of photographed image, elapsed time since last editing, evaluation value of photographed image, and temporal turning point.
- shooting is performed in conjunction with the cameras. However, before the camera 3201 presses the release button 3203, the imaging apparatus 101 predicts that the release button is pressed, and shooting is performed before the camera 3201 performs shooting. Start.
- the imaging apparatus 101 performs automatic shooting in the same manner as the automatic shooting determination described above. At this time, learning is performed to predict the timing at which the camera 3201 shoots. Perform automatic shooting judgment using the network.
- FIG. 33 shows a flowchart of the imaging apparatus 101.
- the shooting mode process starts, it is first determined in S3301 whether the mode is a camera cooperation mode. If the mode is the cooperation mode, the process proceeds to S3303. If the mode is not the cooperation mode, the process proceeds to S3302.
- the cooperation mode may be determined based on whether the camera 3201 and the imaging apparatus 101 are connected by wire or wireless, or may be set by the smart device 301.
- step S3202 since the camera cooperation mode is not set, the processing described in FIG. 9 is performed, the photographing mode processing is ended, and the next calculation cycle is waited.
- step S3303 the information from the camera 3201 is read. Information notifies the imaging apparatus 101 of release switch pressing information of the camera 3201, power on state information, subject information from an image, and the like, and the process advances to step S3304.
- step S3304 it is determined whether or not the imaging apparatus 101 is in the process of shooting, and if it is not in the process of shooting, the process proceeds to S3305, and if it is in the process of shooting, the process proceeds to S3306.
- step S3305 it is determined whether the camera 3201 has started shooting. If the shooting is started, the process advances to step S3310 to start shooting of the imaging apparatus 101, end the shooting mode process, and wait for the next calculation cycle. If it is determined in S3305 that the camera 3201 has not started shooting, the process advances to S3307 to perform automatic shooting determination processing.
- the automatic photographing determination process can be realized by the same method as the method described with reference to FIG. At this time, the feature amount input may be determined using both of the information from the camera 3201 and the information from the imaging apparatus 101, or may be determined based on only one of the information.
- step S3308 determines whether the automatic shooting determination process determines that shooting is to be started. If it is determined that the automatic imaging start is determined, the process proceeds to S3309, and the automatic imaging of the imaging apparatus 101 is started. If the automatic imaging start is not determined, the imaging is not performed and the imaging mode processing is ended to wait for the next calculation cycle.
- the photographing end determination can be realized by the same method as the automatic photographing determination processing described with reference to FIG.
- the feature amount input may be determined using both of the information from the camera 3201 and the information from the imaging apparatus 101, or may be determined based on only one of the information.
- the imaging apparatus 101 takes a moving image, adds a tag to an important time zone, and records the final moving image file. You may leave it.
- the automatic shooting timing of cooperation may be learned using the shooting result.
- a feature amount to be input in FIG. 12 at that time is stored as learning data as incorrect answer data.
- the imaging apparatus 101 when the imaging apparatus 101 is in automatic photographing or when the camera 3201 is started, as the correct data, the feature amount to be input in FIG. 12 at that time is stored as learning data.
- the camera 3201 has been described using an example in which a still image is captured and the imaging device 101 captures a moving image, the imaging method is not limited to this, and the following patterns are manually selected using the smart device 301 or the like. You may
- the imaging apparatus 101 may automatically select the following pattern. When selecting automatically, it is determined automatically which pattern to shoot.
- the imaging apparatus 101 captures a moving image.
- the imaging apparatus 101 shoots a still image.
- the imaging apparatus 101 performs still image shooting.
- the imaging device 101 captures a moving image.
- the direction of the optical axis direction of the camera 3201 and the imaging device 101 and the angle of view may be manually selected or automatically selected.
- the camera 3201 and the imaging device 101 have the same direction of the optical axis.
- the camera 3201 and the imaging device 101 have different optical axis directions.
- the camera 3201 and the imaging device 101 have the same angle of view.
- the camera 3201 and the imaging device 101 have different angles of view.
- the imaging apparatus 101 captures a still image, it is possible to predict in advance the start of capturing and automatically capture not only one but some of the images during the automatic capturing period.
- the imaging device 101 is connected to the accessory shoe 3202 of the camera 3201 , but the present invention is not limited to this.
- it may be attached to another member of the camera 3201 (for example, a tripod screw hole or the like), or may be used without being directly attached to the camera 3201 (for example, wearable and information notification via wireless communication).
- the imaging apparatus 101 predicts that the camera 3201 will be photographed in advance and then photographs the image.
- photographing in the camera 3201 may be predicted in advance.
- the camera cooperation shooting by the prior prediction may be performed by issuing a shooting start instruction to the imaging device 101.
- information notification between the camera 3201 and the imaging device 101 may be configured to be notified of only the release timing. Further, the detection information of both the camera 3201 and the imaging apparatus 101 may be used to determine the start of imaging. In addition, the detection information of only the imaging apparatus 101 may be used for the imaging start determination.
- ⁇ Learning with Camera 3201> (1) Transfer Information of Camera 3201 to Imaging Device 101 For example, a main subject is extracted from an image captured by the camera 3201 by user operation.
- subject information is notified to the imaging apparatus 101 and set. Thereafter, the imaging apparatus 101 determines whether the subject is an important subject based on the number of shots of the subject, performs subject registration, and performs automatic shooting / tracking and the like.
- the imaging device 101 is notified of the timing at which the camera 3201 was shot by a user operation. Then, the important subject is set from the image of the imaging device 101 at the photographing timing. Thereafter, the imaging apparatus 101 determines whether the subject is an important subject based on the number of shots of the subject, performs subject registration, and performs automatic shooting / tracking and the like.
- Subject information notification Subject information detected by the imaging apparatus 101 (for example, a face registered as an individual, a subject determined to be the owner's preference such as a dog and cat, and a result of aesthetics determination for determining the user's preferred subject) To the camera 3201. Then, it notifies where the subject is located in the live image of the camera 3201 and what subject is out of the image (for example, there is a car in the right direction on the screen), and notifies whether the subject of the user's preference exists. Do.
- the imaging device 101 may issue a shooting instruction to the camera 3201.
- the photographing timing is determined by the method described in the processing of the automatic photographing mode, and an automatic photographing instruction is issued to the camera 3201.
- an imaging device capable of acquiring a user's favorite video without the user performing a special operation.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Medical Informatics (AREA)
- Human Computer Interaction (AREA)
- Studio Devices (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
- Color Television Image Signal Generators (AREA)
- Processing Of Color Television Signals (AREA)
Abstract
Description
<撮像装置の構成>
図1は、第1の実施形態の撮像装置を模式的に示す図である。
図3は、撮像装置101と外部装置301との無線通信システムの構成例を示す図である。撮像装置101は撮影機能を有するデジタルカメラであり、外部装置301はBluetooth通信モジュール、無線LAN通信モジュールを含むスマートデバイスである。
図5は、撮像装置101と通信可能である外部装置501との構成例を示す図である。撮像装置101は撮影機能を有するデジタルカメラであり、外部装置501は、例えばBluetooth通信モジュールなどにより撮像装置101と通信可能である各種センシング部を含むウエアラブルデバイスである。
図7は、本実施形態における撮像装置101の第1制御部223が受け持つ動作の例を説明するフローチャートである。
(1)電源ボタンが手動で押下されて電源起動
(2)外部機器(例えば301)から外部通信(例えばBLE通信)からの指示で電源起動
(3)Subプロセッサ(第2制御部211)から、電源起動
[モード判定条件]
後述する学習により設定された各検出情報(画像、音、時間、振動、場所、身体の変化、環境変化)や、自動撮影モードに移行してからの経過時間や、過去の撮影情報などから、自動撮影を行うべきと判定されると、自動撮影モードに設定される。
自動撮影モード処理(ステップ710)では、各検出情報(画像、音、時間、振動、場所、体の変化、環境変化)に基づいて、パン・チルトやズームを駆動して被写体を自動探索する。そして、ユーザの好みの撮影が行えるタイミングであると判定されると、静止画一枚撮影、静止画連続撮影、動画撮影、パノラマ撮影、タイムラプス撮影など様々な撮影方法の中から、撮影方法の判定処理が行われ、自動で撮影が行われる。
[モード判定条件]
前回自動編集を行ってからの経過時間と、過去の撮影画像情報から、自動編集を行うべきと判定されると、自動編集モードに設定される。
自動編集モード処理(ステップ712)では、学習に基づいた静止画像や動画像の選抜処理を行い、学習に基づいて画像効果や編集後動画の時間などにより、一つの動画にまとめたハイライト動画を作成する自動編集処理が行われる。
[モード判定条件]
スマートデバイス内の専用のアプリケーションを介した指示により、画像自動転送モードに設定されている場合、前回画像転送を行ってからの経過時間と過去の撮影画像情報から画像自動を行うべきと判定されると、自動画像転送モードに設定される。
画像自動転送モード処理(ステップ714)では、撮像装置101は、ユーザの好みであろう画像を自動で抽出し、スマートデバイス301にユーザの好みの画像を自動で抽出し、画像転送が行われる。ユーザの好みの画像抽出は、後述する各画像に付加されたユーザの好みを判定したスコアにより行う。
[モード判定条件]
前回学習処理を行ってからの経過時間と、学習に使用することのできる画像に対応付けられた情報や学習データの数などから、自動学習を行うべきと判定されると、自動学習モードに設定される。または、スマートデバイス301からの通信を介して学習データが設定されるように指示があった場合も本モードに設定される。
自動学習モード処理(ステップ716)では、ユーザの好みに合わせた学習を行う。スマートデバイス301での各操作、スマートデバイス301からの学習情報通知などの情報を基にニューラルネットワークを用いて、ユーザの好みに合わせた学習が行われる。スマートデバイス301での各操作の情報としては、例えば、撮像装置からの画像取得情報、専用アプリケーションを介して手動による編集指示がされた情報、撮像装置内の画像に対してユーザが入力した判定値情報がある。
[モード判定条件]
前回ファイル自動削除を行ってからの経過時間と、画像を記録している不揮発性メモリ216の残容量とから、ファイル自動削除を行うべきと判定されると、ファイル自動削除モードに設定される。
ファイル自動削除モード処理(ステップ718)では、不揮発性メモリ216内の画像の中から、各画像のタグ情報と撮影された日時などから自動削除されるファイルを指定し(三択処理)削除する。
(1)特定揺れ検出の判定条件
(2)特定音検出の判定条件
(3)時間経過判定の判定条件
ユーザが撮像装置101を例えば指先などで叩いた状態(タップ状態)を、撮像装置101に取り付けられた加速度センサの出力値より検出することが可能である。3軸の加速度センサの出力を所定サンプリングで特定の周波数領域に設定したバンドパスフィルタ(BPF)に通すことで、タップによる加速度変化の信号領域を抽出することができる。BPF後の加速度信号を所定時間TimeA間に、所定閾値ThreshAを超えた回数が、所定回数CountAであるか否かにより、タップ検出を行う。ダブルタップの場合は、CountAは2に設定され、トリプルタップの場合は、CountAは3に設定される。また、TimeAやThreshAについても、学習情報によって変化させることができる。
撮像装置101の揺れ状態を、撮像装置101に取り付けられたジャイロセンサや加速度センサの出力値より検出することが可能である。ジャイロセンサや加速度センサの出力をHPFで高周波成分をカットし、LPFで低周波成分をカットした後、絶対値変換を行う。算出した絶対値が所定時間TimeB間に、所定閾値ThreshBを超えた回数が、所定回数CountB以上であるか否かにより、振動検出を行う。例えば撮像装置101を机などに置いたような揺れが小さい状態か、ウエアラブルで撮像装置101を装着し歩いているような揺れが大きい状態かを判定することが可能である。また、判定閾値や判定のカウント数の条件を複数もつことで、揺れレベルに応じた細かい揺れ状態を検出することも可能である。
特定の音声コマンドを検出する。音声コマンドは事前に登録されたいくつかのコマンドの他、ユーザが特定音声を撮像装置に登録できる。
予め大量の音声データを基に機械学習により学習させたネットワークにより音シーン判定を行う。例えば、「歓声が上がっている」、「拍手している」、「声を発している」などの特定シーンを検出する。検出するシーンは学習によって変化する。
所定時間の間で、音レベルの大きさがレベル所定値を超えている時間を加算するなどの方法によって、音レベル判定による検出を行う。所定時間やレベル所定値の大きさなどが学習によって変化する。
複数のマイクが設置された平面上の音の方向を検出することができ、所定大きさの音レベルに対して、音の方向を検出する。
(1)特定揺れ検出の判定条件
(2)特定音検出の判定条件
(3)時間経過判定の判定条件
図9を用いて、自動撮影モード処理の詳細を説明する。前述したように、以下の処理は、本実施形態における撮像装置101の第1制御部223が制御を受け持つ。
図13を用いて、エリア分割を説明する。図13Aのように撮像装置(原点Oが撮像装置位置とする)位置を中心として、全周囲でエリア分割を行う。図13Aの例においては、チルト方向、パン方向それぞれ22.5度で分割している。図13Aのように分割すると、チルト方向の角度が0度から離れるにつれて、水平方向の円周が小さくなり、エリア領域が小さくなる。よって、図13Bのように、チルト角度が45度以上の場合、水平方向のエリア範囲は22.5度よりも大きく設定している。図13C、Dに撮影画角内でのエリア分割された例を示す。軸1301は初期化時の撮像装置101の方向であり、この方向角度を基準位置としてエリア分割が行われる。1302は、撮像されている画像の画角エリアを示しており、そのときの画像例を図13Dに示す。画角に写し出されている画像内ではエリア分割に基づいて、図13Dの1303~1318のように画像分割される。
前記のように分割した各エリアについて、エリア内に存在する被写体やエリアのシーン状況に応じて、探索を行う優先順位を示す重要度レベルを算出する。被写体の状況に基づいた重要度レベルは、例えば、エリア内に存在する人物の数、人物の顔の大きさ、顔向き、顔検出の確からしさ、人物の表情、人物の個人認証結果に基づいて算出する。また、シーンの状況に応じた重要度レベルは、例えば、一般物体認識結果、シーン判別結果(青空、逆光、夕景など)、エリアの方向からする音のレベルや音声認識結果、エリア内の動き検知情報等である。また、撮像装置の状態判定(S903)で、撮像装置の振動状態が検出されており、振動状態に応じても重要度レベルが変化するようにもすることができる。例えば、「置き撮り状態」と判定された場合、顔認証で登録されている中で優先度の高い被写体(例えば撮像装置のユーザである)を中心に被写体探索が行われるように、特定人物の顔認証を検出すると重要度レベルが高くなるように判定される。また、後述する自動撮影も上記顔を優先して行われることになり、撮像装置のユーザが撮像装置を身に着けて持ち歩き撮影を行っている時間が多くても、撮像装置を取り外して机の上などに置くことで、ユーザが写った画像も多く残すことができる。このときパン・チルトにより探索可能であることから、撮像装置の置き角度などを考えなくても、適当に設置するだけでユーザが写った画像やたくさんの顔が写った集合写真などを残すことができる。なお、上記条件だけでは、各エリアに変化がない限りは、最も重要度レベルが高いエリアが同じとなり、その結果探索されるエリアがずっと変わらないことになってしまう。そこで、過去の撮影情報に応じて重要度レベルを変化させる。具体的には、所定時間継続して探索エリアに指定され続けたエリアは重要度レベルを下げたり、後述するS910にて撮影を行ったエリアでは、所定時間の間重要度レベルを下げたりしてもよい。
前記のように各エリアの重要度レベルが算出されたら、重要度レベルが高いエリアを探索対象エリアとして決定する。そして、探索対象エリアを画角に捉えるために必要なパン・チルト探索目標角度を算出する。
自動撮影を行うかどうかの判定は以下の2つの判定に基づいて行う。1つは、S904にて得られたエリア別の重要度レベルに基づき、重要度レベルが所定値を超えている場合、自動撮影を実施する判定を下す。2つめは、ニューラルネットワークに基づく判定である。ニューラルネットワークの一例として、多層パーセプトロンによるネットワークの例を図12に示す。ニューラルネットワークは、入力値から出力値を予測することに使用されるものであり、予め入力値と、その入力に対して模範となる出力値とを学習しておくことで、新たな入力値に対して、学習した模範に倣った出力値を推定することができる。なお、学習の方法は後述する。図12の1201およびその縦に並ぶ丸は入力層のニューロンであり、1203およびその縦に並ぶ丸は中間層のニューロンであり、1204は出力層のニューロンである。1202のような矢印は各ニューロンを繋ぐ結合を示している。ニューラルネットワークに基づく判定では、入力層のニューロンに対して、現在の画角中に写る被写体や、シーンや撮像装置の状態に基づいた特徴量を入力として与え、多層パーセプトロンの順伝播則に基づく演算を経て出力層から出力された値を得る。そして、出力の値が閾値以上であれば、自動撮影を実施する判定を下す。なお、被写体の特徴は、現在のズーム倍率、現在の画角における一般物体認識結果、顔検出結果、現在画角に写る顔の数、顔の笑顔度・目瞑り度、顔角度、顔認証ID番号、被写体人物の視線角度、シーン判別結果、特定の構図の検出結果等を使用する。また、前回撮影時からの経過時間、現在時刻、GPS位置情報および前回撮影位置からの変化量、現在の音声レベル、声を発している人物、拍手、歓声が上がっているか否か等を使用してもよい。また、振動情報(加速度情報、撮像装置の状態)、環境情報(温度、気圧、照度、湿度、紫外線量)等を使用してもよい。更に、ウエアラブルデバイス501からの情報通知がある場合、通知情報(ユーザの運動情報、腕のアクション情報、心拍などの生体情報など)も特徴として使用してもよい。この特徴を所定の範囲の数値に変換し、特徴量として入力層の各ニューロンに与える。そのため、入力層の各ニューロンは上記使用する特徴量の数だけ必要となる。
撮影方法の判定では、S901乃至S904において検出した、撮像装置の状態や周辺の被写体の状態に基づいて、静止画撮影、動画撮影、連写、パノラマ撮影などの内どれを実行するかの判定を行う。例えば、被写体(人物)が静止している場合は静止画撮影を実行し、被写体が動いている場合は動画撮影または連写を実行する。また、被写体が撮像装置を取り囲むように複数存在している場合や、前述したGPS情報に基づいて景勝地であることが判断できた場合には、パン・チルトを操作させながら順次撮影した画像を合成してパノラマ画像を生成するパノラマ撮影処理を実行してもよい。なお、<自動撮影を行うかどうかの判定>での判定方法と同様に、撮影前に検出した各種情報をニューラルネットワークに基づく判断によって、撮影方法を判定することもできるし、この判定処理は、後述する学習処理によって、判定条件を変更することができる。
上記図9のS907で、説明したとおり、ユーザ(手動)による撮影指示は、音声コマンド入力による指示もある。音声コマンド入力は、ユーザが自分自身を含む撮影を行いたい場合の音声コマンド入力(例えば「私の写真撮って」等)を含む。そして、パン・チルトやズームを用いた探索処理にて、声を発声した被写体を探索し、撮影画角内に音声コマンドを発声した被写体を含めた撮影を実行する。
画像認識処理の一般的な機械学習手段として、畳み込みニューラルネットワークによる主被写体検出が知られている。畳み込みニューラルネットワークによって、検出した主被写体(声掛けした被写体)の有無と、存在すれば画像上の位置情報が得られる。或いは、顔検出や人体検出結果に基づいて、各人物の領域を切り出した画像毎に畳み込みニューラルネットワークによる主被写体判定を行い、主被写体を推定してもよい。この畳み込みニューラルネットワークは、音声コマンドによる発声を行った人物の画像に基づいて予め学習されたものとして用意しておくが、後に説明する方法で使用していくうちに学習させていくこともできる。
現在の画角中に写る人物毎に、被写体の特徴量を入力として与え、各人物に対して、主被写体判定を行う方法がある。その場合、顔の表情判定結果や目瞑り度、顔角度、顔認証ID番号、被写体人物の視線角度などの顔の特徴の他にも、ジェスチャー判定結果、画像シーン判結果、現在の音レベル、音シーン判定結果などを入力する特徴としてしようしてもよい。このニューラルネットワークについても、音声コマンドによる発声を行った人物の画像に基づいて被写体特徴量に基づいた学習されたものであり、後に説明する方法で使用していくうちに学習させていくこともできる。
被写体は撮像装置101に向かって音声コマンドを発声した後であるので、カメラの方向を向いている可能性が非常に高い。そこで、顔認証ID番号、顔表情結果、顔角度、視線方向、ジェスチャー判定結果のそれぞれの検出結果に重み係数をかけ単純に判定を行ってもよい。顔認証IDが登録済みの場合、主被写体である可能性が高い。顔表情の笑顔度が高い場合、主被写体である可能性が高い。顔角度や視線方向がカメラの方向を向いている場合、主被写体である可能性が高い。ジェスチャー(例えば、カメラに向かって手ふりなど)を行っている場合、主被写体である可能性が高い。何れか1つ以上の情報を用いて、主被写体を推定してもよい。
次に、本実施形態における自動編集モード処理(ハイライト動画)について説明する。
次に、本実施形態におけるファイル自動削除モードの処理について説明する。
次に、本実施形態におけるユーザの好みに合わせた学習について説明する。
自動撮影に対する学習について説明する。自動撮影では、ユーザの好みに合った画像の撮影を自動で行うための学習を行う。図9のフローを用いた説明で上述したように、撮影後(ステップS912)に学習情報生成処理が行われている。後述する方法により学習させる画像を選択させ、画像に含まれる学習情報を基に、ニューラルネットワークの重みを変化させることで学習する。学習は、自動撮影タイミングの判定を行うニューラルネットワークの変更と、撮影方法(静止画撮影、動画撮影、連写、パノラマ撮影など)の判定をニューラルネットワークの変更で行われる。
自動編集に対する学習について説明する。自動編集は、図9のステップ911での撮影直後の編集と、図11で説明したハイライト動画の編集のそれぞれに対して学習が行われる。撮影直後の編集について説明する。後述する方法により学習させる画像を選択させ、画像に含まれる学習情報を基に、ニューラルネットワークの重みを変化させることで学習する。撮影或いは撮影直前の情報により得られた各種検出情報をニューラルネットワークに入力し、編集方法(トリミング処理、画像の回転処理、HDR(ハイダイナミックレンジ)効果、ボケ効果、色変換フィルタ効果など)の判定を行う。ハイライト動画の編集について説明する。ハイライト動画は、ユーザの好みに合ったアルバム動画作成を自動で行うための学習を行う。後述する方法により学習させる画像を選択させ、画像に含まれる学習情報を基に、ニューラルネットワークの重みを変化させることで学習する。撮影或いは撮影直前の情報により得られた各種検出情報をニューラルネットワークに入力し、画像効果付与(トリミング処理、回転処理、HDR効果、ボケ効果、スライド、ズーム、フェード、色変換フィルタ効果、BGM、時間、静止画動画比率)の判定を行う。
被写体探索に対する学習について説明する。被写体探索では、ユーザの好みに合った被写体の探索を自動で行うための学習を行う。図9のフローを用いた説明で上述したように、被写体探索処理(ステップS904)において、各エリアの重要度レベルを算出し、パン・チルト、ズームを駆動し、被写体探索を行っている。学習は撮影画像や探索中の検出情報によって学習され、ニューラルネットワークの重みを変化させることで学習する。探索動作中の各種検出情報をニューラルネットワークに入力し、重要度レベルの算出を行い、重要度レベルに基づきパン・チルトの角度を設定することで学習を反映した被写体探索を行う。また、重要度レベルに基づくパン・チルト角度の設定以外にも、例えば、パン・チルト駆動(速度、加速度、動かす頻度)の学習も行う。
被写体登録に対する学習について説明する。被写体登録では、ユーザの好みに合った被写体の登録やランク付けを自動で行うための学習を行う。学習として、例えば、顔認証登録や一般物体認識の登録、ジェスチャーや音声認識、音によるシーン認識の登録を行う。認証登録は人と物体に対する認証登録を行い、画像取得される回数や頻度、手動撮影される回数や頻度、探索中の被写体の現れる頻度からランク設定を行う。登録された情報は、各ニューラルネットワークを用いた判定の入力として登録されることになる。
撮影報知に対する学習について説明する。図9のS910で説明したように、撮影直前に、所定の条件を満たしたとき、撮像装置が撮影対象となる人物に対して撮影を行う旨を報知した上で撮影することも行う。例えば、パン・チルトを駆動することにより視覚的に被写体の視線を誘導するモーションや、音声出力部218から発するスピーカー音や、LED制御部224によるLED点灯光を使用する。上記報知の直後に被写体の検出情報(例えば、笑顔度、目線検出、ジェスチャー)が得られたか否かで、検出情報を学習に使用するかを判定し、ニューラルネットワークの重みを変化させることで学習する。または、後述する方法により学習させる画像を選択させ、画像に含まれる学習情報を基に、ニューラルネットワークの重みを変化させることで学習する。画像には、撮影直前にどのように報知動作が行われたかの情報が埋め込まれており、選択された画像に付加された検出情報や上記撮影直前の報知動作情報を教師データとして学習する。撮影直前の各検出情報をニューラルネットワークに入力し、報知を行うか否かの判定や、各報知動作(音(音レベル/音の種類/タイミング)、LED光(色、点灯時間、点滅間隔)、パン・チルトモーション(動き方、駆動速度))の判定を行う。各報知動作の学習については、予め用意された報知方法(音、LED光、パン・チルトモーションの複合動作)の中からどの報知を行うかを選択する学習を行う方法でもよい。また、音、LED光、パン・チルトモーションの各報知動作それぞれに対して別々のニューラルネットワークを設けてそれぞれの動作を学習する方法でもよい。
図7、図8を用いて、説明したようにMainプロセッサ(第1制御部223)の供給電源をON/OFFする制御を行うが、低消費電力モードからの復帰条件や、低消費電力状態への遷移条件の学習が行われる。
上述したとおり、所定時間TimeAや所定閾値ThreshAを学習により変化させる。上記のタップ検出の閾値を下げた状態での仮タップ検出も行っており、タップ検出前に仮タップ検出が判定されていたか否かで、TimeAやThreshAのパラメータを検出し易いように設定する。また、タップ検出後の検出情報から、起動要因ではなかった(上述した被写体探索や自動撮影判定の結果、撮影対象がいない)と判定されると、TimeAやThreshAのパラメータを検出し難いように設定する。起動時の撮影対象がいるか否かの判定は後述する学習方法により学習された画像に埋め込まれた被写体検出情報により変化することになる。
上述したとおり、所定時間TimeBや所定閾値ThreshBや所定回数CountBなど学習により変化させる。揺れ状態により起動条件に入った場合、起動を行うが、起動後所定時間間の検出情報から、起動要因ではなかった(上述した被写体探索や自動撮影判定の結果、撮影対象がいない)と判定されると、揺れ状態判定のパラメータを変更し、起動し難いように学習する。また、揺れが大きい状態での撮影頻度が高いと判定されると、揺れ状態判定により起動し易いように設定する。起動時の撮影対象がいるか否かの判定や、揺れが大きい状態での撮影頻度が多いかの判定は、後述する学習方法により学習された画像に埋め込まれた被写体検出情報や撮影時の揺れ情報などにより変化することになる。
ユーザが検出したい特定音声や、特定音シーンや、特定音レベルを、例えば外部機器301の専用アプリケーションと通信を介して、手動で設定することでの学習ができる。また、複数の検出を音声処理部に予め設定しておき、後述する方法により学習させる画像を選択させ、画像に含まれる撮影前後の音情報などの学習情報を基に、学習する。これにより、起動要因とする音判定(特定音コマンドや、「歓声」「拍手」などの音シーン)を設定でき、音検出による起動を学習することができる。
ユーザが起動したい環境情報変化の条件を、例えば外部機器301の専用アプリケーションと通信を介して、手動で設定することでの学習ができる。例えば、温度や気圧や明るさや湿度や紫外線量の絶対量や変化量の特定条件によって起動させることができる。また、各環境情報に基づく、判定閾値を学習することもできる。環境情報による起動後後の検出情報から、起動要因ではなかった(上述した被写体探索や自動撮影判定の結果、撮影対象がいない)と判定されると、各判定閾値のパラメータを検出し難いように設定したりする。或いは、後述する学習方法により学習された画像に埋め込まれた各環境の情報から学習することで、環境情報による起動を学習することができる。例えば、温度上昇時において撮影された画像を多く学習させた場合、温度上昇時に駆動し易いように学習がされることになる。また、上記各パラメータは、電池の残容量によっても変化する。例えば、電池残量が少ないときは各種判定に入り難くなり、電池残量が多いときは各種判定に入り易くなる。具体的には、ユーザが必ず撮像装置を起動してほしい要因でない条件である揺れ状態検出結果や、音検出の音シーン検出は、電池残量によって各検出判定のし易さが変化することになる。
ファイル自動削除に対する学習について説明する。ファイル自動削除では、ファイルの空き容量や優先して削除する画像の選択などについて学習を行う。後述する方法により学習させる画像を選択させ、画像に含まれる学習情報を基に、ニューラルネットワークの重みを変化させることで学習することができる。上述したように、上記自動撮影で説明したとおり各画像には、ユーザの好みを判定されたスコアが演算されており、スコアが低い画像が優先して記録媒体221から削除される。また、スコアの点数だけでなく、記録媒体221内の各画像に埋め込まれた撮影日時や、後述する方法で、選択されたハイライト動画(自動編集された動画)の編集内容によって学習する。例えば、取得されたハイライト動画が、短い時間間隔で撮影された画像が多い場合、撮影された日時が古いファイルが優先的に削除されるが、長い時間間隔で撮影された画像を含む場合、日時が古くてもスコアの高いファイルは削除しないように学習される。或いは、逐次所定時間間隔で、記録媒体221内の各画像のスコアを再計算するようにする。スコア算出時のニューラルネットワークには撮影日時情報も入力されており、短い時間間隔で撮影された画像が多い場合、撮影日時が古いファイルはスコアが低くなるように学習される。これにより、優先的に削除されるようになり、長い時間間隔で撮影された画像を含む場合、日時が古くてもスコアは低くならないように学習されることで、日時が古くてもスコアの高いファイルは削除しないように学習される。他の例では、後述する方法で学習させる画像が選択されるが、選択される画像の日時が比較的新しいものを集中して選択されることが多い場合、撮影された日時が古いファイルを優先的に削除される。しかし、選択される画像の日時が古いものも選択されることが多い場合は、日時が古くてもスコアの高いファイルは削除しないように学習する。他の例では、撮影頻度が多くなるように学習されている場合は、ファイルの空き領域を多くとるようにファイルが自動削除され、撮影頻度が少なくなるように学習されている場合は、ファイルの空き領域は少なくていいようにファイルの自動削除が行われる。他の例では、動画の撮影頻度が多くなるように学習されている場合、ファイルの空き領域を多くとるようにファイルが自動削除され、静止画の撮影頻度が多くなるように学習されている場合、ファイルの空き領域は少なくなるようにファイルの自動削除が行われる。
像揺れ補正に対する学習について説明する。像揺れ補正は、図9のS902で補正量を算出し、補正量に基づいてS905でパン・チルトを駆動することにより、像揺れ補正を行う。像揺れ補正では、ユーザの揺れの特徴に合わせた補正を行うための学習を行う。撮影画像に対して、例えば、PSF(Point Spread Function)を推定することにより、ブレの方向及び大きさを推定することが可能である。図9のS912の学習用情報生成では、推定したブレの方向と大きさが、情報として画像に付加されている。図7のステップ716での学習モード処理内では、推定したブレの方向と大きさを出力として、撮影時の各検出情報を入力として、揺れ補正用のニューラルネットワークの重みを学習させる。撮影時の各検出情報とは、撮影前所定時間における画像の動きベクトル情報や、検出した被写体(人や物体)の動き情報、振動情報(ジャイロ出力、加速度出力、撮像装置の状態)等である。他にも、環境情報(温度、気圧、照度、湿度)、音情報(音シーン判定、特定音声検出、音レベル変化)、時間情報(起動からの経過時間、前回撮影時からの経過時間)、場所情報(GPS位置情報、位置移動変化量)なども入力に加えて判定してもよい。S902での補正量算出時において、上記各検出情報をニューラルネットワークに入力することで、その瞬間撮影したときのブレの大きさを推定することができ、推定したブレの大きさが大きいときは、シャッター速度を短くするなどの制御が可能となる。また、推定したブレの大きさが大きいときはブレ画像になってしまうので撮影を禁止するなどの方法もとれる。また、パン・チルト駆動角度には制限があるため、駆動端に到達してしまうとそれ以上補正を行うことができないが、上記のように撮影時のブレの大きさと方向を推定することで、露光中揺れ補正するためのパン・チルト駆動に必要な範囲を推定できる。露光中可動範囲の余裕がない場合は、揺れ補正量を算出するフィルタのカットオフ周波数を大きくして、可動範囲を超えないように設定することで、大きなブレを抑制することもできる。また、可動範囲を超えそうな場合は、露光直前にパン・チルトの角度を可動範囲を超えそうな方向とは逆の方向に回転してから、露光開始することで、可動範囲を確保してブレない撮影を行うこともできる。これにより、ユーザの撮影時の特徴や使い方に合わせて揺れ補正を学習することができるので、ブレのない画像を撮影することができる。また、上述した「撮影方法の判定」において、流し撮り撮影を行うか否かを判定し、撮影前までの検出情報から、被写体がブレなく撮影するためのパン・チルト駆動速度を推定して、被写体ブレ補正を行ってもよい。ここで、流し撮り撮影は、動いている被写体はブレがなく、動いていない背景が流れる撮影である。その場合、上記各検出情報をニューラルネットワークに入力することで、静止画像撮影中の駆動速度を推定する。学習は、画像を各ブロックで分割して、各ブロックのPSFを推定することで、主被写体が位置するブロックでのブレの方向及び大きさを推定し、その情報を基に学習することができる。また、後述する学習方法により、選択された画像の背景の流れ量に基づいて、背景の流し量を学習することもできる。その場合、選択された画像の中で、主被写体が位置しないブロックでのブレの大きさを推定し、その情報を基にユーザの好みを学習することができる。学習した好みの背景流し量に基づいて、撮影時のシャッター速度を設定することで、ユーザの好みにあった流し撮り効果が得られる撮影を自動で行うことができる。
画像自動転送に対する学習について説明する。画像自動転送では、記録媒体221に記録された画像の中から、優先して転送する画像の選択処理や転送頻度などについて学習を行う。後述する方法により学習させる画像を選択させ、画像に含まれる学習情報を基に、ニューラルネットワークの重みを変化させることで学習することができる。上述したように、上記自動撮影で説明したとおり各画像には、ユーザの好みを判定されたスコアが演算されており、スコアが高い画像が優先して画像転送する。また、過去に画像転送した画像に対応した学習情報も画像転送判定に使用する。後述する方法で学習させる画像が選択されると、画像に含まれる学習情報(特徴量)の何を重要視するかが設定され、過去に画像転送した画像が同じような特徴量を含むものが多い場合、別の特徴量を含み且つスコアの高い画像を転送するように設定する。また、撮像装置の各状態に応じて、画像転送頻度も変化する。電池の残容量によって変化する。例えば、電池残量が少ないときは、画像転送され難く、電池残量が多いときは、画像転送しやすくなるように設定される。具体的に例えば、前回自動転送された時からの経過時間と、その経過時間の間で撮影された画像の中で最も高いスコアとを乗算し、乗算した値が閾値を超えた時に画像転送するようにしておき、閾値を電池残量によって変化するような構成をとっても実現できる。他の例では、撮像装置101が設定された撮影頻度に応じて、画像自動転送の頻度を変更する。撮影頻度が多くなるように学習されている場合は、画像自動転送の頻度も多くなるように設定され、撮影頻度が少なくなるように学習されている場合は、画像自動転送の頻度も少なくなるように設定される。このとき撮影頻度によって上記閾値を変化させることで撮影頻度設定に応じた画像転送頻度を変更できる。他の例では、ファイル(記録媒体221)の空き容量に応じて、画像自動転送の頻度を変更することも行われる。ファイルの空き容量が多い場合は、画像自動転送の頻度は少なく、ファイルの空き容量が少ない場合は、画像自動転頻度が多くなるように設定される。このときファイル空き容量によって上記閾値を変化させることでファイル空き容量に応じた画像転送頻度を変更できる。
図9のステップS907乃至S913で説明したとおり、本実施形態においては、撮像装置101は、手動撮影と自動撮影の2つの撮影を行うことができる。ステップS907で手動操作による撮影指示(上記説明したとおり、3つの判定に基づいて行う)があった場合は、ステップS912において、撮影画像は手動で撮影された画像であるとの情報が付加される。また、ステップS909にて自動撮影ONと判定されて撮影された場合においては、ステップS912において、撮影画像は自動で撮影された画像であると情報が付加される。
被写体探索動作中において、個人認証登録されている被写体が、どんな人物、物体、シーンと同時に写っているかを判定し、同時に画角内に写っている時間比率を演算しておく。
図3で説明したとおり、撮像装置101と外部機器301は、通信302、303の通信手段を有している。主に通信302によって画像の送受信が行われ、外部機器301内の専用のアプリケーションを介して、撮像装置101内の画像を外部機器301に通信取得することができる。また、撮像装置101内の保存されている画像データのサムネイル画像を外部機器301内の専用のアプリケーションを介して、閲覧可能な構成である。これにより、ユーザはサムネイル画像の中から、自分が気に入った画像を選択して、画像確認し、画像取得指示を操作することで外部機器301に画像取得できる。
上記で説明したとおり、撮像装置101と外部機器301は、通信手段を有しており、撮像装置101内の保存されている画像を外部機器301内の専用のアプリケーションを介して、閲覧可能な構成である。ここで、ユーザは、各画像に対して点数付を行う構成にしてもよい。ユーザが好みと思った画像に対して高い点数(例えば5点)を付けたり、好みでないと思った画像に対して低い点数(例えば1点)を付けることができ、ユーザの操作によって、撮像装置が学習していくような構成にする。各画像の点数は、撮像装置内で学習情報と共に再学習に使用する。指定した画像情報からの特徴データを入力した際のニューラルネットワークの出力が、ユーザが指定した点数に近づくように学習される。
外部機器301は、記憶部404を有し、記憶部404には撮像装置101で撮影された画像以外の画像も記録される構成とする。このとき、外部機器301内に保存されている画像は、ユーザが閲覧し易く、公衆回線制御部406を介して、共有サーバに画像をアップロードすることも容易なため、ユーザの好みの画像が多く含まれる可能性が非常に高い。
人と人の繋がりに主眼をおいた社会的なネットワークを構築できるサービスやウェブサイトであるソーシャル・ネットワーキング・サービス(SNS)における情報を学習に使用する方法について説明する。画像をSNSにアップロードする際に、スマートデバイスから画像に関するタグを入力した上で、画像と共に送信する技術がある。また、他のユーザがアップロードした画像に対して好き嫌いを入力する技術もあり、他のユーザがアップロードした画像が、外部機器301を所有するユーザの好みの写真であるかも判定できる。
上記で説明したとおり、撮像装置101と外部機器301は、通信手段を有しており、撮像装置101内に現在設定されている学習パラメータを外部機器301に通信し、外部機器301の記憶部404に保存することができる。学習パラメータとしては、例えば、ニューラルネットワークの重みや、ニューラルネットワークに入力する被写体の選択などが考えられる。また、外部機器301内の専用のアプリケーションを介して、専用のサーバにセットされた学習パラメータを公衆回線制御部406を介して取得して、撮像装置101内の学習パラメータに設定することもできる構成とする。これにより、ある時点でのパラメータを外部機器301に保存しておいて、撮像装置101に設定することで学習パラメータを戻すこともできるし、他のユーザが持つ学習パラメータを専用のサーバを介して取得し自身の撮像装置101に設定することもできる。
外部機器301の専用のアプリケーションにユーザの操作により手動で編集できる機能を持たせ、編集作業の内容を学習にフィードバックすることもできる。例えば、画像効果付与の編集が可能であり、画像の学習情報に対して、手動で編集した画像効果付与が判定されるように、自動編集のニューラルネットワークを学習させる。画像効果は、例えば、トリミング処理、回転処理、スライド、ズーム、フェード、色変換フィルタ効果、時間、静止画動画比率、BGMが考えられる。
学習については、上記説明したとおりユーザの好みの学習を実施する。そして、「自動撮影」のS908において、ニューラルネットワークの出力値が、教師データであるユーザの好みとは異なることを示す値であるときに自動撮影する。例えば、ユーザが好んだ画像を教師画像とし、教師画像と類似する特徴を示すときに高い値が出力されように学習をさせた場合は、逆に出力値が所定以上低いことを条件として自動撮影を行う。また、同様に被写体探索処理や自動編集処理においても、ニューラルネットワークの出力値が、教師データであるユーザの好みとは異なることを示す値となる処理を実行する。
この方法では、学習処理の時点で、ユーザの好みとは異なるシチュエーションを教師データとして学習を実行する。例えば、手動で撮影した画像はユーザが好んで撮影したシーンであるとして、これを教師データとする学習方法を上述した。これに対し、本実施形態では逆に手動撮影した画像は教師データとして使用せず、所定時間以上手動撮影が行われなかったシーンを教師データとして追加する。あるいは、教師データの中に手動撮影した画像と特徴が類似するシーンがあれば、教師データから削除してもよい。また、外部通信機器で画像取得した画像と特徴が異なる画像を教師データに加えるか、画像取得した画像と特徴が似た画像を教師データから削除してもよい。このようにすることで、教師データには、ユーザの好みと異なるデータが集まり、学習の結果、ニューラルネットワークは、ユーザの好みと異なるシチュエーションを判別することができるようになる。そして、自動撮影ではそのニューラルネットワークの出力値に応じて撮影を行うことで、ユーザの好みとは異なるシーンを撮影できる。また、自動編集では、同様にユーザの好みとは異なる編集画像の提案が可能となる。
kp(f)×(x1-x0) (式1)
kt(f)×(y1-y0) (式2)
図9において本実施形態における撮影モードの基本的な処理シーケンスを説明したが、いかなる時もこのシーケンスにしたがって処理をしていると、被写体を捉えて自動撮影が行われるまでに時間が掛かってしまう。この場合、シャッターチャンスを逸失したり、ユーザの意図と異なる被写体を撮影してしまったりする恐れがある。特に、低消費電力モードが解除される(以下ウェイクアップという)際、どのような条件に基づいて解除されたかによって、最適な処理シーケンスは異なる。ここでは、ウェイクアップ条件とそれに適した処理シーケンスの例を示す。
タップ検出によるウェイクアップができると上述した。このような場合は、撮像装置101の所有者が撮影の意思を持ってウェイクアップを指示したと考えられる。そのため、周囲を探索して所有者を見つけ、所有者が写るように即座に自動撮影するような処理が好ましい。
音検出および音声コマンド認識によるウェイクアップができると上述した。音検出の場合は、音の方向に関心の対象人物がいる可能性が高い。また、音声コマンド認識の場合は、音声コマンドを発した人物が、自身を撮影してほしいという意思を持っていると考えられる。そこで、音声を検知した方向にいる人物を見つけて即座に自動撮影するような処理が好ましい。
その他の条件に基づくウェイクアップ(例えば、図8で説明した時間経過判定)の際は、基本とする図9のシーケンスにしたがって処理を行う。こうすることによって、重要な被写体が要るときに限り自動撮影が行なわれ、消費電力や記憶装置の空き容量の消費を抑えることができる。
起動した条件に応じて、起動後の探索、撮影処理を変更する。
[例1]声で起こされたら声の方向を向いて探索、撮影判定開始
[例2]タップで起こされたら持ち主(認証顔)を探す
被写体シーン判定手段を有し、シーン判定結果に応じて、自動スリープに入ることを決定する。判定結果に応じてスリープ時間を調整する。撮像装置の内部状態を判定する手段を有し、内部状態判定手段に応じて自動スリープに入る。
[例1]被写体が居なければ消電に移行
[例2]シーンが変化に乏しければ長めにスリープ
[例3]自動撮影、学習、編集、転送モードのいずれにも該当しない場合スリープ
[例4]電池残量
経過時間、撮影済画像の評価値、電池残量、カード容量のうち少なくとも一つの条件に応じて、画像を自動で転送したり、画像転送頻度を自動で決定したりする。
経過時間、教師データのたまり度合、現在のシーンや被写体の判別結果、予定時刻、今後の撮影可能性、電源OFF時のうち少なくとも一つの条件に応じて、自動で学習モードに入る。
条件に応じて自動削除を行う。撮影頻度、空き容量に応じて削除目標枚数を設定する。<ユーザが手動撮影した画像>、<ユーザが高評価した画像>、<撮像装置が算出した重要度スコアが高い画像>を削除されづらく設定する。また、<外部装置に転送済み画像>、<一度もユーザの目に触れてない画像>を削除されやすく設定する。また、取得されたハイライト動画が短い間隔で撮影されていれば、古いファイルを優先的に削除してもよい。また、長い間隔で撮影されていれば、古くてもスコアの高いファイルは削除しないようにしてもよい。また、動画の撮影頻度が多くなるように学習されていれば、通常より多くを自動削除するようにしてもよい。
撮影画像のたまり度合、前回編集時からの経過時間、撮影済み画像の評価値、時間的な節目のうち少なくとも一つの条件に応じて、編集処理を自動実行する。
(1)カメラ3201の情報を撮像装置101に転送
例えば、ユーザ操作でカメラ3201で撮影された画像から、メイン被写体を抽出する。
例えば、ユーザ操作でカメラ3201が撮影されたタイミングを撮像装置101に通知する。そして、撮影タイミングにおける撮像装置101の画像から重要被写体を設定する。その後、撮像装置101は被写体を撮影した枚数から、重要な被写体であるかを判断し、被写体登録して、自動撮影/追尾などを行う。
撮像装置101と別カメラ3201とを連携して撮影する場合の撮像装置101からの情報によりカメラ3201をアシストする例を説明する。
撮像装置101で検出した被写体情報(例えば、個人登録された顔、犬猫など所有者の好みと判定された被写体、ユーザの好みの被写体を判定された審美性判定結果)をカメラ3201に通知する。そして、カメラ3201のライブ画像においてその被写体がどこに位置するかや、画像外においてどんな被写体がいるか(例えば、画面右方向に車がいますとか)を通知し、ユーザの好みの被写体がいるかどうかを通知する。
撮像装置101から、カメラ3201に撮影指示を行う構成にしてもよい。
Claims (22)
- 撮影ユニットにより撮影された撮影画像に関するデータを取得する取得ユニットと、
前記取得ユニットにより取得されたデータに基づいて、前記撮影ユニットの撮影処理を変更する変更ユニットとを有し、
前記変更ユニットは、前記撮影処理を変更する際、自動で処理された撮影画像における前記取得ユニットにより取得されたデータよりも、ユーザによる指示がなされた撮影画像における前記取得ユニットにより取得されたデータの重みづけを大きくすることを特徴とする撮像装置。 - 前記自動で処理された撮影画像とは、自動で撮影された撮影画像、自動で編集された撮影画像、自動で外部機器へ転送された撮影画像、ファイル自動削除で削除されなかった撮影画像の少なくともいずれかであることを特徴とする請求項1に記載の撮像装置。
- 前記ユーザによる指示がなされた撮影画像とは、ユーザによる指示により撮影指示された撮影画像、ユーザによる指示によりスコアを付加された撮影画像、当該撮像装置と相互通信可能な外部機器へユーザによる指示により送信指示され取得された撮影画像、当該撮像装置と相互通信可能な外部機器に記憶された撮影画像、ユーザによる指示によりサーバにアップロードされた撮影画像、ユーザによる指示によりパラメータの変更がなされた撮影画像、ユーザにより編集指示された撮影画像、ユーザによる指示により撮影エリアを変更された撮影画像の少なくともいずれかであることを特徴とする請求項1に記載の撮像装置。
- 前記撮影処理には、撮影トリガの検出処理が含まれることを特徴とする請求項1に記載の撮像装置。
- 前記撮影処理には、撮影方法の判定処理が含まれることを特徴とする請求項1に記載の撮像装置。
- 前記撮影方法の判定処理では、静止画の一枚撮影、静止画の連続撮影、動画撮影、パノラマ撮影、タイムラプス撮影のうちいずれかであると判定することを特徴とする請求項5に記載の撮像装置。
- 前記撮影トリガは、特定の被写体、特定の構図、特定の音、時間、振動の大きさ、場所の変化、ユーザの身体の変化、当該撮像装置の環境変化、当該撮像装置の状態の検出結果の少なくとも1つに基づいて、検出されることを特徴とする請求項4に記載の撮像装置。
- 前記撮影処理には、特定の被写体の探索処理が含まれることを特徴とする請求項1乃至7のいずれか1項に記載の撮像装置。
- 当該撮像装置は、撮影レンズと撮像素子を含む筐体を、少なくとも1軸以上の方向で回転駆動できる回転機構を備え、
前記特定の被写体の探索処理は、回転機構を回転することで行われることを特徴とする請求項8に記載の撮像装置。 - ズームレンズのズーム駆動を制御することで前記特定の被写体の探索処理を行うことを特徴とする請求項8に記載の撮像装置。
- 撮影画像の一部を切り出して、前記特定の被写体の探索処理を行うことを特徴とする請求項8に記載の撮像装置。
- 前記特定の被写体は、人物の顔であり、
前記自動で撮影された撮影画像は、探索中の被写体の現れる頻度、人物の表情に応じて撮影された撮影画像であることを特徴とする請求項7乃至11のいずれか1項に記載の撮像装置。 - 前記特定の被写体は、物体であり、
前記自動で撮影された撮影画像は、物体認識に応じて撮影された撮影画像であることを特徴とする請求項7乃至11のいずれか1項に記載の撮像装置。 - 撮影画像を編集する編集ユニットと、
撮影画像に関するデータを取得する取得ユニットと、
前記取得ユニットにより取得されたデータに基づいて、前記編集ユニットの編集処理を変更する変更ユニットとを有し、
前記変更ユニットは、前記編集処理を変更する際、自動で処理された撮影画像における前記取得ユニットにより取得されたデータよりも、ユーザによる指示がなされた撮影画像における前記取得ユニットにより取得されたデータの重みづけを大きくすることを特徴とする撮像装置。 - 撮影画像を保存する記録ユニットと、
前記記録ユニットに保存された撮影画像を、相互通信可能な外部機器に転送する画像自動転送ユニットと、
撮影画像に関するデータを取得する取得ユニットと、
前記取得ユニットにより取得されたデータに基づいて、前記画像自動転送ユニットで送信される画像の選択処理を変更する変更ユニットとを有し、
前記変更ユニットは、前記選択処理を変更する際、自動で処理された撮影画像における前記取得ユニットにより取得されたデータよりも、ユーザによる指示がなされた撮影画像における前記取得ユニットにより取得されたデータの重みづけを大きくすることを特徴とする撮像装置。 - 撮影画像を保存する記録ユニットと、
前記記録ユニットに保存された撮影画像を自動で削除する削除ユニットと、
撮影画像に関するデータを取得する取得ユニットと、
前記取得ユニットにより取得されたデータに基づいて、前記削除ユニットで削除される撮影画像の選択処理を変更する変更ユニットとを有し、
前記変更ユニットは、前記選択処理を変更する際、自動で処理された撮影画像における前記取得ユニットにより取得されたデータよりも、ユーザによる指示がなされた撮影画像における前記取得ユニットにより取得されたデータの重みづけを大きくすることを特徴とする撮像装置。 - 撮影画像を表示する表示ユニットと、
撮影画像に関するデータを取得する取得ユニットと、
前記取得ユニットにより取得されたデータに基づいて、前記表示ユニットで表示される撮影画像の表示順序を変更する変更ユニットとを有し、
前記変更ユニットは、前記表示順序を変更する際、自動で処理された撮影画像における前記取得ユニットにより取得されたデータよりも、ユーザによる指示がなされた撮影画像における前記取得ユニットにより取得されたデータの重みづけを大きくすることを特徴とする撮像装置。 - 撮影前に、撮影判定された被写体に対して、撮影することを知らせる報知ユニットと、
撮影画像に関するデータを取得する取得ユニットと、
前記取得ユニットにより取得されたデータに基づいて、前記報知ユニットの報知処理を変更する変更ユニットとを有し、
前記変更ユニットは、前記報知処理を変更する際、自動で処理された撮影画像における前記取得ユニットにより取得されたデータよりも、ユーザによる指示がなされた撮影画像における前記取得ユニットにより取得されたデータの重みづけを大きくすることを特徴とする撮像装置。 - 低消費電力モードに遷移する設定ユニットと、
低消費電力モードからの解除を判定する解除判定ユニットと、
撮影画像に関するデータを取得する取得ユニットと、
前記取得ユニットにより取得されたデータに基づいて、前記設定ユニットと解除判定ユニットの少なくとも1つの判定処理を変更する変更ユニットとを有し、
前記変更ユニットは、前記判定処理を変更する際、自動で処理された撮影画像における前記取得ユニットにより取得されたデータよりも、ユーザによる指示がなされた撮影画像における前記取得ユニットにより取得されたデータの重みづけを大きくすることを特徴とする撮像装置。 - 振れを補正する振れ補正ユニットと、
撮影画像に関するデータを取得する取得ユニットと、
前記取得ユニットにより取得されたデータに基づいて、前記振れ補正ユニットのブレの補正処理を変更する変更ユニットとを有し、
前記変更ユニットは、前記振れ補正処理を変更する際、自動で処理された撮影画像における前記取得ユニットにより取得されたデータよりも、ユーザによる指示がなされた撮影画像における前記取得ユニットにより取得されたデータの重みづけを大きくすることを特徴とする撮像装置。 - 撮像装置であって、
撮影ユニットにより撮影された撮影画像に関する第1のデータに基づいて、前記撮像装置の処理を変更する変更ユニットを有し、
前記変更ユニットは、前記撮像装置の処理を変更する際、自動で処理された撮影画像における前記第1のデータよりも、ユーザによる指示がなされた撮影画像における前記第1のデータの重みづけを大きくすることを特徴とする撮像装置。 - 撮像装置の制御方法であって、
撮影ユニットにより撮影された撮影画像に関する第1のデータに基づいて、前記撮像装置の処理を変更する変更ステップを有し、
前記変更ステップでは、前記撮像装置の処理を変更する際、自動で処理された撮影画像における前記第1のデータよりも、ユーザによる指示がなされた撮影画像における前記第1のデータの重みづけを大きくすることを特徴とする制御方法。
Priority Applications (10)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201880072663.0A CN111345028B (zh) | 2017-09-28 | 2018-09-20 | 图像拾取装置及其控制方法 |
KR1020207010743A KR102405171B1 (ko) | 2017-09-28 | 2018-09-20 | 촬상장치 및 그 제어방법 |
KR1020227018268A KR102475999B1 (ko) | 2017-09-28 | 2018-09-20 | 화상 처리장치 및 그 제어방법 |
BR112020006277-4A BR112020006277A2 (pt) | 2017-09-28 | 2018-09-20 | aparelho de captura de imagem e método de controle para o mesmo |
DE112018005025.4T DE112018005025T5 (de) | 2017-09-28 | 2018-09-20 | Bildaufnahmevorrichtung und Steuerungsverfahren für diese |
GB2005228.8A GB2581621B (en) | 2017-09-28 | 2018-09-20 | Image pickup apparatus and control method therefor |
RU2020114752A RU2741480C1 (ru) | 2017-09-28 | 2018-09-20 | Оборудование фиксации изображений и способ управления для него |
US16/830,028 US11102389B2 (en) | 2017-09-28 | 2020-03-25 | Image pickup apparatus and control method therefor |
US17/371,437 US11729487B2 (en) | 2017-09-28 | 2021-07-09 | Image pickup apparatus and control method therefor |
US18/352,176 US20230362472A1 (en) | 2017-09-28 | 2023-07-13 | Image pickup apparatus and control method therefor |
Applications Claiming Priority (6)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2017-188938 | 2017-09-28 | ||
JP2017188938 | 2017-09-28 | ||
JP2017254231 | 2017-12-28 | ||
JP2017-254231 | 2017-12-28 | ||
JP2018-053078 | 2018-03-20 | ||
JP2018053078A JP6766086B2 (ja) | 2017-09-28 | 2018-03-20 | 撮像装置およびその制御方法 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US16/830,028 Continuation US11102389B2 (en) | 2017-09-28 | 2020-03-25 | Image pickup apparatus and control method therefor |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019065454A1 true WO2019065454A1 (ja) | 2019-04-04 |
Family
ID=65902447
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2018/034818 WO2019065454A1 (ja) | 2017-09-28 | 2018-09-20 | 撮像装置およびその制御方法 |
Country Status (6)
Country | Link |
---|---|
US (1) | US20230362472A1 (ja) |
KR (1) | KR102475999B1 (ja) |
CN (1) | CN114019744A (ja) |
GB (3) | GB2581621B (ja) |
RU (1) | RU2762998C2 (ja) |
WO (1) | WO2019065454A1 (ja) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111627048A (zh) * | 2020-05-19 | 2020-09-04 | 浙江大学 | 多摄像机的协同目标搜索方法 |
JP2020195099A (ja) * | 2019-05-29 | 2020-12-03 | キヤノン株式会社 | 画像処理装置及び画像処理方法、撮像装置、プログラム、記憶媒体 |
CN112040115A (zh) * | 2019-06-03 | 2020-12-04 | 佳能株式会社 | 图像处理设备及其控制方法和存储介质 |
WO2022030274A1 (ja) * | 2020-08-05 | 2022-02-10 | キヤノン株式会社 | 振動型アクチュエータの制御装置及びそれを有する振動型駆動装置、交換用レンズ、撮像装置、自動ステージ |
CN114071106A (zh) * | 2020-08-10 | 2022-02-18 | 合肥君正科技有限公司 | 一种低功耗设备冷启动快速白平衡方法 |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110992993B (zh) * | 2019-12-17 | 2022-12-09 | Oppo广东移动通信有限公司 | 视频编辑方法、视频编辑装置、终端和可读存储介质 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2005347985A (ja) * | 2004-06-02 | 2005-12-15 | Seiko Epson Corp | 学習機能付きカメラ |
JP2007520934A (ja) * | 2003-12-24 | 2007-07-26 | ウオーカー ディジタル、エルエルシー | 画像を自動的に捕捉し、管理する方法および装置 |
JP2013115673A (ja) * | 2011-11-30 | 2013-06-10 | Casio Comput Co Ltd | 画像処理装置、画像処理方法、及びプログラム |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE19511713A1 (de) * | 1995-03-30 | 1996-10-10 | C Vis Computer Vision Und Auto | Verfahren und Vorrichtung zur automatischen Bildaufnahme von Gesichtern |
US6301440B1 (en) * | 2000-04-13 | 2001-10-09 | International Business Machines Corp. | System and method for automatically setting image acquisition controls |
US6606458B2 (en) * | 2001-09-05 | 2003-08-12 | Nisca Corporation | Automatic framing camera |
EP1793580B1 (en) * | 2005-12-05 | 2016-07-27 | Microsoft Technology Licensing, LLC | Camera for automatic image capture having plural capture modes with different capture triggers |
JP4458151B2 (ja) * | 2007-11-06 | 2010-04-28 | ソニー株式会社 | 自動撮像装置、自動撮像制御方法、画像表示システム、画像表示方法、表示制御装置、表示制御方法 |
US20110096149A1 (en) * | 2007-12-07 | 2011-04-28 | Multi Base Limited | Video surveillance system with object tracking and retrieval |
JP5504887B2 (ja) * | 2009-12-28 | 2014-05-28 | ソニー株式会社 | 撮像制御装置、撮像制御方法、プログラム |
RU2541353C2 (ru) * | 2013-06-19 | 2015-02-10 | Общество с ограниченной ответственностью "Аби Девелопмент" | Автоматическая съемка документа с заданными пропорциями |
US9800782B2 (en) * | 2013-10-14 | 2017-10-24 | Narrative AB | Method of operating a wearable lifelogging device |
JP2016033571A (ja) * | 2014-07-31 | 2016-03-10 | キヤノン株式会社 | カメラ雲台装置 |
JP6374536B2 (ja) * | 2015-02-02 | 2018-08-15 | 富士フイルム株式会社 | 追尾システム、端末装置、カメラ装置、追尾撮影方法及びプログラム |
CN105657257B (zh) * | 2015-12-29 | 2018-07-17 | 广东欧珀移动通信有限公司 | 全景照片的拍摄方法、装置、系统、移动终端及自拍杆 |
JP6740641B2 (ja) * | 2016-03-03 | 2020-08-19 | ソニー株式会社 | ウェアラブル端末、制御方法、およびプログラム |
-
2018
- 2018-09-20 KR KR1020227018268A patent/KR102475999B1/ko active IP Right Grant
- 2018-09-20 CN CN202111332333.0A patent/CN114019744A/zh active Pending
- 2018-09-20 GB GB2005228.8A patent/GB2581621B/en active Active
- 2018-09-20 GB GB2200244.8A patent/GB2604029B/en active Active
- 2018-09-20 WO PCT/JP2018/034818 patent/WO2019065454A1/ja active Application Filing
- 2018-09-20 RU RU2021100640A patent/RU2762998C2/ru active
- 2018-09-20 GB GB2200242.2A patent/GB2603295B/en active Active
-
2023
- 2023-07-13 US US18/352,176 patent/US20230362472A1/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2007520934A (ja) * | 2003-12-24 | 2007-07-26 | ウオーカー ディジタル、エルエルシー | 画像を自動的に捕捉し、管理する方法および装置 |
JP2005347985A (ja) * | 2004-06-02 | 2005-12-15 | Seiko Epson Corp | 学習機能付きカメラ |
JP2013115673A (ja) * | 2011-11-30 | 2013-06-10 | Casio Comput Co Ltd | 画像処理装置、画像処理方法、及びプログラム |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2020195099A (ja) * | 2019-05-29 | 2020-12-03 | キヤノン株式会社 | 画像処理装置及び画像処理方法、撮像装置、プログラム、記憶媒体 |
JP7393133B2 (ja) | 2019-05-29 | 2023-12-06 | キヤノン株式会社 | 画像処理装置及び画像処理方法、撮像装置、プログラム、記憶媒体 |
CN112040115A (zh) * | 2019-06-03 | 2020-12-04 | 佳能株式会社 | 图像处理设备及其控制方法和存储介质 |
JP2020198556A (ja) * | 2019-06-03 | 2020-12-10 | キヤノン株式会社 | 画像処理装置及びその制御方法、プログラム、記憶媒体 |
JP7348754B2 (ja) | 2019-06-03 | 2023-09-21 | キヤノン株式会社 | 画像処理装置及びその制御方法、プログラム、記憶媒体 |
CN112040115B (zh) * | 2019-06-03 | 2024-02-20 | 佳能株式会社 | 图像处理设备及其控制方法和存储介质 |
CN111627048A (zh) * | 2020-05-19 | 2020-09-04 | 浙江大学 | 多摄像机的协同目标搜索方法 |
WO2022030274A1 (ja) * | 2020-08-05 | 2022-02-10 | キヤノン株式会社 | 振動型アクチュエータの制御装置及びそれを有する振動型駆動装置、交換用レンズ、撮像装置、自動ステージ |
CN114071106A (zh) * | 2020-08-10 | 2022-02-18 | 合肥君正科技有限公司 | 一种低功耗设备冷启动快速白平衡方法 |
CN114071106B (zh) * | 2020-08-10 | 2023-07-04 | 合肥君正科技有限公司 | 一种低功耗设备冷启动快速白平衡方法 |
Also Published As
Publication number | Publication date |
---|---|
GB2603295B (en) | 2023-03-29 |
GB2603295A (en) | 2022-08-03 |
GB2604029B (en) | 2023-02-15 |
CN114019744A (zh) | 2022-02-08 |
KR102475999B1 (ko) | 2022-12-09 |
GB202005228D0 (en) | 2020-05-20 |
GB2581621A (en) | 2020-08-26 |
KR20220079695A (ko) | 2022-06-13 |
GB2581621B (en) | 2022-04-06 |
RU2021100640A3 (ja) | 2021-07-02 |
RU2762998C2 (ru) | 2021-12-24 |
RU2021100640A (ru) | 2021-01-26 |
GB2604029A (en) | 2022-08-24 |
US20230362472A1 (en) | 2023-11-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7077376B2 (ja) | 撮像装置およびその制御方法 | |
JP6799660B2 (ja) | 画像処理装置、画像処理方法、プログラム | |
KR102475999B1 (ko) | 화상 처리장치 및 그 제어방법 | |
CN109981976B (zh) | 摄像设备及其控制方法和存储介质 | |
JP7233162B2 (ja) | 撮像装置及びその制御方法、プログラム、記憶媒体 | |
JP2022070684A (ja) | 撮像装置およびその制御方法、プログラム | |
WO2019124055A1 (ja) | 撮像装置及びその制御方法、プログラム、記憶媒体 | |
JP2019118097A (ja) | 画像処理方法、画像処理装置、撮像装置、プログラム、記憶媒体 | |
JP7267686B2 (ja) | 撮像装置及びその制御方法 | |
JP2023057157A (ja) | 撮像装置及びその制御方法、プログラム | |
JP7403218B2 (ja) | 撮像装置及びその制御方法、プログラム、記憶媒体 | |
CN111105039A (zh) | 信息处理设备及其控制方法和存储器 | |
JP7199808B2 (ja) | 撮像装置およびその制御方法 | |
JP2020071873A (ja) | 情報処理装置、情報処理方法、及び、プログラム | |
JP2021057815A (ja) | 撮像装置及びその制御方法、プログラム、記憶媒体 | |
JP2023127983A (ja) | 撮像装置およびその制御方法、プログラム | |
JP2020145556A (ja) | 撮像装置及びその制御方法、プログラム、記憶媒体 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18863446 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 202005228 Country of ref document: GB Kind code of ref document: A Free format text: PCT FILING DATE = 20180920 |
|
ENP | Entry into the national phase |
Ref document number: 20207010743 Country of ref document: KR Kind code of ref document: A |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112020006277 Country of ref document: BR |
|
ENP | Entry into the national phase |
Ref document number: 112020006277 Country of ref document: BR Kind code of ref document: A2 Effective date: 20200327 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 18863446 Country of ref document: EP Kind code of ref document: A1 |