JP5155092B2

JP5155092B2 - Camera, playback device, and playback method

Info

Publication number: JP5155092B2
Application number: JP2008263490A
Authority: JP
Inventors: 博飯塚; 浩輔松原; 修野中
Original assignee: Olympus Imaging Corp
Current assignee: Olympus Imaging Corp
Priority date: 2008-10-10
Filing date: 2008-10-10
Publication date: 2013-02-27
Anticipated expiration: 2028-10-10
Also published as: CN101729771B; CN101729771A; JP2010093671A

Description

本発明は、カメラおよび再生装置に関し、詳しくは、撮影時に録音可能なカメラおよびこのカメラで撮影した撮影画像の再生装置および再生方法に関する。 The present invention relates to a camera and a playback device, and more particularly, to a camera that can be recorded during shooting, and a playback device and playback method for a shot image shot with this camera.

近年、大画面テレビが普及してきており、撮影画像を大画面テレビに再生表示して楽しむことが行われている。また、テレビの画質も向上してきており、消費電力も低減されてきていることから、撮影画像をポスターのように表示し、画像をインテリアとして楽しむことも行われている。さらに、デジタル画像を表示するためのデジタルフォトフレームも普及してきている。このように、最近では、生活を撮影画像で彩ることが行われている。 In recent years, large screen televisions have become widespread, and captured images are played back and displayed on large screen televisions. Also, since the image quality of televisions has been improved and the power consumption has been reduced, captured images are displayed like posters and the images can be enjoyed as interiors. In addition, digital photo frames for displaying digital images have become widespread. Thus, recently, life has been colored with photographed images.

インテリア感覚での表示にあたっては、画像は押しつけがましいものではなく、雄大な風景や美しい花鳥風月など、癒されるものが求められており、従来のような動画とは異なる撮影方法や、また表示方法が必要となってきている。 When displaying in an interior sensation, images are not intrusive, and there is a need for something that can heal, such as majestic landscapes and beautiful flower-and-birds, and a different shooting method and display method are required. It has become.

また、表示にあたっては、複数の画像を合成し、パノラマ画像表示が提案されている。例えば、特許文献１には、合成対象となる画像の一方のサイズ全体を用いることにより、パノラマ画像全体を表現し、迫力ある表示を行うようにした画像表示装置が開示されている。また、パノラマ画像を利用し、複数の画像を合成し、アスペクト比が１６：９の大型画面にパノラマ画像を静止画表示するという提案もなされている。
特開平５−３０８５５３号公報 For display, a panoramic image display is proposed by combining a plurality of images. For example, Patent Document 1 discloses an image display device that expresses the entire panoramic image by using the entire size of one of the images to be combined and performs powerful display. In addition, a proposal has been made to use a panoramic image, combine a plurality of images, and display the panoramic image as a still image on a large screen having an aspect ratio of 16: 9.
Japanese Patent Laid-Open No. 5-308553

また、インタリア感覚で撮影画像を表示するにあたって、撮影時に録音された音声を再生すると癒され、また過去の思い出に浸ることもできる。音声を撮影時に録音することは従来よりも種々提案されている。例えば、特許文献２には、モノラル音声画像からステレオ音声画像に変換する音源安定化装置が開示されている。すなわち、この音源安定化装置は、画像知識データベースの情報を用いて、分割された画像から画像内の物体や、その物体の動き（位置）や、カメラの操作等を解析し、物体が発していると考えられる音源を音情報から分離し、分離された音源を映像に適した音場空間に再配置している。
特開２０００−２９５７００号公報 Moreover, when displaying a photographed image as if it were an interior sensation, the sound recorded at the time of photographing can be played back, and it can be immersed in memories of the past. Various proposals have been made to record audio during shooting. For example, Patent Document 2 discloses a sound source stabilization device that converts a monaural sound image into a stereo sound image. That is, this sound source stabilization device analyzes the object in the image, the movement (position) of the object, the operation of the camera, etc. from the divided image using the information of the image knowledge database, and the object is emitted. The sound source that is thought to be present is separated from the sound information, and the separated sound source is rearranged in a sound field space suitable for video.
JP 2000-295700 A

このように、画像合成することにより、大画面に相応しい画像を得ることができ、しかも撮影時に録音された音声を再生することにより、癒し効果や思い出に浸ることが可能となる。しかし、画像合成される個々の画像はそれぞれ指向性で録音していることから、そのまま合成すると、音声の再生時に音源の位置が移動してしまい、画像の鑑賞に相応しい再生とならない。 Thus, by compositing images, it is possible to obtain an image suitable for a large screen, and by reproducing the sound recorded at the time of shooting, it is possible to immerse yourself in healing effects and memories. However, since the individual images to be synthesized are recorded with directivity, if they are synthesized as they are, the position of the sound source moves during the reproduction of the sound, and the reproduction is not suitable for viewing the image.

この点について、前述の特許文献１に開示された画像表示装置には、音声の再生については何ら触れられていない。また、特許文献２に開示の装置では、モノラル音声画像をステレオ音声画像に変換するものであって、画像知識データベースを必要とし、大型化してしまう。また、これまでのカメラは、例えば、運動会や学芸会において、わが子の声を録音できるような用途を想定しており、視野内の音声に合わせるものであって、パノラマ撮影のように、複数の画像を合成する場合を考慮したものではなかった。 In this regard, the image display device disclosed in the above-mentioned Patent Document 1 is silent about the reproduction of sound. The device disclosed in Patent Document 2 converts a monaural sound image into a stereo sound image, requires an image knowledge database, and increases the size. In addition, the conventional cameras are supposed to be used to record the voice of my child at, for example, athletic meet and academy, and match the sound within the field of view. It did not consider the case of compositing images.

本発明は、このような事情を鑑みてなされたものであり、複数の画像に基づいて合成静止画像や動画を再生する際に、違和感のない音響効果となる音声収録を可能としたカメラ、また複数の画像に基づいて合成静止画像や動画を再生する際に、違和感のない音響効果で音声再生する再生装置、および再生方法を提供することを目的とする。 The present invention has been made in view of such circumstances, and a camera that enables audio recording with an acoustic effect without a sense of incongruity when a synthesized still image or video is reproduced based on a plurality of images. It is an object of the present invention to provide a playback apparatus and a playback method for playing back sound with a sound effect that does not cause a sense of incongruity when playing back a composite still image or moving image based on a plurality of images.

上記目的を達成するため第１の発明に係わるカメラは、被写体を連続撮影する撮像部と、被写体方向からの音声の収音範囲を変更可能な収音変更部と、上記撮像部で連続して得られた複数の画像を合成し、合成画像を生成する画像合成部と、上記複数の画像を撮影する際に、上記収音変更部の収音範囲を変更する制御部と、を有し、上記制御部は、上記合成画像を生成する各画像を、右から左に向けて得る場合と、左から右に向けて得る場合に応じて、それぞれ収音範囲を、左から右に、または右から左に変更する。 In order to achieve the above object, a camera according to a first aspect of the invention includes an imaging unit that continuously shoots a subject, a sound collection change unit that can change a sound collection range of sound from the subject direction, and the imaging unit. the resulting combining a plurality of images, an image combining unit for generating a composite image, when taking a plurality of images, have a, and a control unit for changing the sound collection range of the sound collection changing unit, The control unit sets the sound collection range from left to right or right according to the case where each image for generating the composite image is obtained from right to left and from left to right, respectively. Change from to left .

第２の発明に係わるカメラは、カメラ視野を左右に移動させながら連続画像を撮影する撮像部と、上記撮影時に複数の方向の音声を収録する音声取得部と、上記撮像部で連続して得られた複数の画像を合成し、合成画像を生成する画像合成部と、上記複数の画像を合成する際に、上記音声取得部で得られた複数の方向の音声の合成を変更する制御部と、を有し、上記制御部は、上記合成画像を生成する際に、上記合成画像の特定位置の方向に音源位置があるように、各画像における所定の被写体の位置変化に従って音声合成を変更する。 The camera according to the second invention is obtained continuously by an imaging unit that captures continuous images while moving the camera field of view to the left and right, an audio acquisition unit that records audio in a plurality of directions at the time of shooting, and the imaging unit. An image synthesizing unit that synthesizes the plurality of images and generates a synthesized image; and a control unit that changes the synthesis of audio in a plurality of directions obtained by the audio acquisition unit when synthesizing the plurality of images. , have a, the control unit, when generating the composite image, so that there is a direction to the sound source position of the specific position of the combined image, to change the voice synthesized according to the positional change of the predetermined subject in the image .

第３の発明に係わるカメラは、被写体を連続撮影する撮像部と、上記被写体の方向からの音声の収音範囲を変更可能な収音変更部と、カメラの動きを判定する動き判定部と、上記連続撮影する際に、上記動き判定部による判定結果に基づいて、上記収音変更部の収音範囲を変更する制御部と、を有し、上記制御部は、上記連続撮影する際に、上記動き判定部による判定結果が、右から左に向けて動く場合と、左から右に向けて動く場合に応じて、それぞれ収音範囲を、左から右に、または右から左に変更する。 A camera according to a third aspect of the invention includes an imaging unit that continuously shoots a subject, a sound collection changing unit that can change a sound collection range of the sound from the direction of the subject, a motion determination unit that determines the movement of the camera, when the continuous shooting, when based on a determination result of the motion determination unit, have a, and a control unit for changing the sound collection range of the sound collection changing unit, the control unit, for the continuous shooting, The sound collection range is changed from left to right or from right to left, depending on whether the determination result by the motion determination unit moves from right to left or from left to right .

第４の発明に係わるカメラは、上記第３の発明において、上記動き判定部は、上記撮像部から出力される画像データに基づいて判定する。
第５の発明に係わるカメラは、上記第３の発明において、さらに、上記撮像部から出力される画像データに基づいて、上記被写体の顔が存在するか否かを判定する顔検出部を有し、上記制御部は、上記顔検出部によって顔が検出された場合には上記顔に基づいて上記収音変更部の収音範囲を制御する。 In the camera according to a fourth aspect of the present invention, in the third aspect , the motion determination unit makes a determination based on image data output from the imaging unit.
According to a fifth aspect of the present invention, the camera according to the third aspect further includes a face detection unit for determining whether or not the face of the subject exists based on the image data output from the imaging unit. The control unit controls the sound collection range of the sound collection changing unit based on the face when a face is detected by the face detection unit.

第６の発明に係わる再生装置は、連続撮影された画像データと、この連続撮影時にステレオ録音されたステレオ音声データを記憶する記憶部と、上記画像データに基づいて、画像を再生表示する表示部と、上記ステレオ音声データに基づき、左右のバランスを変更して再生可能な音声再生部と、カメラの動きを判定する動き判定部と、上記画像データおよび上記ステレオ音声データの再生時に、上記カメラの動きに基づいて、特定位置に音源位置があるように、上記ステレオ音声データの左右のバランスを制御する制御部と、を有する。 According to a sixth aspect of the present invention, there is provided a playback device that stores continuously captured image data, stereo sound data recorded in stereo during the continuous shooting, and a display unit that reproduces and displays images based on the image data. And an audio playback unit that can be played back by changing the left / right balance based on the stereo audio data, a motion determination unit that determines the movement of the camera, and at the time of playback of the image data and the stereo audio data, And a control unit that controls the left and right balance of the stereo audio data so that the sound source position is at a specific position based on the movement.

第７の発明に係わる再生装置は、上記第６の発明において、上記制御部は、上記カメラの角速度と、上記連続撮影の各コマのタイミングとから補正角を求め、この補正角に応じて上記ステレオ音声データの左右のバランスを制御する。 In the playback apparatus according to a seventh aspect based on the sixth aspect , the control unit obtains a correction angle from the angular velocity of the camera and the timing of each frame of the continuous shooting, and the control unit determines the correction angle according to the correction angle. Controls the left / right balance of stereo audio data.

第８の発明に係わる再生方法は、連続撮影された画像データと、この連続撮影時にステレオ録音されたステレオ音声データを記憶し、カメラの動きを判定し、上記画像データおよび上記ステレオ音声データの再生時に、上記カメラの動きに基づいて、特定位置に音源位置があるように、上記ステレオ音声データの左右のバランスを制御する。 According to an eighth aspect of the present invention, there is provided a reproducing method for storing continuously captured image data and stereo sound data recorded in stereo during the continuous shooting, determining the movement of the camera, and reproducing the image data and the stereo sound data. Sometimes, the left and right balance of the stereo audio data is controlled so that the sound source position is at a specific position based on the movement of the camera.

第９の発明に係わるプログラムは、連続撮影された画像データと、この連続撮影時にステレオ録音されたステレオ音声データを記憶し、カメラの動きを判定し、上記画像データおよび上記ステレオ音声データの再生時に、上記カメラの動きに基づいて、特定位置に音源位置があるように、上記ステレオ音声データの左右のバランスを制御する、
ことをコンピュータに実行させる。 A program according to a ninth aspect of the present invention stores continuously captured image data and stereo sound data recorded in stereo during this continuous shooting, determines camera movement, and reproduces the image data and stereo sound data. Based on the movement of the camera, the left and right balance of the stereo audio data is controlled so that the sound source position is at a specific position .
Let the computer do that.

本発明によれば、複数の画像に基づいて合成静止画像や動画を再生する際に、違和感のない音響効果となる音声収録を可能としたカメラを提供することができる。また複数の画像に基づいて合成静止画像や動画を再生する際に、違和感のない音響効果で音声再生する再生装置、および再生方法を提供することができる。 ADVANTAGE OF THE INVENTION According to this invention, when reproducing | regenerating a synthetic | combination still image and a moving image based on a some image, the camera which enabled the audio | voice recording which becomes an acoustic effect without a sense of incongruity can be provided. In addition, it is possible to provide a playback apparatus and a playback method for playing back sound with a sound effect that does not cause a sense of incongruity when a composite still image or moving image is played back based on a plurality of images.

以下、図面に従って本発明を適用したデジタルカメラを用いて好ましい実施形態について説明する。本実施形態に係わるデジタルカメラは、連続的に撮影しながら、状況に応じた指向性で音声を録音し、再生時に違和感のない音響効果となるようにしている。 Hereinafter, preferred embodiments using a digital camera to which the present invention is applied will be described with reference to the drawings. The digital camera according to the present embodiment records sound with directivity according to the situation while continuously shooting, so that an acoustic effect without a sense of incongruity during reproduction is obtained.

図１は、本発明の第１実施形態に係わるカメラ１０と外部機器２０の構成を示すブロック図である。カメラ１０は、デジタルカメラであり、信号処理及び制御部１、撮像部２、変化判定部３、記録部４、操作判定部６、左右音声収録部７、表示部８、時計部９、および通信部１２を有する。 FIG. 1 is a block diagram showing the configuration of a camera 10 and an external device 20 according to the first embodiment of the present invention. The camera 10 is a digital camera, and includes a signal processing and control unit 1, an imaging unit 2, a change determination unit 3, a recording unit 4, an operation determination unit 6, a left / right audio recording unit 7, a display unit 8, a clock unit 9, and a communication. Part 12.

カメラ１０内の信号処理及び制御部１は、カメラ１０専用の信号処理ＬＳＩ等から構成され、カメラ１０全体を制御するとともに撮像部２から出力される画像データの画像処理を行う。撮像部２は、撮影レンズやこの撮影レンズによって形成された被写体像を画像データに変換する撮像素子等から構成される。 The signal processing and control unit 1 in the camera 10 includes a signal processing LSI dedicated to the camera 10 and controls the entire camera 10 and performs image processing of image data output from the imaging unit 2. The imaging unit 2 includes a photographic lens and an imaging element that converts a subject image formed by the photographic lens into image data.

記録部４は、撮像部２から出力される画像データを、信号処理及び制御部１によって画像処理や圧縮処理された後に記録する。変化判定部３は、撮像部２から出力される画像データを用いて、カメラの視野の変化を判定する。すなわち、カメラ１０が右から左に動くと画像が左から右に動き、カメラ１０が左から右に動くと画像が右から左に動き、この動きを検出する。変化判定部３による判定結果に基づいて、カメラ１０の動きを考慮した音声収録を行う。また、変化判定部３は、画像データを用いて、人物の顔の存在を検出し、顔の部分が存在した場合にはその位置や、また口等の顔パーツ位置の判定をも行う。 The recording unit 4 records the image data output from the imaging unit 2 after being subjected to image processing and compression processing by the signal processing and control unit 1. The change determination unit 3 determines a change in the field of view of the camera using the image data output from the imaging unit 2. That is, when the camera 10 moves from right to left, the image moves from left to right. When the camera 10 moves from left to right, the image moves from right to left, and this movement is detected. Based on the determination result by the change determination unit 3, audio recording is performed in consideration of the movement of the camera 10. Further, the change determination unit 3 detects the presence of a human face using image data, and also determines the position of a face part and the position of a face part such as a mouth when a face part is present.

左右音声収録部７は、ステレオマイク７ａを有しており、左右の音声をそれぞれ記録する。また、この左右音声収録部７は、ステレオマイクからの音声信号を信号処理し、音声の収録範囲を変更することができる（すなわち、収音にあたっての指向性を変更できる）。左右音声収録部７から出力される音声データは、信号処理及び制御部１で信号処理を行った後、画像データと共に記録部４に記録される。 The left and right audio recording unit 7 has a stereo microphone 7a, and records left and right audio, respectively. The left and right audio recording unit 7 can process an audio signal from the stereo microphone and change the audio recording range (that is, change directivity in collecting sound). The audio data output from the left and right audio recording unit 7 is signal-processed by the signal processing and control unit 1 and then recorded in the recording unit 4 together with the image data.

操作判定部６は、レリーズ釦等の操作部材とこれに連動するスイッチ等を有する。操作判定部６によって判定された操作状態は、信号処理及び制御部１に送られ、信号処理及び制御部１は、操作状態に応じた処理を実行する。時計部９は、日時等のカレンダー・計時機能を有し、撮影時の撮影日時情報等を出力する。撮影日時情報は、画像データと共に記録部４に記録される。 The operation determination unit 6 includes an operation member such as a release button and a switch linked to the operation member. The operation state determined by the operation determination unit 6 is sent to the signal processing and control unit 1, and the signal processing and control unit 1 executes a process according to the operation state. The clock unit 9 has a calendar / time keeping function such as date and time, and outputs shooting date and time information at the time of shooting. The shooting date / time information is recorded in the recording unit 4 together with the image data.

表示部８は、撮像部２から出力される画像データに基づいて、被写体像をフレーミング用にライブビュー表示し、また、記録部４に記録されている画像データを再生表示する。通信部１２は、テレビ等の外部機器２０との送信や受信を行う。通信手段としては、無線ＬＡＮ、近接無線通信、赤外線通信、ＵＳＢケーブル等による有線通信等によって行い、カメラ１０で撮影した画像データや音声データを送信可能である。また、近年は、ハイビジョンのディプレイに画像・音声を送信するためにＨＤＭＩ等も利用される傾向にあり、通信部１２は、ＨＤＭＩ端子を備え、これによる有線通信でも良い。 The display unit 8 performs live view display of the subject image for framing based on the image data output from the imaging unit 2, and reproduces and displays the image data recorded in the recording unit 4. The communication unit 12 performs transmission and reception with an external device 20 such as a television. As communication means, wireless LAN, proximity wireless communication, infrared communication, wired communication using a USB cable or the like can be used, and image data and audio data captured by the camera 10 can be transmitted. In recent years, HDMI and the like tend to be used for transmitting images and sounds to a high-definition display, and the communication unit 12 may include an HDMI terminal and may perform wired communication.

テレビやフォトスタンド等の外部機器２０は、信号処理及び制御部２１、通信部２２、表示・再生部２３、表示優先部２４、およびリモコン受信部２５を有する。信号処理及び制御部２１は、カメラ１０の信号処理及び制御部１と同様、外部機器２０専用の信号処理ＬＳＩ等から構成され、外部機器２０全体を制御するとともに、通信部２２を介して受信した画像データや音声データの再生表示の制御を行う。 The external device 20 such as a television or a photo stand includes a signal processing and control unit 21, a communication unit 22, a display / playback unit 23, a display priority unit 24, and a remote control reception unit 25. Similar to the signal processing and control unit 1 of the camera 10, the signal processing and control unit 21 is configured by a signal processing LSI dedicated to the external device 20, etc., and controls the entire external device 20 and received via the communication unit 22. Controls playback and display of image data and audio data.

通信部２２は、カメラ１０との通信を行い、カメラ１０から画像データや音声データを受信する。カメラ１０の通信部１２と同様、無線ＬＡＮ、近接無線通信、赤外線通信、ＵＳＢケーブル、ＨＤＭＩケーブル等による有線通信等による通信が可能である。表示優先部２４は、画像の優先度を判定する。すなわち、カメラ１０に内蔵する表示部８に最初に表示する優先画像か否かの判定を行う。 The communication unit 22 communicates with the camera 10 and receives image data and audio data from the camera 10. Similar to the communication unit 12 of the camera 10, communication by wireless LAN, proximity wireless communication, infrared communication, USB cable, HDMI cable, or the like is possible. The display priority unit 24 determines the priority of the image. That is, it is determined whether or not the image is a priority image to be displayed first on the display unit 8 built in the camera 10.

表示・再生部２３は、薄型の大画面モニタとスピーカを有し、カメラ１０から受信した画像データや音声データの再生表示を行う。再生にあたっては、信号処理及び制御部２１は、表示優先部２４における優先画像か否かの判定結果に応じて、再生制御を行う。なお、外部機器２０がテレビである場合には、通常のテレビ放送等の表示も行う。 The display / playback unit 23 includes a thin large screen monitor and a speaker, and plays back and displays image data and audio data received from the camera 10. In reproduction, the signal processing and control unit 21 performs reproduction control according to the determination result of whether or not the image is the priority image in the display priority unit 24. In addition, when the external device 20 is a television, a normal television broadcast or the like is also displayed.

リモコン受信部２５は、赤外線通信により、リモコン装置より指示信号を受信する。リモコン装置によって、例えば、指定された画像や音声をカメラ１０から受け取ったり、再生したり、中断することが可能となっている。 The remote control receiving unit 25 receives an instruction signal from the remote control device by infrared communication. For example, the remote controller can receive, reproduce, or interrupt a designated image or sound from the camera 10.

次に、図２を用いて、このカメラ１０の使用方法について説明する。ユーザ１５は、図２（ａ）に示すように、カメラ１０を構え、撮像部２の撮影レンズを通して被写体像を撮影すると共に、ステレオマイク７ａによって、前方からの音声も記録可能となっている。このようにして撮影された画像や音声は、図２（ｂ）に示すように、カメラ１０の通信部１２、および外部機器２０の通信部２２を介して、外部機器２０に送信される。外部機器２０は、受信した画像や音声を、表示・再生部２３で再生表示する。 Next, a method of using the camera 10 will be described with reference to FIG. As shown in FIG. 2A, the user 15 holds the camera 10 and shoots a subject image through the photographic lens of the imaging unit 2, and can also record audio from the front by the stereo microphone 7a. As shown in FIG. 2B, the image and the sound thus captured are transmitted to the external device 20 via the communication unit 12 of the camera 10 and the communication unit 22 of the external device 20. The external device 20 reproduces and displays the received image and sound on the display / reproduction unit 23.

次に、図３および図４を用いて、本実施形態におけるカメラ１０による撮影と音声記録について説明する。図３は、カメラ１０によって撮影および音声収録を行っている様子を示している。ユーザは、最初、カメラ１０ａの位置で撮影を開始し、カメラ１０ｂの位置に向けてカメラ１０を動かし、この間、連続的に撮影を行っている。このとき、得られた複数の画像の類似部を重ね合わせ合成することにより、図４（ａ）に示すように、静止パノラマ画像を得ることができる。この例では、３枚の画像５１ａ、５１ｂ、５１ｃを合成し、静止パノラマ画像を生成している。 Next, shooting and sound recording by the camera 10 in this embodiment will be described with reference to FIGS. 3 and 4. FIG. 3 shows a state where the camera 10 performs shooting and audio recording. The user first starts shooting at the position of the camera 10a, moves the camera 10 toward the position of the camera 10b, and continuously performs shooting during this time. At this time, a still panoramic image can be obtained as shown in FIG. 4A by superimposing and synthesizing similar portions of the obtained images. In this example, three images 51a, 51b, and 51c are combined to generate a still panoramic image.

図４（ｂ）は、図４（ａ）と同じく３枚の画像５１ａ、５１ｂ、５１ｃを合成した静止パノラマ画像である。撮影時に、それぞれの画像５１ａ、５１ｂ、５１ｃの画面中央部方向の音声を記録すると、再生時に、音源５３ａ、５３ｂ、５３ｃの位置が、左から右へと移動し、不自然な音声再生となる。 FIG. 4B is a still panoramic image obtained by synthesizing three images 51a, 51b, and 51c as in FIG. 4A. When sound in the center of the screen of each image 51a, 51b, 51c is recorded at the time of shooting, the positions of the sound sources 53a, 53b, 53c are moved from left to right during reproduction, resulting in unnatural sound reproduction. .

そこで、本実施形態においては、カメラ１０ａの位置において撮影する場合には、図３に示すように、画角３１ａの中では右よりの収音範囲３３ａで収音し、カメラ１０ｂの位置において撮影する場合には、画角３１ｂの中では左よりの収音範囲３３ｂで収音する。このようにして撮影した画像をパノラマ画像に合成し、音声を再生すると、画像５１ａの撮影時における音声の音源位置５２ａと、画像５１ｃの撮影時における音声の音源位置５２ｂは、ほぼパノラマ画像の中央となる。すなわち、図４（ｃ）に示すように、パノラマ画像のほぼ中央の音源位置５２ｃから収音された音声が聞こえてくる。図４（ｂ）で説明したような、音源位置の移動が殆どなく、自然な感じで音声再生を行うことができる。 Therefore, in the present embodiment, when shooting at the position of the camera 10a, as shown in FIG. 3, within the angle of view 31a, sound is picked up in the sound collection range 33a from the right, and shooting is performed at the position of the camera 10b. In this case, sound is collected in the sound collection range 33b from the left in the angle of view 31b. When the images thus captured are combined with the panorama image and the sound is reproduced, the sound source position 52a when the image 51a is captured and the sound source position 52b when the image 51c is captured are substantially at the center of the panorama image. It becomes. That is, as shown in FIG. 4C, the sound collected from the sound source position 52c at the substantially center of the panoramic image is heard. As described with reference to FIG. 4B, there is almost no movement of the sound source position, and sound reproduction can be performed with a natural feeling.

このように本実施形態においては、音声の変化を補正によって抑え、時系列的に撮影した画像を貼り合わせて作成した画像であることを感じさせないようにしている。なお、本実施形態においては、３枚の画像を貼り合わせているが、時間をかけて多数の画像を記録し、この中から複数枚の画像を選んで画像を合成し、音声を再生するようにしても勿論かまわない。 As described above, in the present embodiment, a change in sound is suppressed by correction so that an image created by pasting together images taken in time series is not felt. In this embodiment, three images are pasted together. However, it takes time to record a large number of images, select a plurality of images from these images, synthesize the images, and reproduce the sound. But of course it does n’t matter.

次に、本実施形態における動作を、図５に示すフローチャートを用いて説明する。このフローチャートは、カメラ１０の信号処理及び制御部１が司る。 Next, the operation in this embodiment will be described with reference to the flowchart shown in FIG. This flowchart is controlled by the signal processing and control unit 1 of the camera 10.

図５に示すカメラ制御のフローに入ると、まず、撮影モードか否かの判定を行う（Ｓ１０１）。このカメラ１０は、撮影モードと再生モードを有している。ステップＳ１０１における判定の結果、撮影モードであった場合には、画像の取り込みを行い、顔検出を行う（Ｓ１０２）。このステップでは、ライブビュー表示用に撮像部２から出力されている画像データを取得し、この取得した画像データを用いて、変化判定部３は顔検出を行う。続いて、画像表示を行う（Ｓ１０３）。ここでは、ステップＳ１０２で取得した画像データに基づいて、表示部８に被写体像を表示する。撮影者はこの被写体像を見ながらフレーミングを行うことができる。 If the camera control flow shown in FIG. 5 is entered, it is first determined whether or not the camera is in shooting mode (S101). The camera 10 has a shooting mode and a playback mode. If the result of determination in step S101 is shooting mode, image capture is performed and face detection is performed (S102). In this step, the image data output from the imaging unit 2 for live view display is acquired, and the change determination unit 3 performs face detection using the acquired image data. Subsequently, image display is performed (S103). Here, the subject image is displayed on the display unit 8 based on the image data acquired in step S102. The photographer can perform framing while viewing the subject image.

画像表示を行うと、次に、顔を検出したか否かの判定を行う（Ｓ１０４）。ステップＳ１０２において顔検出を行っているが、このとき画像の中から顔の部分を検出できたか否かをこのステップで判定する。この判定の結果、顔を検出したと判定した場合には、顔の位置と口等の顔パーツの位置を検出する（Ｓ１０５）。ここで、検出した顔位置は、ピント合わせや露出制御の際に利用する。 Once image display has been performed, it is next determined whether or not a face has been detected (S104). In step S102, face detection is performed. At this time, it is determined in this step whether or not a face portion has been detected from the image. If it is determined that a face has been detected as a result of this determination, the position of the face and the position of a facial part such as a mouth are detected (S105). Here, the detected face position is used for focusing and exposure control.

顔位置や口位置判定を行うと、またはステップＳ１０４における判定の結果において、顔が存在しなかったと判定した場合には、次に、記録を開始するか否かの判定を行う（Ｓ１０６）。ここでは、レリーズ釦の操作状態を検出し、動画撮影やパノラマ撮影等を開始するか否かを判定する。この判定の結果、記録開始でなかった場合には、ステップＳ１０１に戻り、前述の動作を実行する。 If the face position or mouth position is determined, or if it is determined in step S104 that there is no face, it is next determined whether or not to start recording (S106). Here, the operation state of the release button is detected, and it is determined whether to start moving image shooting, panoramic shooting, or the like. If the result of this determination is that recording has not started, processing returns to step S101 and the aforementioned operation is executed.

ステップＳ１０６における判定の結果、記録開始であった場合には、撮影・収音記録を行う（Ｓ１０７）。このサブルーチン内では、画像と音声の記録を連続的に行うと共に、併せて画面の動き検知等を随時行い、この動き検知結果に応じて収音範囲を変更する。撮影・収音動作は、このサブルーチン内において終了判定がなされるまで続行する。この撮影・収音記録のサブルーチンについては、図６に示すフローを用いて後述する。 If the result of determination in step S106 is that recording has started, shooting and sound recording are performed (S107). In this subroutine, image and sound are continuously recorded and screen motion is detected as needed, and the sound collection range is changed according to the motion detection result. The photographing / sound collecting operation continues until an end determination is made in this subroutine. This shooting / sound recording subroutine will be described later with reference to the flowchart shown in FIG.

ステップＳ１０１における判定の結果、撮影モードが設定されていなかった場合には、再生モードが設定されているか否かの判定を行う（Ｓ１１２）。この判定の結果、再生モードが設定されていなかった場合には、ステップＳ１０１に戻る。一方、ステップＳ１１１における判定の結果、再生モードが設定されていた場合には、再生を行う（Ｓ１１２）。 If the result of determination in step S101 is that shooting mode has not been set, it is determined whether or not playback mode has been set (S112). If the result of this determination is that playback mode has not been set, processing returns to step S101. On the other hand, if the result of determination in step S111 is that the playback mode has been set, playback is performed (S112).

このステップＳ１１２では、記録部４から記録されている撮影画像を読み出し、表示部８にサムネイル形式で画像を表示し、操作部によって画像が選択されると、その画像を拡大表示する。また、画像の表示と共に、音声データが一緒に記録されていた場合には、これを再生する。なお、カメラ１０内にスピーカが設けられていない場合には、画像再生のみとし音声再生は行わない。 In step S112, the photographed image recorded from the recording unit 4 is read out, and the image is displayed on the display unit 8 in a thumbnail format. When an image is selected by the operation unit, the image is enlarged and displayed. Further, when the audio data is recorded together with the display of the image, it is reproduced. If no speaker is provided in the camera 10, only image playback is performed and audio playback is not performed.

再生を行うと、次に、送信を行うか否かの判定を行う（Ｓ１１３）。ここでは、テレビ等の外部機器２０に画像送信するために、送信指示用の操作部材が操作されたか否かの判定を行う。この判定の結果、送信であった場合には、表示画像の送信を行う（Ｓ１１４）。このステップでは、ステップＳ１１２において表示中の画像を、外部機器２０に送信する。なお、複数の画像を選択した場合には、これらの画像をまとめて送信しても良い。表示画像を送信すると、ステップＳ１１３における判定の結果、送信でなかった場合、またはステップＳ１０７において撮影・収音記録が終わると、カメラ制御のフローを終了し、パワーオンのままであれば、ステップＳ１０１に戻り、前述の動作を実行する。 Once reproduction has been performed, it is next determined whether or not transmission is to be performed (S113). Here, it is determined whether or not an operation member for transmission instruction has been operated in order to transmit an image to the external device 20 such as a television. If the result of this determination is transmission, a display image is transmitted (S114). In this step, the image being displayed in step S112 is transmitted to the external device 20. When a plurality of images are selected, these images may be transmitted together. When the display image is transmitted, if the result of determination in step S113 is not transmission, or when shooting / sound recording is completed in step S107, the camera control flow is terminated. Returning to, the above-described operation is executed.

次に、ステップＳ１０７における撮影・収音記録のサブルーチンについて、図６に示すフローチャートを用いて説明する。 Next, the photographing / sound recording subroutine in step S107 will be described with reference to the flowchart shown in FIG.

このフローに入ると、まず、パノラマ撮影か否かの判定を行う（Ｓ１）。ここでは、ユーザが操作部材によってパノラマ撮影モードを設定したか否かの判定を行う。撮影後にパノラマに相応しい撮影であったかどうかを判定し、パノラマを作成するようにしても良いが、本実施形態においては、フローを単純化するためにパノラマ撮影モードはユーザ設定で行うことで説明する。 If this flow is entered, it is first determined whether or not panoramic shooting is performed (S1). Here, it is determined whether or not the user has set the panoramic shooting mode with the operation member. It may be determined whether or not the shooting is suitable for the panorama after shooting, and the panorama may be created. However, in this embodiment, in order to simplify the flow, the panorama shooting mode is performed by user setting.

ステップＳ１における判定の結果、パノラマ撮影モードが設定されていなかった場合には、通常の動画等の連続撮影を行う（Ｓ１１）。続いて、望遠撮影か否かを判定する（Ｓ１２）。ここでは、ズーミング操作によって望遠側に操作されたかを判定する。この判定の結果、望遠であった場合には、中央集中収音を行い（Ｓ１３）、一方、判定の結果、望遠でなかった場合には、左右のステレオ感を強調する収音を行う（Ｓ１４）。すなわち、カメラ１０の撮影レンズ２ａの画角に応じて、中央を重点的に録音するか、収音範囲を広くとりステレオ感を強調した録音をするか切り換える。なお、この収音範囲の変更は、左右音声収録部７によって行う。 If the result of determination in step S <b> 1 is that panoramic shooting mode has not been set, normal shooting of normal moving images or the like is performed (S <b> 11). Subsequently, it is determined whether or not telephoto shooting is performed (S12). Here, it is determined whether or not the zooming operation is performed on the telephoto side. If the result of this determination is telephoto, centralized sound collection is performed (S13). On the other hand, if the result of determination is not telephoto, sound collection that emphasizes the left and right stereo feeling is performed (S14). ). That is, depending on the angle of view of the photographing lens 2a of the camera 10, the recording is switched between focusing on the center or recording with a wide sound collection range and emphasizing stereo. Note that the change of the sound collection range is performed by the left and right audio recording unit 7.

中央集中録音またはステレオ強調録音を行うと次に、通常撮影を終了するか否かの判定を行う（Ｓ１５）。ここでは、レリーズ釦によって終了操作がなされたか否かの判定を行う。この判定の結果、終了でなかった場合には、ステップＳ１１に戻り、撮影を続行する。一方、終了であった場合には、撮影・収音記録のサブルーチンを終了し、元のフローに戻る。 If the centralized recording or the stereo emphasized recording is performed, it is next determined whether or not the normal photographing is finished (S15). Here, it is determined whether or not an end operation has been performed with the release button. If the result of this determination is not end, processing returns to step S11 and shooting is continued. On the other hand, if it is finished, the shooting / sound recording subroutine is finished and the flow returns to the original flow.

ステップＳ１における判定の結果、パノラマ撮影モードであった場合には、最初に撮像を行い、画像端部の像を記録する（Ｓ２）。端部画像の記録は、例えば、別のメモリ領域に記録したり、端部の画像の特徴をタグに残す等、後で利用できるように行う。この端部像記録を行うことによって、カメラ１０を矢印の方向に動かす場合、画面内の端部の像（図９（ｂ）の例では、木５５）が、図９（ａ）に示すように、画像５６ａ→画像５６ｂ→画像５６ｃ内を順次移動していき、端部の像（木５５）が中心となるようにパノラマ画像を生成することができる。そして、カメラ１０を動かしても、常に端部の像（木５５）の方向に音声収録の指向性を持たせていく録音（記録像方向録音）を行い、音声再生時の不自然さを軽減している。 If the result of determination in step S1 is panoramic shooting mode, imaging is first performed and an image at the edge of the image is recorded (S2). The edge image is recorded so that it can be used later, for example, in a different memory area or by leaving the feature of the edge image on the tag. When the camera 10 is moved in the direction of the arrow by performing this edge image recording, an image of the edge in the screen (the tree 55 in the example of FIG. 9B) is as shown in FIG. 9A. Then, the panoramic image can be generated so that the end image (tree 55) is centered by sequentially moving through the image 56a → the image 56b → the image 56c. Even when the camera 10 is moved, the recording (recording image direction recording) is performed so that the sound recording directivity is always given in the direction of the image (tree 55) at the end portion, and the unnaturalness at the time of sound reproduction is reduced. doing.

端部像記録を開始し、連続撮影を行う（Ｓ３）。続いて、最初に中央に所定の大きさの顔があるか否かを判定する（Ｓ４）。ここでは、変化判定部３によって、画面のほぼ中央に所定の大きさの顔、例えば、画面幅の１／５の顔が存在するかを判定する。これは、図７に示すようなシーンの場合、左側から撮影を開始すると、人物５７が所定以上の大きさを占める場合には、この人物５７が主被写体である可能性が高いからである。 Edge image recording is started, and continuous shooting is performed (S3). Subsequently, it is first determined whether or not there is a face of a predetermined size in the center (S4). Here, the change determination unit 3 determines whether or not a face having a predetermined size, for example, a face having a width of 1/5 of the screen width, exists in the approximate center of the screen. This is because, in the case of the scene shown in FIG. 7, when shooting is started from the left side, if the person 57 occupies a size larger than a predetermined size, the person 57 is likely to be the main subject.

この場合、人物５７が風景を見ながら、話す可能性があることから、撮影の最初（ステップＳ１０５のタイミング）、顔があることを判定した場合には、ステップＳ４からステップＳ２１に分岐し、顔方向追尾強調録音を行う（Ｓ２１）。顔方向追尾強調録音では、カメラ１０の動きに合わせて収音範囲５８ａ〜５８ｃを順次変更し、人物５７の方向で収音する。人物５７を主体にした撮影に相応しい収音を行うことができる。 In this case, since the person 57 may speak while looking at the landscape, when it is determined that there is a face at the beginning of the shooting (timing in step S105), the process branches from step S4 to step S21. Direction tracking emphasis recording is performed (S21). In face direction tracking emphasis recording, the sound collection ranges 58 a to 58 c are sequentially changed in accordance with the movement of the camera 10, and sound is collected in the direction of the person 57. It is possible to collect sound suitable for photographing mainly of the person 57.

ステップＳ４における判定の結果、最初に中央に所定の大きさの顔がなかった場合には、次に、画面の動き判定を行う（Ｓ５）。図９に示すように、カメラ１０の狙う方向の変化によって、撮影画像が左から右に動いているのか、右から左に動いているのかを判定する。続いて、画面の動きが左から右であったか否かの判定を行う（Ｓ６）。この判定の結果、画面の動きが左から右であった場合には、音声収録の強調方向を右強調から左強調とする（Ｓ８）。一方、画面の動きが右から左であった場合には、左強調から右強調録音とする（Ｓ７）。 If the result of determination in step S4 is that there is no face of a predetermined size at the center, next, screen motion determination is performed (S5). As shown in FIG. 9, it is determined whether the captured image is moving from left to right or from right to left according to a change in the target direction of the camera 10. Subsequently, it is determined whether or not the screen movement is from left to right (S6). If the result of this determination is that the screen motion is from left to right, the audio recording enhancement direction is changed from right enhancement to left enhancement (S8). On the other hand, when the movement of the screen is from right to left, the recording is made from left enhancement to right enhancement (S7).

ステップＳ８における右強調から左強調録音の動作について、図１０に示すフローチャートを用いて説明する。このフローに入ると、まず、画面右端の像を読み出す（Ｓ３１）。続いて、記録像方向の録音を行う（Ｓ３２）。ここでは、図４や図９において説明したように、最初は、画面右側の方向に向けて収音するが、画面の移動に応じて次第に画面の左側に向けた収音を行う。この収音動作は、左右音声収録部７によって行われる。 The operation from right enhancement to left enhancement recording in step S8 will be described with reference to the flowchart shown in FIG. In this flow, first, the image at the right end of the screen is read (S31). Subsequently, recording in the direction of the recording image is performed (S32). Here, as described in FIG. 4 and FIG. 9, sound is initially collected toward the right side of the screen, but sound is gradually collected toward the left side of the screen as the screen moves. This sound collecting operation is performed by the left and right audio recording unit 7.

なお、ステップＳ７における左強調から右強調録音は、右強調から左強調録音と反対の動作を行えば良い。また、図１０に示したフローでは、画面内の像の移動に従って収音方向を移動したが、画面の動きの方向のみを検出し、単純に動きの方向に応じて、１２０°を数秒で動かす程度で収音範囲を変更するようにしても良い。すなわち、所定の速度でユーザは画角を変えていくと想定し、撮像部２からの画像を利用することなく、音声収録方向切り換えることも可能である。 Note that the left-emphasized to right-emphasized recording in step S7 may be performed in the opposite manner to the right-emphasized to left-emphasized recording. In the flow shown in FIG. 10, the sound collection direction is moved according to the movement of the image in the screen, but only the direction of the screen movement is detected, and 120 ° is simply moved in a few seconds according to the direction of movement. The sound collection range may be changed depending on the degree. That is, assuming that the user changes the angle of view at a predetermined speed, it is also possible to switch the audio recording direction without using the image from the imaging unit 2.

ステップＳ７、Ｓ８、またはＳ２１における強調録音を行うと、次に、撮影の終了か否かを判定する（Ｓ１0）。このステップでは、レリーズ釦の操作状態を判定する。この判定の結果、撮影終了でなかった場合には、ステップＳ３に戻り、撮影を続行する。一方、判定の結果、終了であった場合には、撮影・収音記録のサブルーチンを終了し、元のフローに戻る。 If the emphasized recording in step S7, S8 or S21 is performed, it is next determined whether or not the photographing is finished (S10). In this step, the operation state of the release button is determined. If the result of this determination is that photography has not ended, processing returns to step S3 and photography continues. On the other hand, if the result of the determination is that the processing has been completed, the shooting / sound recording subroutine is terminated and the flow returns to the original flow.

以上説明したように、本実施形態に係わるカメラ１０は、パノラマ撮影モード等、連続的に画像を撮影する際に、カメラ１０の動きに応じて収音範囲を移動させている。このため、画像を合成して再生表示する際に、音声の音源位置が不自然に移動することがなく、違和感のない音響効果で音声再生することができる。 As described above, the camera 10 according to the present embodiment moves the sound collection range according to the movement of the camera 10 when continuously capturing images such as the panoramic shooting mode. For this reason, when the images are combined and reproduced and displayed, the sound source position of the sound does not move unnaturally, and the sound can be reproduced with an acoustic effect without a sense of incongruity.

次に、収音範囲を変化させるための左右音声収録部７の構成と動作について説明する。左右音声収録部７は、図８に示すように、ステレオマイク７ａ、ＡＤ変換器４２、加算・乗算器４３から構成される。 Next, the configuration and operation of the left and right audio recording unit 7 for changing the sound collection range will be described. The left and right audio recording unit 7 includes a stereo microphone 7a, an AD converter 42, and an adder / multiplier 43, as shown in FIG.

ステレオマイク７ａは、右側マイク４１ａと左側マイク４１ｂとから構成され、カメラ本体１０の前面側に配置される。ステレオマイク７ａはＡＤコンバータ４２に接続され、音声信号がデジタル化される。すなわち、右側マイク４１ａはＡＤコンバータ４２ａに、また左側マイク４１ｂはＡＤコンバータ４２ｂに、それぞれ接続されデジタル音声データを出力する。 The stereo microphone 7a includes a right microphone 41a and a left microphone 41b, and is disposed on the front side of the camera body 10. The stereo microphone 7a is connected to the AD converter 42, and the audio signal is digitized. That is, the right microphone 41a is connected to the AD converter 42a, and the left microphone 41b is connected to the AD converter 42b to output digital audio data.

ＡＤコンバータ４２の出力端は、加算・乗算器４３に接続され、左右の音声の差分が演算される。すなわち、右側マイク４１ａの音声データを出力するＡＤコンバータ４２ａは、加算器４３ａのプラス側入力端と、加算器４３ｄのマイナス側入力端に接続される。また、左側マイク４１ｂの音声データを出力するＡＤコンバータ４２ｂは、加算器４３ａのマイナス側入力端と、加算器４３ｄのプラス側入力端に接続される。 The output terminal of the AD converter 42 is connected to an adder / multiplier 43 to calculate the difference between the left and right sounds. That is, the AD converter 42a that outputs the audio data of the right microphone 41a is connected to the plus side input end of the adder 43a and the minus side input end of the adder 43d. Further, the AD converter 42b that outputs the audio data of the left microphone 41b is connected to the minus side input end of the adder 43a and the plus side input end of the adder 43d.

加算器４３ａの出力は乗算器４３ｂの入力端に接続され、加算器４３ｄの出力端は乗算器４３ｅの入力端に、それぞれ接続される。乗算器４３ｂと乗算器４３ｅの制御端は、信号処理及び制御部１に接続され、乗算器４３ｂ、４３ｅのゲインを入力する。加算器４３ｃの入力端は、ＡＤコンバータ４２ａの出力端と乗算器４３ｂの出力端が接続される。加算器４３ｆの入力端は、ＡＤコンバータ４２ｂの出力端と、乗算器４３ｅの出力端が接続される。 The output of the adder 43a is connected to the input terminal of the multiplier 43b, and the output terminal of the adder 43d is connected to the input terminal of the multiplier 43e. Control ends of the multiplier 43b and the multiplier 43e are connected to the signal processing and control unit 1, and input gains of the multipliers 43b and 43e. The input terminal of the adder 43c is connected to the output terminal of the AD converter 42a and the output terminal of the multiplier 43b. The input terminal of the adder 43f is connected to the output terminal of the AD converter 42b and the output terminal of the multiplier 43e.

加算・乗算器４３の出力端は、左右音声収録部７としての出力部であり、記録部４に接続される。すなわち、加算器４３ｃの出力端と、加算器４３ｆの出力端は、それぞれ、右側音声データ、左側音声データを出力し、これらの出力端を介して各音声データは記録部４に記録される。 The output terminal of the adder / multiplier 43 is an output unit as the left and right audio recording unit 7 and is connected to the recording unit 4. That is, the output terminal of the adder 43c and the output terminal of the adder 43f output right audio data and left audio data, respectively, and each audio data is recorded in the recording unit 4 via these output terminals.

このように左右音声収録部７は構成されており、ステレオ入力した音声データの左右のいずれかを強調することができる。左右音声収録部７の２つのマイク４１ａ、４１ｂによって入力した音声信号は、ＡＤコンバータ４２ａ、４２ｂによってデジタル音声データに変換され、加算器４３ａによって、（右側の音声データ）−（左側の音声データ）が演算され、加算器４３ｄによって、（左側の音声データ）−（右側の音声データ）が演算される。すなわち、加算器４３ａ、４３ｂによって、左右の音声データの差分が演算される。ここで、演算された差分は左右の音の差異であり、この差分を強調することにより、右または左側に広がりを強調した音声出力を得ることができ、この加算演算はそのための前処理である。 Thus, the left and right audio recording unit 7 is configured, and can emphasize either left or right of the audio data input in stereo. The audio signals input by the two microphones 41a and 41b of the left and right audio recording unit 7 are converted into digital audio data by the AD converters 42a and 42b, and (right audio data)-(left audio data) by the adder 43a. Is calculated, and (additional left audio data) − (right audio data) is calculated by the adder 43d. That is, the difference between the left and right audio data is calculated by the adders 43a and 43b. Here, the calculated difference is a difference between left and right sounds, and by emphasizing this difference, it is possible to obtain an audio output in which the spread is emphasized to the right or left, and this addition operation is a preprocessing for that purpose. .

加算器４３ａ、４３ｄで求められた差分は、それぞれ乗算器４３ｂ、４３ｅにおいて信号処理及び制御部１からのゲインに基づいて乗算し、この乗算結果を、加算器４３ｃ、４３ｆにおいて、右側の音声データと左側の音声データに、それぞれ加算する。なお、加算器４３ａ、４３ｄの出力がプラスなので、実質的に加算することになる。ここで、乗算器４３ｂ、４３ｅにおけるいずれか一方のゲインを大きくすれば、ゲインの大きい側の音声が強調され、右側または左側に広がりを強調した音声出力を得ることができる。また、両方のゲインを大きくすることにより、中央を重視した音声出力を得ることができる。信号処理及び制御部１は、ステップＳ７、Ｓ８、Ｓ１３、Ｓ１４、Ｓ２１のタイミングにおいて、乗算器４３ｂ、４３ｅに対してゲインを制御することにより、広がり感を変えることができる。 The differences obtained by the adders 43a and 43d are multiplied by multipliers 43b and 43e based on the gain from the signal processing and control unit 1, respectively, and the multiplication results are added to the right audio data by the adders 43c and 43f. And the left audio data are added respectively. Since the outputs of the adders 43a and 43d are positive, they are substantially added. Here, if either one of the gains in the multipliers 43b and 43e is increased, the voice having the higher gain is emphasized, and the voice output having the spread enhanced on the right side or the left side can be obtained. Also, by increasing both gains, it is possible to obtain an audio output that emphasizes the center. The signal processing and control unit 1 can change the sense of spread by controlling the gain for the multipliers 43b and 43e at the timings of steps S7, S8, S13, S14, and S21.

このように、本実施形態における左右音声収録部７は、一対の同じ性能のマイクを用いて、収音の範囲の方向を右から左、左から右へと変化させることができる。また、画面の動きを判定して、左右音声収録部７における強調録音を制御するようにしているので、撮影視野が変化しても、擬似的に一定の音源位置から音声が再生されているようすることができる。 In this way, the left and right audio recording unit 7 in the present embodiment can change the direction of the sound collection range from right to left and from left to right using a pair of microphones having the same performance. Further, since the screen motion is determined and the emphasis recording in the left and right audio recording unit 7 is controlled, even if the shooting field of view changes, the sound is reproduced from a pseudo sound source position in a pseudo manner. can do.

以上説明したように、本発明の第１実施形態によれば、画面の動きに合わせて収音範囲を変化させたので、複数の画像に基づいて合成静止画像を再生する際に、違和感のない音響効果で音声再生を行うことが可能となる。 As described above, according to the first embodiment of the present invention, since the sound collection range is changed in accordance with the movement of the screen, there is no sense of incongruity when a composite still image is reproduced based on a plurality of images. It is possible to perform sound reproduction with acoustic effects.

次に、本発明の第２実施形態について、図１１乃至図１４を用いて説明する。本発明の第１実施形態においては、カメラ１０による撮影時にカメラの動きに合わせて収音範囲を変化させていた。第２実施形態においては、撮影時には収音範囲を変化させることなく、ステレオ録音し、パノラマ画像等の合成画像の再生時に、カメラの動きに合わせて音源の位置を変化させるようにした。本実施形態における構成は、図１に示した第１実施形態の構成と同じであるので、説明を省略する。 Next, a second embodiment of the present invention will be described with reference to FIGS. In the first embodiment of the present invention, the sound collection range is changed in accordance with the movement of the camera when shooting with the camera 10. In the second embodiment, stereo recording is performed without changing the sound collection range at the time of shooting, and the position of the sound source is changed in accordance with the movement of the camera at the time of reproducing a composite image such as a panoramic image. The configuration in this embodiment is the same as that of the first embodiment shown in FIG.

本実施形態の動作について、図１１に示すカメラ制御のフローチャートを用いて説明する。カメラ制御のフローに入ると、まず撮影モードに設定されているか否かの判定を行う（Ｓ２０１）。ステップＳ２０１〜Ｓ２０３は、図５に示した第１実施形態におけるカメラ制御のフローと同じであり、詳しい説明を省略する。ただし、ステップＳ２０２における画像取り込みの際に、顔検出を行っていたが、本実施形態においては省略している。もちろん、露出制御や自動焦点調節のために顔検出を行っても良い。 The operation of this embodiment will be described with reference to the flowchart of camera control shown in FIG. If the camera control flow is entered, it is first determined whether or not the shooting mode is set (S201). Steps S201 to S203 are the same as the camera control flow in the first embodiment shown in FIG. However, face detection was performed at the time of image capture in step S202, but this is omitted in this embodiment. Of course, face detection may be performed for exposure control and automatic focus adjustment.

ステップＳ２０３において画像表示（ライブビュー表示）を行うと、次に、ステップＳ１０６と同様に、記録開始か否かの判定を行う（Ｓ２０４）。ここでは、レリーズ釦が操作されてか否かを判定する。この判定の結果、記録開始でない場合には、ステップＳ２０１に戻り、前述の動作を実行する。 If image display (live view display) is performed in step S203, it is next determined whether or not recording is started as in step S106 (S204). Here, it is determined whether or not the release button is operated. If the result of this determination is that recording has not started, processing returns to step S201 and the above-described operation is executed.

ステップＳ２０４における判定の結果、記録開始であった場合には、連続撮影を開始し、また同時にステレオ録音を開始する（Ｓ２０６）。次いで、動き判定を行う（Ｓ２０７）。このステップでは、画像データに基づき画像の変化から、カメラ１０の動きを判定する。この判定の結果、動きがあった場合には、動きの特徴を記録する（Ｓ２０８）。ここでは、画像データと共に、動きの特徴を記録部４に記録する。 If the result of determination in step S204 is that recording has started, continuous shooting is started, and stereo recording is started simultaneously (S206). Next, motion determination is performed (S207). In this step, the movement of the camera 10 is determined from the image change based on the image data. If there is a motion as a result of this determination, the feature of the motion is recorded (S208). Here, the feature of the motion is recorded in the recording unit 4 together with the image data.

動き特徴記録を行うと、またはステップＳ２０７における判定の結果、動きがなかった場合には、次に、記録を終了するか否かの判定を行う（Ｓ２０９）。このステップでは、レリーズ釦の操作状態に基づいて判定する。この判定の結果、記録終了でなければ、ステップＳ２０６に戻り、連続撮影を続行する。一方、判定の結果、記録終了であれば、このフローを終了し、再び、ステップＳ２０１に戻る。 If motion feature recording is performed or if the result of determination in step S207 is that there is no motion, it is next determined whether or not to end recording (S209). In this step, the determination is made based on the operation state of the release button. If the result of this determination is that recording has not ended, processing returns to step S206 and continuous shooting is continued. On the other hand, if the result of determination is that recording is complete, this flow is terminated, and the flow returns to step S201 again.

ステップＳ２０１における判定の結果、撮影モードでなかった場合には、次に再生モードが設定されているか否かの判定を行う（Ｓ２１１）。ここでは、再生釦等の操作部材が操作され、ユーザから再生開始指示がなされか否かを判定する。この判定の結果、再生モードに設定されていなかった場合には、ステップＳ２０１に戻る。一方、判定の結果、再生モードが設定されていた場合には、次に、パノラマ再生か否かの判定を行う（Ｓ２１２）。 If the result of determination in step S201 is not shooting mode, it is next determined whether or not playback mode is set (S211). Here, it is determined whether or not an operation member such as a playback button is operated and a playback start instruction is issued from the user. If the result of this determination is that playback mode has not been set, processing returns to step S201. On the other hand, if the result of determination is that the playback mode has been set, it is next determined whether or not panoramic playback is to be performed (S212).

このステップＳ２１２では、ステップＳ２０８において、画像データと共に動き特徴が記録されているか否かを判定する。すなわち、動き特徴が記録されている場合には、複数コマを合成することによりパノラマ画像を得ることができる。また、パノラマ再生か否かの判定にあたっては、ユーザによってパノラマ再生モードが設定されているか否かを含めて判断するようにしても良い。 In step S212, it is determined in step S208 whether or not a motion feature is recorded together with the image data. That is, when motion features are recorded, a panoramic image can be obtained by combining a plurality of frames. Further, when determining whether or not the panorama playback is performed, it may be determined including whether or not the panorama playback mode is set by the user.

ステップＳ２１２における判定の結果、パノラマ再生でなかった場合には、通常の動画再生を行い（Ｓ２１３）、ステレオ再生を行う（Ｓ２１４）。ここでは、ステップＳ２０６において録画した動画と、録音した音声を再生する。 If the result of determination in step S212 is not panorama playback, normal movie playback is performed (S213), and stereo playback is performed (S214). Here, the moving image recorded in step S206 and the recorded sound are reproduced.

ステップＳ２１２における判定の結果、パノラマ再生であった場合には、画像接続再生を行う（Ｓ２２１）。すなわち、このステップでは、ステップＳ２０６において撮影した連続画像の共通部分を貼り合わせて静止画でパノラマ画像を生成する。続いて、パノラマ中央部像の方向音声強調で音声を再生する（Ｓ２２２）。このステップでは、完成されたパノラマ画像の中央部の像を判定し、その像位置と、ステップＳ２０８に記録した動き特徴データから、各位置で得られた音声を補正し、パノラマ画像の中央部に音源位置があるように音声再生する。 If the result of determination in step S212 is panorama playback, image connection playback is performed (S221). That is, in this step, a panoramic image is generated as a still image by pasting the common parts of the continuous images taken in step S206. Subsequently, the sound is reproduced with the direction sound enhancement of the panorama central image (S222). In this step, the center image of the completed panoramic image is determined, the sound obtained at each position is corrected from the image position and the motion feature data recorded in step S208, and the panoramic image is displayed in the center of the panoramic image. Play sound as if the sound source is located.

このステップＳ２２２における音声強調での音声再生は、図１２に示す左右音声収録部７において制御される。ステップＳ２０６におけるステレオ録音にあたっては、左右音声収録部７は音声強調を行わず、左右のステレオ音をそのまま記録部４に記録し、ステップＳ２２２において音声強調で再生を行う。左右音声収録部７における音声再生については、図１２を用いて後述する。 The audio reproduction in the audio enhancement in step S222 is controlled by the left and right audio recording unit 7 shown in FIG. In stereo recording in step S206, the left and right audio recording unit 7 does not perform audio enhancement, records the left and right stereo sounds as they are in the recording unit 4, and performs reproduction with audio enhancement in step S222. The audio reproduction in the left and right audio recording unit 7 will be described later with reference to FIG.

また、ステップＳ２２２における音声強調再生にあたって、第１実施形態においては、図９に示したように、画面内の端部の像（図９（ｂ）の例では、木５５）を目標にして音声の方向バランスを取るようにしていた。このため、パノラマ画像の中央部に端部像が含まれている必要があった。本実施形態においては、パノラマ画像の中央部に目標となる端部像がない場合でも、カメラの取り扱い時の角速度を検出するセンサ（例えば、ジャイロ等）を利用して音声強調再生を行うようにしている。この音声強調再生については、図１３および図１４を用いて後述する。 Further, in the audio enhancement reproduction in step S222, in the first embodiment, as shown in FIG. 9, the audio is targeted at the image of the end in the screen (the tree 55 in the example of FIG. 9B). I was trying to balance the direction. For this reason, the end image needs to be included in the center of the panoramic image. In the present embodiment, even when there is no target end image at the center of the panoramic image, audio enhancement playback is performed using a sensor (for example, a gyroscope) that detects an angular velocity during handling of the camera. ing. This voice enhancement reproduction will be described later with reference to FIGS.

ステップＳ２２２における音声強調による再生、またはステップＳ２１４におけるステレオ再生を行うと、次に、再生終了か否かの判定を行う（Ｓ２１５）。ここでは、再生釦等の操作部材が再度操作される等、ユーザによる再生終了指示がなされたか等に基づいて判定する。この判定の結果、再生終了でなかった場合には、ステップＳ２１２に戻り、再生動作を続行する。 If reproduction by voice emphasis in step S222 or stereo reproduction in step S214 is performed, it is next determined whether or not reproduction has ended (S215). Here, the determination is made based on whether or not a reproduction end instruction is given by the user, such as an operation member such as a reproduction button being operated again. If the result of this determination is that playback has not ended, processing returns to step S212 and playback operation continues.

ステップＳ２１５における判定の結果、再生終了であった場合には、次に、送信するか否かの判定を行う（Ｓ２２６）。ここでは、送信釦等の操作部材が操作され、表示中の再生画像を外部機器２０に送信するか否かを判定する。この判定の結果、送信であった場合には、表示画像を外部機器２０に送信する（Ｓ２７７）。表示画像を送信すると、またはステップＳ２２６における判定の結果、送信でなかった場合には、カメラ制御のフローを終了し、ステップＳ２０１から再び実行する。 If the result of determination in step S215 is that reproduction has ended, it is next determined whether or not to transmit (S226). Here, it is determined whether or not an operation member such as a transmission button is operated to transmit the displayed reproduction image to the external device 20. If the result of this determination is transmission, the display image is transmitted to the external device 20 (S277). If the display image is transmitted or if the result of determination in step S226 is not transmission, the camera control flow is terminated, and the process is executed again from step S201.

次に、ステップＳ２２２においてパノラマ中央部像の方向の音声強調の処理を行う左右音声収録部７の構成について、図１２を用いて説明する。 Next, the configuration of the left and right audio recording unit 7 that performs audio enhancement processing in the direction of the panoramic central image in step S222 will be described with reference to FIG.

図１２は、左右音声収録部７の構成を示すブロック図である。この左右音声収録部７は再生時には、左右の音声再生のバランスを調整し、音声強調処理を行う。図８に示した第１実施形態における構成と比較し、ＡＤコンバータ４２と加算・乗算器４３の間に記録部４を接続している点が相違しているが、各回路内の構成は、第１実施形態における左右音声収録部７と同じである。 FIG. 12 is a block diagram showing the configuration of the left and right audio recording unit 7. The left and right audio recording unit 7 adjusts the balance of left and right audio reproduction and performs audio enhancement processing during reproduction. Compared with the configuration in the first embodiment shown in FIG. 8, the point that the recording unit 4 is connected between the AD converter 42 and the adder / multiplier 43 is different, but the configuration in each circuit is as follows. This is the same as the left and right audio recording unit 7 in the first embodiment.

すなわち、右側マイク４１ａの音声信号をＡＤ変換するＡＤコンバータ４２ａの出力端は記録部４に接続され、このＡＤコンバータ４２ａによってＡＤ変換され、記録部４に記録された音声データは、加算器４３ａ、４３ｃ、４３ｄに出力される。また、左側マイク４１ｂの音声信号をＡＤ変換するＡＤコンバータ４２ｂの出力端は記録部４に接続され、このＡＤコンバータ４２ｂによってＡＤ変換され、記録部４に記録された音声データは、加算器４３ａ、４３ｄ、４３ｆに出力される。 That is, the output end of the AD converter 42a for AD converting the audio signal of the right microphone 41a is connected to the recording unit 4, and the AD data is converted by the AD converter 42a, and the audio data recorded in the recording unit 4 is added to the adder 43a, 43c and 43d. Also, the output end of the AD converter 42b for AD converting the audio signal of the left microphone 41b is connected to the recording unit 4, and AD conversion is performed by the AD converter 42b, and the audio data recorded in the recording unit 4 is added to the adder 43a, 43d and 43f.

上述した第１実施形態においては、音声収録部７は撮影時に強調収音を行い、収音範囲を変更していたが、本実施形態においては、撮影時には、ステレオマイク７ａからの音声信号をＡＤコンバータによってデジタル化し、この音声データの収音範囲を変更することなく、そのまま記録部４に記録する。そして、再生時に記録部４から読み出された音声データに基づいて、加算・乗算器４３によって音声再生の際の左右のバランスを制御している。すなわち、画面の移動に応じて乗算器４３ｂ、４３ｅに印加するゲインを変化させ、パノラマ画像の中央部の像の方向に音源があるかのように、音声再生を行う。 In the first embodiment described above, the sound recording unit 7 performs emphasized sound collection at the time of shooting and changes the sound collection range. However, in this embodiment, at the time of shooting, the sound signal from the stereo microphone 7a is AD. It is digitized by a converter and recorded in the recording unit 4 as it is without changing the sound collection range of the audio data. Then, based on the audio data read from the recording unit 4 during reproduction, the adder / multiplier 43 controls the left / right balance during audio reproduction. That is, the gain applied to the multipliers 43b and 43e is changed in accordance with the movement of the screen, and sound reproduction is performed as if there is a sound source in the direction of the image at the center of the panoramic image.

次に、ステップＳ２２２におけるパノラマ中央部像の方向の音声強調のサブルーチンについて、図１３および図１４を用いて説明する。前述したように、第１実施形態においては、パノラマ画像の中央部の像方向を検出するにあたって、画面内の端部の像（木５５）を基準にしていた。本実施形態においては、このような端部の像がない場合でも、パノラマ画像の中央部の像方向を検出することができる。 Next, the speech enhancement subroutine in the direction of the panoramic central image in step S222 will be described with reference to FIGS. As described above, in the first embodiment, when detecting the image direction of the center portion of the panoramic image, the image (tree 55) at the end in the screen is used as a reference. In the present embodiment, the image direction of the central portion of the panoramic image can be detected even when there is no such edge image.

図１３（ａ）は、パノラマ画像を生成するために連続的に画像を撮影する様子を示す。すなわち、最初にカメラ１０ａの位置において（タイミングＴ１）で、撮影を開始する。このとき、カメラ１０で撮影すると、図１３（ｂ）に示すように、画角θの画像６１ａを得ることができる。続いて、途中のタイミングＴ２において撮影すると、目標物となる木５９が映しこまれ、画像６１ｂを得ることができる。そして、終端に達し、カメラ１０ｂの位置において（タイミングＴ３）、最後の画像を撮影すると、画像６１ｃを得ることができる。なお、これらの画像６１ａ〜６１ｃは、ステップＳ２０６（図１１参照）において取得し、記録している。 FIG. 13A shows a state in which images are continuously captured in order to generate a panoramic image. That is, first, shooting is started at the timing of the camera 10a (timing T1). At this time, when the image is taken by the camera 10, an image 61a having an angle of view θ can be obtained as shown in FIG. Subsequently, when the image is taken at the timing T2 in the middle, the target tree 59 is reflected and an image 61b can be obtained. When the end is reached and the last image is taken at the position of the camera 10b (timing T3), an image 61c can be obtained. In addition, these images 61a-61c are acquired and recorded in step S206 (refer FIG. 11).

カメラ１０によって連続的に取得した画像６１ａ〜６１ｃを並べると、図１３（ｂ）に示すようになり、これらを合成することによりパノラマ画像６２を得ることができる。このパノラマ画像の中央部には目標となる像としての木５９がある。本実施形態においては、パノラマ画像の最初や最後の画像には、共通の像（木５９）がなくても、合成して得られたパノラマ画像の中央部に木５９があることから、この木が視野に入るまでのタイミングの差異から、音声再生時における方向補正を可能としている。 When images 61a to 61c continuously acquired by the camera 10 are arranged, the result is as shown in FIG. 13B, and a panoramic image 62 can be obtained by combining these images. In the center of this panoramic image, there is a tree 59 as a target image. In the present embodiment, even if there is no common image (tree 59) in the first and last panoramic images, there is a tree 59 in the center of the panoramic image obtained by combining. Because of the difference in timing until the image enters the field of view, it is possible to correct the direction during audio reproduction.

つまり、図１３（ａ）に示すように、カメラ１０ａがタイミングＴ１で画角θの画像を得て撮影を開始し、カメラ１０ｂの位置で（タイミングＴ３）で画角θのパノラマ用画像の最後の画像を得る。これらの間のタイミングＴ２において、パノラマ画像の中央の像を得る。タイミングＴ１とタイミングＴ２の間にどれだけの角度Φ動いたかは、画角θと、画角内を動く像の速さから判定することができる。 That is, as shown in FIG. 13 (a), the camera 10a starts to capture an image having an angle of view θ at timing T1, and the last of the panoramic image having the angle of view θ at the position of the camera 10b (timing T3). Get the image. At the timing T2 between these, the center image of the panoramic image is obtained. The angle Φ moved between the timing T1 and the timing T2 can be determined from the angle of view θ and the speed of the image moving within the angle of view.

画角の端から端まで、像がΔＴの時間で移動すると、角速度センサによって取得された角速度ｖに基づいて、カメラ１０をどれだけの角度θ、動かしたかが分かる。この角速度ｖは、撮影中にステップＳ２０８に動き特徴として判定し、画像と共に記録される。したがって、タイミングＴ１とタイミングＴ２の差異Ｔ２−Ｔ１に、角速度ｖを乗ずることにより、補正すべき収音角度を決定することができる。 When the image moves from the end of the angle of view to the end in the time of ΔT, it can be seen how much angle θ the camera 10 has been moved based on the angular velocity v acquired by the angular velocity sensor. This angular velocity v is determined as a motion feature in step S208 during shooting, and is recorded together with the image. Therefore, the sound collection angle to be corrected can be determined by multiplying the difference T2-T1 between the timing T1 and the timing T2 by the angular velocity v.

次に、パノラマ中央部像の方向の音声強調のサブルーチンについて、図１４に示すフローチャートを用いて説明する。このフローに入ると、まず、パノラマ画像を合成する際の中央コマの判定を行う（Ｓ５０１）。このステップでは、図１３の例では、パノラマ画像の中央部にあたる画像６１ｂを中央コマとして判定する。続いて、中央コマ撮影タイミングをＴ２とする（Ｓ５０２）。 Next, a speech enhancement subroutine in the direction of the panorama central image will be described with reference to the flowchart shown in FIG. If this flow is entered, first, determination of the center frame when the panoramic image is synthesized is performed (S501). In this step, in the example of FIG. 13, the image 61b corresponding to the center of the panoramic image is determined as the center frame. Subsequently, the central frame shooting timing is set to T2 (S502).

次に、Ｔで得られた音声を読み出す（Ｓ５０３）。このステップでは、読み出されている画像データに対応する音声データを記録部４から読み出し、この音声のタイミングをＴとする。続いて、補正角を、（Ｔ２−Ｔ）×θ／ΔＴより求める（Ｓ５０４）。このステップでは、図１３を用いて説明したように、各タイミングＴにおいて中央部との差異（Ｔ２−Ｔ）に角速度（ｖ＝θ／ΔＴ）を乗算することにより補正角を求める。 Next, the voice obtained at T is read (S503). In this step, audio data corresponding to the read image data is read from the recording unit 4, and the timing of this audio is T. Subsequently, a correction angle is obtained from (T2−T) × θ / ΔT (S504). In this step, as described with reference to FIG. 13, the correction angle is obtained by multiplying the difference (T2−T) from the central portion at each timing T by the angular velocity (v = θ / ΔT).

補正角を演算すると、次に、この補正角を用いて補正角音声強調を行う（Ｓ５０５）。このステップでは、図１２に示す左右音声収録部７の乗算器４３ｂ、４３ｅに印加するゲインを、補正角に応じて変化させる。これによって、パノラマ画像のほぼ中央部の方向に音源位置があるような強調再生を行うことができる。 Once the correction angle is calculated, the correction angle speech enhancement is performed using the correction angle (S505). In this step, the gain applied to the multipliers 43b and 43e of the left and right audio recording unit 7 shown in FIG. 12 is changed according to the correction angle. As a result, it is possible to perform enhanced reproduction such that the sound source position is in the direction of the substantially central portion of the panoramic image.

補正角音声強調を行うと、次に、終了か否かの判定を行う（Ｓ５０６）。ここでは、再生が終了か否かを再生釦等の操作部材の操作状態に応じて判定する。この判定の結果、終了でなかった場合には、ステップＳ５０３に戻り、音声強調による音声再生を続行する。一方、ステップＳ５０６における判定の結果、終了であった場合は、元のフローに戻る。 Once the correction angle speech enhancement has been performed, it is next determined whether or not it is finished (S506). Here, whether or not the reproduction is finished is determined according to the operation state of the operation member such as the reproduction button. If the result of this determination is that it has not ended, processing returns to step S503, and audio reproduction by audio enhancement is continued. On the other hand, if the result of determination in step S506 is end, processing returns to the original flow.

以上、説明したように、本発明の第２実施形態においては、撮影時にはステレオ録音しておき、再生時に、画面の動きに応じて音声強調による再生を行い、中央部に音源の位置があるように音声再生を行っている。このため、静止画でパノラマ画像を再生表示した際に、音源の位置が移動し不自然な音声再生となることを防止することができる。 As described above, in the second embodiment of the present invention, stereo recording is performed at the time of shooting, and playback is performed by voice enhancement according to the movement of the screen during playback, so that the position of the sound source is at the center. Audio playback is in progress. For this reason, when a panoramic image is reproduced and displayed as a still image, it is possible to prevent the position of the sound source from moving and unnatural sound reproduction.

また、本実施形態においては、最初と最後の画面内に端部の像（木５５）が、存在しなくても補正角を演算により求めることができる。このため、第１実施形態よりも広い範囲のパノラマ撮影時であっても、補正角を演算することができ、撮影方向が大きく変化しても音の変化が気にならない。 Further, in the present embodiment, the correction angle can be obtained by calculation even if there is no end image (tree 55) in the first and last screens. For this reason, even during panoramic shooting in a wider range than in the first embodiment, the correction angle can be calculated, and even if the shooting direction changes greatly, changes in sound are not anxious.

なお、中央部の像の方向に音源があるように音声強調を行っていたが、この方向は、中央部以外であっても、適宜、変更できるようにしても良い。この場合には、Ｔ２のタイミングを手動で設定するようにすれば良い。 Note that the speech enhancement is performed so that the sound source is in the direction of the image at the center, but this direction may be changed as appropriate even when the direction is other than the center. In this case, the timing of T2 may be set manually.

次に、本発明の第１および第２実施形態の変形例について図１５および図１６を用いて説明する。すなわち、第１および第２実施形態においては、連続的に撮影を行い、これによって得た複数の画像を用いてパノラマ画像を合成し、静止画で表示していた。これに対して、本変形例においては、連続撮影した画像を順次再生し、動画のように再生を行う場合を想定している。本変形例によれば、カメラの視野変更に伴う不必要な音の変化を低減することができる。 Next, modifications of the first and second embodiments of the present invention will be described with reference to FIGS. 15 and 16. That is, in the first and second embodiments, continuous shooting is performed, a panoramic image is synthesized using a plurality of images obtained thereby, and is displayed as a still image. On the other hand, in this modified example, it is assumed that the continuously shot images are sequentially reproduced and reproduced like a moving image. According to this modification, it is possible to reduce unnecessary changes in sound due to a change in the visual field of the camera.

図１５を用いて、本変形例におけるカメラ１０による撮影と音声記録について説明する。図１５（ａ）は、カメラ１０によって撮影および音声収録を行っている様子を示している。ユーザ１５は、最初、カメラ１０ｂの位置で撮影を開始し、カメラ１０ａの位置に向けてカメラ１０を動かしている。このとき、カメラ１０ａの位置で画像を撮影すると、図１５（ｂ）に示すような画像が得られ、カメラ１０ｂの位置で画像を撮影すると、図１５（ｃ）に示すような画像が得られる。 With reference to FIG. 15, photographing and sound recording by the camera 10 in the present modification will be described. FIG. 15A shows a state where shooting and audio recording are performed by the camera 10. The user 15 first starts photographing at the position of the camera 10b and moves the camera 10 toward the position of the camera 10a. At this time, when an image is taken at the position of the camera 10a, an image as shown in FIG. 15B is obtained, and when an image is taken at the position of the camera 10b, an image as shown in FIG. 15C is obtained. .

図１５（ｂ）（ｃ）に示すような海辺の広々とした情景を、連写または動画で撮影する際に併せて録音すると、それぞれのカメラ１０の位置の前方の画角に相当する狭い範囲の音が収録されることになる。しかし、この範囲では、顔は動かさず、人間の目１５ａ、１５ｂのみを動かすことが可能である。つまり、カメラ１０は画面の移り変わりに応じて、前方の音声を有して収音するが、撮影者の耳１５ｃは、可聴範囲３５の音を聴いている場合がある。 When a wide seaside scene as shown in FIGS. 15B and 15C is recorded together with continuous shooting or moving image shooting, a narrow range corresponding to the angle of view ahead of the position of each camera 10 is recorded. Will be recorded. However, in this range, it is possible to move only the human eyes 15a and 15b without moving the face. In other words, the camera 10 picks up the sound with a forward sound according to the screen change, but the photographer's ear 15 c may be listening to the sound in the audible range 35.

このような状況下で録音した音声の再生の際に、カメラ１０の動きに応じて落ち着きなく変化すると、画像と音声を楽しむには相応しくない。そこで、本変形例においては、画面が動いても、画面内の所定の場所に音源があり、そこから音が聞こえてくるようにしている。 When the sound recorded in such a situation is reproduced, if the sound changes in accordance with the movement of the camera 10, it is not suitable for enjoying the image and sound. Therefore, in this modification, even if the screen moves, there is a sound source at a predetermined location in the screen so that sound can be heard from there.

本変形例は、第１実施形態のように撮影時にも適用できるが、ここでは第２実施形態に応用する例で説明する。この場合、図１１に示したカメラ制御のフロー中、ステップＳ２２１、Ｓ２２２において静止画パノラマ画像ではなく動画を再生する際に、音声強調再生を行えば良い。すなわち、動画を再生表示しながら、あたかもパノラマ時の画面中央をカメラが向いていた時の音声がずっと記録されていたような再生を行う。 Although this modification can be applied at the time of photographing as in the first embodiment, an example applied to the second embodiment will be described here. In this case, during the camera control flow shown in FIG. 11, audio playback may be performed when playing back a moving image instead of a still image panoramic image in steps S221 and S222. In other words, while reproducing and displaying a moving image, reproduction is performed as if the sound was recorded when the camera was facing the center of the screen during panorama.

例えば、図１６に示すように、大木の幹に沿って、下から頂点までゆっくりと見上げながら動画を撮影した場合を考える。この場合、幹に蝉が止まって鳴いていたとすると、一番下を撮影する際には、画面、中央から蝉の声が聞こえてくるが、画面が頂上に向かうにつれ、画面の下側から蝉の声が聞こえてくるように、音声強調再生を行う。これによって、同じ音源位置から蝉の声が聞こえてくることから、不自然な感じを与えることはない。もちろん、第１、第２実施形態と同様に、左右に移動させる連続撮影であっても、同様の効果を得ることができる。
For example, as shown in FIG. 16 , consider a case where a moving image is photographed while slowly looking up from the bottom to the top along the trunk of a large tree. In this case, if the heel stopped on the trunk and it was ringing, when you shoot the bottom, you will hear a crow's voice from the center of the screen. Voice-enhanced playback is performed so that the voice can be heard. As a result, since the voice of the fox is heard from the same sound source position, it does not give an unnatural feeling. Of course, similar to the first and second embodiments, the same effect can be obtained even in continuous shooting that moves left and right.

このような構成を採用し、動作を行うことにより、安定した環境音再生を可能とし、雰囲気豊かな音響効果を持つ音声収録や再生が可能となる。カメラ１０のユーザは、必ずしも見ている方向の範囲の音を聴いているわけではない。記憶の再現という観点では、厳密な音の再現より、思い出にふけることができるような音声再生が好ましい。記憶の中に残っている音声を無理なく再現できるような音声収音が望まれる。そこで、本変形例においては、最適な収音を行いながら、撮影者が撮影時に聴いていて記憶に残っているような環境音の記録や再生を重視している。これによって、慌ただしく音声が切り替わることがなく、落ち着いて記憶をひもとくことができ、癒し効果のある画像と音声再生を可能としている。 By adopting such a configuration and performing the operation, it is possible to stably reproduce the environmental sound and to record and reproduce the sound having a rich atmosphere. The user of the camera 10 does not necessarily listen to the sound in the range of the viewing direction. From the viewpoint of reproduction of memory, audio reproduction that can indulge in memories is preferable to accurate reproduction of sound. It is desired to collect sound that can easily reproduce the sound remaining in the memory. Therefore, in this modification, emphasis is placed on the recording and reproduction of environmental sound that the photographer listens at the time of shooting and remains in memory while performing optimum sound collection. As a result, the sound does not change over quickly, and it is possible to calm down and memorize the memory, and it is possible to reproduce images and sound with a healing effect.

以上、説明したように本発明の各実施形態においては、複数の画像に基づいて合成静止画像や動画を再生する際に、音源の位置が一定となるように、連続撮影の際に収音範囲を変更しており、違和感のない音響効果となる音声収録を可能としている。また複数の画像に基づいて合成静止画像や動画を再生する際に、音源の位置が一定となるように、音声の左右のバランスを変更しており、違和感のない音響効果で音声再生を可能としている。 As described above, in each embodiment of the present invention, the sound collection range is obtained during continuous shooting so that the position of the sound source is constant when a synthesized still image or video is reproduced based on a plurality of images. The sound recording which becomes the sound effect without a sense of incongruity is enabled. In addition, when playing back a composite still image or video based on multiple images, the left / right balance of the sound has been changed so that the position of the sound source is constant, making it possible to play sound with a sound effect that does not feel strange. Yes.

なお、本発明の各実施形態においては、画面の移動を、画像データに基づいて検出していたが、これに限らず、例えば、カメラ１０内に設けた角速度センサや加速度センサ等によって、カメラ１０の動きを直接、検出するようにしても勿論かまわない。 In each embodiment of the present invention, the movement of the screen is detected based on the image data. However, the present invention is not limited to this. For example, the camera 10 may be detected by an angular velocity sensor or an acceleration sensor provided in the camera 10. Of course, it may be possible to detect the movement of the camera directly.

また、本発明の各実施形態においては、左右２チャンネルのステレオ録音を前提に説明したが、これに限らず、さらに多チャンネルの録音であっても勿論、本発明を適用することができる。 In each embodiment of the present invention, the left and right two-channel stereo recording has been described. However, the present invention is not limited to this and can be applied to a multi-channel recording.

さらに、本発明の各実施形態においては、再生表示する場合には、カメラ１０の表示部８、またはカメラ１０から外部機器２０に送信して行っていた。しかし、これに限らず、例えば、記録部４で記録された記録媒体を直接、テレビやパーソナルコンピュータに装填するようにしても良い。 Furthermore, in each embodiment of the present invention, when reproducing and displaying, transmission is performed from the display unit 8 of the camera 10 or the camera 10 to the external device 20. However, the present invention is not limited to this. For example, the recording medium recorded by the recording unit 4 may be directly loaded into a television or personal computer.

また、本発明の各実施形態においては、撮影のための機器として、デジタルカメラを用いて説明したが、カメラとしては、デジタル一眼レフカメラでもコンパクトデジタルカメラでもよく、ビデオカメラ、ムービーカメラのような動画用のカメラでもよく、さらに、携帯電話や携帯情報端末（ＰＤＡ：Personal Digital Assist）等に内蔵されるカメラでも構わない。いずれにしても、画像と共に音声を記録することのできる撮影のための機器であれば、本発明を適用することができる。 In each embodiment of the present invention, a digital camera has been described as an apparatus for photographing. However, the camera may be a digital single lens reflex camera or a compact digital camera, such as a video camera or a movie camera. It may be a camera for moving images, or may be a camera built in a mobile phone or a personal digital assistant (PDA). In any case, the present invention can be applied to any device for photographing that can record sound together with an image.

本発明は、上記実施形態にそのまま限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で構成要素を変形して具体化できる。また、上記実施形態に開示されている複数の構成要素の適宜な組み合わせにより、種々の発明を形成できる。例えば、実施形態に示される全構成要素の幾つかの構成要素を削除してもよい。さらに、異なる実施形態にわたる構成要素を適宜組み合わせてもよい。 The present invention is not limited to the above-described embodiments as they are, and can be embodied by modifying the constituent elements without departing from the scope of the invention in the implementation stage. In addition, various inventions can be formed by appropriately combining a plurality of components disclosed in the embodiment. For example, you may delete some components of all the components shown by embodiment. Furthermore, constituent elements over different embodiments may be appropriately combined.

本発明の第１実施形態に係わるカメラと外部機器の構成を示すブロック図である。It is a block diagram which shows the structure of the camera concerning 1st Embodiment of this invention, and an external apparatus. 本発明の第１実施形態に係るカメラの使用状態を説明する図であり、（ａ）は撮影時の様子を示し、（ｂ）はテレビ等の外部機器に画像を転送する様子を示す図である。It is a figure explaining the use condition of the camera which concerns on 1st Embodiment of this invention, (a) shows the mode at the time of imaging | photography, (b) is a figure which shows a mode that an image is transferred to external apparatuses, such as a television. is there. 本発明の第１実施形態に係わるカメラにおいて、画像と音声の記録について説明する図である。It is a figure explaining recording of an image and a sound in the camera concerning a 1st embodiment of the present invention. 本発明の第１実施形態において、連続撮影された画像からパノラマ画像を生成する様子を示す図であり、（ａ）は３コマの画像からパノラマ画像を生成した図であり、（ｂ）は各コマの中央部に指向性のある収音を行った場合の音声再生を示し、（ｃ）はパノラマ画像の中央部の像の方向に収音した場合を示す図である。In the first embodiment of the present invention, it is a diagram showing a state where a panoramic image is generated from continuously captured images, (a) is a diagram in which a panoramic image is generated from images of three frames, (b) is each The sound reproduction when directivity sound collection is performed at the center part of the frame is shown, and (c) is a diagram showing the case where sound is picked up in the direction of the image at the center part of the panoramic image. 本発明の第１実施形態に係わるカメラにおけるカメラ制御の動作を示すフローチャートである。It is a flowchart which shows the operation | movement of the camera control in the camera concerning 1st Embodiment of this invention. 本発明の第１実施形態における撮影・収音記録の動作を示すフローチャートである。It is a flowchart which shows the operation | movement of imaging | photography and sound recording in 1st Embodiment of this invention. 本発明の第１実施形態において主被写体として人物である場合の画像を示す図である。It is a figure which shows an image in case of being a person as a main subject in 1st Embodiment of this invention. 本発明の第１実施形態に係わるカメラにおいて左右音声収録部の構成を示すブロック図である。It is a block diagram which shows the structure of the left-right audio | voice recording part in the camera concerning 1st Embodiment of this invention. 本発明の第１実施形態に係わるカメラにおいて、画像端部像記録を行っている様子を示す図であり、（ａ）はカメラと被写体の位置関係を示し、（ｂ）はカメラで撮影した連続画像の各コマを示す図である。2A and 2B are diagrams illustrating a state where image edge image recording is performed in the camera according to the first embodiment of the present invention, in which FIG. 1A illustrates the positional relationship between the camera and a subject, and FIG. It is a figure which shows each frame of an image. 本発明の第１実施形態における右強調から左強調録音の動作を示すフローチャートである。It is a flowchart which shows the operation | movement of right emphasis to left emphasis recording in 1st Embodiment of this invention. 本発明の第２実施形態におけるカメラ制御の動作を示すフローチャートである。It is a flowchart which shows the operation | movement of the camera control in 2nd Embodiment of this invention. 本発明の第２実施形態に係わるカメラにおいて左右音声収録部の構成を示すブロック図である。It is a block diagram which shows the structure of the left-right audio | voice recording part in the camera concerning 2nd Embodiment of this invention. 本発明の第２実施形態に係わるカメラにおいて、パノラマ中央部像の方向の音声強調再生を行っている様子を示す図であり、（ａ）はカメラと被写体の位置関係を示し、（ｂ）はカメラで撮影した連続画像の各コマと合成されたパノラマ画像を示す図である。It is a figure which shows a mode that the audio | voice emphasis reproduction | regeneration of the direction of a panoramic center image is performed in the camera concerning 2nd Embodiment of this invention, (a) shows the positional relationship of a camera and a to-be-photographed object, (b) is. It is a figure which shows the panoramic image synthesize | combined with each frame of the continuous image image | photographed with the camera. 本発明の第２実施形態におけるパノラマ中央部像の方向の音声強調の動作を示すフローチャートである。It is a flowchart which shows the audio | voice emphasis operation | movement of the direction of the panoramic center part image in 2nd Embodiment of this invention. 本発明の第１、第２実施形態の変形例に係わるカメラにおいて、画像と音声の記録について説明する図であり、（ａ）は、カメラによって撮影および音声収録を行っている様子を示し、（ｂ）はカメラ１０ａの位置で撮影した画像を示し、（ｃ）はカメラ１０ｂの位置で撮影した画像を示す図である。It is a figure explaining recording of an image and sound in a camera concerning the modification of the 1st and 2nd embodiment of the present invention, and (a) shows signs that photography and voice recording are performed by the camera, b) shows an image taken at the position of the camera 10a, and (c) shows an image taken at the position of the camera 10b. 本発明の第１、第２実施形態の変形例に係わるカメラにおいて、縦長方向にカメラを移動させながら、連続撮影する様子を示す図である。It is a figure which shows a mode that a continuous shooting is carried out, moving a camera to a longitudinal direction in the camera concerning the modification of 1st, 2nd embodiment of this invention.

Explanation of symbols

１・・・信号処理及び制御部、２・・・撮像部、３・・・変化判定部、４・・・記録部、６・・・操作判定部、７・・・左右音声収録部、８・・・表示部、９・・・時計部、１０・・・カメラ、１０ａ・・・カメラ、１０ｂ・・・カメラ、１２・・・通信部、１５・・・ユーザ（撮影者）、１５ａ・・・目、１５ｂ・・・目、１５ｃ・・・耳、２０・・・外部機器、２１・・・信号処理及び制御部、２２・・・通信部、２３・・・表示・再生部、２４・・・表示優先部、２５・・・リモコン受信部、３１ａ・・・画角、３１ｂ・・・画角、３３ａ・・・収音範囲、３３ｂ・・・収音範囲、３５・・・可聴範囲、４１ａ・・・右側マイク、４１ｂ・・・左側マイク、４２ａ・・・ＡＤコンバータ、４２ｂ・・・ＡＤコンバータ、４３ａ・・・加算器、４３ｂ・・・加算器、４３ｃ・・・乗算器、４３ｄ・・・加算器、４３ｅ・・・乗算器、４３ｆ・・・加算器、５１ａ〜５１ｃ・・・画像、５２ａ〜５２ｃ・・・音源位置、５３ａ〜５３ｃ・・・音源、５５・・・木、５６ａ〜５６ｃ・・・画像、５７・・・人物、５８ａ〜５８ｃ・・・収音範囲、５９・・・木、６１ａ〜６１ｃ・・・画像、６２・・・パノラマ画像 DESCRIPTION OF SYMBOLS 1 ... Signal processing and control part, 2 ... Imaging part, 3 ... Change determination part, 4 ... Recording part, 6 ... Operation determination part, 7 ... Left-right audio | voice recording part, 8 Display unit 9 Clock unit 10 Camera 10a Camera 10b Camera 12 Communication unit 15 User (photographer) 15a .. Eye, 15b ... Eye, 15c ... Ear, 20 ... External device, 21 ... Signal processing and control unit, 22 ... Communication unit, 23 ... Display / playback unit, 24 ... Display priority part, 25 ... Remote control receiving part, 31a ... Angle of view, 31b ... Angle of view, 33a ... Sound collection range, 33b ... Sound collection range, 35 ... Audible Range, 41a ... right microphone, 41b ... left microphone, 42a ... AD converter, 42b ... AD converter, 43a ... add 43b ... adder, 43c ... multiplier, 43d ... adder, 43e ... multiplier, 43f ... adder, 51a-51c ... image, 52a-52c ... Sound source position, 53a to 53c ... sound source, 55 ... tree, 56a to 56c ... image, 57 ... person, 58a-58c ... sound collection range, 59 ... tree, 61a- 61c ... image, 62 ... panoramic image

Claims

An imaging unit for continuously shooting a subject;
A sound collection changer that can change the sound collection range of the sound from the subject direction;
An image combining unit that combines a plurality of images continuously obtained by the imaging unit and generates a combined image;
A control unit that changes a sound collection range of the sound collection change unit when shooting the plurality of images;
I have a,
The control unit sets the sound collection range from left to right or right according to the case where each image for generating the composite image is obtained from right to left and from left to right, respectively. A camera characterized by changing from left to right .

An imaging unit that captures continuous images while moving the camera field of view to the left and right;
An audio acquisition unit that records audio in multiple directions at the time of shooting,
An image combining unit that combines a plurality of images continuously obtained by the imaging unit and generates a combined image;
When synthesizing the plurality of images, a control unit that changes synthesis of audio in a plurality of directions obtained by the audio acquisition unit;
I have a,
The control unit, when generating the synthesized image, changes speech synthesis in accordance with a change in position of a predetermined subject in each image so that the sound source position is in the direction of the specific position of the synthesized image. camera.

An imaging unit for continuously shooting a subject;
A sound collection changer capable of changing the sound collection range of the sound from the direction of the subject;
A motion determination unit for determining the motion of the camera;
A control unit that changes a sound collection range of the sound collection change unit based on a determination result by the motion determination unit when the continuous shooting is performed;
I have a,
When the continuous shooting is performed, the control unit sets the sound collection range to the left according to whether the determination result by the motion determination unit moves from right to left and from left to right. A camera characterized by changing from right to left or from right to left .

The camera according to claim 3 , wherein the motion determination unit determines based on image data output from the imaging unit.

And a face detection unit that determines whether or not the subject's face exists based on the image data output from the imaging unit. The control unit detects the face by the face detection unit. 4. The camera according to claim 3 , wherein a sound collection range of the sound collection change unit is controlled based on the face.

A storage unit for storing continuously captured image data and stereo sound data recorded in stereo during the continuous shooting;
A display unit for reproducing and displaying an image based on the image data;
Based on the stereo audio data, an audio playback unit that can be played by changing the left and right balance;
A motion determination unit for determining the motion of the camera;
A controller that controls the left-right balance of the stereo audio data so that the sound source position is at a specific position based on the movement of the camera when reproducing the image data and the stereo audio data;
A playback apparatus comprising:

The control unit obtains a correction angle from the angular velocity of the camera and the timing of each frame of continuous shooting, and controls the left-right balance of the stereo audio data according to the correction angle. 6. The playback device according to 6 .

Stores continuously captured image data and stereo audio data recorded in stereo during this continuous shooting.
Judge the camera movement,
When reproducing the image data and the stereo audio data, the left and right balance of the stereo audio data is controlled so that the sound source position is at a specific position based on the movement of the camera.
A reproduction method characterized by the above.

Stores continuously captured image data and stereo audio data recorded in stereo during this continuous shooting.
Judge the camera movement,
When reproducing the image data and the stereo audio data, the left and right balance of the stereo audio data is controlled so that the sound source position is at a specific position based on the movement of the camera.
A program characterized by causing a computer to execute the above.