JP2010041167A

JP2010041167A - Voice output controller, sound output device, voice output control method, and program

Info

Publication number: JP2010041167A
Application number: JP2008199300A
Authority: JP
Inventors: Koji Koseki; 浩次小関
Original assignee: Seiko Epson Corp
Current assignee: Seiko Epson Corp
Priority date: 2008-08-01
Filing date: 2008-08-01
Publication date: 2010-02-18

Abstract

<P>PROBLEM TO BE SOLVED: To attract a person's attention to an advertisement effectively. <P>SOLUTION: A controller 10 connected with a super-directivity speaker 40 which outputs sound in a specific direction, and a speaker mounting seat 41 which adjusts the output direction of sound from the super-directivity speaker 40 controls a camera 50 to take a photograph in front of the indication side of an advertisement, controls a detection means to detect persons who appears in the photographed image and to detect a figure who is turning to a predetermined direction as an objective person, and controls the speaker mounting seat 41 to adjust the output direction of sound from the super-directivity speaker 40 toward the objective figure, and to output sound from the super-directivity speaker 40. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、超指向性スピーカからの音声出力を制御する音声出力制御装置、音声出力装置、音声出力制御方法、及び、プログラムに関する。 The present invention relates to an audio output control device, an audio output device, an audio output control method, and a program that control audio output from a superdirective speaker.

従来、広告を表示する方法としては、ポスターを掲示する方法や、壁面にディスプレイ装置を設置して広告を表示する方法がある。また、広告効果を高めるため、横長のディスプレイ装置を９０度回転させて縦長に設置し、この縦長のディスプレイ装置の画面に広告を表示する方法が提案されている（例えば、特許文献１参照）。
特開２００４−２２６４９４号公報 Conventionally, as a method of displaying an advertisement, there are a method of posting a poster and a method of displaying an advertisement by installing a display device on a wall surface. In order to enhance the advertising effect, a method has been proposed in which a horizontally long display device is rotated 90 degrees and installed vertically, and an advertisement is displayed on the screen of the vertically long display device (see, for example, Patent Document 1).
JP 2004-226494 A

ところで、広告効果を高めるためには、広告への注目を集めることが有効であるが、視覚的効果によって注目を集めることには限度があり、また、広告そのものに気付いていない人を広告に注目させることは非常に困難であった。
本発明は、上述した事情に鑑みてなされたものであり、広告に対する注目を効果的に集めることを目的とする。 By the way, it is effective to attract attention to the advertisement in order to increase the advertising effect, but there is a limit to attracting attention by the visual effect, and attention is paid to the person who is not aware of the advertisement itself. It was very difficult to do.
The present invention has been made in view of the above-described circumstances, and an object thereof is to effectively attract attention to advertisements.

上記課題を解決するため、本発明は、特定の方向に音声を出力する超指向性スピーカと、前記超指向性スピーカの音声出力方向を調整する音声方向調整機構と、に接続され、広告の表示面を視認可能な範囲を撮影する撮影手段と、前記撮影手段により撮影された撮影画像に写っている人物のうち、その顔が所定方向を向いている人物を対象者として検出する対象者検出手段と、前記音声方向調整機構によって、前記超指向性スピーカの音声出力方向を前記対象者検出手段により検出された対象者の方に向けさせて、前記超指向性スピーカから音声を出力させる音声出力制御手段と、を備えることを特徴とする音声出力制御装置を提供する。
この構成によれば、広告の表示面を視認可能な範囲にいる人のうち所定方向を向いている人を対象者として、超指向性スピーカから対象者に対して音声を出力するので、広告への注目の状態に応じて対象者を選んで、その対象者にのみ聞こえるように音声を出力できる。これにより、広告を視認可能な範囲にいる人のうち、広告への注目の状態が特定の状態にある人を対象にして音声による案内等を行って、広告に対する注目を効果的に集めることができる。 In order to solve the above problems, the present invention is connected to a super-directional speaker that outputs sound in a specific direction and a sound direction adjusting mechanism that adjusts a sound output direction of the super-directional speaker, and displays an advertisement. An image capturing unit that captures a range in which a surface can be visually recognized, and an object person detecting unit that detects, as an object person, a person whose face is facing a predetermined direction among persons captured in the captured image captured by the image capturing unit. And by the sound direction adjustment mechanism, the sound output control for causing the sound output direction of the superdirective speaker to be directed toward the subject detected by the subject detection means and for outputting sound from the superdirective speaker. And an audio output control device.
According to this configuration, since the person who faces the predetermined direction among the persons in the range where the display surface of the advertisement can be viewed is targeted, the sound is output from the superdirective speaker to the target person. A target person can be selected according to the state of attention of the user, and voice can be output so that only the target person can hear. As a result, it is possible to effectively attract attention to the advertisement by performing voice guidance or the like for a person who is in a specific state of attention to the advertisement among those who can view the advertisement. it can.

上記構成において、前記撮影手段は、前記広告の表示面として、広告画像を表示する表示装置の表示画面を視認可能な範囲を撮影するものであり、前記音声出力制御手段は、前記超指向性スピーカから、前記表示装置により表示中の広告画像に関連する音声を出力させるものとしてもよい。
この場合、広告画像が表示される表示画面を視認可能な範囲にいる人のうち、広告への注目の状態が特定の状態にある人を対象にして、広告画像に関連する音声を出力するので、より効果的に、表示画面に表示される広告に対する注目を集めることができる。また、広告の内容に関してより多くの情報を提供することも可能となり、広告効果をさらに高めることも期待できる。 In the above configuration, the photographing unit photographs a range in which a display screen of a display device that displays an advertisement image is visible as the display surface of the advertisement, and the sound output control unit includes the superdirective speaker. The sound related to the advertisement image being displayed by the display device may be output.
In this case, the sound related to the advertisement image is output to a person who is in a specific state of attention to the advertisement among those who can view the display screen on which the advertisement image is displayed. Therefore, it is possible to attract attention to advertisements displayed on the display screen more effectively. It is also possible to provide more information regarding the content of the advertisement, and it can be expected that the advertising effect will be further enhanced.

また、上記構成において、前記撮影手段により撮影された撮影画像に基づいて前記対象者の属性を判別する属性判別手段をさらに備え、前記音声出力制御手段は、前記超指向性スピーカの音声出力方向を、前記対象者のうち前記属性判別手段によって特定の属性に判別された人物に向けさせるものとしてもよい。
この場合、広告への注目の状態が特定の状態にある人のうち、さらに、撮影画像から判別される属性が特定の属性となっている人を対象者として、超指向性スピーカによって音声を出力するので、特定の属性の人を選んで広告に注目させたり、広告に関するさらなる情報を提供したりすることができる。これにより、広告に対して特定の属性の人の注目を効果的に集めることができ、広告効果をさらに高めることも期待できる。 Further, in the above configuration, the apparatus further includes attribute determining means for determining the attribute of the subject based on a photographed image photographed by the photographing means, and the sound output control means determines the sound output direction of the superdirective speaker. The target person may be directed to a person who has been determined to have a specific attribute by the attribute determination unit.
In this case, among the people who are in a specific state of interest in the advertisement, the sound is output by the superdirective speaker for the target person who has the specific attribute determined from the captured image. Therefore, it is possible to select a person with a specific attribute to pay attention to the advertisement or to provide further information about the advertisement. Thereby, it is possible to effectively attract the attention of a person with a specific attribute to the advertisement, and it can be expected to further enhance the advertisement effect.

また、上記構成において、前記撮影手段により撮影された撮影画像に基づいて前記対象者の属性を判別する属性判別手段をさらに備え、前記音声出力制御手段は、前記超指向性スピーカにより、前記属性判別手段によって判別された前記対象者の属性に対応する音声を出力させるものとしてもよい。
この場合、広告への注目の状態が特定の状態にある人を対象者とし、撮影画像から判別される対象者の属性に応じた音声を出力するので、対象者の属性に適した音声を出力することで、対象者に強く効果的に働きかけることができ、広告に対する注目を効果的に集めるとともに広告効果を高めることが可能となる。 Further, in the above configuration, the apparatus further includes attribute determining means for determining an attribute of the subject based on a photographed image photographed by the photographing means, and the sound output control means is configured to identify the attribute by the superdirective speaker. A sound corresponding to the attribute of the subject determined by the means may be output.
In this case, the target person is the person who is in a specific state of interest in the advertisement, and the sound corresponding to the attribute of the target person determined from the captured image is output, so the sound suitable for the attribute of the target person is output. By doing so, it is possible to strongly and effectively work on the target person, and it is possible to effectively attract attention to the advertisement and enhance the advertising effect.

また、本発明は、特定の方向に音声を出力する超指向性スピーカと、前記超指向性スピーカの音声出力方向を調整する音声方向調整機構と、広告の表示面を視認可能な範囲を撮影する撮影手段と、前記撮影手段により撮影された撮影画像に基づいて、前記撮影画像に写っている人物を検出すると共に、顔の向きが所定方向を向いている人物を対象者として検出する対象者検出手段と、前記音声方向調整機構によって前記超指向性スピーカの音声出力方向を前記対象者検出手段により検出された対象者に向くよう調整させ、前記超指向性スピーカから音声を出力させる音声出力制御手段と、を備えることを特徴とする音声出力装置を提供する。
この構成によれば、広告の表示面を視認可能な範囲にいる人のうち所定の方向を向いている人を対象者として超指向性スピーカから音声を出力するので、広告を視認可能な範囲にいる人のうち、広告への注目の状態が特定の状態にある人を対象にして音声による案内等を行って、広告に対する注目を効果的に集めることができる。 The present invention also captures a super-directional speaker that outputs sound in a specific direction, a sound direction adjusting mechanism that adjusts a sound output direction of the super-directional speaker, and a range in which an advertisement display surface can be visually recognized. Based on the photographing means and a photographed image photographed by the photographing means, a person who is detected in the photographed image is detected, and a person whose face is facing a predetermined direction is detected as a subject. And a sound output control means for adjusting the sound output direction of the superdirective speaker to be directed toward the subject detected by the subject detection means by the sound direction adjusting mechanism and outputting the sound from the superdirective speaker. And providing an audio output device.
According to this configuration, since the sound is output from the superdirective speaker with the person facing the predetermined direction among the persons in the range where the display surface of the advertisement is visible, the advertisement can be visually recognized. Of those who are in a particular state of interest in the advertisement, guidance by voice or the like can be given to those who are in a specific state, thereby effectively collecting attention on the advertisement.

また、本発明は、特定の方向に音声を出力する超指向性スピーカと、前記超指向性スピーカの音声出力方向を調整する音声方向調整機構と、に接続された音声出力制御装置により、広告の表示面の前方を撮影し、撮影画像に基づいて、前記撮影画像に写っている人物を検出すると共に、顔の向きが所定方向を向いている人物を対象者として検出し、前記音声方向調整機構によって、前記超指向性スピーカの音声出力方向を前記対象者に向くよう調整させ、前記超指向性スピーカから音声を出力させること、を特徴とする音声出力制御方法を提供する。
この方法によれば、広告の表示面を視認可能な範囲にいる人のうち所定の方向を向いている人を対象者として超指向性スピーカから音声を出力するので、広告を視認可能な範囲にいる人のうち、広告への注目の状態が特定の状態にある人を対象にして音声による案内等を行って、広告に対する注目を効果的に集めることができる。 In addition, the present invention provides an advertisement output by a sound output control device connected to a superdirectional speaker that outputs sound in a specific direction and a sound direction adjustment mechanism that adjusts a sound output direction of the superdirective speaker. The front of the display surface is photographed, and based on the photographed image, a person appearing in the photographed image is detected, a person whose face is facing a predetermined direction is detected as a target person, and the sound direction adjusting mechanism The sound output control method is characterized in that the sound output direction of the superdirective speaker is adjusted to face the target person and the sound is output from the superdirective speaker.
According to this method, since the sound is output from the superdirective speaker with the person facing the predetermined direction among the persons in the range where the display surface of the advertisement is visible, the advertisement can be viewed in the range. Of those who are in a particular state of interest in the advertisement, guidance by voice or the like can be given to those who are in a specific state, thereby effectively collecting attention on the advertisement.

また、本発明は、特定の方向に音声を出力する超指向性スピーカと、前記超指向性スピーカの音声出力方向を調整する音声方向調整機構と、に接続されたコンピュータにより実行されるプログラムであって、前記コンピュータを、広告の表示面の前方を撮影する撮影手段と、前記撮影手段により撮影された撮影画像に基づいて、前記撮影画像に写っている人物を検出すると共に、顔の向きが所定方向を向いている人物を対象者として検出する対象者検出手段と、前記音声方向調整機構によって前記超指向性スピーカの音声出力方向を前記対象者検出手段により検出された対象者に向くよう調整させ、前記超指向性スピーカから音声を出力させる音声出力制御手段と、して機能させることを特徴とするプログラムを提供する。
このプログラムを実行するコンピュータによれば、広告の表示面を視認可能な範囲にいる人のうち所定の方向を向いている人を対象者として超指向性スピーカから音声を出力するので、広告を視認可能な範囲にいる人のうち、広告への注目の状態が特定の状態にある人を対象にして音声による案内等を行って、広告に対する注目を効果的に集めることができる。 The present invention is also a program executed by a computer connected to a super-directional speaker that outputs sound in a specific direction and a sound direction adjusting mechanism that adjusts a sound output direction of the super-directional speaker. Then, the computer detects a person appearing in the photographed image based on the photographing means for photographing the front of the advertisement display surface and the photographed image photographed by the photographing means, and the face orientation is predetermined. Target person detecting means for detecting a person facing the direction as a target person, and adjusting the sound output direction of the super-directional speaker to the target person detected by the target person detecting means by the sound direction adjusting mechanism; A program is provided that functions as voice output control means for outputting voice from the superdirective speaker.
According to the computer that executes this program, since the sound is output from the super-directional speaker with the person facing the predetermined direction among the persons in the range where the display surface of the advertisement is visible, the advertisement is visually recognized. It is possible to effectively attract attention to the advertisement by performing voice guidance or the like for those who are in a possible range and who are in a specific state of attention to the advertisement.

本発明によれば、広告を視認可能な範囲にいる人のうち、広告への注目の状態が特定の状態にある人を対象にして音声による案内等を行って、広告に対する注目を効果的に集めることができる。 According to the present invention, it is possible to effectively provide attention to an advertisement by performing voice guidance or the like for a person who is in a specific state of attention to the advertisement among those who are in a range where the advertisement can be visually recognized. Can be collected.

以下、図面を参照して本発明の実施形態を説明する。
図１は、本実施形態に係る音声出力システム１の機能的構成を示すブロック図である。
音声出力システム１は、制御装置１０に、超指向性スピーカ４０、カメラ５０、及び、表示装置６０を各々接続して構成される。
音声出力装置としての音声出力システム１は、表示装置６０によって商品やサービス等の広告の画像を表示するとともに、この表示装置６０に表示される広告を視認可能な範囲をカメラ５０により撮影し、この範囲にいる人を、撮影画像に基づいて検出し、検出した人に向けて超指向性スピーカ４０から音声を出力する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.
FIG. 1 is a block diagram showing a functional configuration of an audio output system 1 according to the present embodiment.
The audio output system 1 is configured by connecting a super-directional speaker 40, a camera 50, and a display device 60 to the control device 10.
The audio output system 1 as an audio output device displays an image of an advertisement such as a product or service on the display device 60, and captures a range in which the advertisement displayed on the display device 60 is visible with the camera 50. A person in the range is detected based on the photographed image, and sound is output from superdirective speaker 40 toward the detected person.

超指向性スピーカ４０は、パラメトリックスピーカと呼ばれる高い指向性を有するスピーカであって、その音声出力方向に位置する人のみ、或いは、その人の近傍にいる人を含めた少数の人にのみ聞こえるように音声を出力する。具体的な例を挙げると、超音波トランスデューサを備え、この超音波トランスデューサによって超音波帯域の搬送波を可聴帯域の音声信号によって変調した変調波を出力する超音波スピーカを、超指向性スピーカ４０として用いることができる。
超指向性スピーカ４０は、スピーカ台座４１により支持される。スピーカ台座４１は、超指向性スピーカ４０を設置する際の台座であり、超指向性スピーカ４０の音声出力方向を調整する音声方向調整機構として機能する。本実施形態のスピーカ台座４１は、一例として、１〜３軸（１軸、２軸または３軸）の可動軸（図示略）と、これらの可動軸を中心として超指向性スピーカ４０の向きを変えるモータ（図示略）とを備えている。後述するように、制御装置１０の制御によってスピーカ台座４１を動作させることで、超指向性スピーカ４０の音声出力方向を任意の向きに変更することが可能である。 The super-directional speaker 40 is a speaker having high directivity called a parametric speaker, and can be heard only by a small number of people including a person located in the sound output direction or a person in the vicinity of the person. Output audio to. As a specific example, an ultrasonic speaker that includes an ultrasonic transducer and outputs a modulated wave in which an ultrasonic wave carrier wave is modulated by an audio signal in an audible band is used as the super directional speaker 40. be able to.
Superdirective speaker 40 is supported by speaker base 41. The speaker pedestal 41 is a pedestal when the super-directional speaker 40 is installed, and functions as a sound direction adjusting mechanism that adjusts the sound output direction of the super-directional speaker 40. As an example, the speaker pedestal 41 according to the present embodiment has one to three axes (one axis, two axes, or three axes) movable axes (not shown) and the direction of the super-directional speaker 40 around these movable axes. A motor to be changed (not shown). As will be described later, by operating the speaker base 41 under the control of the control device 10, the sound output direction of the superdirective speaker 40 can be changed to an arbitrary direction.

カメラ５０は、静止画像及び／又は動画像を撮影するカメラであって、制御装置１０の制御に従って撮影を行い、撮影画像データを制御装置１０に出力する。
カメラ５０は、制御装置１０に接続されるインタフェース部５１と、撮影制御部５２と、撮影部５３と、を備える。
撮影部５３は、ＣＣＤイメージセンサやＣＭＯＳイメージセンサ等の撮像素子（図示略）、撮影レンズ群（図示略）、ズームやフォーカス等の調整を行うためにレンズ群を駆動するレンズ駆動部（図示略）等を備え、撮影制御部５２の制御に従って撮影を行う。撮影制御部５２は、インタフェース部５１を介して入力される制御信号に従って、撮影部５３のレンズ駆動部を動作させて所定の撮影条件を実現させ、この条件下で撮影部５３が備える撮像素子から出力されるデータを所定形式のデータに変換し、撮影画像データとして、インタフェース部５１を介して出力する。インタフェース部５１は、有線のケーブルまたは無線通信回線を介して制御装置１０に接続され、制御装置１０から入力される制御信号を受信して撮影制御部５２に出力するとともに、撮影制御部５２から入力される撮影画像データ等を制御装置１０に出力する。 The camera 50 is a camera that captures a still image and / or a moving image. The camera 50 captures images under the control of the control device 10 and outputs captured image data to the control device 10.
The camera 50 includes an interface unit 51, a shooting control unit 52, and a shooting unit 53 that are connected to the control device 10.
The photographing unit 53 includes an image pickup device (not shown) such as a CCD image sensor or a CMOS image sensor, a photographing lens group (not shown), and a lens driving unit (not shown) that drives the lens group to adjust zoom and focus. ) And the like, and performs shooting according to the control of the shooting control unit 52. The imaging control unit 52 operates a lens driving unit of the imaging unit 53 according to a control signal input via the interface unit 51 to realize a predetermined imaging condition. Under this condition, an imaging element included in the imaging unit 53 The output data is converted into data of a predetermined format, and is output as captured image data via the interface unit 51. The interface unit 51 is connected to the control device 10 via a wired cable or a wireless communication line, receives a control signal input from the control device 10, outputs the control signal to the shooting control unit 52, and inputs from the shooting control unit 52. The captured image data and the like are output to the control device 10.

表示装置６０は、制御装置１０の制御に従って広告の画像（静止画像及び動画像を含む）を表示する。
表示装置６０は、制御装置１０に接続されるインタフェース部６１と、インタフェース部６１を介して入力された表示信号を取得する描画制御部６２と、描画制御部６２に接続された描画メモリ６３と、描画制御部６２の制御に従って表示パネル６５を駆動する表示駆動回路６４と、表示パネル６５とを備えている。
描画制御部６２は、インタフェース部６１を介して制御装置１０から入力された表示信号に基づいて、表示用の画像を描画メモリ６３に描画する。そして、描画制御部６２は、表示パネル６５における描画タイミングに合わせて描画メモリ６３から画像を読み出し、表示駆動回路６４に出力する。表示駆動回路６４は、描画制御部６２から入力された画像に基づいて表示パネル６５を駆動し、画像を表示させる。 The display device 60 displays advertisement images (including still images and moving images) under the control of the control device 10.
The display device 60 includes an interface unit 61 connected to the control device 10, a drawing control unit 62 that acquires a display signal input via the interface unit 61, a drawing memory 63 connected to the drawing control unit 62, A display drive circuit 64 that drives the display panel 65 under the control of the drawing control unit 62 and a display panel 65 are provided.
The drawing control unit 62 draws a display image in the drawing memory 63 based on a display signal input from the control device 10 via the interface unit 61. Then, the drawing control unit 62 reads an image from the drawing memory 63 in accordance with the drawing timing on the display panel 65 and outputs the image to the display drive circuit 64. The display driving circuit 64 drives the display panel 65 based on the image input from the drawing control unit 62 to display the image.

ここで、表示パネル６５は、液晶表示パネル、プラズマ表示パネル、或いは有機ＥＬパネル等のフラットディスプレイパネルにより構成される。表示パネル６５が透過型の液晶表示パネルで構成される場合、表示装置６０はバックライト装置（図示略）を備え、表示駆動回路６４は、表示パネル６５を駆動するとともにバックライト装置の点灯制御を行い、所定のタイミングで点灯させる。また、表示パネル６５がプラズマ表示パネルや有機ＥＬパネル等の自発光型のものである場合、バックライト装置は不要である。 Here, the display panel 65 is configured by a flat display panel such as a liquid crystal display panel, a plasma display panel, or an organic EL panel. When the display panel 65 is composed of a transmissive liquid crystal display panel, the display device 60 includes a backlight device (not shown), and the display drive circuit 64 drives the display panel 65 and controls the lighting of the backlight device. And turn it on at a predetermined timing. Further, when the display panel 65 is a self-luminous type such as a plasma display panel or an organic EL panel, a backlight device is unnecessary.

図２は、超指向性スピーカ４０及びカメラ５０の設置状態を示す斜視図である。また、図３はカメラ５０の撮影範囲を示す平面図である。
図２に示すように、超指向性スピーカ４０及びカメラ５０は、表示装置６０の上端部に取り付けられている。
超指向性スピーカ４０は、その音声出力方向が主に表示パネル６５の前方に向くよう取り付けられる。本実施形態では、スピーカ台座４１は二つの直交する可動軸を有し、図中矢印ＭＨで示す方向（水平方向）及び矢印ＭＶで示す方向（垂直方向）に、超指向性スピーカ４０の音声出力方向を変更する。スピーカ台座４１の可動範囲は特に限定されないが、典型的な例としては、矢印ＭＨ及び矢印ＭＶで示すように、超指向性スピーカ４０の音声出力方向を、表示パネル６５の正面方向を中心として左右及び上下に変更する態様が挙げられる。超指向性スピーカ４０の音声出力方向は、表示パネル６５から音声を聴かせる対象者までの距離に応じて矢印ＭＶ方向に変化し、対象者の左右方向の位置に応じて矢印ＭＨ方向に変化する。 FIG. 2 is a perspective view showing an installation state of the super-directional speaker 40 and the camera 50. FIG. 3 is a plan view showing the shooting range of the camera 50.
As shown in FIG. 2, superdirective speaker 40 and camera 50 are attached to the upper end portion of display device 60.
The super-directional speaker 40 is attached so that the sound output direction is mainly directed to the front of the display panel 65. In the present embodiment, the speaker base 41 has two orthogonal movable axes, and the sound output of the superdirective speaker 40 in the direction indicated by the arrow MH (horizontal direction) and the direction indicated by the arrow MV (vertical direction) in the figure. Change direction. Although the movable range of the speaker base 41 is not particularly limited, as a typical example, as indicated by an arrow MH and an arrow MV, the sound output direction of the superdirectional speaker 40 is left and right with the front direction of the display panel 65 as the center. And the aspect changed up and down is mentioned. The sound output direction of superdirective speaker 40 changes in the direction of arrow MV according to the distance from display panel 65 to the subject who hears the sound, and changes in the direction of arrow MH according to the position of the subject in the left-right direction. .

撮影手段としてのカメラ５０は、図２及び図３に示すように、広告の表示面としての表示パネル６５の正面を含む表示パネル６５の前方を撮影するよう設置されている。カメラ５０の撮影範囲は、図３に符号Ｇで示す領域であり、表示パネル６５に表示される広告の画像を視認可能な範囲である。すなわち、カメラ５０は、表示パネル６５を視認できる位置にいる人を撮影できるように配置される。ここで、音声出力システム１がカメラ５０を一台のみ使用する場合には、カメラ５０に、焦点距離が35〜24mmの広角レンズや、21mm以下の超広角レンズ（焦点距離はいずれも35mmフィルム換算）、或いは魚眼レンズを用いて広範囲を撮影することが好ましく、カメラ５０の設置位置は、図２及び図３に示すように、表示パネル６５のほぼ中央が好ましい。音声出力システム１が複数のカメラ５０を備え、これら複数のカメラ５０の撮影画像を、制御装置１０が重複の排除等の処理を行って利用する場合には、各々のカメラ５０の撮影範囲は図３の領域Ｇの一部のみをカバーすればよい。この場合、カメラ５０は通常の広角レンズを備えていれば十分に機能を果たすことができる。 As shown in FIGS. 2 and 3, the camera 50 serving as a photographing unit is installed so as to photograph the front of the display panel 65 including the front surface of the display panel 65 serving as an advertisement display surface. The shooting range of the camera 50 is a region indicated by a symbol G in FIG. 3 and is a range in which an advertisement image displayed on the display panel 65 can be visually recognized. In other words, the camera 50 is arranged so that a person at a position where the display panel 65 can be viewed can be photographed. Here, when the audio output system 1 uses only one camera 50, the camera 50 has a wide-angle lens with a focal length of 35 to 24 mm or an ultra-wide-angle lens with a focal length of 21 mm or less (both focal lengths are equivalent to 35 mm film). ), Or a fish-eye lens is preferably used to photograph a wide range, and the installation position of the camera 50 is preferably approximately at the center of the display panel 65 as shown in FIGS. When the audio output system 1 includes a plurality of cameras 50 and the control apparatus 10 uses the captured images of the plurality of cameras 50 by performing processing such as elimination of duplication, the shooting ranges of the respective cameras 50 are illustrated in FIG. It is only necessary to cover a part of the third region G. In this case, the camera 50 can function sufficiently if it is provided with a normal wide-angle lens.

また、音声出力システム１が超指向性スピーカ４０を一台のみ使用する場合には、スピーカ台座４１による水平方向の可動範囲を広くして、例えば矢印ＭＨで示す水平方向に１８０度以上とすることが考えられる。
なお、超指向性スピーカ４０及びカメラ５０の設置場所の高さ位置は、表示装置６０の上端に限定されず、より高い位置であってもよい。カメラ５０によって領域Ｇ全体を効率よく撮影し、この領域Ｇ内の特定の人に確実に超指向性スピーカ４０の音声を聴かせるためには、超指向性スピーカ４０及びカメラ５０は高い場所に設置される方が、比較的好ましいといえる。 When the audio output system 1 uses only one super-directional speaker 40, the horizontal movable range by the speaker base 41 is widened, for example, 180 degrees or more in the horizontal direction indicated by the arrow MH. Can be considered.
In addition, the height position of the installation location of the super-directional speaker 40 and the camera 50 is not limited to the upper end of the display device 60, and may be a higher position. In order to efficiently capture the entire area G with the camera 50 and to ensure that a specific person in the area G listens to the sound of the superdirectional speaker 40, the superdirectional speaker 40 and the camera 50 are installed at a high place. It can be said that it is relatively preferable.

音声出力システム１の各部を制御する制御装置１０は、例えば、パーソナルコンピュータとして実現されるものであり、音声出力制御装置として機能する。制御装置１０は、図１に示すように、音声出力部１１、台座駆動部１２、入力部１３、表示部１４、記録媒体読取部１５、インタフェース部１６、制御装置１０の各部を制御する制御部２０、及び、記憶部３０を備えている。 The control device 10 that controls each unit of the audio output system 1 is realized as a personal computer, for example, and functions as an audio output control device. As shown in FIG. 1, the control device 10 includes a sound output unit 11, a pedestal drive unit 12, an input unit 13, a display unit 14, a recording medium reading unit 15, an interface unit 16, and a control unit that controls each unit of the control device 10. 20 and a storage unit 30.

音声出力部１１は、超指向性スピーカ４０に接続され、制御部２０の制御に従って、記憶部３０に記憶された音声データに係る音声を出力するための音声信号を生成し、この音声信号を超指向性スピーカ４０に出力する。
台座駆動部１２は、制御部２０の制御に従って、スピーカ台座４１が備えるモータ（図示略）を駆動するための駆動信号や電源を供給する。この台座駆動部１２がスピーカ台座４１に出力する駆動信号や電源によって上記モータが所定角度だけ回動し、超指向性スピーカ４０の音声出力方向が、制御部２０が決定した方向となる。 The audio output unit 11 is connected to the superdirective speaker 40 and generates an audio signal for outputting audio related to the audio data stored in the storage unit 30 according to the control of the control unit 20. Output to the directional speaker 40.
The pedestal drive unit 12 supplies a drive signal and power for driving a motor (not shown) included in the speaker pedestal 41 according to the control of the control unit 20. The motor is rotated by a predetermined angle by a drive signal or power source output from the pedestal drive unit 12 to the speaker pedestal 41, and the sound output direction of the super-directional speaker 40 becomes the direction determined by the control unit 20.

入力部１３は、マウスやキーボード等の入力デバイスに接続され、これら入力デバイスの操作を検出して、この操作に対応する操作信号を制御部２０に出力する。
表示部１４は、制御部２０の制御に従って、各種情報を表示するものであり、例えば液晶表示パネルを用いて構成される。
記録媒体読取部１５は、ＣＤ、ＤＶＤ、或いは次世代型ＤＶＤ等の光ディスク型記録媒体、ＭＯ等の光磁気記録媒体、磁気記録媒体、半導体記憶素子を利用した記憶装置、磁気的記録媒体を利用した記録装置等から、プログラムやデータを読み取る装置である。記録媒体読取部１５は、制御部２０の制御に従って、表示パネル６５に表示する画像に係るデータや、超指向性スピーカ４０から出力する音声に係るデータ、制御部２０が実行するプログラムや処理対象のデータ等を読み取って、制御部２０に出力する。記録媒体読取部１５により読み取られたデータやプログラムは、制御部２０の制御に基づいて、記憶部３０に記憶される。 The input unit 13 is connected to input devices such as a mouse and a keyboard, detects operations of these input devices, and outputs operation signals corresponding to these operations to the control unit 20.
The display unit 14 displays various types of information under the control of the control unit 20, and is configured using, for example, a liquid crystal display panel.
The recording medium reading unit 15 uses an optical disk type recording medium such as a CD, a DVD, or a next-generation DVD, a magneto-optical recording medium such as an MO, a magnetic recording medium, a storage device using a semiconductor storage element, or a magnetic recording medium. This is a device that reads a program and data from a recording device or the like. Under the control of the control unit 20, the recording medium reading unit 15 includes data related to an image displayed on the display panel 65, data related to sound output from the superdirective speaker 40, a program executed by the control unit 20, and a processing target. Data and the like are read and output to the control unit 20. Data and programs read by the recording medium reading unit 15 are stored in the storage unit 30 based on the control of the control unit 20.

インタフェース部１６は、カメラ５０が備えるインタフェース部５１、及び、表示装置６０が備えるインタフェース部６１に対し、有線または無線により接続される。インタフェース部１６は、インタフェース部５１、６１との間において、制御信号や表示情報、撮影画像データ等の入出力を実行する。 The interface unit 16 is connected to the interface unit 51 included in the camera 50 and the interface unit 61 included in the display device 60 by wire or wirelessly. The interface unit 16 performs input / output of control signals, display information, captured image data, and the like with the interface units 51 and 61.

制御部２０は、制御装置１０の各部を中枢的に制御するものであり、ＣＰＵ、ＣＰＵによって実行される基本制御プログラムや処理されるデータ等を不揮発的に記憶するＲＯＭ、ＣＰＵによって実行されるプログラムや処理されるデータ等を一時的に記憶するＲＡＭ、及び、その他の周辺回路等を備えている。制御部２０は、ＲＯＭに記憶された基本制御プログラムを読み出して実行することにより、制御装置１０の各部を制御する。さらに、制御部２０は、ＲＯＭや記憶部３０に記憶されたプログラムを読み出して実行することで、制御装置１０に接続された各部を制御することにより、制御装置１０の各種機能を実現する。
すなわち、制御部２０は、顔方向判定部２１（対象者検出手段）、属性判別部２２（属性判別手段）、音声出力制御部２３（音声出力制御手段）、及び、スピーカ台座制御部２４の各機能部を有する。これらの機能部は、制御部２０が有するＣＰＵが所定のプログラムを実行することで、実現される。 The control unit 20 centrally controls each unit of the control device 10, and includes a CPU, a ROM that stores a basic control program executed by the CPU and data to be processed in a nonvolatile manner, and a program executed by the CPU. And a RAM for temporarily storing data to be processed and other peripheral circuits. The control unit 20 controls each unit of the control device 10 by reading and executing the basic control program stored in the ROM. Further, the control unit 20 reads out and executes a program stored in the ROM or the storage unit 30 to control various units connected to the control device 10, thereby realizing various functions of the control device 10.
That is, the control unit 20 includes a face direction determination unit 21 (subject detection unit), an attribute determination unit 22 (attribute determination unit), an audio output control unit 23 (audio output control unit), and a speaker pedestal control unit 24. It has a functional part. These functional units are realized when a CPU included in the control unit 20 executes a predetermined program.

顔方向判定部２１は、カメラ５０から入力される撮影画像データを解析して、カメラ５０の撮影画像に写っている人毎に、顔の向きを判定する処理を行う。顔方向判定部２１は、少なくとも、各々の人の顔が表示パネル６５を向いているか否かを判定する。
図３に示すように、カメラ５０は領域Ｇにいる人を撮影可能なものであり、例えば領域Ｇに三人の人Ｕ１、Ｕ２、Ｕ３がいる場合には、カメラ５０の撮影画像には三人の人Ｕ１〜Ｕ３の顔が写る。図３中、人Ｕ１の顔の向きを方向Ａ１とし、人Ｕ２の顔の向きを方向Ａ２とし、人Ｕ３の顔の向きを方向Ａ３とする。図３の例では、人Ｕ１の顔の向き方向Ａ１は表示パネル６５に対して横向きであり、人Ｕ３の顔の向き方向Ａ３は表示パネル６５とは反対側の斜め方向である。これに対し、人Ｕ２の顔の向き方向Ａ２は正面から表示パネル６５側を向いている。
カメラ５０は表示パネル６５の表示面と同じ側から、表示パネル６５の前方、すなわち表示パネル６５を視認可能な範囲（領域Ｇ）を撮影するので、カメラ５０の撮影画像において、表示パネル６５を向いている人Ｕ２の顔は正面向きに写っている。
顔方向判定部２１は、カメラ５０の撮影画像における人の姿を検出し、各々の人の顔が正面向きの顔であるか否かを判定することで、顔の向きを判定する。なお、顔方向判定部２１は、人の顔が表示パネル６５を正面から見ているか否かを判定するだけでなく、表示パネル６５に対して横方向や斜め方向、或いは表示パネル６５の反対側を向いている人の顔について、その向きやおよその角度を判定できるものであってもよい。 The face direction determination unit 21 analyzes the captured image data input from the camera 50 and performs a process of determining the face orientation for each person shown in the captured image of the camera 50. The face direction determination unit 21 determines at least whether each person's face is facing the display panel 65.
As shown in FIG. 3, the camera 50 is capable of photographing a person in the region G. For example, when there are three people U1, U2, U3 in the region G, the photographed image of the camera 50 includes three images. The faces of people U1-U3 are shown. In FIG. 3, the direction of the face of the person U1 is defined as a direction A1, the direction of the face of the person U2 is defined as a direction A2, and the direction of the face of the person U3 is defined as a direction A3. In the example of FIG. 3, the face direction A1 of the person U1 is lateral to the display panel 65, and the face direction A3 of the person U3 is an oblique direction opposite to the display panel 65. In contrast, the face direction A2 of the person U2 faces the display panel 65 from the front.
Since the camera 50 photographs the front of the display panel 65, that is, a range (region G) where the display panel 65 can be viewed from the same side as the display surface of the display panel 65, the camera 50 faces the display panel 65 in the photographed image of the camera 50. The face of the person U2 is in front.
The face direction determination unit 21 determines the face direction by detecting the person's appearance in the image captured by the camera 50 and determining whether each person's face is a front-facing face. The face direction determination unit 21 not only determines whether or not a human face is looking at the display panel 65 from the front, but also laterally or obliquely with respect to the display panel 65 or on the opposite side of the display panel 65. It may be possible to determine the orientation and approximate angle of the face of a person facing the camera.

属性判別部２２は、カメラ５０から入力される撮影画像データを解析して、カメラ５０の撮影画像に写っている人毎に、属性を判別する処理を行う。属性判別部２２は、少なくとも、各々の人の顔が表示パネル６５を向いているか否かを判定する。
属性判別部２２は、カメラ５０の撮影画像から人の姿の部分を検出し、その人の姿の部分について特徴を検出する。ここで検出される特徴は、画像中の頭髪の占める割合、頭髪及び皮膚の色調、身長及び身幅とその比、顔の特徴、服装の色調等である。続いて属性判別部２２は、検出した画像の特徴に基づいて、その人の属性として、例えば性別や年代を判別する。 The attribute determination unit 22 analyzes the captured image data input from the camera 50 and performs a process of determining the attribute for each person shown in the captured image of the camera 50. The attribute determination unit 22 determines at least whether each person's face is facing the display panel 65.
The attribute discriminating unit 22 detects a human figure portion from a photographed image of the camera 50 and detects a feature of the human figure portion. The features detected here are the proportion of hair in the image, the color of hair and skin, the height and width and their ratio, the characteristics of the face, the color of clothes, and the like. Subsequently, the attribute discrimination unit 22 discriminates, for example, gender and age as the person's attribute based on the detected feature of the image.

音声出力制御部２３は、カメラ５０の撮影画像に写っている人のうち、顔方向判定部２１により判定された人毎の顔の方向、及び、属性判別部２２により判別された人毎の属性に基づいて、超指向性スピーカ４０によって音声を聴かせる対象者を選択する。そして、音声出力制御部２３は、対象者に適した音声を、記憶部３０に記憶された音声選択用テーブル３３に基づいて選択し、選択した音声のデータを広告音声データ３２から読み出して、この音声を出力するための音声信号を、音声出力部１１から超指向性スピーカ４０へ出力させる。
スピーカ台座制御部２４は、音声出力制御部２３によって選択された対象者に超指向性スピーカ４０の音声を聴かせるため、カメラ５０の撮影画像における対象者の位置に基づいて、スピーカ台座４１を駆動する方向及び駆動量を算出し、算出結果に基づいて台座駆動部１２を制御し、スピーカ台座４１を動作させる。
この音声出力制御部２３及びスピーカ台座制御部２４の動作により、カメラ５０によって撮影された人のうち、特定の人（対象者）に対して超指向性スピーカ４０から音声が出力される。 The voice output control unit 23 is the face direction for each person determined by the face direction determination unit 21 among the persons shown in the captured image of the camera 50, and the attribute for each person determined by the attribute determination unit 22. Based on the above, the target person who listens to the sound is selected by the super-directional speaker 40. Then, the voice output control unit 23 selects a voice suitable for the target person based on the voice selection table 33 stored in the storage unit 30, reads the selected voice data from the advertising voice data 32, and A sound signal for outputting sound is output from the sound output unit 11 to the superdirective speaker 40.
The speaker pedestal control unit 24 drives the speaker pedestal 41 based on the position of the target person in the captured image of the camera 50 so that the target person selected by the sound output control part 23 listens to the sound of the superdirective speaker 40. The direction and the driving amount are calculated, the pedestal driving unit 12 is controlled based on the calculation result, and the speaker pedestal 41 is operated.
By the operations of the audio output control unit 23 and the speaker base control unit 24, audio is output from the superdirective speaker 40 to a specific person (target person) among persons photographed by the camera 50.

記憶部３０は、磁気的、光学的記録媒体或いは半導体記憶素子を用いた記憶装置を備え、各種のプログラムやデータ等を不揮発的に記憶する。また、記憶部３０は、広告画像データ３１、広告音声データ３２、音声選択用テーブル３３、及び個人識別情報３４の各情報を記憶する。
広告画像データ３１は、表示装置６０によって表示される画像のデータであり、商品やサービス等の広告用の静止画像または動画像のデータである。広告画像データ３１は、複数の画像のデータを含んでいる。
広告音声データ３２は、超指向性スピーカ４０から出力される音声のデータであり、広告画像データ３１に含まれる各画像データの種類、及び、音声を聴かせる対象者の属性等に対応して、複数の音声データが広告音声データ３２に含まれる。
音声選択用テーブル３３は、広告音声データ３２の中から、超指向性スピーカ４０が出力する音声を選択するためのテーブルであり、一つの音声データを決定するための条件等が設定されている。
個人識別情報３４は、カメラ５０の撮影画像に写っている人について、人毎に異同を識別するための情報であり、例えば、属性判別部２２が属性を判別する際に検出した撮影画像の特徴である。 The storage unit 30 includes a storage device using a magnetic or optical recording medium or a semiconductor storage element, and stores various programs and data in a nonvolatile manner. In addition, the storage unit 30 stores information of advertisement image data 31, advertisement sound data 32, sound selection table 33, and personal identification information 34.
The advertisement image data 31 is data of an image displayed by the display device 60 and is data of a still image or a moving image for advertisement such as a product or service. The advertisement image data 31 includes data of a plurality of images.
The advertisement sound data 32 is sound data output from the superdirective speaker 40, and corresponds to the type of each image data included in the advertisement image data 31, the attribute of the target person who listens to the sound, and the like. A plurality of audio data is included in the advertisement audio data 32.
The sound selection table 33 is a table for selecting the sound output from the superdirective speaker 40 from the advertisement sound data 32, and conditions and the like for determining one sound data are set.
The personal identification information 34 is information for identifying the difference in each person in the photographed image of the camera 50. For example, the feature of the photographed image detected when the attribute discrimination unit 22 discriminates the attribute. It is.

図４は、音声選択用テーブル３３の構成例を模式的に示す図である。
この図４に示す例の音声選択用テーブル３３によれば、表示装置６０に表示される広告画像の種類と、対象者がカメラ５０撮影画像において検出された回数と、対象者の属性と、をもとに音声データが決定される。
すなわち、音声選択用テーブル３３には、表示装置６０に表示される広告画像の種類、対象者がカメラ５０撮影画像において検出された回数、対象者の属性（年代、性別）毎に、対応する音声データが設定されている。例えば、表示装置６０に表示中の広告画像が広告画像Ａであり、音声出力制御部２３が検出した対象者がカメラ５０の撮影画像から検出されたのが最初（１回目）であり、対象者の属性が２０−３０代の男性である場合、音声データとしては、音声データＡ１が設定されている。
従って、音声出力制御部２３は、制御部２０の制御によって表示装置６０に表示させている広告画像の種類、属性判別部２２によって対象者がカメラ５０撮影画像から検出された回数、及び、属性判別部２２が判別した対象者の属性（年代、性別）に基づいて、広告音声データ３２に含まれる複数の音声データから、適切な音声データを選択できる。
さらに、図４の例では、音声選択用テーブル３３において、検出回数が２回目以後の場合には顔の向き毎に異なる音声データが対応づけられている。このように、音声選択用テーブル３３においては、顔方向判定部２１によって判定された顔の向き毎に、異なる音声データを選択するよう設定することもできる。 FIG. 4 is a diagram schematically illustrating a configuration example of the voice selection table 33.
According to the voice selection table 33 in the example shown in FIG. 4, the type of advertisement image displayed on the display device 60, the number of times the target person is detected in the captured image of the camera 50, and the attributes of the target person The voice data is determined based on the original.
That is, in the voice selection table 33, the type of advertisement image displayed on the display device 60, the number of times the target person is detected in the image captured by the camera 50, and the corresponding voice for each target person attribute (age, gender). Data is set. For example, the advertisement image being displayed on the display device 60 is the advertisement image A, and the target person detected by the audio output control unit 23 is first detected from the captured image of the camera 50 (first time). If the attribute is a male in his 20-30s, voice data A1 is set as the voice data.
Therefore, the audio output control unit 23 determines the type of advertisement image displayed on the display device 60 under the control of the control unit 20, the number of times the target person is detected from the captured image of the camera 50 by the attribute determination unit 22, and the attribute determination. Appropriate sound data can be selected from a plurality of sound data included in the advertisement sound data 32 based on the attributes (age, gender) of the subject determined by the unit 22.
Further, in the example of FIG. 4, in the audio selection table 33, different audio data is associated with each face orientation when the number of times of detection is the second time or later. As described above, the voice selection table 33 can be set so that different voice data is selected for each face direction determined by the face direction determination unit 21.

ここで、対象者がカメラ５０の撮影画像において検出された回数は、例えば、個人識別情報３４を利用して求めることができる。
すなわち、制御部２０は、属性判別部２２によってカメラ５０の撮影画像から人の姿の部分の特徴を検出した後、検出した特徴に固有のＩＤを付して、個人識別情報３４に登録する。そして、制御部２０は、属性判別部２２によってカメラ５０の撮影画像から人の姿の部分の特徴を検出した後で、個人識別情報３４に同様の特徴を有する人の画像の情報が登録されているか否かを判定する。この判定により、カメラ５０の撮影画像から検出された人の姿が、以前にもカメラ５０の撮影画像において検出された人の姿であるかどうかが判定でき、初めてカメラ５０の撮影画像に写った人の姿か否かがわかる。 Here, the number of times the target person is detected in the captured image of the camera 50 can be obtained by using the personal identification information 34, for example.
That is, the control unit 20 detects the feature of the human figure from the photographed image of the camera 50 by the attribute discrimination unit 22, and then adds a unique ID to the detected feature and registers it in the personal identification information 34. Then, the control unit 20 detects the feature of the human figure from the photographed image of the camera 50 by the attribute discriminating unit 22, and then the information of the person image having the same feature is registered in the personal identification information 34. It is determined whether or not. By this determination, it can be determined whether or not the figure of the person detected from the photographed image of the camera 50 is the figure of the person previously detected in the photographed image of the camera 50. You can see if it is a person.

図５は、音声出力システム１の動作を示すフローチャートである。
この図５に示す動作は、制御装置１０の制御部２０が、カメラ５０の撮影画像を所定時間毎にサンプリングする毎に、行われる。この図５の動作の実行時、制御部２０は、対象者検出手段、属性判別手段、音声出力制御手段として機能する。
制御部２０は、まず、カメラ５０の撮影画像を、インタフェース部１６を介して取得する（ステップＳ１１）。ここで取得される撮影画像は、静止画像データであってもよいし、動画像データから一つのフレームを切り出したものであってもよい。
制御部２０は、顔方向判定部２１及び属性判別部２２による検出を行って、カメラ５０の撮影画像中に人の姿の画像（人物画像）があるか否かを判別する（ステップＳ１２）。ここで、撮影画像中に人物の画像がない場合（ステップＳ１２；Ｎｏ）、制御部２０は本処理を終了する。 FIG. 5 is a flowchart showing the operation of the audio output system 1.
The operation shown in FIG. 5 is performed every time the control unit 20 of the control device 10 samples a captured image of the camera 50 every predetermined time. When the operation of FIG. 5 is executed, the control unit 20 functions as a subject detection unit, an attribute determination unit, and an audio output control unit.
First, the control unit 20 acquires a captured image of the camera 50 via the interface unit 16 (step S11). The captured image acquired here may be still image data or may be one obtained by cutting out one frame from moving image data.
The control unit 20 performs detection by the face direction determination unit 21 and the attribute determination unit 22, and determines whether or not there is an image of a person (person image) in the captured image of the camera 50 (step S12). Here, when there is no person image in the captured image (step S12; No), the control unit 20 ends the process.

一方、撮影画像中に人物の画像があった場合、すなわち人が写っていた場合（ステップＳ１２；Ｙｅｓ）、制御部２０は、撮影画像において検出された全ての人物の画像から処理対象となる人物の画像を一つ選択し（ステップＳ１３）、この画像が、個人識別情報３４に登録されている人物の画像であるか否かを判定する（ステップＳ１４）。この判定は、上述したように処理対象の人物の画像の特徴を検出し、検出した特徴と同じ特徴を有する画像が個人識別情報３４に登録されているか否かを判定することで、行われる。
処理対象の人物の画像が、まだ個人識別情報３４に登録されていなかった場合（ステップＳ１４；Ｎｏ）、制御部２０は、顔方向判定部２１の機能によって処理対象の人物の画像から顔の向きを判定する（ステップＳ１５）。 On the other hand, when there is an image of a person in the captured image, that is, when a person is captured (step S12; Yes), the control unit 20 performs processing from the images of all persons detected in the captured image. One image is selected (step S13), and it is determined whether this image is an image of a person registered in the personal identification information 34 (step S14). This determination is performed by detecting the characteristics of the image of the person to be processed as described above and determining whether an image having the same characteristics as the detected characteristics is registered in the personal identification information 34.
When the image of the person to be processed has not yet been registered in the personal identification information 34 (step S14; No), the control unit 20 uses the function of the face direction determination unit 21 to change the face direction from the image of the person to be processed. Is determined (step S15).

ここで、処理対象の人物の画像から判定された顔の向きが、表示パネル６５の正面向きであった場合（ステップＳ１６；Ｙｅｓ）、制御部２０は、音声出力などの処理を行わずにステップＳ２３に移行する。つまり、本実施形態で、制御部２０は、表示パネル６５に表示される広告画像を既に見ている人には、超指向性スピーカ４０による音声出力を行わない。これは、広告画像を見ていない人に超指向性スピーカ４０の音声を聴かせることで、広告に注目させるためであり、広告への注目を集めることを最優先とする場合に特に有効である。 Here, when the orientation of the face determined from the image of the person to be processed is the front direction of the display panel 65 (step S16; Yes), the control unit 20 does not perform processing such as sound output and the like. The process proceeds to S23. That is, in this embodiment, the control unit 20 does not perform audio output from the superdirective speaker 40 to a person who has already seen the advertisement image displayed on the display panel 65. This is to make a person who has not seen the advertisement image listen to the sound of the super-directional speaker 40 so as to pay attention to the advertisement, and is particularly effective when the highest priority is to attract attention to the advertisement. .

また、処理対象の人物の画像から判定された顔の向きが、表示パネル６５の正面を向いていない場合（ステップＳ１６；Ｎｏ）、制御部２０は、この人物を超指向性スピーカ４０の音声出力の対象者として決定し（ステップＳ１７）、この人物の画像に基づいて属性判別部２２による属性判別を行い（ステップＳ１８）、音声出力制御部２３の機能により、音声選択用テーブル３３に従って音声データを選択するとともに選択した音声データを広告音声データ３２から取得する（ステップＳ１９）。
続いて、制御部２０は、超指向性スピーカ４０の音声出力方向をステップＳ１７で決定した対象者の方向に合わせるため、スピーカ台座制御部２４の機能によって台座駆動部１２を制御し、超指向性スピーカ４０の向きを調整する（ステップＳ２０）。そして、制御部２０は、音声出力制御部２３の機能によって超指向性スピーカ４０から音声を出力させ（ステップＳ２１）、この人物画像についてステップＳ１８で検出した特徴を個人識別情報３４に登録し（ステップＳ２２）、ステップＳ２３に移行する。
ステップＳ２３では、カメラ５０の撮影画像において検出された人物の画像の全てについて処理が完了したか否かを判別し、全ての人物の画像の処理が済んでいれば（ステップＳ２３；Ｙｅｓ）、本処理を終了し、まだ処理されていない人物の画像がある場合は（ステップＳ２３；Ｎｏ）、ステップＳ１３に戻って、別の人物の画像を処理対象とする。 When the face orientation determined from the image of the person to be processed is not facing the front of the display panel 65 (step S16; No), the control unit 20 outputs the voice to the superdirective speaker 40. (Step S17), the attribute discrimination unit 22 performs attribute discrimination based on the person image (step S18), and the audio output control unit 23 uses the audio output control unit 23 to obtain audio data according to the audio selection table 33. The selected audio data is acquired from the advertisement audio data 32 (step S19).
Subsequently, the control unit 20 controls the pedestal driving unit 12 by the function of the speaker pedestal control unit 24 in order to adjust the sound output direction of the superdirective speaker 40 to the direction of the subject determined in step S17, and the superdirectivity. The direction of the speaker 40 is adjusted (step S20). And the control part 20 outputs a sound from the super-directional speaker 40 by the function of the audio | voice output control part 23 (step S21), and registers the characteristic detected by step S18 about this person image in the personal identification information 34 (step). S22), the process proceeds to step S23.
In step S23, it is determined whether or not the processing has been completed for all the human images detected in the captured image of the camera 50, and if all the human images have been processed (step S23; Yes), the present When the process ends and there is an image of a person who has not yet been processed (step S23; No), the process returns to step S13 to set another person's image as a processing target.

ところで、ステップＳ１３で選択された処理対象の人物の画像が、個人識別情報３４に登録されていた場合（ステップＳ１４；Ｙｅｓ）、制御部２０は、顔方向判定部２１の機能により顔の向きを判定し（ステップＳ２４）、この顔の向きをも加味して音声選択用テーブル３３に基づいて音声データを選択するとともに、選択した音声データを広告音声データ３２から取得する（ステップＳ２５）。続いて、制御部２０は、超指向性スピーカ４０の音声出力方向をステップＳ１７で決定した対象者の方向に合わせるため、スピーカ台座制御部２４の機能によって台座駆動部１２を制御し、超指向性スピーカ４０の向きを調整する（ステップＳ２６）。そして、制御部２０は、音声出力制御部２３の機能によって超指向性スピーカ４０から音声を出力させ（ステップＳ２７）、ステップＳ２３に移行する。 By the way, when the image of the person to be processed selected in step S13 is registered in the personal identification information 34 (step S14; Yes), the control unit 20 determines the face direction by the function of the face direction determination unit 21. Determination is made (step S24), and the voice data is selected based on the voice selection table 33 in consideration of the orientation of the face, and the selected voice data is acquired from the advertising voice data 32 (step S25). Subsequently, the control unit 20 controls the pedestal driving unit 12 by the function of the speaker pedestal control unit 24 in order to adjust the sound output direction of the superdirective speaker 40 to the direction of the subject determined in step S17, and the superdirectivity. The direction of the speaker 40 is adjusted (step S26). And the control part 20 outputs an audio | voice from the super-directional speaker 40 by the function of the audio | voice output control part 23 (step S27), and transfers to step S23.

以上説明したように、本発明を適用した実施形態に係る音声出力システム１によれば、広告画像を表示する表示装置６０の表示パネル６５を視認可能な範囲にいる人をカメラ５０により撮影し、この撮影画像に基づいて、各々の人の顔の向きを判定し、所定方向を向いている人、例えば表示パネル６５を向いていない人を対象者として、超指向性スピーカ４０から対象者に対して音声を出力するので、広告への注目の状態に応じて対象者を選んで、その対象者にのみ聞こえるように音声を出力できる。これにより、広告を視認可能な範囲Ｇにいる人のうち、広告への注目の状態が特定の状態にある人、例えば表示パネル６５の広告画像を見ていない人や、広告画像を見ている人を対象にして音声による案内等を行って、広告に対する注目を効果的に集めることができる。また、超指向性スピーカ４を用いることで、僅かな人数にしか聞こえないように音声を出力することができ、広い範囲の人に聞こえるように音声を出力する場合に比べて、表示パネル６５に表示される広告への注目を強く喚起できる。 As described above, according to the audio output system 1 according to the embodiment to which the present invention is applied, a person who is in a range where the display panel 65 of the display device 60 that displays an advertisement image is visible can be photographed by the camera 50, Based on this photographed image, the orientation of each person's face is determined, and a person who is facing a predetermined direction, for example, a person who is not facing the display panel 65 is regarded as a target person from the super-directional speaker 40 to the target person. Therefore, it is possible to select a target person according to the state of attention to the advertisement and output the sound so that only the target person can hear it. Thereby, among those who are in the range G where the advertisement can be visually recognized, those who are in a specific state of attention to the advertisement, for example, those who have not seen the advertisement image on the display panel 65, or are looking at the advertisement image It is possible to effectively attract attention to advertisements by providing voice guidance for people. In addition, by using the super-directional speaker 4, it is possible to output sound so that only a small number of people can hear it, and to the display panel 65 as compared with the case where sound is output so that it can be heard by a wide range of people. It is possible to strongly attract attention to the displayed advertisement.

また、音声出力システム１において、カメラ５０は広告画像を表示する表示パネル６５を視認可能な範囲Ｇを撮影するものであり、制御装置１０は、超指向性スピーカ４０によって、表示装置６０により表示中の広告画像に関連する音声を出力させるので、表示パネル６５の広告画像への注目の状態が特定の状態にある人を対象にして、広告画像に関連する音声を出力することで、より効果的に、表示画面に表示される広告に対する注目を集めることができる。また、表示装置６０によって表示中の広告の内容に関してより多くの情報を提供することも可能となり、広告効果をさらに高めることも期待できる。 In the audio output system 1, the camera 50 captures a range G in which the display panel 65 that displays the advertisement image can be visually recognized, and the control device 10 is displaying on the display device 60 with the superdirective speaker 40. Since the sound related to the advertisement image is output, the sound related to the advertisement image is output more effectively for the person whose attention is focused on the advertisement image of the display panel 65 in a specific state. In addition, attention can be attracted to advertisements displayed on the display screen. Further, it is possible to provide more information regarding the content of the advertisement being displayed by the display device 60, and it can be expected that the advertisement effect is further enhanced.

さらに、制御装置１０は、属性判別部２２の機能によって、カメラ５０の撮影画像に写っている人の属性を判別し、音声選択用テーブル３３に基づいて、属性に応じた音声データを広告音声データ３２から取得して、超指向性スピーカ４０から出力するので、対象者の属性に適した音声を出力することで、対象者に強く効果的に働きかけることができ、広告に対する注目を効果的に集めるとともに広告効果を高めることが可能となる。 Further, the control device 10 determines the attribute of the person shown in the captured image of the camera 50 by the function of the attribute determination unit 22, and based on the audio selection table 33, converts the audio data corresponding to the attribute to the advertisement audio data. 32, and output from the super-directional speaker 40. By outputting sound suitable for the attributes of the target person, the target person can be strongly and effectively acted, and attention to the advertisement is effectively collected. At the same time, the advertising effect can be enhanced.

加えて、制御装置１０は、音声選択用テーブル３３に基づいて、カメラ５０の撮影画像から検出した人物画像が撮影画像に写ったのが初めてか、２回目以後かによって、異なる音声データを選択して出力するので、同じ人物に違う音声を聴かせる等の細かい対応をすることが可能となる。これにより、表示パネル６５に表示される広告への注目をより効果的に集めるとともに広告効果を高めることが可能となる。
また、カメラ５０の撮影画像から検出した人物画像が撮影画像に写った回数だけでなく、２回目以後に写った際の顔の向きに応じて、異なる音声データを選択して出力するので、対象者の行動に合わせて音声を聴かせることが可能になり、表示パネル６５に表示される広告への注目をより効果的に集めるとともに広告効果を高めることが可能となる。
例えば、対象者に対し、表示パネル６５に注目するよう案内する音声を聴かせた後、この対象者の顔の向きが表示パネル６５を向いていた場合には、この対象者は超指向性スピーカ４０からの音声に従って表示パネル６５の広告画像に注目している。このため、２回目にカメラ５０の撮影画像から検出された際に、この対象者に対しては、表示パネル６５に表示中の広告画像の内容について、より詳細に説明する音声を聴かせたり、表示パネル６５に注目したことに対する感謝の音声を聴かせたりすればよい。このように、対象者の行動を反映して音声を変化させることで、より高い広告効果が期待できる。 In addition, based on the audio selection table 33, the control device 10 selects different audio data depending on whether the person image detected from the captured image of the camera 50 is shown in the captured image for the first time or after the second time. Therefore, it is possible to take fine measures such as making the same person hear different sounds. Thereby, it is possible to more effectively attract attention to the advertisement displayed on the display panel 65 and enhance the advertisement effect.
Also, since different audio data is selected and output not only according to the number of times the person image detected from the captured image of the camera 50 is captured in the captured image but also according to the orientation of the face when it is captured after the second time, It becomes possible to listen to the voice according to the person's action, and it is possible to more effectively attract attention to the advertisement displayed on the display panel 65 and to enhance the advertisement effect.
For example, after the target person listens to a voice that guides the user to pay attention to the display panel 65, when the face direction of the target person faces the display panel 65, the target person is a super-directional speaker. Attention is paid to the advertisement image on the display panel 65 in accordance with the voice from 40. For this reason, when it is detected from the photographed image of the camera 50 for the second time, the target person can hear a sound that explains the content of the advertisement image being displayed on the display panel 65 in more detail, What is necessary is just to hear the voice of thanks for having paid attention to the display panel 65. Thus, a higher advertising effect can be expected by changing the voice reflecting the behavior of the target person.

なお、上記実施形態においては、対象者の属性に応じて音声選択用テーブル３３に基づいて音声データを選択する構成を例に挙げて説明したが、本発明はこれに限定されるものではなく、例えば、対象者の属性を予め設定しておき、この属性に該当する人物の画像があった場合に、この人物に対してのみ音声を出力することも可能である。
以下、この場合について変形例として説明する。 In the above-described embodiment, the configuration in which the audio data is selected based on the audio selection table 33 according to the attributes of the target person has been described as an example. However, the present invention is not limited to this, For example, if an attribute of a target person is set in advance and there is an image of a person corresponding to this attribute, it is possible to output sound only to this person.
Hereinafter, this case will be described as a modified example.

［変形例］
図６は、本発明を適用した実施形態の変形例における音声出力システム１の動作を示すフローチャートである。
この変形例において音声出力システム１の構成は上記実施形態と共通であるから、音声出力システム１の各構成部には同符号を付して図示及び説明を省略する。また、図６に示すフローチャートの一部の処理は、図５に示した動作と共通であるから、この共通の処理については同番号を付して概略のみ説明する。 [Modification]
FIG. 6 is a flowchart showing the operation of the audio output system 1 in a modification of the embodiment to which the present invention is applied.
In this modified example, the configuration of the audio output system 1 is the same as that of the above-described embodiment. Therefore, the components of the audio output system 1 are denoted by the same reference numerals, and illustration and description thereof are omitted. 6 is common to the operation shown in FIG. 5, the common processing will be described with only the same reference numerals.

図６に示す動作は、制御装置１０の制御部２０が、カメラ５０の撮影画像を所定時間毎にサンプリングする毎に、行われる。制御部２０は、カメラ５０の撮影画像を取得し（ステップＳ１１）、撮影画像中に人物画像があるか否かを判別し（ステップＳ１２）、撮影画像中に人物の画像がない場合には（ステップＳ１２；Ｎｏ）、本処理を終了する。
撮影画像中に人物画像があった場合（ステップＳ１２；Ｙｅｓ）、制御部２０は、撮影画像において検出された全ての人物の画像から処理対象となる人物の画像を一つ選択し（ステップＳ１３）、この画像が、個人識別情報３４に登録されている人物の画像であるか否かを判定する（ステップＳ１４）。
処理対象の人物画像が、個人識別情報３４に登録されていない場合（ステップＳ１４；Ｎｏ）、制御部２０は、処理対象の人物の画像から顔の向きを判定する（ステップＳ１５）。
ここで、顔の向きが表示パネル６５の正面向きであった場合（ステップＳ１６；Ｙｅｓ）、制御部２０は、音声出力などの処理を行わずにステップＳ２３に移行する。また、顔の向きが表示パネル６５の正面を向いていない場合（ステップＳ１６；Ｎｏ）、制御部２０は、この人物の画像に基づいて属性判別部２２による属性判別を行い（ステップＳ１８）、音声出力の対象として事前に設定された属性か否かを判定する（ステップＳ３１）。 The operation illustrated in FIG. 6 is performed each time the control unit 20 of the control device 10 samples the captured image of the camera 50 at predetermined time intervals. The control unit 20 acquires a photographed image of the camera 50 (step S11), determines whether or not there is a person image in the photographed image (step S12), and if there is no person image in the photographed image (step S12) Step S12; No), this processing is terminated.
If there is a person image in the photographed image (step S12; Yes), the control unit 20 selects one person image to be processed from all the person images detected in the photographed image (step S13). Then, it is determined whether or not this image is an image of a person registered in the personal identification information 34 (step S14).
When the person image to be processed is not registered in the personal identification information 34 (Step S14; No), the control unit 20 determines the face direction from the image of the person to be processed (Step S15).
Here, when the face direction is the front direction of the display panel 65 (step S16; Yes), the control unit 20 proceeds to step S23 without performing processing such as sound output. If the face is not facing the front of the display panel 65 (step S16; No), the control unit 20 performs attribute determination by the attribute determination unit 22 based on the person image (step S18), and the sound. It is determined whether or not the attribute is set in advance as an output target (step S31).

属性判別部２２が判別した属性が、音声出力の対象として事前に設定された属性であった場合（ステップＳ３１；Ｙｅｓ）、制御部２０は、この人物を超指向性スピーカ４０の音声出力の対象者として決定し（ステップＳ１７）、音声出力制御部２３の機能により、音声選択用テーブル３３に従って音声データを選択するとともに選択した音声データを広告音声データ３２から取得する（ステップＳ３２）。
続いて、制御部２０は、スピーカ台座制御部２４の機能によって台座駆動部１２を制御し、超指向性スピーカ４０の向きを調整し（ステップＳ２０）、超指向性スピーカ４０から音声を出力させ（ステップＳ２１）、この人物画像についてステップＳ１８で検出した特徴を個人識別情報３４に登録し（ステップＳ２２）、ステップＳ２３に移行する。
ステップＳ２３では、カメラ５０の撮影画像において検出された人物の画像の全てについて処理が完了したか否かを判別し、全ての人物の画像の処理が済んでいれば（ステップＳ２３；Ｙｅｓ）、本処理を終了し、まだ処理されていない人物の画像がある場合は（ステップＳ２３；Ｎｏ）、ステップＳ１３に戻って、別の人物の画像を処理対象とする。 When the attribute determined by the attribute determination unit 22 is an attribute set in advance as an audio output target (step S31; Yes), the control unit 20 sets the person as a target for audio output of the superdirective speaker 40. The voice data is selected according to the voice selection table 33 and the selected voice data is acquired from the advertisement voice data 32 by the function of the voice output control unit 23 (step S32).
Subsequently, the control unit 20 controls the pedestal driving unit 12 by the function of the speaker pedestal control unit 24, adjusts the direction of the superdirective speaker 40 (step S20), and outputs sound from the superdirective speaker 40 ( In step S21), the feature detected in step S18 for this person image is registered in the personal identification information 34 (step S22), and the process proceeds to step S23.
In step S23, it is determined whether or not the processing has been completed for all the human images detected in the captured image of the camera 50, and if all the human images have been processed (step S23; Yes), the present When the process ends and there is an image of a person who has not yet been processed (step S23; No), the process returns to step S13 to set another person's image as a processing target.

また、ステップＳ１３で選択された処理対象の人物の画像が、個人識別情報３４に登録されていた場合（ステップＳ１４；Ｙｅｓ）、制御部２０は、顔方向判定部２１の機能により顔の向きを判定し（ステップＳ２４）、この顔の向きをも加味して音声選択用テーブル３３に基づいて音声データを選択して取得し（ステップＳ２５）、超指向性スピーカ４０の向きを調整し（ステップＳ２６）、超指向性スピーカ４０から音声を出力させ（ステップＳ２７）、ステップＳ２３に移行する。 In addition, when the image of the person to be processed selected in step S13 is registered in the personal identification information 34 (step S14; Yes), the control unit 20 changes the face direction by the function of the face direction determination unit 21. A determination is made (step S24), and voice data is selected and acquired based on the voice selection table 33 in consideration of the face direction (step S25), and the direction of the superdirective speaker 40 is adjusted (step S26). ), Sound is output from superdirective speaker 40 (step S27), and the process proceeds to step S23.

この変形例の動作によれば、カメラ５０の撮影画像から検出された人物の画像が、予め対象者の属性として設定された属性であった場合のみ、超指向性スピーカ４０による音声出力を行い、設定された属性以外の属性の人物に対しては、超指向性スピーカ４０による音声を聴かせない。例えば、表示パネル６５に表示される広告画像の対象年齢や対象の性別など、広告対象として想定されている属性から外れる人に対して、広告画像への注目を喚起する意味が薄い。本変形例では、表示パネル６５の広告画像の対象属性等に基づいて設定された属性に該当しない人には超指向性スピーカ４０の音声を聴かせず、設定された属性の人に対してのみ音声を聴かせることで、効果的に広告への注目を集めることができ、より高い広告効果が期待できる。 According to the operation of this modification, only when the image of the person detected from the captured image of the camera 50 is an attribute set in advance as the attribute of the target person, the superdirective speaker 40 performs audio output, A person with an attribute other than the set attribute is not allowed to hear the sound from superdirective speaker 40. For example, it is not meaningful to call attention to an advertisement image to a person who deviates from an attribute assumed as an advertisement target, such as the target age of the advertisement image displayed on the display panel 65 and the sex of the target. In this modified example, the sound of the superdirective speaker 40 is not heard by a person who does not correspond to the attribute set based on the target attribute or the like of the advertisement image of the display panel 65, and only for the person with the set attribute. By listening to the sound, it is possible to effectively attract attention to the advertisement, and a higher advertising effect can be expected.

なお、上述した実施形態及び変形例は、あくまでも本発明の一態様を示すものであり、本発明の範囲内で任意に変形および応用が可能である。
上記の実施形態及び変形例では、属性判別部２２が属性として人物の性別や年代を判別する構成としたが、判別される属性は性別に限らず、日本人であるか外国人であるかを判別する構成とし、日本人の場合にはこの人物に対し日本語の音声が出力されるようにし、外国人の場合にはこの人物に対し外国語の音声が出力されるようにしてもよい。
また、上記実施形態及び変形例では、例えば壁掛け設置される表示装置６０に超指向性スピーカ４０及びカメラ５０が設置される構成を例に挙げて説明したが、本発明はこれに限定されるものではなく、表示装置６０から離れた場所に超指向性スピーカ４０及びカメラ５０を設置することも勿論可能であり、超指向性スピーカ４０とカメラ５０とを互いに離れた場所に設置することも可能である。さらに、広告を表示する表示面としては、表示装置６０の表示パネル６５に限定されず、例えば紙または合成樹脂製のシートからなるポスターを掲示する掲示板も、広告を表示する表示面に相当するし、壁面に直接広告が描かれている場合に、この壁面自体を広告の表示面として扱うことも可能である。すなわち、この壁面を視認可能な範囲をカメラ５０により撮影するとともに、この撮影画像に基づいて音声を聴かせる対象者を選択してから、対象者に向けて超指向性スピーカ４０により音声を出力することが可能である。 In addition, embodiment mentioned above and a modification show the one aspect | mode of this invention to the last, A deformation | transformation and application are arbitrarily possible within the scope of the present invention.
In the above embodiment and modification, the attribute determination unit 22 is configured to determine the gender and age of the person as an attribute. However, the attribute to be determined is not limited to gender, and whether the attribute is a Japanese or a foreigner. In the case of Japanese, a Japanese voice may be output to this person, and in the case of a foreigner, a foreign language voice may be output to this person.
Further, in the above-described embodiment and modification, for example, the configuration in which the superdirective speaker 40 and the camera 50 are installed in the display device 60 installed on the wall has been described as an example, but the present invention is limited to this. Instead, it is of course possible to install the super-directional speaker 40 and the camera 50 in a place away from the display device 60, and the super-directional speaker 40 and the camera 50 can be installed in places away from each other. is there. Furthermore, the display surface for displaying the advertisement is not limited to the display panel 65 of the display device 60. For example, a bulletin board displaying a poster made of a sheet of paper or synthetic resin corresponds to the display surface for displaying the advertisement. When an advertisement is directly drawn on the wall surface, the wall surface itself can be handled as an advertisement display surface. That is, the range in which the wall surface can be visually recognized is photographed by the camera 50, and the target person who hears the sound is selected based on the photographed image, and then the sound is output to the target person by the superdirective speaker 40. It is possible.

さらに、上記実施形態及び変形例における超指向性スピーカ４０及びカメラ５０の数についても任意であり、制御部２０が実行するプログラムは記憶部３０や記録媒体読取部１５によって読み取り可能な記録媒体に記録するほか、通信回線（図示略）を介してダウンロードすることも可能であり、その他、音声出力システム１を構成する細部構成等についても、任意に変更可能であることは勿論である。 Furthermore, the number of superdirective speakers 40 and cameras 50 in the above-described embodiments and modifications is also arbitrary, and the program executed by the control unit 20 is recorded on a recording medium that can be read by the storage unit 30 or the recording medium reading unit 15. In addition, it is also possible to download via a communication line (not shown), and it is of course possible to arbitrarily change the detailed configuration of the audio output system 1 as well.

音声出力システムの構成を示す機能ブロック図である。It is a functional block diagram which shows the structure of an audio | voice output system. 超指向性スピーカ及びカメラの設置状態を示す斜視図である。It is a perspective view which shows the installation state of a super-directional speaker and a camera. カメラの撮影範囲を示す平面図である。It is a top view which shows the imaging | photography range of a camera. 音声選択用テーブルの構成例を模式的に示す図である。It is a figure which shows typically the structural example of the table for audio | voice selection. 音声出力システムの動作を示すフローチャートである。It is a flowchart which shows operation | movement of an audio | voice output system. 音声出力システムの動作の変形例を示すフローチャートである。It is a flowchart which shows the modification of operation | movement of an audio | voice output system.

Explanation of symbols

１…音声出力システム（音声出力装置）、１０…制御装置（音声出力制御装置）、１１…音声出力部、１２…台座駆動部、２０…制御部（対象者検出手段）、２１…顔方向判定部（対象者検出手段）、２２…属性判別部（属性判別手段）、２３…音声出力制御部（音声出力制御手段）、２４…スピーカ台座制御部、３０…記憶部、３１…広告画像データ、３２…広告音声データ、３２…音声選択用テーブル、３３…音声選択用テーブル、３４…個人識別情報、４０…超指向性スピーカ、４１…スピーカ台座（音声方向調整機構）、５０…カメラ（撮影手段）、６０…表示装置、６５…表示パネル。 DESCRIPTION OF SYMBOLS 1 ... Audio | voice output system (audio | voice output apparatus), 10 ... Control apparatus (audio | voice output control apparatus), 11 ... Audio | voice output part, 12 ... Base drive part, 20 ... Control part (subject detection means), 21 ... Face direction determination Part (subject detection means), 22 ... attribute discrimination part (attribute discrimination means), 23 ... audio output control part (audio output control means), 24 ... speaker pedestal control part, 30 ... storage part, 31 ... advertising image data, 32 ... Advertising audio data, 32 ... Audio selection table, 33 ... Audio selection table, 34 ... Personal identification information, 40 ... Super directional speaker, 41 ... Speaker base (audio direction adjustment mechanism), 50 ... Camera (imaging means) ), 60 ... display device, 65 ... display panel.

Claims

Connected to a super directional speaker that outputs sound in a specific direction and a sound direction adjusting mechanism that adjusts a sound output direction of the super directional speaker;
Photographing means for photographing the range in which the display surface of the advertisement is visible,
A target person detecting means for detecting, as a target person, a person whose face is facing a predetermined direction among persons photographed in the photographed image photographed by the photographing means;
Audio output control means for causing the audio output direction of the superdirective speaker to be directed toward the subject detected by the subject detection means by the audio direction adjusting mechanism, and for outputting sound from the superdirective speaker; ,
An audio output control device comprising:

The photographing means photographs a range in which a display screen of a display device that displays an advertisement image is visible as a display surface of the advertisement.
The sound output control means is configured to output sound related to an advertisement image being displayed by the display device from the superdirective speaker;
The audio output control apparatus according to claim 1.

Further comprising attribute determination means for determining the attribute of the subject based on a photographed image photographed by the photographing means;
The sound output control means directs the sound output direction of the superdirective speaker to a person identified as a specific attribute by the attribute determination means among the target person;
The sound output control device according to claim 1 or 2.

Further comprising attribute determination means for determining the attribute of the subject based on a photographed image photographed by the photographing means;
The sound output control means causes the superdirective speaker to output sound corresponding to the attribute of the subject determined by the attribute determination means;
The sound output control device according to claim 1 or 2.

A super-directional speaker that outputs sound in a specific direction;
An audio direction adjustment mechanism for adjusting an audio output direction of the superdirective speaker;
Photographing means for photographing the range in which the display surface of the advertisement is visible,
A target person detecting means for detecting a person shown in the shot image based on a photographed image photographed by the photographing means and detecting a person whose face is facing a predetermined direction as a target person;
An audio output control means for adjusting the audio output direction of the superdirective speaker to be directed toward the target detected by the target detection means by the audio direction adjusting mechanism, and outputting the audio from the superdirective speaker;
An audio output device comprising:

With a sound output control device connected to a super-directional speaker that outputs sound in a specific direction and a sound direction adjusting mechanism that adjusts a sound output direction of the super-directional speaker,
Take a picture of the front side of the advertising display,
Based on the photographed image, a person in the photographed image is detected, and a person whose face is facing a predetermined direction is detected as a target person.
Adjusting the sound output direction of the superdirective speaker to face the subject by the sound direction adjusting mechanism, and outputting sound from the superdirective speaker;
An audio output control method characterized by the above.

A program executed by a computer connected to a super-directional speaker that outputs sound in a specific direction and a sound direction adjusting mechanism that adjusts a sound output direction of the super-directional speaker,
The computer,
Photographing means for photographing the front of the display surface of the advertisement;
A target person detecting means for detecting a person shown in the shot image based on a photographed image photographed by the photographing means and detecting a person whose face is facing a predetermined direction as a target person;
An audio output control means for adjusting the audio output direction of the superdirective speaker to be directed toward the target detected by the target detection means by the audio direction adjusting mechanism, and outputting the audio from the superdirective speaker;
A program characterized by making it function.