WO2015170368A1 - Directivity control apparatus, directivity control method, storage medium, and directivity control system - Google Patents

Directivity control apparatus, directivity control method, storage medium, and directivity control system Download PDF

Info

Publication number
WO2015170368A1
WO2015170368A1 PCT/JP2014/002473 JP2014002473W WO2015170368A1 WO 2015170368 A1 WO2015170368 A1 WO 2015170368A1 JP 2014002473 W JP2014002473 W JP 2014002473W WO 2015170368 A1 WO2015170368 A1 WO 2015170368A1
Authority
WO
WIPO (PCT)
Prior art keywords
unit
directivity
sound
image
tracking
Prior art date
Application number
PCT/JP2014/002473
Other languages
French (fr)
Japanese (ja)
Inventor
信一 重永
昭年 泉
林 和典
徳田 肇道
裕隆 澤
Original Assignee
パナソニックIpマネジメント株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by パナソニックIpマネジメント株式会社 filed Critical パナソニックIpマネジメント株式会社
Priority to CN201480045464.2A priority Critical patent/CN105474667B/en
Priority to JP2015526795A priority patent/JP6218090B2/en
Priority to PCT/JP2014/002473 priority patent/WO2015170368A1/en
Publication of WO2015170368A1 publication Critical patent/WO2015170368A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • H04R1/40Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones

Definitions

  • the present invention relates to a directivity control device, a directivity control method, a storage medium, and a directivity control system that control the directivity of speech.
  • a surveillance system installed at a predetermined position (for example, a ceiling surface) of a factory, a store (for example, a retail store, a bank) or a public place (for example, a library), one or more camera devices (for example, PTZ (for example, PTZ (for example)) Pan Tilt Zoom) camera device or omnidirectional camera device) is connected to widen the angle of view of the image data (including still images and moving images, the same applies hereinafter) of the video in the monitoring target range.
  • PTZ for example, PTZ (for example)
  • Pan Tilt Zoom Pan Tilt Zoom
  • the pan / tilt head control device shown in Patent Document 1 displays an image captured by a TV camera installed on a pan / tilt head provided with pan and tilt driving means on a monitor TV and automatically shoots on the monitor TV screen.
  • the trajectory points from the movement start point to the end point are input, the sequential trajectory points are sequentially connected to obtain a continuous trajectory line, and the trajectory data from the movement start point to the end point of the trajectory line is sequentially read to read the data.
  • Automatic shooting is executed so that the point is positioned at the center of the shooting screen.
  • the pan / tilt head control device of the TV camera can obtain the trajectory data of the pan and tilt drive by a simple input operation by inputting the trajectory point on the screen of the monitor TV, and performs accurate drive control. be able to.
  • Patent Document 1 does not disclose a configuration for picking up sound produced by a person projected on a monitor television. For example, even if the configuration of Patent Document 1 is applied to the above-described monitoring system, the moving start point to the end point is not disclosed. There is a problem that it is difficult to pick up the voice of a person on a trajectory point with high accuracy.
  • An object is to provide a directivity control device, a directivity control method, a storage medium, and a directivity control system that suppress efficiency degradation.
  • the present invention is a directivity control device that controls the directivity of sound collected by a first sound collection unit including a plurality of microphones, and is provided on an image of a display unit from the first sound collection unit.
  • a directivity forming unit that forms directivity of the sound in a direction toward the monitoring target corresponding to the first specified position, and an image on the display unit that is specified according to the movement of the monitoring target
  • An information acquisition unit that acquires information about a second specified position, wherein the directivity forming unit uses the information about the second specified position acquired by the information acquisition unit, It is a directivity control device that switches the directivity of the sound in a direction toward the monitoring object corresponding to a position.
  • the present invention is also a directivity control method in a directivity control apparatus for controlling the directivity of sound collected by a first sound collection unit including a plurality of microphones, the first sound collection unit including: Forming the directivity of the sound in a direction toward the monitoring target corresponding to the first designated position on the image of the display; and the display unit specified according to the movement of the monitoring target Using the information on the second designated position on the image and the information on the obtained second designated position in the direction toward the monitoring object corresponding to the second designated position, And a step of switching the directivity of the voice.
  • the present invention is a storage medium storing a program for executing processing in a directivity control device that controls the directivity of sound collected by a first sound collection unit including a plurality of microphones, A step of forming directivity of the sound in a direction from the first sound collection unit to the monitoring target corresponding to the first designated position on the image of the display unit, and according to the movement of the monitoring target The step of acquiring information related to the second specified position on the image of the display unit specified, and the monitoring corresponding to the second specified position using the acquired information related to the second specified position And a step of switching the directivity of the sound in a direction toward the object.
  • the present invention provides an image pickup unit that picks up a sound pickup region, a first sound pickup unit that includes a plurality of microphones and picks up sound in the sound pickup region, and is picked up by the first sound pickup unit.
  • a directivity control device for controlling the directivity of the sound the directivity control device from the first sound collection unit to the monitoring target corresponding to the first designated position on the image of the display unit
  • a directivity control system for switching the directivity of the voice.
  • the monitoring object on the image moves, it can be appropriately formed following the directivity of the sound with respect to the monitoring object, and the efficiency deterioration of the monitoring work of the monitor can be suppressed.
  • Explanatory drawing which shows the operation example of manual tracking processing
  • Explanatory drawing which shows the operation example which changes a tracking point by manual tracking processing when the tracking point automatically specified in automatic tracking processing is wrong
  • Explanatory drawing showing slow playback processing in recording playback mode and slow playback mode
  • Explanatory drawing which shows the enlarged display process in enlarged display mode
  • A An explanatory view showing an automatic scroll process after an enlargement display process in an enlargement display mode
  • A A flowchart for explaining a first example of an overall flow of manual tracking processing in the directivity control system of the first embodiment.
  • FIG. 10 A flowchart for explaining a second example of the automatic tracking process shown in FIG. 10
  • FIG. 10 A flowchart for explaining an example of the tracking correction process shown in (A).
  • FIG. 10 A flowchart for explaining a third example of the automatic tracking process shown in FIG.
  • FIG. 10 A flowchart for explaining an example of the tracking assisting process shown in FIG.
  • FIG. 13A is a flowchart illustrating an example of the automatic scroll process necessity determination process shown in FIG. 13B
  • FIG. 13B is an explanatory diagram of a scroll necessity determination line in the automatic scroll process necessity determination process.
  • Figure (A) A flowchart for explaining an example of an entire flow of a flow line display reproduction process using a tracking list in the directivity control system of the first embodiment, and (B) an example of a reproduction start time calculation process shown in (A).
  • Flow chart to explain A flowchart for explaining an example of the flow line display process shown in FIG. (A) A flowchart for explaining an example of the audio output process shown in FIG. 9 (A), (B) a flowchart for explaining an example of the image privacy protection process shown in FIG. 13 (A).
  • A A diagram showing an example of a waveform of an audio signal corresponding to a pitch before voice change processing
  • B a diagram showing an example of a waveform of an audio signal corresponding to a pitch after voice change processing
  • C detected Explanatory diagram of processing to blur the outline of a person's face
  • the block diagram which shows the system configuration example of the directivity control system of 2nd Embodiment.
  • Explanatory drawing which shows the automatic switching process of the camera apparatus used for imaging of the image displayed on a display apparatus
  • Explanatory drawing which shows the automatic switching process of the omnidirectional microphone array apparatus used for the sound collection of the sound of the monitoring object
  • Explanatory drawing which shows the manual switching process of the camera apparatus used for imaging of the image displayed on a display apparatus
  • Explanatory drawing which shows the manual switching process of the omnidirectional microphone array apparatus used for the sound collection of the sound of the monitoring object
  • Explanatory drawing which shows the selection process of the optimal omnidirectional microphone array apparatus used for the sound collection of the sound of the monitoring object
  • A The flowchart explaining an example of the automatic switching process of the omnidirectional microphone array apparatus in the directivity control system of 2nd Embodiment, The flowchart which shows an example of the microphone switching determination process shown to (B) (A).
  • A The flowchart explaining an example of the manual switching process of the camera apparatus in the directivity control system of 2nd Embodiment
  • B The manual switching process of the omnidirectional microphone array apparatus in the directivity control system of 2nd Embodiment
  • Flowchart explaining an example A) A flow chart for explaining a first example of an optimal omnidirectional microphone array device selection process in the directivity control system of the second embodiment, and (B) an optimal all in the directivity control system of the second embodiment.
  • the directivity control system of each embodiment is, for example, as a monitoring system (including a manned monitoring system and an unmanned monitoring system) installed in a factory, a public facility (for example, a library, an event venue), or a store (for example, a retail store or a bank). Used.
  • a monitoring system including a manned monitoring system and an unmanned monitoring system
  • a public facility for example, a library, an event venue
  • a store for example, a retail store or a bank.
  • the present invention records a program for causing a directivity control device, which is a computer, to execute an operation defined by a directivity control method, or a program for causing a computer to execute an operation defined by a directivity control method. It can also be expressed as a computer-readable recording medium.
  • FIG. 1 is an explanatory diagram illustrating an outline of operations of the directivity control systems 100 and 100A according to the first embodiment.
  • FIG. 2 is a block diagram illustrating a first system configuration example of the directivity control system 100 according to the first embodiment.
  • FIG. 3 is a block diagram illustrating a second system configuration example of the directivity control system 100A according to the first embodiment.
  • the camera apparatus C1 images the monitoring target object (for example, the person HM1) of the directivity control systems 100 and 100A used as the monitoring system, for example, and transmits image data obtained by the imaging via the network NW.
  • the monitoring target object for example, the person HM1
  • the directivity control systems 100 and 100A used as the monitoring system, for example, and transmits image data obtained by the imaging via the network NW.
  • NW the network NW
  • the person HM1 may be stationary or may be moved, but will be described as moving.
  • the person HM1 moves from the tracking position A1 (x1, y1, z0) to the tracking position A2 (x2, y2, z0) by the tracking time t2 at the tracking time t1.
  • the tracking point refers to a position where the user designates the person HM1 on the tracking screen TRW when an image of the moving person HM1 captured by the camera device C1 is displayed on the tracking screen TRW of the display device 35 ( That is, the position on the tracking screen TRW).
  • the tracking point and tracking time data are associated with the tracking point (see, for example, FIG. 16B described later).
  • the tracking position is a three-dimensional coordinate indicating a position in real space corresponding to a position on the tracking screen TRW where the person HM1 is designated.
  • the tracking screen TRW is a voice tracking process (for example, a person HM1 out of a screen (hereinafter referred to as “camera screen”) on which an image captured by a camera device (for example, the camera device C1) is displayed on the display device 35.
  • the screen shown as the monitoring target object which becomes the object of the after-mentioned reference) is shown.
  • a screen on which the person HM1 or the like is not projected as a monitoring target is referred to as a camera screen
  • a screen that is projected as a monitoring target is referred to as a tracking screen
  • the camera a screen that is projected as a monitoring target
  • the omnidirectional microphone array device M1 picks up the sound emitted by the person HM1, and transmits the sound collecting sound data to the directivity control device 3 connected via the network NW.
  • the directivity control device 3 sets the directivity of the collected sound in the directivity direction from the omnidirectional microphone array device M1 to the tracking position A1. Form. In addition, the directivity control device 3 switches the directivity of the collected sound in the directivity direction from the omnidirectional microphone array device M1 to the tracking position A2 when the person HM1 moves from the tracking position A1 to the tracking position A2. Form.
  • the directivity control device 3 moves the omnidirectional microphone from the direction from the omnidirectional microphone array device M1 to the tracking position A1 as the person HM1 as the monitoring object moves from the tracking position A1 to the tracking position A2.
  • the directivity of the collected sound is tracked in the direction from the array apparatus M1 to the tracking position A2, that is, a sound tracking process is performed.
  • the directivity control system 100 shown in FIG. 2 includes one or more camera devices C1,..., Cn, one or more omnidirectional microphone array devices M1,..., Mm, a directivity control device 3, and a recorder device 4. It is the structure containing these. n and m are integers of 1 or more and may be the same number or different numbers. The same applies to the following embodiments.
  • the camera devices C1, ..., Cn, the omnidirectional microphone array devices M1, ..., Mm, the directivity control device 3, and the recorder device 4 are connected to each other via a network NW.
  • the network NW may be a wired network (for example, an intranet or the Internet), or a wireless network (for example, a wireless LAN (Local Area Network), WiMAX (registered trademark), or a wireless WAN (Wide Area Network)).
  • a description will be given assuming that one camera device C1 and an omnidirectional microphone array device M1 are provided in order to simplify the description.
  • the housing of the camera device C1 and the omnidirectional microphone array device M1 are separately attached at different positions, but the housing of the camera device C1 and the omnidirectional microphone are attached.
  • the housing of the array device M1 may be integrally attached at the same position.
  • a camera device C1 as an example of an imaging unit is installed fixedly on a ceiling surface of an event venue, for example, has a function as a monitoring camera in a monitoring system, and from a monitoring control room (not shown) connected to a network NW. With the remote operation, an image within a predetermined angle of view of the camera device C1 is captured in a predetermined sound collection area (for example, a predetermined area in the event venue).
  • the camera device C1 may be a camera having a PTZ function or a camera capable of imaging all directions.
  • the camera device C1 is a camera capable of capturing an omnidirectional image
  • the image data indicating the omnidirectional video in the sound collection area that is, the omnidirectional image data
  • Planar image data generated by performing panorama conversion by performing correction processing is transmitted to the directivity control device 3 or the recorder device 4 via the network NW.
  • the camera device C1 transmits the coordinate data of the designated position in the image data to the directivity control device 3.
  • the distance and direction from the camera device 1 to the sound position in the real space corresponding to the designated position (hereinafter simply abbreviated as “sound position”) (including horizontal and vertical angles, and so on). Is transmitted to the directivity control device 3.
  • sound position including horizontal and vertical angles, and so on.
  • An omnidirectional microphone array apparatus M1 as an example of a sound collection unit is fixedly installed on the ceiling surface of an event venue, for example, and a plurality of microphone units 22 and 23 (see FIGS. 36A to 36E) are even.
  • the configuration includes at least a microphone unit provided at intervals and a CPU (Central Processing Unit) that controls operations of the microphone units 22 and 23 of the microphone unit.
  • CPU Central Processing Unit
  • the omnidirectional microphone array device M1 When the power is turned on, the omnidirectional microphone array device M1 performs predetermined audio signal processing (for example, amplification processing, filter processing, addition processing) on the audio data of the sound collected by the microphone element in the microphone unit, The voice data obtained by the predetermined voice signal processing is transmitted to the directivity control device 3 or the recorder device 4 via the network NW.
  • predetermined audio signal processing for example, amplification processing, filter processing, addition processing
  • FIGS. 36A to 36E are external views of the casing of the omnidirectional microphone array apparatus M1.
  • the omnidirectional microphone array apparatuses M1C, M1A, M1B, M1, and M1D shown in FIGS. 36A to 36E are different in appearance and arrangement positions of a plurality of microphone units, but the functions of the omnidirectional microphone array apparatuses are the same. is there.
  • 36A includes an omnidirectional microphone array apparatus M1C having a disk-shaped casing 21.
  • a plurality of microphone units 22 and 23 are concentrically arranged in the housing 21. Specifically, the plurality of microphone units 22 are arranged concentrically with the same center as the casing 21 and along the circumference of the casing 21, and the plurality of microphone units 23 are the same as the casing 21.
  • a concentric circle having a center is disposed inside the housing 21.
  • Each microphone unit 22 has a wide interval, a large diameter, and characteristics suitable for a low sound range.
  • each microphone unit 23 is narrow in distance from each other, has a small diameter, and has characteristics suitable for a high sound range.
  • 36B includes an omnidirectional microphone array apparatus M1A having a disk-shaped casing 21.
  • a plurality of microphone units 22 are arranged in a cross shape along the vertical direction and the horizontal direction at equal intervals in the casing 21, and the vertical array and the horizontal array are the center of the casing 21.
  • the omnidirectional microphone array apparatus M1A since the plurality of microphone units 22 are linearly arranged in two directions, ie, the vertical direction and the horizontal direction, it is possible to reduce the amount of calculation when forming the directivity of audio data.
  • a plurality of microphone units 22 may be arranged in only one column in the vertical direction or the horizontal direction.
  • the omnidirectional microphone array apparatus M1B shown in FIG. 36 (C) has a disk-shaped casing 21B having a smaller diameter than the omnidirectional microphone array apparatus M1C shown in FIG. 36 (A).
  • a plurality of microphone units 22 are arranged at equal intervals along the circumference of the casing 21B.
  • the omnidirectional microphone array apparatus M1B shown in FIG. 36C has characteristics suitable for a high sound range because the distance between the microphone units 22 is short.
  • 36D includes an omnidirectional microphone array apparatus M1 having a donut-shaped or ring-shaped casing 21C in which an opening 21a having a predetermined diameter is formed at the center of the casing 21C.
  • an omnidirectional microphone array apparatus M1 shown in FIG. 36 (D) is used.
  • the housing 21C a plurality of microphone units 22 are concentrically arranged at equal intervals along the circumferential direction of the housing 21C.
  • the omnidirectional microphone array apparatus M1D shown in FIG. 36 (E) has a rectangular casing 21D.
  • a plurality of microphone units 22 are arranged at equal intervals along the outer periphery of the casing 21D.
  • the casing 21D has a rectangular shape, and therefore the installation of the omnidirectional microphone array apparatus M1D can be simplified even at, for example, a corner or a wall surface.
  • the microphone units 22 and 23 of the omnidirectional microphone array apparatus M1 may be omnidirectional microphones, bi-directional microphones, unidirectional microphones, sharp directional microphones, super-directional microphones (for example, gun microphones), or the like. A combination may be used.
  • the directivity control devices 3 and 3A may be, for example, a stationary PC (Personal Computer) installed in a monitoring control room (not shown), a mobile phone that can be carried by the user, a PDA (Personal Digital Assistant), or a tablet terminal.
  • a data communication terminal such as a smartphone may be used.
  • the directivity control device 3 includes at least a communication unit 31, an operation unit 32, a memory 33, a signal processing unit 34, a display device 35, and a speaker device 36.
  • the signal processing unit 34 includes at least a directivity direction calculation unit 34a, an output control unit 34b, and a tracking processing unit 34c.
  • the communication unit 31 receives the image data transmitted from the camera device C1 or the audio data transmitted from the omnidirectional microphone array device M1, and outputs the received image data to the signal processing unit 34.
  • the operation unit 32 is a user interface (UI) for notifying the signal processing unit 34 of a user input operation, and is, for example, a pointing device such as a mouse or a keyboard.
  • UI user interface
  • the operation unit 32 may be configured using a touch panel that is arranged corresponding to the display screen of the display device 35 and can detect an input operation with a user's finger FG or a stylus pen, for example.
  • the operation unit 32 has a designated position designated by the cursor CSR or the user's finger FG by the user's mouse operation in the image data displayed on the display device 35 (that is, image data taken by the camera device C1).
  • the coordinate data is output to the signal processing unit 34.
  • the memory 33 is configured by using, for example, a RAM (Random Access Memory), and functions as a work memory during operation of each unit of the directivity control device 3.
  • the memory 33 as an example of an image storage unit or an audio storage unit is configured using, for example, a hard disk or a flash memory, and the image data or audio data stored in the recorder device 4, that is, the camera device C1 over a certain period. Is stored, or audio data picked up by the omnidirectional microphone array apparatus M1 is stored.
  • the memory 33 as an example of the designation list storage unit is an example of a designation list including data of all designated positions and designated times (see later) of the image data displayed on the display device 35 on the tracking screen TRW. Data of the tracking list LST (see, for example, FIG. 16B) is stored.
  • the signal processing unit 34 is configured using, for example, a CPU (Central Processing Unit), an MPU (Micro Processing Unit), or a DSP (Digital Signal Processor), and controls the operation of each unit of the directivity control device 3 as a whole. Control processing, data input / output processing between other units, data calculation (calculation) processing, and data storage processing are performed.
  • a CPU Central Processing Unit
  • MPU Micro Processing Unit
  • DSP Digital Signal Processor
  • the directivity direction calculation unit 34 a uses the cursor CSR by the user's mouse operation or the coordinate data of the specified position of the image data specified by the user's finger FG as the operation unit 32. Is acquired from the communication unit 31 to the camera device C1. The directivity direction calculation unit 34 a acquires distance and direction data from the installation position of the camera device 1 to the sound (sound source) position in the real space corresponding to the specified position of the image data from the communication unit 31.
  • the directivity direction calculation unit 34a uses the data on the distance and direction from the installation position of the camera device C1 to the sound position, and uses the direction direction coordinates ( ⁇ MAh , ⁇ ) from the installation position of the omnidirectional microphone array device M1 to the sound position. MAv ) is calculated.
  • the directivity calculation unit 34a calculates in advance.
  • the sound position (sound source position) from the omnidirectional microphone array apparatus M1 is obtained using the predetermined calibration parameter data and the data in the direction (horizontal angle, vertical angle) from the camera device C1 to the sound position (sound source position).
  • Directivity direction coordinates ( ⁇ MAh , ⁇ MAv ) are calculated.
  • the calibration is an operation for calculating or acquiring a predetermined calibration parameter necessary for the directivity direction calculation unit 34a of the directivity control device 3 to calculate the directivity direction coordinates ( ⁇ MAh , ⁇ MAv ).
  • Specific contents of the calibration method and calibration parameters are not particularly limited, and can be realized, for example, within the scope of known techniques.
  • the omnidirectional microphone array device M1 when integrally attached so as to surround the camera device C1, the direction from the camera device C1 to the sound position (sound source position) (horizontal angle, (Vertical angle) can be used as the directivity direction coordinates ( ⁇ MAh , ⁇ MAv ) from the omnidirectional microphone array device 2 to the sound position.
  • ⁇ MAh indicates a horizontal angle in the directivity direction from the installation position of the omnidirectional microphone array device 2 to the voice position
  • ⁇ MAv is the omnidirectional microphone array device 2.
  • the vertical angle of the pointing direction from the installation position to the voice position is shown.
  • the reference directions (0 degree directions) of the horizontal angles of the camera device C1 and the omnidirectional microphone array device M1 coincide.
  • the output control unit 34b controls the operations of the display device 35 and the speaker device 36.
  • the output control unit 34b as an example of the display control unit displays the image data transmitted from the camera device C1 on the display device 35 in accordance with, for example, an input operation with the cursor CSR or the user's finger FG by the user's mouse operation.
  • the output control unit 34b as an example of the audio output control unit acquires the audio data transmitted from the omnidirectional microphone array apparatus 2 or the audio data collected by the omnidirectional microphone array apparatus M1 over a certain period from the recorder apparatus 4. In such a case, for example, audio data is output to the speaker device 36 in response to an input operation using a cursor CSR or a user's finger FG by a user's mouse operation.
  • the output control unit 34b as an example of the image reproduction unit acquires image data captured by the camera device C1 over a certain period from the recorder device 4, for example, the cursor CSR or the user's finger by the user's mouse operation is obtained.
  • the display device 35 is caused to reproduce the image data in response to an input operation by the FG.
  • the output control unit 34b as an example of the directivity forming unit is calculated by the directivity direction calculating unit 34a using the audio data transmitted from the omnidirectional microphone array device 2 or the audio data acquired from the recorder device 4.
  • the directivity (beam) of the sound (collected sound) collected by the omnidirectional microphone array device 2 is formed in the directivity direction indicated by the directivity direction coordinates ( ⁇ MAh , ⁇ MAv ).
  • the directivity control device 3 can relatively increase the volume level of the sound emitted by the monitoring target (for example, the person HM1) existing in the directivity direction in which the directivity is formed, and the sound in the direction in which the directivity is not formed. Can be suppressed to relatively reduce the volume level.
  • the monitoring target for example, the person HM1
  • the tracking processing unit 34c as an example of the information acquisition unit acquires information related to the above-described voice tracking processing. For example, on the tracking screen TRW of the display device 35 on which the image data captured by the camera device C1 is displayed, the tracking processing unit 34c responds to an input operation with a cursor CSR by a user's mouse operation or a user's finger FG, for example. When a new position is designated, information on the newly designated position is acquired.
  • the information regarding the newly specified position is specified at the newly specified time (specified time) and the specified time. From the omnidirectional microphone array device M1 to the sound position (sound source position) or the coordinate information of the sound position (sound source position) where the monitoring object (for example, the person HM1) in the real space corresponding to the position on the image data exists. Distance information is included.
  • the tracking processing unit 34c as an example of the reproduction time calculating unit uses the data of the tracking list LST stored in the memory 33, for example, according to the input operation with the cursor CSR by the user's mouse operation or the user's finger FG. The reproduction time of the sound at the position on the designated flow line is calculated (see later).
  • the display device 35 as an example of the display unit is configured using, for example, an LCD (Liquid Crystal Display) or an organic EL (Electroluminescence), and receives image data captured by the camera device C1 under the control of the output control unit 34b. indicate.
  • LCD Liquid Crystal Display
  • organic EL Electrode
  • the speaker device 36 as an example of the sound output unit has directivity formed in sound data of sound collected by the omnidirectional microphone array device M1 or in a directivity direction indicated by directivity direction coordinates ( ⁇ MAh , ⁇ MAv ). Output audio data. Note that the display device 35 and the speaker device 36 may have different configurations from the directivity control device 3.
  • the recorder device 4 stores the image data picked up by the camera device C1 and the sound data of the sound collected by the omnidirectional microphone array device M1 in association with each other.
  • the directivity control system 100A shown in FIG. 3 includes one or more camera devices C1,..., Cn, one or more omnidirectional microphone array devices M1,..., Mm, a directivity control device 3A, and a recorder device 4. It is the structure containing these.
  • FIG. 3 the same components and operations as those in FIG. 2 are denoted by the same reference numerals, description thereof is simplified or omitted, and different contents are described.
  • the directivity control device 3A includes at least a communication unit 31, an operation unit 32, a memory 33, a signal processing unit 34A, a display device 35, a speaker device 36, and an image processing unit 37.
  • the signal processing unit 34A includes at least a pointing direction calculation unit 34a, an output control unit 34b, a tracking processing unit 34c, and a sound source detection unit 34d.
  • the sound source detection unit 34d detects the sound position (sound source position) in the real space corresponding to the sound uttered by the person HM1, which is the monitoring target, from the image data displayed on the display device 35. For example, the sound source detection unit 34d divides the sound collection area of the omnidirectional microphone array apparatus M1 into a plurality of grid areas, and directivity is formed from the omnidirectional microphone array apparatus M1 to the center position of each grid area. Measure the sound intensity or volume level. The sound source detection unit 34d estimates that the sound source exists in the lattice area having the highest sound intensity or volume level among all the lattice areas. The detection result of the sound source detection unit 34d includes, for example, distance information from the omnidirectional microphone array device M1 to the center position of the lattice area having the highest sound intensity or volume level.
  • the image processing unit 37 In response to an instruction from the signal processing unit 34, the image processing unit 37 performs predetermined image processing on the image data displayed on the display device 35 (for example, VMD (Video Motion Detector) processing for detecting the motion of the person HM1). , Human face and face orientation detection processing, human detection processing), and the image processing result is output to the signal processing unit 34.
  • VMD Video Motion Detector
  • the image processing unit 37 detects the face outline DTL of the monitoring target (for example, the person HM1) displayed on the display device 35 in accordance with, for example, an input operation with the cursor CSR by the user's mouse operation or the user's finger FG. And masking the face. Specifically, the image processing unit 37 calculates a rectangular region including the detected face outline DTL, and performs a process of adding a predetermined blur to the rectangular region (see FIG. 22C).
  • FIG. 22C is an explanatory diagram of the process of blurring the detected human face outline DTL.
  • the image processing unit 37 outputs the image data generated by the blurring process to the signal processing unit 34.
  • FIG. 37 is a simple explanatory diagram of the delay sum method in which the omnidirectional microphone array apparatus M1 forms the directivity of the audio data in the direction of the angle ⁇ .
  • the microphone elements 221 to 22n are arranged on a straight line.
  • the directivity is a two-dimensional region in the plane.
  • the same processing method may be performed by arranging microphones in a two-dimensional array.
  • the sound source 80 is, for example, a monitoring object (for example, a person HM1) that exists in the direction of the omnidirectional microphone array apparatus M1, and is in a direction of a predetermined angle ⁇ with respect to the surface of the casing 21 of the omnidirectional microphone array apparatus M1. Exists. Further, the distance d between the microphone elements 221, 222, 223,..., 22 (n ⁇ 1), 22n is constant.
  • the sound wave emitted from the sound source 80 first reaches the microphone element 221 and is collected, then reaches the microphone element 222 and is collected, and is successively collected, and finally reaches the microphone element 22n. Sound is collected.
  • each microphone element 221, 222, 223,..., 22 (n ⁇ 1), 22n of the omnidirectional microphone array apparatus M1 toward the sound source 80 is, for example, the sound source 80 being monitored (for example, the person HM1). Is the same as the direction from each microphone (microphone element) of the omnidirectional microphone array device 2 toward the sound position (sound source position) corresponding to the designated position designated on the display device 35 by the user. .
  • ⁇ 1 is a difference time between the time when the sound wave reaches the microphone element 221 and the time when the sound wave reaches the microphone element 22n
  • ⁇ 2 is the time when the sound wave reaches the microphone element 222 and the sound wave reaches the microphone element 22n.
  • ⁇ (n ⁇ 1) is the difference time between the time when the sound wave reaches the microphone element 22 (n ⁇ 1) and the time when the sound wave reaches the microphone element 22n. is there.
  • the omnidirectional microphone array apparatus M1 includes A / D converters 241, 242, 243 provided corresponding to the microphone elements 221, 222, 223, ..., 22 (n-1), 22n. ..., 24 (n-1), 24n, delay units 251, 252, 253, ..., 25 (n-1), 25n, and an adder 26 (see FIG. 37).
  • the omnidirectional microphone array apparatus M1 converts analog audio data collected by the microphone elements 221, 222, 223,..., 22 (n ⁇ 1), 22n into A / D converters 241, 242, 243, .., 24 (n-1), and 24n are AD-converted into digital audio data.
  • the omnidirectional microphone array apparatus M1 includes the microphone elements 221, 222, 223,..., 22 (n ⁇ 1), 22n in the delay units 251, 252, 253,. After the delay time corresponding to the arrival time difference is given and the phases of all the sound waves are made uniform, the adder 26 adds the audio data after the delay processing. Thereby, the omnidirectional microphone array apparatus M1 can form the directivity of the audio data in the direction of the predetermined angle ⁇ in each of the microphone elements 221, 222, 223,..., 22 (n ⁇ 1), 22n.
  • the delay times D1, D2, D3,..., D (n ⁇ 1), Dn set in the delay units 251, 252, 253,. This corresponds to the time differences ⁇ 1, ⁇ 2, ⁇ 3,..., ⁇ (n ⁇ 1), and is expressed by Equation (1).
  • L1 is a difference in sound wave arrival distance between the microphone element 221 and the microphone element 22n.
  • L2 is a difference in sound wave arrival distance between the microphone element 222 and the microphone element 22n.
  • L3 is a difference in sound wave arrival distance between the microphone element 223 and the microphone element 22n.
  • L (n ⁇ 1) is a difference in sound wave arrival distance between the microphone element 22 (n ⁇ 1) and the microphone element 22n. It is.
  • Vs is the speed of sound waves (sound speed).
  • L1, L2, L3,..., L (n ⁇ 1), and Vs are known values.
  • the delay time Dn set in the delay device 25n is 0 (zero).
  • the omnidirectional microphone array apparatus M1 uses the delay times D1, D2, D3,..., Dn-1, Dn set in the delay units 251, 252, 253,. By changing the directivity of the voice data collected by the microphone elements 221, 222, 223,..., 22 (n-1), 22n built in the microphone units 22, 23 can be easily formed. .
  • the description of the directivity forming process shown in FIG. 37 is based on the assumption that the omnidirectional microphone array apparatus 2 performs the description for the sake of simplicity, and other omnidirectional microphone array apparatuses (for example, the omnidirectional microphone array). The same applies to the device Mm). However, the output control unit 34b of the signal processing units 34 and 34A of the directivity control devices 3 and 3A has the same number of AD converters 241 to 24n and delay units 251 to 25n as the number of microphones of the omnidirectional microphone array device M1. In the case of the configuration including the adder 26, the sound collected by each microphone element of the omnidirectional microphone array device M1 by the output control unit 34b of the signal processing units 34 and 34A of the directivity control devices 3 and 3A. 37 may be used to perform the directivity forming process shown in FIG.
  • a user for example, a supervisor; the same applies hereinafter
  • image data of a video imaged in real time by the camera device C1 is displayed on the display device 35.
  • the tracking mode is used when the follow-up control (voice tracking process) of the directivity of the sound collected by the omnidirectional microphone array device M1 as the monitoring target (for example, the person HM1) moves is performed. Is done.
  • the tracking processing method is a monitoring target when performing tracking control (voice tracking processing) of the directivity of the sound collected by the omnidirectional microphone array device M1 by the movement of the monitoring target (for example, the person HM1).
  • This is a method of setting the position of an object (for example, a designated position on the tracking screen TRW of the display device 35 or a position in real space), and is divided into manual tracking processing and automatic tracking processing. Details of each will be described later.
  • the number of tracking objects indicates the number of monitoring objects to be subjected to follow-up control (voice tracking processing) of the directivity of the sound collected by the omnidirectional microphone array apparatus M1, for example, one person for a person Or more than one person.
  • the manual designation method refers to a method in which the user designates a tracking point on the tracking screen TRW in manual tracking processing (described later). For example, the click operation or drag operation of the cursor CSR by a mouse operation, the user This corresponds to a touch operation or a touch slide operation with the finger FG.
  • the slow playback mode is based on the assumption that the recording / playback mode is on, and the playback speed of the image data played back on the display device 35 is played back at a speed value smaller than the initial value (eg, normal value). Used when.
  • the enlarged display mode is used when the monitored object (for example, the person HM1) displayed on the tracking screen TRW of the display device 35 is enlarged and displayed.
  • the voice privacy protection mode is intended to make it difficult to specify who is the voice to be output when the voice data collected by the omnidirectional microphone array device M1 is output from the speaker device 36. Used when voice processing (for example, voice change processing) is performed.
  • the image privacy protection mode means that it is difficult to specify who is the monitoring object (for example, the person HM1) displayed on the tracking screen TRW of the display device 35 when the enlarged display mode is on. This is used when image processing is performed.
  • connection mode is used when connecting designated positions (see, for example, a point marker MR1 described later) designated on the tracking screen TRW by manual designation or automatic designation in the process of moving the monitoring object. If the connection mode is every time, adjacent point markers are connected each time a specified position is specified in the movement process of the monitoring target. If the connection mode is batch, the point markers corresponding to all the designated positions obtained in the process of moving the monitoring object are connected together with the adjacent point markers.
  • the correction mode is used when the automatic tracking process is switched to the manual tracking process when the designated position automatically designated in the automatic tracking process is out of the movement process of the monitoring target.
  • the multiple camera switching method is used when a camera device used for capturing an image of a monitoring object is switched among the multiple camera devices C1 to Cn. Details of the multiple camera switching method will be described in the second embodiment.
  • the multi-microphone switching method is used when switching the omnidirectional microphone array device used for collecting the sound emitted from the monitored object among the plurality of omnidirectional microphone array devices M1 to Mm. Details of the multi-microphone switching method will be described in the second embodiment.
  • the tracking point upper limit setting mode is used when the upper limit value of the tracking point is set. For example, when the tracking point upper limit setting mode is ON, when the number of tracking points reaches the upper limit value, the tracking processing unit 34c may reset (erase) all the tracking points, The fact that the number has reached the upper limit value may be displayed on the tracking screen TRW. Further, a plurality of voice tracking processes can be executed as long as the number of tracking points reaches the upper limit.
  • a predetermined setting button or setting menu in a monitoring system application (not shown) is displayed on the tracking screen TRW.
  • the setting button or setting menu is determined by a click operation of the cursor CSR by the user's mouse operation or a touch operation by the user's finger FG.
  • FIG. 4 is an explanatory diagram illustrating an operation example of the manual tracking process.
  • FIG. 4 the movement process of the person HM1 as the monitoring target is shown on the tracking screen TRW displayed on the display device 35.
  • three tracking operations are performed by clicking or dragging the cursor CSR by the user's mouse operation.
  • Points b1, b2, and b3 are designated.
  • the tracking processing unit 34c acquires information of the tracking time t1 when the cursor CSR designates the tracking point b1, the tracking time t2 that designates the tracking point b2, and the tracking time t3 that designates the tracking point b3. Further, the tracking processing unit 34c associates the coordinate information on the tracking screen TRW of the tracking point b1 or the three-dimensional coordinates indicating the position in the real space corresponding to the coordinate information with the information of the tracking time t1 in the memory 33. save. In addition, the tracking processing unit 34c associates the coordinate information on the tracking screen TRW of the tracking point b2 or the three-dimensional coordinates indicating the position in the real space corresponding to the coordinate information with the information of the tracking time t2 in the memory 33. save.
  • the tracking processing unit 34c associates the coordinate information on the tracking screen TRW of the tracking point b3 or the three-dimensional coordinates indicating the position in the real space corresponding to the coordinate information with the information of the tracking time t3 in the memory 33. save.
  • the output control unit 34b displays the point marker MR1 at the tracking point b1 on the tracking screen TRW, displays the point marker MR2 at the tracking point b2 on the tracking screen TRW, and further points to the tracking point b3 on the tracking screen TRW.
  • the marker MR3 is displayed.
  • the output control unit 34b can explicitly indicate the tracking point through which the moving person HM1 has passed on the tracking screen TRW as a locus.
  • the output control unit 34b displays the flow line LN1 by connecting the point markers MR1 and MR2, and further displays the flow line LN2 by connecting the point markers MR2 and MR3.
  • FIG. 5 is an explanatory diagram illustrating an operation example of changing the tracking point by the manual tracking process when the tracking point automatically designated in the automatic tracking process is incorrect.
  • the tracking point automatically specified by the image processing unit 37 or the sound source detection unit 34d is different from the point of the movement process of the person HM1, and is incorrect due to the connection between the point markers MR1 and MR2W.
  • a flow line LNW is displayed.
  • the automatic tracking process is switched to the manual tracking process. For example, when a correct tracking point is designated by a click operation using the cursor CSR.
  • the output control unit 34b connects the point markers MR1 and MR2R, and displays the correct flow line LNR on the tracking screen TRW.
  • FIG. 6 is an explanatory diagram showing the slow playback process in the recording playback mode and the slow playback mode.
  • the output control unit 34b sets the initial playback speed ( The image data of the video showing the moving process of the person HM1 is played back slowly on the tracking screen TRW at a speed value smaller than the normal value (see the tracking screen TRW on the lower side of FIG. 6).
  • the output control unit 34b can delay the movement of the person HM1 on the tracking screen TRW, the tracking point can be easily designated in the manual tracking process or the automatic tracking process.
  • the output control unit 34b may perform the slow reproduction process without accepting the touch operation of the user's finger FG when the moving speed of the person HM1 is equal to or higher than a predetermined value.
  • the playback speed during slow playback may be a fixed value, or may be changed as appropriate in accordance with an input operation with the cursor CSR by the user's mouse operation or the user's finger FG.
  • FIG. 7 is an explanatory diagram showing an enlarged display process in the enlarged display mode.
  • the output control unit 34b centers on the clicked position.
  • the tracking screen TRW is enlarged and displayed at a predetermined magnification (see the tracking screen TRW on the lower side of FIG. 7).
  • the output control unit 34b may enlarge and display the content of the tracking screen TRW on another pop-up screen (not shown) with the clicked position as the center. As a result, the output control unit 34b makes it easy for the user to compare the tracking screen TRW that has not been enlarged and the pop-up screen that has been enlarged, for example, by a simple designation operation by the user, so that the user can easily monitor the object to be monitored (person HM1). Can be specified.
  • the output control unit 34b may enlarge and display the contents of the displayed camera screen with the center of the display device 35 as a reference. Thereby, the output control unit 34b simply designates the monitoring target to the user when, for example, the monitoring target (person HM1) is shown near the center of the display device 35 by a simple designation operation of the user, for example. Can be made.
  • the output control unit 34b may enlarge the display centering on a position corresponding to a geometric average of a plurality of designated positions on the tracking screen TRW. Thereby, the output control part 34b can make a user select easily the several monitoring target object currently projected on the tracking screen TRW.
  • FIG. 8A is an explanatory diagram showing an automatic scroll process after the enlargement display process in the enlargement display mode.
  • an image of the entire imaging area C1RN may not be displayed on the tracking screen TRW.
  • the tracking screen TRW is automatically scrolled so as to be displayed at the center of the tracking screen TRW. Accordingly, the output control unit 34b automatically displays the tracking screen TRW so that the user's designated position is always at the center of the tracking screen TRW due to the movement of the person HM1 displayed on the enlarged tracking screen TRW.
  • the person HM1 as the monitoring object is displayed at the center on the tracking screen TRW during the automatic scrolling process, so that the user can easily select.
  • FIG. 9A is a flowchart illustrating a first example of the overall flow of manual tracking processing in the directivity control system 100 of the first embodiment.
  • FIG. 9B is a flowchart illustrating a second example of the overall flow of manual tracking processing in the directivity control system 100 of the first embodiment.
  • FIGS. 9A and 9B show the overall flow of the manual tracking process in the directivity control system 100 of the present embodiment.
  • Detailed contents of the processing will be described each time with reference to the drawings described later.
  • FIG. 9B the same contents as those illustrated in FIG. 9A are denoted by the same step numbers, and description thereof is simplified or omitted, and different contents are described.
  • 9A and 9B show the operation of the directivity control device 3.
  • the output control unit 34b is configured to use an omnidirectional microphone array on the tracking screen TRW of the display device 35 on which an image of the person HM1 as a monitoring target imaged by the camera device C1 is displayed.
  • the directivity of the collected sound is formed in the direction from the device M1 to the position (sound position, sound source position) of the person HM1 corresponding to the position specified by the cursor CSR by the user's mouse operation or the input operation by the user's finger FG.
  • FIG. 9A if the tracking mode is off (S1, NO), the manual tracking process shown in FIG. 9A ends, but if the tracking mode is on (S1, YES), The tracking assist process is started (S2). Details of the tracking assist processing will be described later with reference to FIG.
  • the tracking position of the movement process (movement path) of the person HM1 is the click operation of the cursor CSR by the user's mouse operation or the touch of the user's finger FG. Designated by operation (S3).
  • the tracking processing unit 34c associates the three-dimensional coordinates indicating the position in the real space corresponding to the specified position on the tracking screen TRW specified in step S3 and the specified time as the tracking position and tracking time of the tracking point, respectively.
  • the data is stored in the memory 33, and further, a point marker is displayed at the tracking point on the tracking screen TRW via the output control unit 34b (S4).
  • the point marker may be displayed by the tracking processing unit 34c, and the same applies to the following embodiments.
  • the output control unit 34b forms the directivity of the collected sound in the direction from the omnidirectional microphone array device M1 to the position (sound position, sound source position) of the person HM1 corresponding to the tracking point specified in step S3. (S5).
  • the tracking processing unit 34c acquires the tracking position and tracking time data of the tracking point by designating the movement process (movement route) of the person HM1 in response to an input operation with the cursor CSR by the user's mouse operation or the user's finger FG. When it is only necessary to do this, the operation of step S5 may be omitted.
  • the output control unit 34b does not switch the directivity from the omnidirectional microphone array apparatus M1 to the direction of the person HM1 corresponding to the tracking point specified in step S3 (voice position, sound source position). The same applies to the following embodiments.
  • step S5 the output control unit 34b performs tracking connection processing (S6). Details of the tracking connection process will be described later with reference to FIG. After step S6, the output control unit 34b outputs the collected sound having the directivity formed in step S5 from the speaker device 36 (S7). Details of the audio output processing will be described later with reference to FIG. After step S7, the operation of the directivity control device 3 returns to step S1, and the processes of steps S1 to S7 are repeated until the tracking mode is turned off.
  • step S2 tracking assist processing is started (S2). Details of the tracking assist processing will be described later with reference to FIG.
  • step S2 on the tracking screen TRW of the display device 35, the position (namely, tracking point) of the movement process (movement path) of the person HM1 is the drag operation of the cursor CSR by the user's mouse operation or the touch of the user's finger FG. It is assumed that the slide operation is started (S3A).
  • step S3A if the predetermined time (for example, about several seconds) has not elapsed since the storage of the tracking position and tracking time data corresponding to the previous tracking point has ended (S8, NO), in step S3A It is considered that the started drag operation or touch slide operation has not ended, and the operation of the directivity control device 3 proceeds to step S7.
  • the predetermined time for example, about several seconds
  • step S3 when a predetermined time (for example, about several seconds) has elapsed since the storage of the tracking position and tracking time data corresponding to the previous tracking point has ended (S8, YES), in step S3 It is considered that the started drag operation or touch slide operation is completed, and a new tracking point is designated. That is, the tracking processing unit 34c uses the three-dimensional coordinates indicating the position in the real space corresponding to the specified position when the drag operation or the touch slide operation is ended and the specified time as the tracking position and tracking time of the new tracking point, respectively. The information is stored in the memory 33 in association with each other, and a point marker is displayed at the tracking point on the tracking screen TRW via the output control unit 34b (S4).
  • the operation after step S4 is the same as the operation after step S4 shown in FIG.
  • FIG. 10A is a flowchart for explaining a first example of the entire flow of the automatic tracking process in the directivity control system 100A of the first embodiment.
  • FIG. 10B is a flowchart for explaining a first example of the automatic tracking process shown in FIG.
  • FIG. 11A is a flowchart for explaining a second example of the automatic tracking process shown in FIG.
  • FIG. 11B is a flowchart illustrating an example of the tracking correction process illustrated in FIG.
  • FIG. 12 is a flowchart for explaining a third example of the automatic tracking process shown in FIG.
  • FIG. 10 (A) As in FIGS. 9 (A) and 9 (B), in order to avoid complication of explanation, referring to FIG. 10 (A), in the directivity control system 100A of the present embodiment.
  • the overall flow of the automatic tracking process will be described first, and the detailed contents of each process will be described each time with reference to the drawings described later.
  • FIG. 10A also shows the operation of the directivity control device 3.
  • the output control unit 34b is configured to display an omnidirectional microphone array on the tracking screen TRW of the display device 35 on which an image of the person HM1 as the monitoring target imaged by the camera device C1 is displayed.
  • the directivity of the collected sound is set in the direction from the device M1 to the position (speech position, sound source position) of the person HM1 corresponding to the position automatically designated by using the detection processing result of the sound source detection unit 34d or the image processing unit 37. Suppose that it is formed.
  • step S1 tracking assist processing is started (S2). Details of the tracking assist processing will be described later with reference to FIG. After step S2, automatic tracking processing is performed (S3B). Details of the automatic tracking process will be described later with reference to FIGS. 10B, 11A, and 12.
  • step S3B the output control unit 34b collects sound in the direction from the omnidirectional microphone array apparatus M1 to the position of the person HM1 (speech position, sound source position) corresponding to the tracking point automatically designated in step S3B. (S5).
  • the operation after step S5 is the same as the operation after step S4 shown in FIG.
  • the image processing unit 37 performs known image processing to determine whether or not the person HM1 as the monitoring target is detected on the tracking screen TRW of the display device 35, and the person HM1 is detected. If it is determined that it has been detected, the determination result (including the detection position (eg, known representative point) of the person HM1 and detection time data) is output to the tracking processing unit 34c of the signal processing unit 34 (S3B-1). .
  • the sound source detection unit 34d determines whether or not the position of the sound (sound source) emitted by the person HM1 as the monitoring target is detected on the tracking screen TRW of the display device 35 by performing a known sound source detection process. If it is determined that the position of the sound source has been detected, the determination result (including the sound source detection position and detection time data) is output to the tracking processing unit 34c (S3B-1). In order to simplify the description of step S3B-1, it is assumed that there is no monitoring target other than the monitoring target person HM1 on the tracking screen TRW.
  • the tracking processing unit 34c automatically sets the designated position of the person HM1 in the automatic tracking processing, that is, the tracking point, using the determination result of the image processing unit 37 or the sound source detection unit 34d (S3B-1).
  • the tracking processing unit 34c corresponds to the tracking position of the tracking point and the tracking time by using the three-dimensional coordinates indicating the position in the real space corresponding to the detection position on the tracking screen TRW automatically specified in step S3B-1 and the detection time, respectively. Then, it is stored in the memory 33, and further, a point marker is displayed at the tracking point on the tracking screen TRW via the output control unit 34b (S3B-2).
  • step S3B-2 the automatic tracking process shown in FIG. 10B ends, and the process proceeds to step S5 shown in FIG.
  • step S3B-4 when the first tracking point (initial position) has already been designated (S3B-3, YES), the operation of step S3B-4 is omitted.
  • an input operation for example, a click operation
  • the touch operation designates the position (namely, tracking point) of the movement process (movement path) of the person HM1 (S3B-4).
  • the tracking processing unit 34c When the first tracking point has already been specified, or after the first tracking point has been specified in step S3B-4, the tracking processing unit 34c performs an image processing unit 37 or a sound source detection unit centered on the first tracking point. The next tracking point is automatically designated using the determination result 34d (S3B-5). Accordingly, the tracking processing unit 34c relates to the position of the sound (sound source) emitted by the person HM1 around the first tracking point (initial position) on the tracking screen TRW, for example, by the user specifying the first tracking point. Since the detection process of the information or the information related to the position of the person HM1 is started, each detection process can be performed at high speed.
  • the tracking processing unit 34c corresponds to the tracking position of the tracking point and the tracking time by using the three-dimensional coordinates indicating the position in the real space corresponding to the detection position on the tracking screen TRW automatically specified in step S3B-5 and the detection time, respectively. Then, it is stored in the memory 33, and further, a point marker is displayed at the tracking point on the tracking screen TRW via the output control unit 34b (S3B-2).
  • step S3B-2 If the operation for correcting the tracking point is not performed after step S3B-2 (S3B-6, NO), the automatic tracking process shown in FIG. 11A ends, and the step shown in FIG. Proceed to S5.
  • step S3B-2 for example, when the determination result of the image processing unit 37 or the sound source detection unit 34d is incorrect, an operation for correcting the tracking position corresponding to the tracking point is performed (S3B-6). YES), the tracking correction process shown in FIG. 11B is performed (S3B-7).
  • step S3B-7-1 when the voice uttered by the person HM1 moving on the tracking screen TRW is output, the voice is temporarily output by the cursor CSR by the user's mouse operation or the input operation by the user's finger FG. Canceled (S3B-7-1).
  • step S3B-7-1 the correction mode is turned on by an input operation with the cursor CSR by the user's mouse operation or the user's finger FG, so that the automatic tracking process is temporarily shifted to the manual tracking process. Assume that the correct tracking point is designated (S3B-7-2).
  • the output control unit 34b deletes the wrong point marker displayed on the tracking screen TRW immediately before the designation in step S3B-7-2 (S3B-7-3), that is, the changed tracking point, that is, A point marker is displayed at the tracking point designated in step S3B-7-2, and the output of the voice that was temporarily suspended in step S3B-7-1 is resumed (S3B-7-3). Further, the tracking processing unit 34c overwrites and stores the position designated in step S3B-7-2 as a tracking point (S3B-7-3). After step S3B-7-3, the tracking correction process shown in FIG. 11B ends, and the process proceeds to step S5 shown in FIG.
  • the image processing unit 37 performs known image processing to determine whether or not the person HM1 as the monitoring target is detected on the tracking screen TRW of the display device 35 (S3B-8). When it is determined that the person HM1 has been detected (S3B-9, YES), the image processing unit 37 calculates a detection position (for example, a known representative point) of the person HM1, and further compares the detection time and the detection position. Each data is output as a determination result to the tracking processing unit 34c of the signal processing unit 34 (S3B-10).
  • a detection position for example, a known representative point
  • the sound source detection unit 34d determines whether or not the position of the sound (sound source) emitted by the person HM1 as the monitoring target is detected on the tracking screen TRW of the display device 35 by performing a known sound source detection process. If it is determined that the detected position is detected, the detected position of the person HM1 is calculated, and each data of the detection time and the detected position is output as a determination result to the tracking processing unit 34c (S3B-11).
  • the tracking processing unit 34c stores the sound source detection position and the detection time on the tracking screen TRW calculated in step S3B-11 in association with the tracking point tracking position and the tracking time in the memory 33, and further outputs them. Point markers are displayed at the tracking points on the tracking screen TRW via the control unit 34b (S3B-12).
  • the tracking processing unit 34c determines whether the distance between the detected position of the person HM1 calculated in step S3B-10 and the detected position of the sound source calculated in step S3B-11 is within a predetermined value. Is determined (S3B-13). If the distance between the detection position of the person HM1 and the detection position of the sound source is within a predetermined value (S3B-13, YES), the automatic tracking process shown in FIG. 12 is terminated, and step S5 shown in FIG. Proceed to
  • step S3B-7 the tracking correction process shown in FIG. 11B is performed (S3B-7). . Since the tracking correction processing has been described with reference to FIG. 11B, description thereof is omitted here. After step S3B-7, the automatic tracking process shown in FIG. 12 ends, and the process proceeds to step S5 shown in FIG.
  • the tracking processing unit 34c performs tracking correction processing if the distance between the sound source position detected by the sound source position detection process or the person HM1 position detection process and the position of the person HM1 is equal to or greater than a predetermined value.
  • Information relating to the position designated by the user's position changing operation in can be easily corrected and acquired as information relating to the position of the person HM1.
  • the tracking processing unit 34c for example, Without requiring a change operation, the position of the sound source or the position of the person HM1 can be easily acquired as information regarding the position after the movement of the person HM1.
  • FIG. 13A is a flowchart for explaining an example of the tracking assist process shown in FIG.
  • step S2-5 when the enlarged display mode of the directivity control devices 3 and 3A is OFF (S2-1, NO), the operation of the directivity control devices 3 and 3A proceeds to step S2-5.
  • the directivity control devices 3 and 3A when the enlarged display mode of the directivity control devices 3 and 3A is on (S2-1, YES), the directivity control devices 3 and 3A perform image privacy protection processing (S2-2), and further Then, an automatic scroll process is performed (S2-3). Details of the image privacy protection process will be described later with reference to FIG. Details of the automatic scroll processing will be described later with reference to FIGS. 13B, 14A, and 14B.
  • the output control unit 34b enlarges and displays the content of the tracking screen TRW at a predetermined magnification with the tracking position corresponding to the nearest tracking point on the tracking screen TRW as the center (S2-4).
  • the output control unit 34b sets the initial value of the playback speed. Image data of a video showing the movement process of the person HM1 is played back on the tracking screen TRW at a speed value smaller than (normal value) (S2-6).
  • FIG. 13B is a flowchart illustrating an example of the automatic scroll process illustrated in FIG.
  • FIG. 14A is a flowchart illustrating an example of the automatic scroll process necessity determination process illustrated in FIG.
  • FIG. 14B is an explanatory diagram of a scroll necessity determination line in the automatic scroll processing necessity determination processing.
  • the tracking processing unit 34c performs an automatic scroll process necessity determination process (S2-3-1). Details of the automatic scroll process necessity determination process will be described later with reference to FIG.
  • step S2-3-1 if the output control unit 34b determines that the automatic scroll process is necessary as a result of the automatic scroll process necessity determination process (S2-3-2, YES), the tracking screen A predetermined automatic scroll process is performed on the TRW (S2-3-3).
  • the output control unit 34b always keeps the person HM1 at the center of the tracking screen TRW along the movement path of the person HM1 on the tracking screen TRW according to the input operation with the cursor CSR by the user's mouse operation or the user's finger FG.
  • the tracking screen TRW is automatically scrolled so as to be displayed on the screen.
  • the output control unit 34b can prevent the designated position of the person HM1 as the user's monitoring target from moving out of the tracking screen TRW, and continue to move.
  • the person HM1 on the tracking screen TRW can be easily designated.
  • step S2-3-1-1 the output control unit 34b displays the tracking screen TRW so that the person HM1 is always displayed at the center of the tracking screen TRW.
  • the automatic scroll process necessity determination process shown in step S2-3-1 may be omitted as the automatic scroll process.
  • the output control unit 34b automatically scrolls by a predetermined amount in the moving direction of the person HM1 (for example, the direction beyond the scroll determination line JDL, which will be described later). To process. Thereby, even when the tracking screen TRW is enlarged and displayed, the output control unit 34b can prevent the designated position of the person HM1 as the monitoring target of the user from deviating from the tracking screen TRW.
  • the output control unit 34b determines the position (for example, the next position specified by the cursor CSR by the user's mouse operation or the input operation by the user's finger FG).
  • the tracking screen TRW is automatically scrolled so that the tracking point) becomes the center of the tracking screen TRW.
  • the output control unit 34b can prevent the designated position of the person HM1 as the user's monitoring target from moving out of the tracking screen TRW, and continue to move.
  • the person HM1 on the tracking screen TRW can be easily designated.
  • step S2-3-3 or when it is determined that the automatic scroll process is not necessary as a result of the automatic scroll process necessity determination process (S2-3-2, NO), it is shown in FIG.
  • the automatic scroll process ends, and the process proceeds to step S2-4 shown in FIG.
  • the tracking processing unit 34c determines whether or not the tracking position corresponding to the designated tracking point TP1 exceeds any one of the scroll determination lines JDL on the upper, lower, left, and right sides of the tracking screen XTRW to be enlarged. Determination is made (S2-3-1-1).
  • the tracking processing unit 34c determines that the tracking position does not exceed any of the scroll determination lines JDL (S2-3-1-1, NO). If the tracking processing unit 34c determines that the automatic scroll processing is unnecessary (S2-3- 1-2). On the other hand, when the tracking processing unit 34c determines that the tracking position exceeds any of the scroll determination lines JDL (S2-3-1-1, YES), the tracking processing unit 34c determines that automatic scroll processing is necessary, and further applies The type of the scroll determination line JDL to be performed (for example, information indicating one of the four scroll determination lines JDL shown in FIG. 14B) is stored in the memory 33 (S2-3-1-3). After steps S2-3-1-2 and S2-3-1-3, the automatic scroll process necessity determination process shown in FIG. 14A ends, and step S2-3-2 shown in FIG. Proceed to
  • FIG. 15A is a flowchart illustrating an example of the tracking connection process illustrated in FIG.
  • FIG. 15B is a flowchart illustrating an example of the batch connection process illustrated in FIG.
  • the tracking processing unit 34c determines whether or not the connection mode is in each case (S6-2).
  • the output control unit 34b selects the latest one or more tracking points corresponding to the one or more tracking points specified immediately before. Are connected and displayed (S6-3).
  • the output control unit 34b when the person HM1 displayed on the tracking screen TRW of the display device 35 moves, at least the current specified position and the immediately preceding position among the plurality of specified positions specified by the user specifying operation. Since the designated position is connected and displayed, a part of the trajectory of the movement of the person HM1 can be explicitly shown.
  • step S6-3 is not limited to the operation in the case of single designation in which tracking points are designated one by one, but also includes the operation in the case where a plurality of tracking points are designated at the same time. The same applies to -4--3.
  • step S6-3 After step S6-3 or when the tracking point has not yet been designated (S6-1, NO), the tracking connection process shown in FIG. 15A is terminated, and FIGS. B) or proceed to step S7 shown in FIG.
  • connection mode is not frequent (S6-2, NO)
  • S6-4 a batch connection process is performed (S6-4). The batch connection process will be described with reference to FIG.
  • the tracking processing unit 34c sequentially reads data in the tracking list LST (for example, see FIG. 16B) stored in the memory 33 (S6-4-1). When it is determined that the read data is the start point of the tracking point (S6-4-2, YES), the tracking processing unit 34c again uses the data in the tracking list LST (see, for example, FIG. 16B). Read (S6-4-1).
  • the output control unit 34b uses the read data of the tracking list to The point markers of the one or more tracking points specified in (1) and the corresponding one or more latest tracking points are connected and displayed (S6-4-3).
  • step S6-4-3 if the connection is made up to the end point of the tracking point (S6-4-4, YES), the batch connection process shown in FIG. 15B ends, and FIG. The process proceeds to step S7 shown in FIG. 9B or FIG.
  • step S6-4-3 if the end of the tracking point is not connected (S6-4-4, NO), the tracking processing unit 34c stores the tracking list LST stored in the memory 33 (for example, The data in FIG. 16 (B) is sequentially read out, and from step S6-4-1 to step S6-4-4 until point markers corresponding to all tracking points in the tracking list LST are connected and displayed. The operations up to are repeated. Accordingly, the output control unit 34b is adjacent to each designated position with respect to all of the plurality of designated positions designated by the user's designated operation when the person HM1 projected on the tracking screen TRW of the display device 35 moves. Since one or two designated positions are connected and displayed, the entire trajectory of the movement of the person HM1 can be explicitly shown.
  • FIG. 16A is an explanatory diagram of the collected sound reproduction start time PT corresponding to the user's designated position P0 on the flow line between the tracking points displayed for one movement of the person HM1.
  • FIG. 16B is a diagram illustrating a first example of a tracking list.
  • TP1, TP2, TP3, and TP4 are tracking points designated during the movement of the person HM1 for one time, as also shown in the tracking list LST shown in FIG.
  • the coordinates (x, y, z) indicating the tracking position and the tracking time are stored in association with each other.
  • the z coordinate value z0 of the coordinates indicating the tracking position is assumed to be constant.
  • the tracking processing unit 34c designates the designated position.
  • Two tracking points TP1 and TP2 before and after P0 are extracted, and the reproduction start time PT at the designated position P0 is calculated according to the equation (2) using the coordinates indicating the tracking positions of the tracking points TP1 and TP2 and the tracking time data. .
  • the output control unit 34b when outputting (reproducing) the sound to the speaker device 36, the output control unit 34b performs tracking time order including the designated position P0 designated by the cursor CSR by the user's mouse operation or the input operation by the user's finger FG. After the directivity is formed in the directivity direction corresponding to the corresponding tracking position, the sound having the directivity is output (reproduced).
  • FIG. 17A is an explanatory diagram of the reproduction start time PT of the collected sound corresponding to the user's designated position P0 on the flow line between different tracking points based on a plurality of simultaneous designations.
  • FIG. 17B is a diagram showing a second example of the tracking list LST.
  • (TP11, TP21), (TP12, TP22), (TP13, TP23), and (TP14, TP24) are, for example, as shown in the tracking list LST shown in FIG.
  • the tracking points are specified simultaneously during movement of different persons as a plurality of monitoring objects.
  • the coordinates (x, y, z) indicating the tracking position and the tracking time are stored in association with each other.
  • the tracking points (TP11, TP21) are start points, and the tracking points (TP14, TP24) are end points.
  • the z coordinate value z0 of the coordinates indicating the tracking position is assumed to be constant.
  • the tracking processing unit 34c designates the designated position P0 at any position on the different flow line between the tracking points shown in FIG. 17A according to the input operation with the cursor CSR by the user's mouse operation or the user's finger FG. Then, the two tracking points TP11 and TP12 before and after the designated position P0 are extracted, and the reproduction start time PT at the designated position P0 is expressed by using the coordinates indicating the tracking positions of the tracking points TP11 and TP12 and the tracking time data. Calculate according to (3).
  • the output control unit 34b when outputting (reproducing) the sound to the speaker device 36, the output control unit 34b performs tracking time order including the designated position P0 designated by the cursor CSR by the user's mouse operation or the input operation by the user's finger FG. After the directivity is formed in the directivity direction corresponding to the corresponding tracking position, the sound having the directivity is output (reproduced).
  • FIG. 18A is an explanatory diagram of the sound collection sound reproduction start times PT and PT 'corresponding to the user's designated positions P0 and P0' on the flow line between different tracking points based on the designation of a plurality of times.
  • FIG. 18B is a diagram showing a third example of the tracking list LST.
  • (TP11, TP12, TP13, TP14) is, for example, during the movement of a person as the first monitoring target as shown in the tracking list LST shown in FIG. 18B.
  • TP21, TP22, TP23 is also a tracking point designated during the movement of a person as the second monitoring object, for example.
  • the person as the second monitoring object may be the same person or a different person from the person as the first monitoring object.
  • the coordinates (x, y, z) indicating the tracking position and the tracking time are stored in association with each other for each tracking point TP11, TP12, TP13, TP14, TP21, TP22, TP23.
  • the tracking points TP11 and TP21 are start points, and the tracking points TP14 and TP23 are end points.
  • the z coordinate value z0 of the coordinates indicating the tracking position is assumed to be constant.
  • the tracking processing unit 34c is located at any position on each flow line between the tracking points shown in FIG. 18A according to the input operation with the cursor CSR by the user's mouse operation or the user's finger FG.
  • two tracking points (TP11, TP12) and (TP21, TP22) before and after the specified positions P0 and P0' are extracted, and the tracking positions of the tracking points (TP11, TP12) and (TP21, TP22) are extracted.
  • the coordinates of the designated position P0 are (x0, y0, z0), and the coordinates of the designated position P0 'are (x0', y0 ', z0).
  • the output control unit 34b includes a designated position P0 or a designated position P0 ′ designated by an input operation with a cursor CSR by a user's mouse operation or a user's finger FG when outputting (reproducing) sound to the speaker device 36. After the directivity is formed in the directivity direction corresponding to the corresponding tracking position in the order of the tracking time, the sound having the directivity is output (reproduced).
  • FIG. 19A is a flowchart for explaining an example of the entire flow of the flow line display reproduction process using the tracking list LST in the directivity control systems 100 and 100A of the first embodiment.
  • a flow line display process is first performed (S11). Details of the flow line display processing will be described later with reference to FIG.
  • step S11 when the designated position P0 is designated on the flow line between the tracking points displayed in step S11 according to the input operation with the cursor CSR by the user's mouse operation or the user's finger FG (S12), the reproduction is performed.
  • a start time calculation process is performed (S13). Details of the reproduction start time calculation process will be described later with reference to FIG.
  • the tracking processing unit 34c refers to the tracking list LST stored in the memory 33, and all corresponding to the tracking time closest to the reproduction start time PT of the designated position P0 calculated in the reproduction start time calculation process shown in step S13.
  • the coordinates of the tracking position (which may be one) are read (S14).
  • the output control unit 34b uses the tracking position coordinate data read by the tracking processing unit 34c to collect sound from the omnidirectional microphone array apparatus M1 in the direction toward all (or one) tracking positions. Voice directivity is formed (S14).
  • the output control unit 34b determines the tracking position specified next to the arbitrary specified position in accordance with the position (optional specified position) arbitrarily specified by the user on the flow line indicating the movement trajectory of the person HM1.
  • the directivity of the voice can be formed in advance in the direction toward the.
  • step S14 the output control unit 34b starts reproduction of the collected voice data stored in the recorder device 4 or the memory 33 from the reproduction start time PT calculated in step S13 (S15).
  • step S15 when there is a next tracking time within a predetermined time from the reproduction start time PT (S16, YES), the output control unit 34b performs all (or one) corresponding to the next tracking time.
  • the directivity of the collected sound is formed in the direction from the omnidirectional microphone array apparatus M1 to all (or even one) tracking positions (S17).
  • step S7 After step S17 or when there is no next tracking time within a predetermined time from the reproduction start time PT (S16, NO), an audio output process is performed (S7). Details of the audio output processing will be described later with reference to FIG. After step S7, when the audio output process at the tracking time corresponding to the end point of the tracking point is completed (S18, YES), the flow line display reproduction process shown in FIG. Thereby, the output control part 34b can output clearly the sound-collected sound which the monitoring target object emitted at the reproduction start time calculated according to the user's arbitrary designated positions, and within a predetermined time from the reproduction start time. When there is a next designated position, the directivity of the voice at the next designated position can be formed in advance.
  • step S7 if the audio output process at the tracking time corresponding to the end point of the tracking point has not ended (S18, NO), the audio output process at the tracking time corresponding to the end point of the tracking point ends. Until then, the operations from step S16 to step S18 are repeated.
  • FIG. 19B is a flowchart for explaining an example of the reproduction start time calculation process shown in FIG.
  • the tracking processing unit 34c reads the tracking list LST (see, for example, FIG. 16B) stored in the memory 33 (S13-1).
  • the tracking processing unit 34c extracts two tracking points TP1 and TP2 before and after the designated position P0 designated in step S12 from the data of the tracking list LST read in step S13-1 (S13-2).
  • the tracking processing unit 34c calculates the reproduction start time PT at the designated position P0 using the coordinates indicating the tracking positions of the tracking points TP1 and TP2 and the data of the tracking time (S13-3, for example, refer to Equation (2)).
  • the reproduction start time calculation process shown in FIG. 19B ends, and the process proceeds to step S14 shown in FIG.
  • FIG. 20 is a flowchart for explaining an example of the flow line display process shown in FIG.
  • the tracking processing unit 34c sequentially reads data in the tracking list LST (for example, see FIG. 16B) stored in the memory 33 (S11-1).
  • the flow line display process shown in FIG. 20 ends, and FIG. The process proceeds to step S12 shown in FIG.
  • the tracking processing unit 34c detects the tracking list LST (for example, FIG. 16).
  • the data of (B) is read sequentially.
  • the output control unit 34b displays a point marker on each of the monitoring objects at one or more tracking points read by the tracking processing unit 34c (S11-3).
  • the output control unit 34b is, for example, an input operation (for example, a mouse right-click operation and a left-click operation, a keyboard operation by a user's mouse FG or a user's finger FG).
  • a mode for example, the same symbol, identification number, symbol, and so on
  • a point marker is displayed by distinguishing each monitoring object by a combination of identification numbers, a frame of a predetermined shape, or the like).
  • the frame having a predetermined shape here is, for example, a rectangle, a circle, or a triangle.
  • the line type for example, solid line, dotted line
  • step S11-3 when it is determined that the tracking point data read in step 11-3 is the starting point of the tracking point (S11-4, YES), the tracking processing unit 34c again performs the tracking list. Data of LST (see, for example, FIG. 16B) is read (S11-3).
  • the output control unit 34b uses the data of the read tracking list. Then, the point markers of the one or more tracking points designated immediately before and the latest one or more tracking points corresponding to each other are connected and displayed (S11-5).
  • step S11-5 if the end of the tracking point of the tracking list LST read in step S11-1 is connected (S11-6, YES), the operation proceeds to step S11-2.
  • step S11-5 if the end of the tracking point of the tracking list LST read in step S11-1 is not connected (S11-6, NO), the data is read in step S11-1. The operation from step S11-3 to step S11-6 is repeated until the end of the tracking point in the tracking list LST is connected.
  • FIG. 21A is a flowchart illustrating an example of the audio output process illustrated in FIG.
  • FIG. 21B is a flowchart for explaining an example of the image privacy protection process shown in FIG.
  • FIG. 22A is a diagram illustrating an example of a waveform of an audio signal corresponding to the pitch before the voice change process.
  • FIG. 22B is a diagram illustrating an example of a waveform of an audio signal corresponding to the pitch after the voice change process.
  • FIG. 22C is an explanatory diagram of processing for blurring the detected outline of a person's face.
  • the output control unit 34b determines whether or not the voice privacy protection mode is on (S7-1). If the output control unit 34b determines that the voice privacy protection mode is on (S7-1, YES), the output control unit 34b performs voice change processing on the collected voice data output from the speaker device 36 (S7). -2).
  • step S7-2 or when it is determined that the voice privacy protection mode is off (S7-1, NO), the output control unit 34b outputs the collected sound as it is from the speaker device 36 (S7). -3).
  • step S7-3 the audio output process shown in FIG. 21A ends, and the process returns to step S1 shown in FIG. 9A, FIG. 9B, or FIG.
  • the output control unit 34b increases, for example, the pitch of the voice data collected by the omnidirectional microphone array device M1 or the waveform of the voice data formed by the output control unit 34b itself. Decrease (see, for example, FIGS. 22A and 22B).
  • the output control unit 34b performs voice change processing on the sound collected in real time by the omnidirectional microphone array apparatus M1 by a simple input operation of the user, for example, and outputs the sound, so that the sound emitted by the person HM1 is output.
  • the output control unit 34b performs voice change processing on the sound collected in real time by the omnidirectional microphone array apparatus M1 by a simple input operation of the user, for example, and outputs the sound, so that the sound emitted by the person HM1 is output.
  • the output control unit 34b performs voice change processing on the sound and outputs the sound when the sound collected by the omnidirectional microphone array apparatus M1 is output for a certain period of time by a simple input operation of the user, for example. Therefore, it is possible to effectively protect the privacy of the voice of the person HM1 by making it difficult to understand who the voice of the person HM1 is.
  • the tracking processing unit 34c determines whether or not the image privacy protection mode is on (S2-2-1).
  • the image processing unit 37 determines the face outline DTL of the person HM1 displayed on the tracking screen TRW of the display device 35. Is detected (extracted) (S2-2-2), and the face outline DTL is masked (S2-2-3). Specifically, the image processing unit 37 calculates a rectangular region including the detected face outline DTL, and performs a process of adding a predetermined blur to the rectangular region (see FIG. 22C).
  • the image processing unit 37 outputs the image data generated by the blurring process to the output control unit 34b.
  • step S2-2-3 After step S2-2-3 or when it is determined that the image privacy protection mode is off (S2-2-1, NO), the output control unit 34b displays the image obtained from the image processing unit 37. Data is displayed on the display device 35 (S2-2-4).
  • the image processing unit 37 performs a masking process on a part (for example, a face) of the person HM1 as the monitoring target displayed on the tracking screen TRW of the display device 35 by, for example, a simple input operation of the user. Privacy can be effectively protected by making it difficult to understand who the person HM1 of the monitoring object is.
  • the image privacy protection process shown in FIG. 21B is performed if the image privacy protection mode of the directivity control devices 3 and 3A is turned on when the monitoring object (for example, the person HM1) appears on the camera screen. This may be done even if the enlarged display mode is not turned on.
  • the directivity control devices 3 and 3A apply the image data on the tracking screen TRW of the display device 35 from the omnidirectional microphone array device M1 including a plurality of microphones.
  • Information about a designated position for example, a tracking point
  • the monitoring object for example, the person HM1
  • the moving monitoring object for example, the person HM1
  • the directivity control devices 3 and 3A use the information related to the designated position with respect to the image data on the tracking screen TRW of the display device 35 in the direction toward the monitoring target (for example, the person HM1) corresponding to the designated position.
  • the monitoring target for example, the person HM1
  • the directivity control devices 3 and 3A are monitored objects (for example, the person HM1). Since the directivity of the sound formed in the direction toward the position before the movement of the object is formed in the direction toward the position after the movement of the monitoring object (for example, the person HM1), the movement of the monitoring object (for example, the person HM1) Along with this, the directivity of the voice can be properly formed and the efficiency of the monitoring work of the supervisor can be suppressed.
  • the directivity control devices 3 and 3A can monitor the object to be monitored by a simple manual operation that designates the object to be moved (for example, the person HM1) in the image data displayed on the tracking screen TRW of the display device 35. It is possible to easily acquire accurate information regarding the position after the movement of the person (for example, the person HM1).
  • the directivity control device 3A simplifies the sound source of the sound emitted from the monitoring object (for example, the person HM1) and the monitoring object (for example, the person HM1) itself from the image data displayed on the tracking screen TRW of the display device 35. Therefore, information regarding the position of the sound source or information regarding the position of the monitoring target can be easily obtained as information regarding the position after the movement of the monitoring target (for example, the person HM1).
  • the directivity control device 3B is configured to exceed the imaging area of the camera device or the sound collection area of the omnidirectional microphone array device in accordance with the movement state of the monitoring target (for example, a person). Switches the camera device used to capture an image of the monitoring object to another camera device, or switches the omnidirectional microphone array device used to collect sound emitted from the monitoring object to another omnidirectional microphone array device.
  • the camera device used for capturing an image of the monitoring target for example, the person HM1
  • the omnidirectional microphone array apparatus used for collecting the sound emitted by the person HM1 are preliminarily used. Assume that information is associated with the information and is stored in advance in the memory 33 of the directivity control device 3B.
  • FIG. 23 is a block diagram illustrating a system configuration example of the directivity control system 100B of the second embodiment.
  • the directivity control system 100B shown in FIG. 23 includes one or more camera devices C1,..., Cn, one or more omnidirectional microphone array devices M1,..., Mm, a directivity control device 3B, and a recorder device 4. It is the structure containing these.
  • the same reference numerals are given to the configuration and operation of each part shown in the directivity control systems 100 and 100A shown in FIGS. Will be described.
  • the directivity control device 3B may be, for example, a stationary PC installed in a monitoring control room (not shown), or a data communication terminal such as a mobile phone, PDA, tablet terminal, or smartphone that can be carried by the user.
  • the directivity control device 3B includes a communication unit 31, an operation unit 32, a memory 33, a signal processing unit 34A, a display device 35, a speaker device 36, an image processing unit 37, and an operation switching control unit 38. It is the structure which contains at least.
  • the signal processing unit 34A includes at least a pointing direction calculation unit 34a, an output control unit 34b, a tracking processing unit 34c, and a sound source detection unit 34d.
  • the operation switching control unit 38 based on various information or data regarding the movement status of the monitoring target (for example, a person) acquired by the tracking processing unit 34c, a plurality of camera devices C1 to Cn or a plurality of omnidirectional microphone array devices M1.
  • Mm various operations for switching the camera device used for capturing an image of the monitoring object of the directivity control system 100B or the omnidirectional microphone array device used for collecting the sound emitted from the monitoring object are performed.
  • FIG. 24 is an explanatory diagram showing an automatic switching process of a camera device used for capturing an image displayed on the display device 35.
  • the camera device used for capturing an image of the person HM1 is moved from the camera device C1 to the camera by moving the person HM1 as the monitoring object from the tracking position A1 to the tracking position A2.
  • An example of switching to the device C2 will be described.
  • the tracking position A1 is within the range of the imaging area C1RN of the camera device C1, and is within the range of the switching determination line JC1 of the camera device C1 determined in advance.
  • the tracking position A2 is within the range of the imaging area C2RN of the camera device C2, and is outside the range of the switching determination line JC1 of the camera device C1.
  • the tracking positions A1 and A2 are within the sound collection area of the omnidirectional microphone array apparatus M1.
  • the operation switching control unit 38 When the person HM1 is about to exceed the imaging area C1RN of the camera device C1, the operation switching control unit 38 includes information indicating that the camera device used for capturing an image of the person HM1 is switched from the camera device C1 to the camera device C2.
  • the camera device C2 is notified through the communication unit 31 and the network NW.
  • the operation switching control unit 38 instructs the camera device C2 to prepare for capturing an image in a range within the angle of view of the camera device C2.
  • the image data of the video imaged by the camera device C1 is displayed on the tracking screen TRW of the display device 35.
  • the operation switching control unit 38 provides information indicating that the camera device used for capturing an image of the person HM1 is switched from the camera device C1 to the camera device C2. And notifies the camera device C2 via the communication unit 31 and the network NW.
  • the operation switching control unit 38 uses the distance information between the camera device C1 and the person HM1 measured by the camera device C1 to determine whether the person HM1 has exceeded the switching determination line JC1. More specifically, the operation switching control unit 38 has the person HM1 within the angle of view of the camera device C1, and the distance from the camera device C1 to the person HM1 is the distance from the camera device C1 to the switching determination line JC1. When it becomes larger than (known), it is determined that the person HM1 exceeds the switching determination line JC1. It is assumed that the operation switching control unit 38 knows in advance a camera device (for example, the camera device C2) that can be switched from the camera device C1, and also knows a camera device that can be switched from other camera devices in advance.
  • a camera device for example, the camera device C2
  • the operation switching control unit 38 selects a camera device to be used for capturing an image of the person HM1 from the camera device C1 to the camera. Switch to device C2. Thereafter, on the tracking screen TRW of the display device 35, image data of a video imaged by the camera device C2 (for example, image data of the moving person HM1) is displayed.
  • the operation switching control unit 38 can adaptively switch to a camera device that can accurately display an image of the moving monitoring target (for example, the person HM1), and the user's monitoring target (for example, the person)
  • the image of HM1) can be designated easily.
  • FIG. 25 is an explanatory diagram showing an automatic switching process of the omnidirectional microphone array device used for collecting the sound of the monitoring target (for example, the person HM1).
  • the omnidirectional microphone array device used for collecting the sound emitted by the person HM1 by moving the person HM1 as the monitoring object from the tracking position A1 to the tracking position A2 An example of switching from the omnidirectional microphone array apparatus M1 to the omnidirectional microphone array apparatus M2 will be described.
  • the tracking position A1 is within the range of the sound collection area M1RN of the omnidirectional microphone array apparatus M1, and is within the range of the switching determination line JM1 of the omnidirectional microphone array apparatus M1 determined in advance.
  • the tracking position A2 is within the range of the sound collection area M2RN of the omnidirectional microphone array apparatus M2, and is outside the range of the switching determination line JM1 of the omnidirectional microphone array apparatus M1.
  • the tracking positions A1 and A2 are within the imaging area of the camera device C1.
  • the operation switching control unit 38 uses the omnidirectional microphone array apparatus used for collecting sound emitted from the person HM1.
  • Information indicating switching from M1 to the omnidirectional microphone array apparatus M2 is notified to the omnidirectional microphone array apparatus M2 via the communication unit 31 and the network NW.
  • the operation switching control unit 38 instructs the omnidirectional microphone array apparatus M2 to prepare to collect sound within the sound collection area of the omnidirectional microphone array apparatus M2.
  • the operation switching control unit 38 uses the omnidirectional microphone array apparatus used for collecting the sound emitted by the person HM1.
  • Information indicating switching from M1 to the omnidirectional microphone array apparatus M2 is notified to the omnidirectional microphone array apparatus M2 via the communication unit 31 and the network NW.
  • the operation switching control unit 38 uses the distance information between the omnidirectional microphone array device M1 and the person HM1 to determine whether or not the person HM1 exceeds the switching determination line JM1. More specifically, when the distance from the omnidirectional microphone array apparatus M1 to the person HM1 becomes larger than the distance (known) from the omnidirectional microphone array apparatus M1 to the switching determination line JM1, It is determined that the person HM1 has exceeded the switching determination line JM1.
  • the operation switching control unit 38 knows in advance an omnidirectional microphone array apparatus (for example, omnidirectional microphone array apparatus M2) that can be switched from the omnidirectional microphone array apparatus M1, and can switch from other omnidirectional microphone array apparatuses. It is assumed that the omnidirectional microphone array apparatus is known in advance.
  • the operation switching control unit 38 uses the omnidirectional microphone for sound collection by the person HM1.
  • the array device M is switched from the omnidirectional microphone array device M1 to the omnidirectional microphone array device M2.
  • the operation switching control unit 38 can adaptively switch to the omnidirectional microphone array device capable of accurately collecting the sound emitted from the moving monitoring target (for example, the person HM1). It is possible to pick up sound produced by an object (for example, the person HM1) with high accuracy.
  • FIG. 26 is an explanatory diagram illustrating manual switching processing of the camera device used for capturing an image displayed on the display device 35.
  • the screen TRW is switched to a multi-camera screen including a camera screen C1W of the camera device C1 and camera screens of camera devices (for example, eight camera devices) around the camera device C1.
  • switchable camera devices are determined in advance for the camera device C1 currently in use, for example, camera devices C2, C3, and C4.
  • camera screens C2W, C3W, and C4W captured by the camera devices C2, C3, and C4 are displayed (see hatching shown in FIG. 26). It is assumed that the person HM1 is moving in the movement direction MV1.
  • the user uses one of the three camera screens C2W, C3W, and C4W with the finger FG on the multi-camera screen shown in FIG. Assume that a touch operation is performed on (for example, camera screen C3W).
  • the operation switching control unit 38 changes the camera device used for capturing an image of the person HM1 from the currently used camera device C1 to the camera screen C3W that is the target of the touch operation. Switch to the corresponding camera device C3.
  • the operation switching control unit 38 can adaptively switch to a camera device that can accurately display an image of the moving monitoring target (for example, the person HM1) by a simple operation of the user.
  • the image of the monitoring object (for example, the person HM1) can be easily designated.
  • FIG. 27 is an explanatory diagram illustrating a manual switching process of the omnidirectional microphone array device used for collecting the sound of the monitoring target (for example, the person HM1).
  • the person HM1 as the monitoring target is displayed in the center on the tracking screen TRW.
  • the omnidirectional microphone array devices that can be switched from the currently used omnidirectional microphone array device M1 are the three omnidirectional microphone array devices M2, M3, and M4 installed around the omnidirectional microphone array device M1. .
  • the omnidirectional microphone array apparatus M2 which can be switched from the omnidirectional microphone array apparatus M1 currently used on the tracking screen TRW in response to an input operation with the cursor CSR by the user's mouse operation or the user's finger FG.
  • Markers M2R, M3R, and M4R indicating the approximate positions of M3 and M4 are displayed (see (1) shown in FIG. 27).
  • the user considers the moving direction MV1 from the tracking position A1 corresponding to the tracking point of the person HM1 as the monitoring object, and touches one of the three markers (for example, the user's finger FG) (for example, The marker M3R) is selected (see (2) shown in FIG. 27).
  • the operation switching control unit 38 connects the communication unit 31 and the network NW from the omnidirectional microphone array apparatus M1 currently in use to the omnidirectional microphone array apparatus M3 corresponding to the marker M3R selected by the touch operation of the user's finger FG. To start sound collection (see (3) in FIG. 27).
  • the output control unit 34b switches the directivity from the omnidirectional microphone array apparatus M3 corresponding to the selected marker M3R to the current tracking position of the person HM1 (see (4) shown in FIG. 27). . Thereafter, the markers M2R, M3R, and M4R indicating the approximate positions of the omnidirectional microphone array devices M2, M3, and M4 displayed on the tracking screen TRW are deleted by the output control unit 34b.
  • the operation switching control unit 38 accurately captures the sound generated by the moving monitoring target (for example, the person HM1) by a simple user operation on the markers M2R, M3R, and M4R displayed on the tracking screen TRW. It is possible to adaptively switch to the omnidirectional microphone array apparatus M3 capable of sounding, and it is possible to pick up the sound emitted by the person HM1 with high accuracy in accordance with the moving direction MV1 of the person HM1.
  • FIG. 28 is an explanatory diagram showing a selection process of the optimum omnidirectional microphone array device used for collecting the sound of the monitoring object.
  • all the camera devices for example, nine camera devices managed by the directivity control system 100B according to the input operation with the cursor CSR by the user's mouse operation or the user's finger FG. ) Camera screens are listed.
  • the camera screens displayed as a list on the display device 35 on the upper left side of FIG. 28 the camera screens on which the monitoring target object (for example, the person HM1) that is the target of the audio tracking process is shown. C3W.
  • the camera screen C1W with the best reflection of the person HM1 is selected in accordance with the input operation with the cursor CSR by the user's mouse operation or the user's finger FG.
  • the operation switching control unit 38 selects and switches the camera device C1 corresponding to the camera screen C1W as a camera device used for capturing an image of the person HM1 in accordance with the user's selection of the camera screen C1W.
  • the output control unit 34b enlarges the image data captured by the camera device corresponding to the camera screen C1W and displays the image data on the tracking screen TRW1 of the display device 35 (see the lower left side in FIG. 28).
  • the output control unit 34b displays markers M1R, M2R, M3R, and M4R indicating the approximate positions of all the omnidirectional microphone array devices associated with the camera device C1 selected by the operation switching control unit 38 on the tracking screen TRW1. Display in the four corners. The display positions of the markers M1R, M2R, M3R, and M4R are not limited to the four corners on the tracking screen TRW1.
  • the output control unit 34b highlights the markers one by one (for example, blinking). Br), for each marker, directivity is formed in the direction from the omnidirectional microphone array device corresponding to each marker to the position of the person HM1, and the sound collected for a certain time is output.
  • the operation switching control unit 38 selects the selected marker M3R.
  • the corresponding omnidirectional microphone array apparatus M3 is selected and switched as the omnidirectional microphone array apparatus used for collecting the sound emitted by the person HM1.
  • the operation switching control unit 38 receives the collected sound having different directivities for a certain period of time in the plurality of omnidirectional microphone array devices M1, M2, M3, and M4 associated with the selected camera device C5. Since the sound can be output, it is possible to accurately collect the sound emitted from the moving monitoring target (for example, the person HM1) by performing a simple operation for selecting the sound to be collected that the user determines to be optimal.
  • the optimum omnidirectional microphone array device M3 can be selected, and the sound emitted from the monitoring object (for example, the person HM1) can be collected with high accuracy.
  • FIG. 29A is a flowchart for explaining an example of automatic switching processing of the camera device in the directivity control system 100B of the second embodiment.
  • the automatic switching process of the camera device shown in FIG. 29A explains the details of the automatic switching process of the camera device shown in FIG. 24. For example, after step S3B-1 shown in FIG. Continued.
  • the image processing unit 37 performs predetermined image processing on the image data displayed on the tracking screen TRW of the display device 35, whereby the position of the monitoring object (for example, the person HM1) is detected. (That is, a tracking point) is detected (S21). After step S21, camera switching determination processing is performed (S22). Details of the camera switching determination process will be described later with reference to FIG.
  • step S22 when the camera switching mode is set to ON by the operation switching control unit 38 (S23, YES), the operation switching control unit 38 sets the camera device currently in use (for example, the camera device C1). All the switchable camera devices associated with each other are instructed to capture an image via the communication unit 31 and the network NW (S24). All the camera devices that have received an instruction to capture an image start capturing an image.
  • the camera switching mode is a flag used for controlling the process of whether to switch the camera device when the multiple camera switching method is automatic.
  • the operation switching control unit 38 uses the distance information between the camera device C1 and the person HM1 measured by the camera device C1 currently in use to detect the person HM1 at the tracking position A1 in the real space detected in step S21. It is determined whether or not the imaging area C1RN of the device C1 has been exceeded (S25). When it is determined that the person HM1 has exceeded the imaging area C1RN of the camera device C1 (S25, YES), the operation switching control unit 38 switches according to the instruction in step S24 and is associated with the currently used camera device C1. Image data captured by all possible camera devices is output to the image processing unit 37.
  • the image processing unit 37 performs predetermined image processing on all the image data output from the operation switching control unit 38, thereby determining whether or not the person HM1 as the monitoring target is detected (S26).
  • the image processing unit 37 outputs the image processing result to the operation switching control unit 38.
  • the operation switching control unit 38 can detect the person HM1 as the monitoring target using the image processing result of the image processing unit 37, and is the closest to the tracking position A1 in the real space detected in step S21.
  • One near camera device for example, camera device C2
  • the camera device used for capturing an image of the person HM1 is switched from the camera device C1 to the camera device C2 (S27).
  • the output control unit 34b switches the tracking screen TRW displayed on the display device 35 to the camera screen of the camera device C2 selected by the operation switching control unit 38 and displays it (S27).
  • the automatic switching process of the camera device shown in FIG. 29A ends, and the process proceeds to the automatic switching process of the omnidirectional microphone array device shown in FIG.
  • FIG. 29B is a flowchart illustrating an example of the camera switching determination process illustrated in FIG.
  • the operation switching control unit 38 sets the camera switching mode in the directivity control device 3B to OFF (S22-1).
  • the operation switching control unit 38 uses the distance information between the camera device C1 and the person HM1 measured by the camera device C1 currently in use to determine the tracking position A1 in the real space corresponding to the tracking point detected in step S21. It is determined whether or not a predetermined switching determination line JC1 of the camera device C1 currently in use has been exceeded (S22-2).
  • the operation switching control unit 38 determines that the tracking position A1 in the real space corresponding to the tracking point detected in step S21 exceeds the predetermined switching determination line JC1 of the camera device C1 currently in use (S22). -2, YES), the camera switching mode is set to ON (automatic) (S22-3).
  • FIG. 29 (B) The camera switching determination process shown in FIG. 9 ends, and the process proceeds to step S23 shown in FIG.
  • FIG. 30A is a flowchart for explaining an example of automatic switching processing of the omnidirectional microphone array device in the directivity control system 100B of the second embodiment.
  • the automatic switching process of the omnidirectional microphone array apparatus shown in FIG. 30A explains in detail the contents of the automatic switching process of the omnidirectional microphone array apparatus shown in FIG. 25, and step S27 shown in FIG.
  • the automatic switching process of the camera device shown in FIG. 29A may be performed after the automatic switching process of the omnidirectional microphone array device shown in FIG.
  • the sound source detection unit 34d calculates or calculates the position (sound source position) of the monitoring object (for example, the person HM1) in the real space by performing a predetermined sound source detection process.
  • the coordinates indicating the position on the image data corresponding to the position of the sound source (that is, the coordinates of the tracking position A1 corresponding to the tracking point) are calculated (S31).
  • a microphone switching determination process is performed (S32). Details of the microphone switching determination process will be described later with reference to FIG.
  • step S32 when the microphone switching mode is set to ON by the operation switching control unit 38 (S33, YES), the operation switching control unit 38 selects the omnidirectional microphone array device (for example, omnidirectional) currently in use. Instruct all the switchable omnidirectional microphone array devices associated with the microphone array device M1) to collect the sound emitted by the person HM1 via the communication unit 31 and the network NW (S34). All omnidirectional microphone array apparatuses that have received an instruction to collect sound start to collect sound. Note that the microphone switching mode is a flag used for controlling processing for determining whether to switch the omnidirectional microphone array apparatus when the multiple microphone switching method is automatic.
  • the omnidirectional microphone array device for example, omnidirectional
  • the operation switching control unit 38 uses the distance information between the omnidirectional microphone array apparatus M1 currently used and the person HM1 calculated by the sound source detection unit 34d, so that the person HM1 selects the sound collection area M1RN of the omnidirectional microphone array apparatus M1. It is determined whether or not it has been exceeded (S35). When it is determined that the person HM1 has exceeded the sound collection area M1RN of the omnidirectional microphone array apparatus M1 (S35, YES), the sound source detection unit 34d determines that the omnidirectional microphone array apparatus currently in use is in accordance with an instruction in step S34. Based on the strength or volume level of the sound collected by all switchable omnidirectional microphone array devices associated with M1, the position (sound source position) of the person HM1 as the monitoring object is calculated ( S36).
  • the operation switching control unit 38 uses the sound source detection result of the sound source detection unit 34d to monitor among all switchable omnidirectional microphone array devices associated with the currently used omnidirectional microphone array device M1.
  • One omnidirectional microphone array apparatus for example, omnidirectional microphone array apparatus M2 that minimizes the difference in the distance between the position of the person HM1 as an object (the position of the sound source) and the omnidirectional microphone array apparatus is selected, and the person HM1
  • the omnidirectional microphone array apparatus used for collecting the emitted voice is switched from the omnidirectional microphone array apparatus M1 to the omnidirectional microphone array apparatus M2 (S37).
  • the output control unit 34b switches the sound directivity from the omnidirectional microphone array device M2 after switching to the direction of the sound source calculated in step S36 (S37).
  • the automatic switching process of the omnidirectional microphone array apparatus shown in FIG. 30A ends, and the process proceeds to, for example, step S3B-2 shown in FIG. 10B.
  • the automatic switching process of the camera device shown in FIG. 29A may be started after the automatic switching process of the omnidirectional microphone array device shown in FIG.
  • FIG. 30B is a flowchart illustrating an example of the microphone switching determination process illustrated in FIG.
  • the operation switching control unit 38 sets the microphone switching mode to OFF (S32-1).
  • the operation switching control unit 38 uses the distance information between the omnidirectional microphone array apparatus M1 currently in use and the person HM1, and the tracking position A1 calculated in step S31 is a predetermined value of the omnidirectional microphone array apparatus M1 currently in use. It is determined whether or not the switching determination line JM1 is exceeded (S32-2).
  • the operation switching control unit 38 When it is determined that the tracking position A1 exceeds the predetermined switching determination line JM1 of the omnidirectional microphone array apparatus M1 currently in use (S32-2, YES), the operation switching control unit 38 turns on the microphone switching mode. Setting is made (S32-3).
  • step S32-3 or when it is determined that the tracking position A1 does not exceed the predetermined switching determination line JM1 of the omnidirectional microphone array apparatus M1 currently in use (S32-2, NO), FIG.
  • the microphone switching determination process shown in (B) ends, and the process proceeds to step S33 shown in FIG.
  • FIG. 31A is a flowchart illustrating an example of manual switching processing of the camera device in the directivity control system 100B of the second embodiment.
  • the manual switching process of the camera device in the directivity control system 100B shown in FIG. 31A is performed following step S1 shown in FIG. 9A, FIG. 9B, or FIG.
  • FIG. 31A when an instruction for switching the camera device is input to the display device 35 in response to an input operation with the cursor CSR by the user's mouse operation or the user's finger FG (S41), output control is performed.
  • the unit 34b displays the tracking screen TRW of the image captured by the camera device C1 currently used for capturing an image of the person HM1, the camera screen C1W of the camera device C1, and the camera devices (for example, eight cameras) around the camera device C1. Switch to a multi-camera screen including the camera screen (S42).
  • the user For the multi-camera screen displayed on the display device 35 in step S42, the user considers the moving direction MV1 of the person HM1 as the monitoring target (see FIG. 26), for example, any camera screen with the finger FG. Is selected by a touch operation (S43).
  • the operation switching control unit 38 selects a camera device used for capturing an image of the person HM1 in response to the touch operation of the user's finger FG from the currently used camera device C1 in step S43. Switching to the camera device C3 corresponding to the screen C3W (S44). Thereby, the manual switching process of the camera device shown in FIG. 31A is completed, and any one of steps S45, S51, S61, and S71 shown in FIG. 31B, FIG. 32A, and FIG. Proceed to
  • FIG. 31B is a flowchart for explaining an example of manual switching processing of the omnidirectional microphone array device in the directivity control system 100B of the second embodiment.
  • FIG. 31B when an instruction for switching the omnidirectional microphone array apparatus is input in response to an input operation with the cursor CSR or the user's finger FG by the user's mouse operation (S45), the output control unit 34b On the tracking screen TRW, a marker (for example, marker M2R) indicating the approximate position of the omnidirectional microphone array device (for example, omnidirectional microphone array device M2, M3, M4) that can be switched from the currently used omnidirectional microphone array device M1. , M3R, M4R) are displayed (S46).
  • a marker for example, marker M2R
  • M2R the approximate position of the omnidirectional microphone array device
  • the user selects one of the three markers (for example, the marker M3R) by touching the finger FG of the user in consideration of the moving direction MV1 from the tracking position A1 of the person HM1 as the monitoring target. (S47, see FIG. 27).
  • the operation switching control unit 38 connects the communication unit 31 and the network NW from the omnidirectional microphone array apparatus M1 currently in use to the omnidirectional microphone array apparatus M3 corresponding to the marker M3R selected by the touch operation of the user's finger FG.
  • the start of sound collection is instructed (S47).
  • the output control unit 34b switches the directivity from the omnidirectional microphone array device M3 corresponding to the marker M3R selected in Step S47 to the direction toward the tracking position of the current person HM1 (S48). Further, the output control unit 34b deletes the markers M2R, M3R, and M4R indicating the approximate positions of the omnidirectional microphone array devices M2, M3, and M4 displayed on the tracking screen TRW (S48).
  • step S48 the manual switching process of the omnidirectional microphone array apparatus shown in FIG. 31 (B) ends, and the process proceeds to step S2 shown in FIG. 9 (A), FIG. 9 (B), or FIG. 10 (A).
  • the manual switching process of the camera apparatus shown in FIG. 31A may be performed after the manual switching process of the omnidirectional microphone array apparatus shown in FIG.
  • FIG. 32A is a flowchart for explaining a first example of selection processing of the optimum omnidirectional microphone array device in the directivity control system 100B of the second embodiment.
  • FIG. 32B is a flowchart illustrating a second example of the optimum omnidirectional microphone array device selection process in the directivity control system 100B of the second embodiment.
  • FIG. 33 is a flowchart for explaining a third example of the optimum omnidirectional microphone array apparatus selection process in the directivity control system 100B of the second embodiment.
  • the operation switching control unit 38 determines each distance from each omnidirectional microphone array device to a position in the real space corresponding to the designated position designated in step S51, that is, from each omnidirectional microphone array device as a monitoring object. Each distance to the person HM1 is calculated (S53).
  • the operation switching control unit 38 selects the omnidirectional microphone array device that provides the minimum distance among the distances calculated in step S53, and instructs the signal processing unit 34 to select the selected omnidirectional microphone array device. Instruct to form directivity for the voice data of the voice collected by (S54).
  • the output control unit 34b of the signal processing unit 34 is directed from the omnidirectional microphone array apparatus selected by the operation switching control unit 38 in step S54 to the position of the person HM1 as the monitoring target. Then, the sound directivity is formed, and the sound having the directivity is output from the speaker device 36 (S55).
  • the operation switching control unit 38 accurately specifies the sound emitted by the moving monitoring object (for example, the person HM1) by simply specifying the position indicating the moving direction of the monitoring object (for example, the person HM1).
  • the optimum omnidirectional microphone array device that can pick up the sound can be selected, and the sound emitted from the monitoring object (for example, the person HM1) can be picked up with high accuracy.
  • step S55 the optimum omnidirectional microphone array apparatus selection process shown in FIG. 32A is completed, and the process proceeds to step S2 shown in FIG. 9A, FIG. 9B, or FIG. move on. Note that the manual switching process of the camera apparatus shown in FIG. 31A may be performed after the selection process of the optimum omnidirectional microphone array apparatus shown in FIG.
  • FIG. 32 (B) on the tracking screen TRW displayed on the display device 35, in the moving direction of the person HM1 as the monitoring object in accordance with the input operation with the cursor CSR by the user's mouse operation or the user's finger FG.
  • the position (tracking position corresponding to the tracking point) is designated (S61)
  • information (for example, coordinates) regarding the designated position is input to the operation switching control unit 38.
  • the image processing unit 37 detects the orientation of the face of the person HM1 as the monitoring target by performing predetermined image processing on the image data captured by the currently used camera device (for example, the camera device C1). (S62). The image processing unit 37 outputs the detection result of the face direction of the person HM1 as the monitoring target to the operation switching control unit 38.
  • the operation switching control unit 38 detects information related to the specified position specified in step S61 (for example, coordinates indicating the position on the image data) and the detection result of the face orientation of the person HM1 obtained from the image processing unit 37 in step S62. Are used to calculate the relationship between the face orientation of the person HM1, the designated position, and each omnidirectional microphone array device (S63). For example, the operation switching control unit 38 calculates the distance between the position of the monitoring object (for example, the person HM1) corresponding to the designated position on the image data designated in step S61 and each omnidirectional microphone array device.
  • the operation switching control unit 38 corresponds to the designated position on the image data that is in the direction along the face of the monitoring target (eg, the person HM1) (eg, within 45 degrees in the horizontal direction) and designated in step S61.
  • the omnidirectional microphone array apparatus that selects the minimum distance between the position of the monitoring target (for example, the person HM1) and each omnidirectional microphone array apparatus is selected (S64).
  • the operation switching control unit 38 instructs the signal processing unit 34 to form directivity with respect to the audio data of the audio collected by the omnidirectional microphone array device selected in step S64 (S64). ).
  • the output control unit 34b of the signal processing unit 34 changes the sound directivity from the omnidirectional microphone array apparatus selected in step S64 toward the position of the person HM1 as the monitoring target.
  • the formed voice having directivity is output from the speaker device 36 (S65).
  • the operation switching control unit 38 is moving depending on the orientation of the face on the image data of the monitoring object (for example, the person HM1) and the distance between the monitoring object (for example, the person HM1) and each omnidirectional microphone array device. It is possible to select an optimal omnidirectional microphone array device that can accurately pick up the sound emitted from the monitoring object (for example, the person HM1), and the sound emitted from the monitoring object (for example, the person HM1) can be selected with high accuracy. Can be picked up.
  • step S65 the optimum omnidirectional microphone array apparatus selection process shown in FIG. 32B is completed, and the process proceeds to step S2 shown in FIG. 9A, FIG. 9B, or FIG. move on.
  • step S2 shown in FIG. 9A, FIG. 9B, or FIG. move on.
  • the manual switching process of the camera device shown in FIG. 31A may be performed after the selection process of the optimum omnidirectional microphone array device shown in FIG.
  • the output control unit 34b lists the camera screens of all the camera devices managed by the directivity control system 100B in the display device 35 in response to an input operation using the cursor CSR or the user's finger FG by the user's mouse operation. Displayed (S71). Among the camera screens displayed as a list on the display device 35, the cursor CSR by the user's mouse operation or the user's mouse is displayed on the camera screen on which the monitoring target object (for example, the person HM1) to be subjected to the voice tracking process is shown. Assume that the camera screen C1W with the best reflection of the person HM1 is selected in accordance with the input operation with the finger FG (S72).
  • the monitoring target object for example, the person HM1
  • the operation switching control unit 38 selects and switches a camera device corresponding to the camera screen as a camera device used for capturing an image of the person HM1 in accordance with the user's selection of the camera screen in step S72.
  • the output control unit 34b enlarges the image data captured by the camera device corresponding to the camera screen and displays it on the tracking screen TRW1 of the display device 35 (see S73, lower left side of FIG. 28).
  • the output control unit 34b is a marker (for example, markers M1R, M2R, M3R, and M4R shown in FIG. 28) that indicates the approximate positions of all the omnidirectional microphone array devices associated with the camera device selected by the operation switching control unit 38. Are displayed at the four corners of the tracking screen TRW1 (S74).
  • the output control unit 34b highlights the markers one by one (for example, With the blinking Br), for each marker, directivity is formed in the direction from the omnidirectional microphone array device corresponding to each marker to the position of the person HM1, and the sound collected for a certain time is output (S76). .
  • the operation switching control unit 38 selects the selected marker M3R.
  • the corresponding omnidirectional microphone array apparatus M3 is selected and switched as the omnidirectional microphone array apparatus used for collecting the sound emitted by the person HM1 (S77).
  • step S77 the selection process of the optimal omnidirectional microphone array apparatus shown in FIG. 33 is completed, and the process proceeds to step S2 shown in FIG. 9A, FIG. 9B, or FIG.
  • step S2 shown in FIG. 9A, FIG. 9B, or FIG.
  • the manual switching process of the camera device shown in FIG. 31A may be performed after the selection process of the optimum omnidirectional microphone array device shown in FIG.
  • this modified example in the modified example of the first embodiment (hereinafter referred to as “this modified example”), in the first embodiment or the second embodiment, a plurality of monitoring objects (for example, a plurality of persons) are displayed on the tracking screen TRW.
  • An operation example of the directivity control system 100 when a plurality of persons are designated at the same timing or at different timings when they appear will be described. Since the system configuration example of the directivity control system according to the present modification is the same as the directivity control system 100, 100A, 100B according to the first or second embodiment, the description of the system configuration example is simplified or omitted. The different contents will be described. Hereinafter, in order to simplify the explanation, the system configuration example of the directivity control system 100 will be described.
  • FIG. 34 is a flowchart for explaining an example of the overall flow of manual tracking processing based on a plurality of simultaneous designations in the directivity control system 100 according to the modification of the first embodiment.
  • FIG. 35 is a flowchart illustrating an example of an automatic tracking process for a plurality of monitoring objects in the directivity control system 100 according to the modification of the first embodiment.
  • directivity control devices 3A and 3B are used.
  • the tracking mode determination process in step S1, the tracking assist process in step S2, the tracking connection process in step S6, and the audio output process in step S7 are performed in, for example, step S1 shown in FIG.
  • the tracking mode determination process, the tracking assist process in step S2, the tracking connection process in step S6 shown in FIG. 9 (A), and the audio output process in step S7 shown in FIG. 9 (A) are omitted. To do.
  • step S2 the tracking points corresponding to the tracking positions of the movement processes (movement paths) of a plurality of persons as monitoring objects are simultaneously determined in accordance with the input operation with the cursor CSR by the user's mouse operation or the user's finger FG. It is assumed that a plurality are designated (S82).
  • the tracking processing unit 34c for each person as the monitoring target specified in step S82, distinguishes positions in the real space corresponding to a plurality of specified positions on the tracking screen TRW and specified times, and tracks tracking points respectively.
  • the position and tracking time are associated with each other and stored in the memory 33 (S83). Further, the tracking processing unit 34c displays a point marker for each person as a monitoring target by distinguishing the tracking point on the tracking screen TRW via the output control unit 34b (S83).
  • the output control unit 34b selects each person corresponding to the tracking position of each person as a plurality of monitoring objects simultaneously designated in step S82 from the omnidirectional microphone array apparatus (for example, omnidirectional microphone array apparatus) M1 currently in use.
  • the directivity of the collected sound is formed in the direction to the position in the real space (sound position, sound source position) (S84). After step S84, tracking connection processing is performed (S6).
  • step S6 the output control unit 34b resumes the output (reproduction) of the sound that has been paused in step S81 from the speaker device 36 (S85).
  • step S85 an audio output process is performed (S7).
  • step S7 the operations from step S81 to step S7 (steps S81, S2, S82, S83, S84, S6, S85, and S7) are repeated until the tracking mode of the directivity control device 3B is turned off. .
  • the image processing unit 37 of the directivity control devices 3A and 3B performs known image processing to detect a person as a monitoring target on the tracking screen TRW of the display device 35. If it is determined that a plurality of persons have been detected, the signal processing is performed using the determination result (including the detection position of each person (for example, known representative points) and detection time data) as an automatic designation result. To the tracking processing unit 34c of the unit 34 (S91).
  • the sound source detection unit 34d performs a known sound source detection process to determine whether or not the position of the sound (sound source) emitted by the person as the monitoring target is detected on the tracking screen TRW of the display device 35, When it is determined that the positions of a plurality of sound sources have been detected, the determination result (including the sound source detection position and detection time data) is output to the tracking processing unit 34c as an automatic designation result (S91).
  • the tracking processing unit 34c calculates the movement vector of each person as a plurality of monitoring objects using the transition of one or more automatic designation results immediately before in step S91, and estimates the movement direction of each person (S91). ).
  • the tracking processing unit 34c associates the tracking positions corresponding to the plurality of automatically designated tracking points with the previous automatic designation results using the estimation result of the moving direction of the person as the plurality of monitoring objects in step S91. In addition, it is stored in the memory 33 as a pair of tracking positions (S92).
  • the tracking processing unit 34c distinguishes the designated position and designated time of each person on the tracking screen TRW for each person as the monitoring target, and stores them in the memory 33 in association with the tracking position and tracking time of the tracking point. (S92). Further, the tracking processing unit 34c displays a point marker for each person as a monitoring target by distinguishing the tracking position on the tracking screen TRW via the output control unit 34b (S92).
  • the directivity control devices 3, 3 ⁇ / b> A, 3 ⁇ / b> B move how the plurality of monitoring objects (for example, persons) displayed on the image data on the tracking screen TRW of the display device 35 move.
  • the directivity of the sound formed in the direction toward the position before the movement of each person is formed in the direction toward the position after the movement of each person, the directivity of the sound is increased as each person moves. It can follow and form appropriately, and it can control the efficiency degradation of the supervisor's monitoring work.
  • One embodiment of the present invention is a directivity control device that controls the directivity of sound collected by a first sound collection unit including a plurality of microphones, the first sound collection unit, the display unit A directivity forming unit that forms the directivity of the sound in a direction toward the monitoring target corresponding to the first specified position on the image of the display, and the display unit specified according to the movement of the monitoring target
  • An information acquisition unit that acquires information about a second designated position on the image of the image, and the directivity forming unit uses the information about the second designated position acquired by the information acquisition unit,
  • the directivity control device switches the directivity of the voice in a direction toward the monitoring target corresponding to a second designated position.
  • the directivity control device forms sound directivity in the direction from the first sound collection unit including a plurality of microphones to the monitoring target corresponding to the first designated position on the image of the display unit.
  • information on the second designated position that designates the moving monitoring object is acquired.
  • the directivity control device switches the directivity of the sound in the direction toward the monitoring target corresponding to the second designated position, using information regarding the second designated position on the image of the display unit.
  • the directivity control device monitors the directivity of the sound formed in the direction toward the position before the movement of the monitoring object even if the monitoring object projected on the image of the display unit moves. Since the object is formed in the direction toward the position after the movement, the sound can be properly formed following the directivity of the sound as the monitoring object moves, and the efficiency of monitoring work of the supervisor can be suppressed. .
  • the information acquisition unit acquires information related to the second specified position in response to a specifying operation on the monitoring object that moves on the image of the display unit. It is a control device.
  • the directivity control device can easily obtain accurate information regarding the position after the movement of the monitoring target object by a simple operation that specifies the monitoring target object moving on the image displayed on the display unit. can do.
  • a sound source detection unit that detects a sound source position corresponding to the monitoring object from the image of the display unit
  • an image processing unit that detects the monitoring object from the image of the display unit.
  • the information acquisition unit includes the information regarding the sound source position detected by the sound source detection unit or the information regarding the position of the monitoring target detected by the image processing unit as the second designated position. It is a directivity control device that is acquired as information on the information.
  • the directivity control device can easily detect the sound source of the sound generated by the monitoring target and the monitoring target itself from the image displayed on the display unit.
  • Information regarding the position of the monitoring object can be easily acquired as information regarding the position after the movement of the monitoring object.
  • the sound source detection unit starts detection processing of a sound source position corresponding to the monitoring object centering on an initial position designated on the image of the display unit, and the image
  • the processing unit is a directivity control device that starts the detection process of the monitoring object around the initial position.
  • the directivity control device is related to the position of the sound source around the initial position (for example, at the position of the monitoring target) specified on the image displayed on the display unit, for example, by the user's specifying operation. Since the detection process of the information or the information related to the position of the monitoring object is started, the detection process of the position of the sound source or the detection process of the position of the monitoring object can be performed at high speed.
  • the change is performed in accordance with an operation for changing the information on the sound source position detected by the sound source detection unit or the information on the position of the monitoring target detected by the image processing unit.
  • It is a directivity control apparatus which acquires the information regarding the position on the image of the said display part designated by operation as information regarding the said 2nd designated position.
  • the user Information regarding the position designated on the image by the position changing operation can be easily corrected and acquired as information regarding the position after the movement of the monitoring object.
  • the information acquisition unit is configured such that a distance between the sound source position detected by the sound source detection unit and the position of the monitoring target detected by the image processing unit is a predetermined value or more.
  • the information on the position on the image of the display unit specified by the change operation in response to the change operation of the information on the sound source position or the information on the position of the monitoring target object. It is a directivity control device that is acquired as information on the information.
  • the directivity control device is configured so that the distance between the sound source position detected by the sound source position detection process or the monitoring target object position detection process and the monitoring target object position is equal to or greater than a predetermined value.
  • information related to the position designated on the image can be easily corrected and acquired as information related to the position after the movement of the monitoring target object by, for example, a user's position changing operation.
  • the directivity control device for example, if the distance between the sound source position detected by the sound source position detection process or the monitoring object position detection process and the position of the monitoring object is not a predetermined value or more, for example, The position of the sound source or the position of the monitoring object can be easily acquired as information on the position after the movement of the monitoring object without requiring a position changing operation.
  • an embodiment of the present invention further includes an image storage unit that stores an image captured over a certain period, and an image playback unit that plays back the image stored in the image storage unit on the display unit.
  • the image reproduction unit is a directivity control device that reproduces the image at a speed value smaller than the initial value of the reproduction speed by a predetermined input operation.
  • the initial value of the reproduction speed is determined by a user's predetermined input operation (for example, a slow reproduction instruction operation).
  • Slow reproduction can be performed at a speed value smaller than (for example, a normal value used at the time of video reproduction).
  • an embodiment of the present invention further includes a display control unit that displays a captured image on the display unit, and the display control unit is configured according to designation to a designated position on the image of the display unit.
  • a directivity control device that enlarges and displays the image on the same screen at a predetermined magnification around the designated position.
  • the directivity control device enlarges and displays an image at a predetermined magnification within the same screen, with the designated position on the image displayed on the display unit as the center, for example, by a simple designation operation by the user. Therefore, it is possible to simplify the user's operation of specifying the monitoring target on the same screen.
  • an embodiment of the present invention further includes a display control unit that displays a captured image on the display unit, and the display control unit is configured according to designation to a designated position on the image of the display unit.
  • a directivity control device that enlarges and displays the image on another screen at a predetermined magnification with the designated position as the center.
  • the directivity control device enlarges and displays an image at a predetermined magnification in different screens with a designated position on the image displayed on the display unit as a center by, for example, a simple designation operation by the user. Therefore, the user can easily specify the monitoring target by comparing the screen that is not enlarged and the screen that is enlarged.
  • an embodiment of the present invention further includes a display control unit that displays a captured image on the display unit, and the display control unit uses the center of the display unit as a reference in accordance with a predetermined input operation.
  • a directivity control device that enlarges and displays the image at a predetermined magnification.
  • the directivity control device enlarges and displays the image at a predetermined magnification on the basis of the center of the display unit, for example, by a simple designation operation by the user, for example, monitoring near the center of the display unit.
  • the user can easily specify the monitoring object.
  • the display control unit when the designated position exceeds a predetermined scroll determination line on the screen on which the image is enlarged and displayed in accordance with the movement of the monitoring object, A directivity control apparatus that scrolls the screen by a predetermined amount in a direction beyond the scroll determination line.
  • the directivity control device has exceeded the scroll determination line when the user-specified position exceeds the scroll determination line due to the movement of the monitoring object displayed on the enlarged display screen. Since the screen is automatically scrolled by a predetermined amount in the direction, it is possible to prevent the designated position of the monitoring target of the user from moving off the screen even when the screen is enlarged.
  • the display control unit when the designated position exceeds a predetermined scroll determination line on the screen on which the image is enlarged and displayed in accordance with the movement of the monitoring object, A directivity control device that scrolls the screen so that the designated position is at the center.
  • the directivity control device allows the user-specified position to be displayed on the screen when the user-specified position exceeds the scroll determination line due to the movement of the monitoring object displayed on the enlarged display screen. Since the screen is automatically scrolled to the center of the screen, even if the screen is enlarged, it is possible to prevent the designated position of the user's monitoring target from moving off the screen and to keep moving It is possible to easily specify the monitoring object.
  • one embodiment of the present invention is a directivity control device in which the display control unit scrolls the screen so that the designated position is at the center of the screen on the screen on which the image is enlarged and displayed. .
  • the directivity control device automatically scrolls the screen so that the position designated by the user is always at the center of the screen when the monitoring target displayed on the enlarged screen is moved. Therefore, even when the screen is enlarged and displayed, it is possible to prevent the designated position of the monitoring target of the user from moving off the screen, and it is possible to easily specify the monitoring target on the screen that continues to move.
  • an embodiment of the present invention is a directivity control device in which the image processing unit performs masking processing on a part of the monitoring object on the image of the display unit in accordance with a predetermined input operation.
  • the directivity control device masks a part (for example, a face) of a monitoring object (for example, a person) displayed on the screen of the display unit by, for example, a simple input operation by the user. Privacy can be effectively protected by making it difficult to understand who the person of the object is.
  • an embodiment of the present invention further includes a sound output control unit that causes the sound output unit to output the sound collected by the first sound collecting unit, and the sound output control unit includes a predetermined input operation.
  • the directivity control device is configured to perform voice change processing on the sound collected by the first sound collection unit and output the sound to the sound output unit.
  • the directivity control device performs voice change processing on the sound collected in real time by the first sound collection unit, for example, by a simple input operation by the user, and outputs the sound.
  • voice change processing on the sound collected in real time by the first sound collection unit, for example, by a simple input operation by the user, and outputs the sound.
  • a sound storage unit that stores sound collected by the first sound collection unit over a certain period of time, and the sound stored in the sound storage unit is output to a sound output unit.
  • An audio output control unit for causing the audio output control unit to perform a voice change process on the audio collected by the first sound collection unit in response to a predetermined input operation. This is a directivity control device for output.
  • the directivity control device performs a voice change process on the sound when the sound collected by the first sound collection unit is output for a certain period of time by a simple input operation of the user, for example. Therefore, it is possible to effectively protect the privacy of the monitoring target person's voice by making it difficult to understand the voice of the monitoring target object (for example, a person).
  • one embodiment of the present invention further includes a display control unit that displays a predetermined marker at a specified position on the image of one or more of the display units that is specified according to the movement of the monitoring object.
  • the directivity control device is not limited to a display control unit that displays a predetermined marker at a specified position on the image of one or more of the display units that is specified according to the movement of the monitoring object.
  • the directivity control device when the user performs a designation operation for designating the monitoring target displayed on the display unit, the directivity control device is set to a predetermined position designated on the screen of the display unit. Since the marker is displayed, the position through which the moving monitoring object passes can be explicitly shown as a trajectory.
  • At least the current designated position and the immediately preceding designated position among two or more designated positions on the image of the display unit designated according to the movement of the monitoring object is a directivity control apparatus further provided with the display control part which connects and displays.
  • the directivity control device includes at least the current designated position and the immediately preceding position among the plurality of designated positions designated by the user's designated operation when the monitoring object projected on the screen of the display unit moves. Since the designated position is connected and displayed, a partial trajectory of the movement of the monitored object can be explicitly shown.
  • one or two specified positions adjacent to each specified position with respect to all the specified positions on the image of the display unit that are specified according to the movement of the monitoring object is a directivity control apparatus further provided with the display control part which displays the flow line which connected.
  • the directivity control device is adjacent to each designated position with respect to all of the plurality of designated positions designated by the user's designated operation when the monitoring object projected on the screen of the display unit moves. Since one or two designated positions to be displayed are connected and displayed, the entire trajectory of the movement of the monitoring object can be explicitly shown.
  • a specified list storage unit that stores data of all specified positions and specified times on the image of the display unit, and all of the display units displayed by the display control unit.
  • Reproduction that calculates the reproduction start time of the sound at the designated position on the flow line using the designation list stored in the designation list storage unit in accordance with designation of an arbitrary position on the flow line connecting the designated position
  • a time calculation unit wherein the directivity forming unit uses the data at the specified position corresponding to the specified time closest to the reproduction start time of the sound calculated by the reproduction time calculation unit, It is a directivity control device that forms the directivity of speech.
  • the directivity control device is configured such that when all the designated positions designated by the user are displayed while moving the monitored object, the designated position on the flow line is designated according to any user designation.
  • the reproduction start time of the collected voice at is calculated, and the directivity of the voice is formed corresponding to any designated time designated during movement of the monitoring object closest to the reproduction time.
  • the directivity control device designates the designation specified next to the arbitrarily designated position in accordance with the position (arbitrarily designated position) arbitrarily designated by the user on the flow line indicating the trajectory of the movement of the monitored object.
  • Voice directivity can be formed in advance in the direction toward the position (tracking position).
  • a sound storage unit that stores sound collected by the first sound collection unit over a certain period of time, and the sound stored in the sound storage unit is output to a sound output unit.
  • An audio output control unit that causes the audio output unit to output the audio to the audio output unit at the reproduction start time of the audio calculated by the reproduction time calculation unit.
  • a unit configured to form directivity of the voice by using data of the designated position corresponding to the next designated time when there is a next designated time within a predetermined time from the reproduction start time of the voice; It is a control device.
  • the directivity control device reproduces the sound at the sound reproduction start time at the position designated according to any user designation on the flow line, and within a predetermined time from the sound reproduction time, the monitoring target object
  • the directivity of the voice is formed using data at a designated position corresponding to the next designated time.
  • the directivity control device can clearly output the collected sound emitted by the monitoring target at the reproduction start time calculated according to the user's arbitrary designated position, and within a predetermined time from the reproduction start time.
  • the directivity of the voice at the next designated position can be formed in advance.
  • the directivity control device further includes an operation switching control unit that switches an imaging unit used for display from the first imaging unit to the second imaging unit.
  • the directivity control device when the moving monitoring object exceeds a predetermined switching range corresponding to the first imaging unit used for displaying the image on the display unit, The imaging unit used for image display is switched from the first imaging unit to the second imaging unit.
  • the directivity control device can adaptively switch to an imaging unit capable of accurately displaying an image of the moving monitoring object, and can easily specify the image of the user's monitoring object. it can.
  • a sound collection unit used for collecting sound of the monitoring object when the monitoring object exceeds a predetermined switching range corresponding to the first sound collection unit, a sound collection unit used for collecting sound of the monitoring object is provided.
  • the directivity control device further includes an operation switching control unit that switches from the first sound collecting unit to the second sound collecting unit.
  • the directivity control device when the moving monitoring object exceeds the predetermined switching range corresponding to the first sound collection unit used for collecting the sound of the monitoring object, The sound collection unit used for collecting the sound of the object is switched from the first sound collection unit to the second sound collection unit.
  • the directivity control device can adaptively switch to a sound collection unit capable of accurately collecting the sound emitted by the moving monitoring object, and the sound emitted by the monitoring object can be accurately obtained. Sound can be collected.
  • a display control unit that displays a list of images captured by a plurality of imaging units on different screens according to a predetermined input operation, and the display control unit Select an imaging unit to be used for displaying the image of the monitoring object on the display unit in response to a selection operation on one of the predetermined selectable screens among the screens displayed in a list on the display unit
  • a directivity control device further comprising an operation switching control unit.
  • the directivity control device changes the imaging unit used for displaying the image on the display unit from a plurality of different screens displayed in a list on the display unit to a screen specified by the user according to the moving direction of the monitoring target. Switch to the corresponding imaging unit.
  • the directivity control device can adaptively switch to an imaging unit capable of accurately displaying an image of the moving monitoring target object by a simple operation of the user, and the user's monitoring target image Can be specified easily.
  • a display that displays a marker indicating the approximate positions of a plurality of surrounding sound collection units that can be switched from the first sound collection unit in accordance with a predetermined input operation is displayed on the display unit.
  • a sound collection unit used for collecting sound of the monitoring target object a sound collection unit used for collecting sound of the monitoring target object.
  • the directivity control device further includes an operation switching control unit that switches from the first sound collection unit to another sound collection unit corresponding to the selected marker.
  • the directivity control device causes the display unit to display markers indicating the approximate positions of a plurality of surrounding sound collection units that can be switched from the first sound collection unit, for example, by a user input operation, and is selected by the user In accordance with one of the markers, the sound collection unit used to collect the sound of the monitoring target is switched from the first sound collection unit to another sound collection unit corresponding to the selected marker.
  • the directivity control device can adaptively switch to the sound collection unit capable of accurately collecting the sound emitted from the moving monitoring object by a simple operation of the user. Can be collected with high accuracy.
  • the operation switching control unit is configured to perform the operation according to designation of a position on the image of the monitoring object captured by the imaging unit selected by the operation switching control unit.
  • a directivity control device that selects a sound collecting unit having the shortest distance from a plurality of sound collecting units including one sound collecting unit to the monitoring target as a sound collecting unit used for collecting sound of the monitoring target. It is.
  • the directivity control device includes a plurality of sound collection units including the first sound collection unit to the monitoring target according to the position designation on the image of the monitoring target captured by the selected imaging unit. Is selected as the sound collecting unit used for collecting the sound of the object to be monitored.
  • the directivity control device allows the user to easily specify the position indicating the moving direction of the monitored object, so that the optimum sound collecting sound that can be accurately picked up by the moving monitored object can be obtained.
  • the sound part can be selected, and the sound emitted from the monitoring object can be collected with high accuracy.
  • an embodiment of the present invention further includes an image processing unit that detects a face direction of the monitoring target object from the image of the display unit, and the operation switching control unit is selected by the operation switching control unit. In response to designation of a position on the image of the monitoring object imaged by the imaging unit, the first convergence is performed in a direction corresponding to the face direction of the monitoring object detected by the image processing unit.
  • a directivity control apparatus that selects a sound collecting unit having a shortest distance from a plurality of sound collecting units including a sound unit to the monitoring target as a sound collecting unit used for collecting sound of the monitoring target.
  • the directivity control device exists in the direction indicated by the orientation of the face of the monitoring object on the image according to the position designation on the image of the monitoring object imaged by the selected imaging unit, And the sound collection part with the shortest distance from the some sound collection part containing a 1st sound collection part to the monitoring target object is selected as a sound collection part used for the sound collection of the sound of the monitoring target object.
  • the directivity control device can accurately collect the sound emitted by the moving monitoring object according to the orientation of the face on the image of the monitoring object and the distance between the monitoring object and the sound collection unit. It is possible to select an optimal sound pickup unit that is possible, and it is possible to pick up sound generated by the monitoring object with high accuracy.
  • one embodiment of the present invention further includes an audio output control unit that causes the audio output unit to output audio collected by the first sound collection unit
  • the display control unit includes the operation switching control unit
  • a marker indicating the approximate position of a plurality of sound collection units including the first sound collection unit associated with the imaging unit selected by the display unit is displayed on the display unit, and the audio output control unit is configured to switch the operation From the sound collection unit corresponding to each marker displayed on the display unit to the monitoring target according to the designation of the position on the image of the monitoring target captured by the imaging unit selected by the control unit
  • the sound with directivity formed in the direction of is sequentially output for a predetermined time, and the operation switching control unit selects one of the markers based on the sound output by the sound output control unit.
  • the sound collection part corresponding to the marked marker Selecting a sound pickup unit used for sound collection sound serial monitored object, a directivity control apparatus.
  • the directivity control device causes the display unit to display a marker indicating the approximate position of the plurality of sound collection units including the first sound collection unit associated with the selected imaging unit, and to monitor the movement In accordance with the position designation on the image of the target object, sound in which directivity is formed in the direction from the sound collection unit corresponding to each marker to the monitoring target is sequentially output for a predetermined time, and further selected.
  • the sound collection unit corresponding to any one of the markers is selected as the sound collection unit used for collecting the sound of the monitoring target.
  • the directivity control device can output sound collection sounds in which different directivities are formed in a plurality of sound collection units associated with the selected imaging unit over a certain period of time, so that the user is optimal.
  • the directivity control device can output sound collection sounds in which different directivities are formed in a plurality of sound collection units associated with the selected imaging unit over a certain period of time, so that the user is optimal.
  • an embodiment of the present invention is a directivity control method in a directivity control apparatus that controls directivity of sound collected by a first sound collection unit including a plurality of microphones.
  • the directivity control device forms sound directivity in the direction from the first sound collection unit including a plurality of microphones to the monitoring target corresponding to the first designated position on the image of the display unit.
  • information on the second designated position that designates the moving monitoring object is acquired.
  • the directivity control device switches the directivity of the sound in the direction toward the monitoring target corresponding to the second designated position, using information regarding the second designated position on the image of the display unit.
  • the directivity control device monitors the directivity of the sound formed in the direction toward the position before the movement of the monitoring object even if the monitoring object projected on the image of the display unit moves. Since the object is formed in the direction toward the position after the movement, the sound can be properly formed following the directivity of the sound as the monitoring object moves, and the efficiency of monitoring work of the supervisor can be suppressed. .
  • One embodiment of the present invention is a storage medium storing a program for executing processing in a directivity control device that controls directivity of sound collected by a first sound collection unit including a plurality of microphones. And forming the directivity of the sound in the direction from the first sound collection unit to the monitoring target corresponding to the first designated position on the image of the display unit; and Using the step of acquiring information related to the second specified position on the image of the display unit specified in accordance with movement and the acquired information related to the second specified position, the second specified position And a step of switching the directivity of the voice in a direction toward the corresponding monitoring object.
  • a directivity control device capable of executing a program stored in the storage medium is provided from a first sound collection unit including a plurality of microphones to a monitoring object corresponding to a first designated position on an image on a display unit.
  • Voice directivity is formed in the direction, and information on the second designated position that designates the moving monitoring object is acquired.
  • the directivity control device switches the directivity of the sound in the direction toward the monitoring target corresponding to the second designated position, using information regarding the second designated position on the image of the display unit.
  • the directivity control device monitors the directivity of the sound formed in the direction toward the position before the movement of the monitoring object even if the monitoring object projected on the image of the display unit moves. Since the object is formed in the direction toward the position after the movement, the sound can be properly formed following the directivity of the sound as the monitoring object moves, and the efficiency of monitoring work of the supervisor can be suppressed. .
  • an embodiment of the present invention includes an imaging unit that images a sound collection region, a first sound collection unit that includes a plurality of microphones and collects sound in the sound collection region, and the first sound collection unit.
  • a directivity control device that controls the directivity of the sound collected in step 1, wherein the directivity control device corresponds to a first designated position on the image of the display unit from the first sound collection unit.
  • Information on a second designated position on the image of the display unit, which is designated according to the movement of the monitoring target, and a directivity forming unit that forms the directivity of the voice in a direction toward the monitoring target uses the information related to the second designated position obtained by the information obtaining unit, and the monitoring target corresponding to the second designated position
  • a directivity control system that switches the directivity of the voice in a direction toward an object.
  • the directivity control device forms sound directivity in the direction from the first sound collection unit including a plurality of microphones to the monitoring target corresponding to the first designated position on the image of the display unit.
  • information on the second designated position that designates the moving monitoring object is acquired.
  • the directivity control device switches the directivity of the sound in the direction toward the monitoring target corresponding to the second designated position, using information regarding the second designated position on the image of the display unit.
  • the directivity control device allows the sound formed in the direction toward the position before the movement of the monitoring object even if the monitoring object displayed on the image of the display unit moves. Because the directivity of the object is formed in the direction toward the position after the movement of the monitoring object, it can be formed appropriately following the directivity of the voice as the monitoring object moves. Efficiency degradation can be suppressed.
  • the present invention is directed to a directivity control device that properly forms the directivity of sound with respect to a monitoring object even when the monitoring object on the image moves, and suppresses deterioration in efficiency of the monitoring work of the observer. This is useful as a sex control method, a storage medium, and a directivity control system.
  • 3A, 3B Directivity control device 4 Recorder device 31 Communication unit 32 Operation unit 33 Memory 34, 34A Signal processing unit 34a Directional direction calculation unit 34b Output control unit 34c Tracking processing unit 34d Sound source detection unit 35 Display device 36 Speaker device 37 Image processing unit 38 Operation switching control unit 100, 100A, 100B Directivity control system C1, Cn Camera device C1RN, C2RN Imaging area JC1, JM1 Switching determination line JDL Scroll determination line LN1, LN2, LNR, LNW Tracking line LST Tracking list NW Network M1, Mm Omnidirectional microphone array device MR1, MR2, MR2W, MR2R, MR3 Point marker TP1, TP2 Tracking point TRW Tracking screen

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Otolaryngology (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Studio Devices (AREA)

Abstract

A directivity control apparatus controls the directivity of sounds collected by a first sound collection unit including a plurality of microphones. A directivity formation unit forms the directivity of the sounds in a direction extending from the first sound collection unit toward a monitoring target corresponding to a first designated position on an image displayed on a display unit. An information acquisition unit acquires information related to a second designated position on the image displayed on the display unit, said second designated position being designated in accordance with a movement of the monitoring target. The directivity formation unit changes, by use of the acquired information related to the second designated position, the directivity of the sounds to a direction extending toward the monitoring target corresponding to the second designated position.

Description

指向性制御装置、指向性制御方法、記憶媒体及び指向性制御システムDirectivity control device, directivity control method, storage medium, and directivity control system
 本発明は、音声の指向性を制御する指向性制御装置、指向性制御方法、記憶媒体及び指向性制御システムに関する。 The present invention relates to a directivity control device, a directivity control method, a storage medium, and a directivity control system that control the directivity of speech.
 従来、工場、店舗(例えば小売店、銀行)或いは公共の場(例えば図書館)の既定位置(例えば天井面)に設置される監視システムでは、ネットワークを介して1つ以上のカメラ装置(例えばPTZ(Pan Tilt Zoom)カメラ装置、又は全方位カメラ装置)を接続し、監視対象範囲の映像の画像データ(静止画像及び動画像を含む。以下同様。)の広画角化が図られている。 Conventionally, in a surveillance system installed at a predetermined position (for example, a ceiling surface) of a factory, a store (for example, a retail store, a bank) or a public place (for example, a library), one or more camera devices (for example, PTZ (for example, PTZ (for example)) Pan Tilt Zoom) camera device or omnidirectional camera device) is connected to widen the angle of view of the image data (including still images and moving images, the same applies hereinafter) of the video in the monitoring target range.
 映像を用いた監視では得られる情報量が限られるので、1つ以上のカメラ装置以外に、複数のマイクロホンが収容されたマイクアレイ装置を用いて、カメラ装置の画角内に存在する特定の監視対象物(例えば人物)の発する音声データが得られる監視システムの要請が高い。また、このような監視システムでは、マイクアレイ装置が音声を収音する際、人物が移動することも考慮する必要があると考えられる。 Since the amount of information obtained by monitoring using video is limited, in addition to one or more camera devices, a specific monitoring that exists within the angle of view of the camera device using a microphone array device containing a plurality of microphones. There is a high demand for a monitoring system that can obtain voice data from an object (for example, a person). In such a monitoring system, it is considered that it is necessary to consider the movement of a person when the microphone array apparatus collects sound.
 ここで、テレビカメラが撮影している画像を映し出すモニターテレビ画面上で移動の始点から終点の軌跡点の指定により、軌跡点を描画してユーザの入力操作を簡易化する先行技術として、例えば特許文献1に示すテレビカメラの雲台制御装置が提案されている。 Here, as a prior art that simplifies a user's input operation by drawing a trajectory point by designating a trajectory point from the start point to the end point on a monitor television screen that displays an image captured by a TV camera, for example, as a patent A pan / tilt head control device for a television camera shown in Document 1 has been proposed.
 特許文献1に示すテレビカメラの雲台制御装置は、パン及びチルト駆動手段を設けた雲台に設置されたテレビカメラが撮影している画像をモニターテレビに映し出し、モニターテレビの画面上で自動撮影における移動始点から終点に至る軌跡点が入力され、順次入力された軌跡点を順次接続して連続した軌跡線を求め、更に、軌跡線の移動始点から終点に至る軌跡データを順次読み出してデータ読出点が撮影画面の中心に位置するように自動撮影を実行する。これにより、テレビカメラの雲台制御装置は、モニターテレビの画面上で軌跡点を入力することで、簡単な入力操作でパン及びチルト駆動の軌跡データを得ることができ、正確な駆動制御を行うことができる。 The pan / tilt head control device shown in Patent Document 1 displays an image captured by a TV camera installed on a pan / tilt head provided with pan and tilt driving means on a monitor TV and automatically shoots on the monitor TV screen. The trajectory points from the movement start point to the end point are input, the sequential trajectory points are sequentially connected to obtain a continuous trajectory line, and the trajectory data from the movement start point to the end point of the trajectory line is sequentially read to read the data. Automatic shooting is executed so that the point is positioned at the center of the shooting screen. As a result, the pan / tilt head control device of the TV camera can obtain the trajectory data of the pan and tilt drive by a simple input operation by inputting the trajectory point on the screen of the monitor TV, and performs accurate drive control. be able to.
日本国特開平06-133189号公報Japanese Unexamined Patent Publication No. 06-133189
 しかし、特許文献1ではモニターテレビに映し出された人物の発する音声を収音する構成は開示されておらず、例えば特許文献1の構成を上述した監視システムに適用しても、移動始点から終点に至る軌跡点上の人物の音声を高精度に収音することは困難であるという課題がある。 However, Patent Document 1 does not disclose a configuration for picking up sound produced by a person projected on a monitor television. For example, even if the configuration of Patent Document 1 is applied to the above-described monitoring system, the moving start point to the end point is not disclosed. There is a problem that it is difficult to pick up the voice of a person on a trajectory point with high accuracy.
 本発明は、上述した従来の課題を解決するために、画像上の監視対象物が移動しても、監視対象物に対する音声の指向性を追従して適正に形成し、監視者の監視業務の効率劣化を抑制する指向性制御装置、指向性制御方法、記憶媒体及び指向性制御システムを提供することを目的とする。 In order to solve the above-described conventional problems, the present invention appropriately forms the sound directivity with respect to the monitoring target even if the monitoring target on the image moves, An object is to provide a directivity control device, a directivity control method, a storage medium, and a directivity control system that suppress efficiency degradation.
 本発明は、複数のマイクを含む第1の収音部で収音された音声の指向性を制御する指向性制御装置であって、前記第1の収音部から、表示部の画像上の第1の指定位置に対応する監視対象物への方向に、前記音声の指向性を形成する指向性形成部と、前記監視対象物の移動に応じて指定された、前記表示部の画像上の第2の指定位置に関する情報を取得する情報取得部と、を備え、前記指向性形成部は、前記情報取得部により取得された前記第2の指定位置に関する情報を用いて、前記第2の指定位置に対応する前記監視対象物への方向に、前記音声の指向性を切り替える、指向性制御装置である。 The present invention is a directivity control device that controls the directivity of sound collected by a first sound collection unit including a plurality of microphones, and is provided on an image of a display unit from the first sound collection unit. A directivity forming unit that forms directivity of the sound in a direction toward the monitoring target corresponding to the first specified position, and an image on the display unit that is specified according to the movement of the monitoring target An information acquisition unit that acquires information about a second specified position, wherein the directivity forming unit uses the information about the second specified position acquired by the information acquisition unit, It is a directivity control device that switches the directivity of the sound in a direction toward the monitoring object corresponding to a position.
 また、本発明は、複数のマイクを含む第1の収音部で収音された音声の指向性を制御する指向性制御装置における指向性制御方法であって、前記第1の収音部から、表示部の画像上の第1の指定位置に対応する監視対象物への方向に、前記音声の指向性を形成するステップと、前記監視対象物の移動に応じて指定された、前記表示部の画像上の第2の指定位置に関する情報を取得するステップと、取得された前記第2の指定位置に関する情報を用いて、前記第2の指定位置に対応する前記監視対象物への方向に、前記音声の指向性を切り替えるステップと、を有する、指向性制御方法である。 The present invention is also a directivity control method in a directivity control apparatus for controlling the directivity of sound collected by a first sound collection unit including a plurality of microphones, the first sound collection unit including: Forming the directivity of the sound in a direction toward the monitoring target corresponding to the first designated position on the image of the display; and the display unit specified according to the movement of the monitoring target Using the information on the second designated position on the image and the information on the obtained second designated position in the direction toward the monitoring object corresponding to the second designated position, And a step of switching the directivity of the voice.
 また、本発明は、複数のマイクを含む第1の収音部で収音された音声の指向性を制御する指向性制御装置における処理を実行するプログラムが格納された記憶媒体であって、前記第1の収音部から、表示部の画像上の第1の指定位置に対応する監視対象物への方向に、前記音声の指向性を形成するステップと、前記監視対象物の移動に応じて指定された、前記表示部の画像上の第2の指定位置に関する情報を取得するステップと、取得された前記第2の指定位置に関する情報を用いて、前記第2の指定位置に対応する前記監視対象物への方向に、前記音声の指向性を切り替えるステップと、を実行するプログラムが格納された、記憶媒体である。 Further, the present invention is a storage medium storing a program for executing processing in a directivity control device that controls the directivity of sound collected by a first sound collection unit including a plurality of microphones, A step of forming directivity of the sound in a direction from the first sound collection unit to the monitoring target corresponding to the first designated position on the image of the display unit, and according to the movement of the monitoring target The step of acquiring information related to the second specified position on the image of the display unit specified, and the monitoring corresponding to the second specified position using the acquired information related to the second specified position And a step of switching the directivity of the sound in a direction toward the object.
 更に、本発明は、収音領域を撮像する撮像部と、複数のマイクを含み前記収音領域の音声を収音する第1の収音部と、前記第1の収音部で収音された音声の指向性を制御する指向性制御装置と、を備え、前記指向性制御装置は、前記第1の収音部から、表示部の画像上の第1の指定位置に対応する監視対象物への方向に、前記音声の指向性を形成する指向性形成部と、前記監視対象物の移動に応じて指定された、前記表示部の画像上の第2の指定位置に関する情報を取得する情報取得部と、を備え、前記指向性形成部は、前記情報取得部により取得された前記第2の指定位置に関する情報を用いて、前記第2の指定位置に対応する前記監視対象物への方向に、前記音声の指向性を切り替える、指向性制御システムである。 Furthermore, the present invention provides an image pickup unit that picks up a sound pickup region, a first sound pickup unit that includes a plurality of microphones and picks up sound in the sound pickup region, and is picked up by the first sound pickup unit. A directivity control device for controlling the directivity of the sound, the directivity control device from the first sound collection unit to the monitoring target corresponding to the first designated position on the image of the display unit Information for obtaining information on a second designated position on the image of the display unit, which is designated according to the movement of the monitoring object, and a directivity forming unit that forms the directivity of the sound in the direction of An orientation unit, and the directivity forming unit uses the information on the second designated position obtained by the information obtaining unit, and directs the direction to the monitoring object corresponding to the second designated position. And a directivity control system for switching the directivity of the voice.
 本発明によれば、画像上の監視対象物が移動しても、監視対象物に対する音声の指向性を追従して適正に形成でき、監視者の監視業務の効率劣化を抑制できる。 According to the present invention, even if the monitoring object on the image moves, it can be appropriately formed following the directivity of the sound with respect to the monitoring object, and the efficiency deterioration of the monitoring work of the monitor can be suppressed.
第1の実施形態の指向性制御システムの動作概要を示す説明図Explanatory drawing which shows the operation | movement outline | summary of the directivity control system of 1st Embodiment. 第1の実施形態の指向性制御システムの第1のシステム構成例を示すブロック図The block diagram which shows the 1st system configuration example of the directivity control system of 1st Embodiment. 第1の実施形態の指向性制御システムの第2のシステム構成例を示すブロック図The block diagram which shows the 2nd system configuration example of the directivity control system of 1st Embodiment. 手動トラッキング処理の操作例を示す説明図Explanatory drawing which shows the operation example of manual tracking processing 自動トラッキング処理において自動指定されたトラッキングポイントが間違っていた場合に、手動トラッキング処理によりトラッキングポイントを変更する操作例を示す説明図Explanatory drawing which shows the operation example which changes a tracking point by manual tracking processing when the tracking point automatically specified in automatic tracking processing is wrong 録画再生モード及びスロー再生モードにおけるスロー再生処理を示す説明図Explanatory drawing showing slow playback processing in recording playback mode and slow playback mode 拡大表示モードにおける拡大表示処理を示す説明図Explanatory drawing which shows the enlarged display process in enlarged display mode (A)拡大表示モードにおける拡大表示処理後の自動スクロール処理を示す説明図、(B)時刻t=t1におけるトラッキング画面を示す図、(C)時刻t=t2におけるトラッキング画面を示す図(A) An explanatory view showing an automatic scroll process after an enlargement display process in an enlargement display mode, (B) a view showing a tracking screen at time t = t1, and (C) a view showing a tracking screen at time t = t2. (A)第1の実施形態の指向性制御システムにおける手動トラッキング処理の全体フローの第1例を説明するフローチャート、(B)第1の実施形態の指向性制御システムにおける手動トラッキング処理の全体フローの第2例を説明するフローチャート(A) A flowchart for explaining a first example of an overall flow of manual tracking processing in the directivity control system of the first embodiment. (B) An overall flow of manual tracking processing in the directivity control system of the first embodiment. Flow chart explaining the second example (A)第1の実施形態の指向性制御システムにおける自動トラッキング処理の全体フローの第1例を説明するフローチャート、(B)(A)に示す自動トラッキング処理の第1例を説明するフローチャート(A) A flowchart for explaining a first example of the entire flow of the automatic tracking process in the directivity control system of the first embodiment, and (B) a flowchart for explaining a first example of the automatic tracking process shown in (A). (A)図10(A)に示す自動トラッキング処理の第2例を説明するフローチャート、(B)(A)に示すトラッキング補正処理の一例を説明するフローチャート(A) A flowchart for explaining a second example of the automatic tracking process shown in FIG. 10 (A), and (B) a flowchart for explaining an example of the tracking correction process shown in (A). 図10(A)に示す自動トラッキング処理の第3例を説明するフローチャートFlowchart for explaining a third example of the automatic tracking process shown in FIG. (A)図9(A)に示すトラッキング補助処理の一例を説明するフローチャート、(B)(A)に示す自動スクロール処理の一例を説明するフローチャート(A) A flowchart for explaining an example of the tracking assisting process shown in FIG. 9A, and (B) a flowchart for explaining an example of the automatic scrolling process shown in (A). (A)図13(B)に示す自動スクロール処理要否判定処理の一例を示すフローチャート、(B)自動スクロール処理要否判定処理におけるスクロール要否判定線の説明図FIG. 13A is a flowchart illustrating an example of the automatic scroll process necessity determination process shown in FIG. 13B, and FIG. 13B is an explanatory diagram of a scroll necessity determination line in the automatic scroll process necessity determination process. (A)図9(A)に示すトラッキング結線処理の一例を説明するフローチャート、(B)(A)に示す一括結線処理の一例を説明するフローチャート(A) A flowchart for explaining an example of the tracking connection process shown in FIG. 9 (A), and (B) a flowchart for explaining an example of the collective connection process shown in (A). (A)1回分の人物の移動に対して表示されたトラッキングポイント間の動線上におけるユーザの指定位置に対応した収音音声の再生開始時刻PTの説明図、(B)トラッキングリストの第1例を示す図(A) Explanatory drawing of the reproduction start time PT of collected sound corresponding to the user's designated position on the flow line between tracking points displayed for one person's movement, (B) First example of tracking list Figure showing (A)複数同時指定に基づく異なるトラッキングポイント間の動線上におけるユーザの指定位置に対応した収音音声の再生開始時刻PTの説明図、(B)トラッキングリストの第2例を示す図(A) Explanatory drawing of reproduction | regeneration start time PT of the picked-up sound corresponding to the user's designated position on the flow line between the different tracking points based on multiple simultaneous designation | designated, (B) The figure which shows the 2nd example of a tracking list. (A)複数回指定に基づく異なるトラッキングポイント間の動線上におけるユーザの各指定位置に対応した収音音声の再生開始時刻PT,PT’の説明図、(B)トラッキングリストの第3例を示す図(A) Explanatory drawing of reproduction start times PT and PT ′ of the collected sound corresponding to each designated position of the user on the flow line between different tracking points based on a plurality of designations, and (B) a third example of the tracking list. Figure (A)第1の実施形態の指向性制御システムにおけるトラッキングリストを用いた動線表示再生処理の全体フローの一例を説明するフローチャート、(B)(A)に示す再生開始時刻算出処理の一例を説明するフローチャート(A) A flowchart for explaining an example of an entire flow of a flow line display reproduction process using a tracking list in the directivity control system of the first embodiment, and (B) an example of a reproduction start time calculation process shown in (A). Flow chart to explain 図19(A)に示す動線表示処理の一例を説明するフローチャートA flowchart for explaining an example of the flow line display process shown in FIG. (A)図9(A)に示す音声出力処理の一例を説明するフローチャート、(B)図13(A)に示す画像プライバシー保護処理の一例を説明するフローチャート(A) A flowchart for explaining an example of the audio output process shown in FIG. 9 (A), (B) a flowchart for explaining an example of the image privacy protection process shown in FIG. 13 (A). (A)ボイスチェンジ処理前のピッチに対応する音声信号の波形の一例を示す図、(B)ボイスチェンジ処理後のピッチに対応する音声信号の波形の一例を示す図、(C)検出された人物の顔の輪郭内にぼかしを入れる処理の説明図(A) A diagram showing an example of a waveform of an audio signal corresponding to a pitch before voice change processing, (B) a diagram showing an example of a waveform of an audio signal corresponding to a pitch after voice change processing, (C) detected Explanatory diagram of processing to blur the outline of a person's face 第2の実施形態の指向性制御システムのシステム構成例を示すブロック図The block diagram which shows the system configuration example of the directivity control system of 2nd Embodiment. ディスプレイ装置に表示される画像の撮像に用いるカメラ装置の自動切替処理を示す説明図Explanatory drawing which shows the automatic switching process of the camera apparatus used for imaging of the image displayed on a display apparatus 監視対象物の音声の収音に用いる全方位マイクアレイ装置の自動切替処理を示す説明図Explanatory drawing which shows the automatic switching process of the omnidirectional microphone array apparatus used for the sound collection of the sound of the monitoring object ディスプレイ装置に表示される画像の撮像に用いるカメラ装置の手動切替処理を示す説明図Explanatory drawing which shows the manual switching process of the camera apparatus used for imaging of the image displayed on a display apparatus 監視対象物の音声の収音に用いる全方位マイクアレイ装置の手動切替処理を示す説明図Explanatory drawing which shows the manual switching process of the omnidirectional microphone array apparatus used for the sound collection of the sound of the monitoring object 監視対象物の音声の収音に用いる最適な全方位マイクアレイ装置の選択処理を示す説明図Explanatory drawing which shows the selection process of the optimal omnidirectional microphone array apparatus used for the sound collection of the sound of the monitoring object (A)第2の実施形態の指向性制御システムにおけるカメラ装置の自動切替処理の一例を説明するフローチャート、(B)(A)に示すカメラ切替判定処理の一例を示すフローチャート(A) A flowchart for explaining an example of the automatic switching process of the camera device in the directivity control system of the second embodiment, and (B) a flowchart showing an example of the camera switching determination process shown in (A). (A)第2の実施形態の指向性制御システムにおける全方位マイクアレイ装置の自動切替処理の一例を説明するフローチャート、(B)(A)に示すマイク切替判定処理の一例を示すフローチャート(A) The flowchart explaining an example of the automatic switching process of the omnidirectional microphone array apparatus in the directivity control system of 2nd Embodiment, The flowchart which shows an example of the microphone switching determination process shown to (B) (A). (A)第2の実施形態の指向性制御システムにおけるカメラ装置の手動切替処理の一例を説明するフローチャート、(B)第2の実施形態の指向性制御システムにおける全方位マイクアレイ装置の手動切替処理の一例を説明するフローチャート(A) The flowchart explaining an example of the manual switching process of the camera apparatus in the directivity control system of 2nd Embodiment, (B) The manual switching process of the omnidirectional microphone array apparatus in the directivity control system of 2nd Embodiment Flowchart explaining an example (A)第2の実施形態の指向性制御システムにおける最適な全方位マイクアレイ装置の選択処理の第1例を説明するフローチャート、(B)第2の実施形態の指向性制御システムにおける最適な全方位マイクアレイ装置の選択処理の第2例を説明するフローチャート(A) A flow chart for explaining a first example of an optimal omnidirectional microphone array device selection process in the directivity control system of the second embodiment, and (B) an optimal all in the directivity control system of the second embodiment. Flowchart for explaining a second example of selection processing of azimuth microphone array device 第2の実施形態の指向性制御システムにおける最適な全方位マイクアレイ装置の選択処理の第3例を説明するフローチャートThe flowchart explaining the 3rd example of the selection process of the optimal omnidirectional microphone array apparatus in the directivity control system of 2nd Embodiment. 第1の実施形態の変形例の指向性制御システムにおける複数同時指定に基づく手動トラッキング処理の全体フローの一例を説明するフローチャートThe flowchart explaining an example of the whole flow of the manual tracking process based on multiple simultaneous specification in the directivity control system of the modification of 1st Embodiment. 第1の実施形態の変形例の指向性制御システムにおける複数の監視対象物の自動トラッキング処理の一例を説明するフローチャートThe flowchart explaining an example of the automatic tracking process of the several monitoring target object in the directivity control system of the modification of 1st Embodiment. (A)~(E)全方位マイクアレイ装置の筐体の外観図(A) to (E) External view of the omnidirectional microphone array device housing 全方位マイクアレイ装置が角度θの方向に音声データの指向性を形成する遅延和方式の簡単な説明図Simple explanatory diagram of the delay sum method in which the omnidirectional microphone array device forms the directivity of the audio data in the direction of the angle θ
 以下、本発明に係る指向性制御装置、指向性制御方法、記憶媒体及び指向性制御システムの各実施形態について、図面を参照して説明する。各実施形態の指向性制御システムは、例えば工場、公共施設(例えば図書館、イベント会場)、又は店舗(例えば小売店、銀行)に設置される監視システム(有人監視システム及び無人監視システムを含む)として用いられる。 Hereinafter, embodiments of a directivity control device, a directivity control method, a storage medium, and a directivity control system according to the present invention will be described with reference to the drawings. The directivity control system of each embodiment is, for example, as a monitoring system (including a manned monitoring system and an unmanned monitoring system) installed in a factory, a public facility (for example, a library, an event venue), or a store (for example, a retail store or a bank). Used.
 なお、本発明は、コンピュータである指向性制御装置に、指向性制御方法により規定される動作を実行させるためのプログラム、又は指向性制御方法により規定される動作をコンピュータに実行させるプログラムが記録されたコンピュータ読み取り可能な記録媒体として表現することも可能である。 The present invention records a program for causing a directivity control device, which is a computer, to execute an operation defined by a directivity control method, or a program for causing a computer to execute an operation defined by a directivity control method. It can also be expressed as a computer-readable recording medium.
(第1の実施形態)
 図1は、第1の実施形態の指向性制御システム100,100Aの動作概要を示す説明図である。図2は、第1の実施形態の指向性制御システム100の第1のシステム構成例を示すブロック図である。図3は、第1の実施形態の指向性制御システム100Aの第2のシステム構成例を示すブロック図である。
(First embodiment)
FIG. 1 is an explanatory diagram illustrating an outline of operations of the directivity control systems 100 and 100A according to the first embodiment. FIG. 2 is a block diagram illustrating a first system configuration example of the directivity control system 100 according to the first embodiment. FIG. 3 is a block diagram illustrating a second system configuration example of the directivity control system 100A according to the first embodiment.
 指向性制御システム100,100Aの具体的な構成については後述し、先ず指向性制御システム100,100Aの動作概要について、図1を参照して簡単に説明する。 The specific configuration of the directivity control systems 100 and 100A will be described later. First, an outline of the operation of the directivity control systems 100 and 100A will be briefly described with reference to FIG.
 図1では、カメラ装置C1は、例えば監視システムとして使用される指向性制御システム100,100Aの監視対象物(例えば人物HM1)を撮像し、撮像により得られた画像のデータを、ネットワークNWを介して接続された指向性制御装置3に送信する。 In FIG. 1, the camera apparatus C1 images the monitoring target object (for example, the person HM1) of the directivity control systems 100 and 100A used as the monitoring system, for example, and transmits image data obtained by the imaging via the network NW. To the connected directivity control device 3.
 本実施形態を含む各実施形態では、人物HM1は、静止しても良いし移動しても良いが移動するものとして説明する。人物HM1は、例えばトラッキング時刻t1においてトラッキング位置A1(x1,y1,z0)から、トラッキング時刻t2までにトラッキング位置A2(x2,y2,z0)に移動する。 In each embodiment including the present embodiment, the person HM1 may be stationary or may be moved, but will be described as moving. For example, the person HM1 moves from the tracking position A1 (x1, y1, z0) to the tracking position A2 (x2, y2, z0) by the tracking time t2 at the tracking time t1.
 ここで、トラッキングポイントとは、移動する人物HM1がカメラ装置C1により撮像された画像がディスプレイ装置35のトラッキング画面TRWに表示された場合に、ユーザがトラッキング画面TRW上で人物HM1を指定した位置(即ち、トラッキング画面TRW上の位置)である。トラッキングポイントには、トラッキング位置及びトラッキング時刻のデータが対応付けられる(例えば後述する図16(B)参照)。トラッキング位置は、人物HM1が指定されたトラッキング画面TRW上の位置に対応する実空間上の位置を示す3次元座標である。 Here, the tracking point refers to a position where the user designates the person HM1 on the tracking screen TRW when an image of the moving person HM1 captured by the camera device C1 is displayed on the tracking screen TRW of the display device 35 ( That is, the position on the tracking screen TRW). The tracking point and tracking time data are associated with the tracking point (see, for example, FIG. 16B described later). The tracking position is a three-dimensional coordinate indicating a position in real space corresponding to a position on the tracking screen TRW where the person HM1 is designated.
 また、トラッキング画面TRWとは、カメラ装置(例えばカメラ装置C1)により撮像された画像がディスプレイ装置35に表示された画面(以下、「カメラ画面」という)のうち、例えば人物HM1が音声トラッキング処理(後述参照)の対象となる監視対象物として映し出されている画面を示す。以下の各実施形態において、人物HM1等が監視対象物として映し出されていない画面をカメラ画面と記載し、監視対象物として映し出されている画面をトラッキング画面と記載し、特に説明が無い限り、カメラ画面とトラッキング画面とを区別して記載する。 In addition, the tracking screen TRW is a voice tracking process (for example, a person HM1 out of a screen (hereinafter referred to as “camera screen”) on which an image captured by a camera device (for example, the camera device C1) is displayed on the display device 35. The screen shown as the monitoring target object which becomes the object of the after-mentioned reference) is shown. In each of the following embodiments, a screen on which the person HM1 or the like is not projected as a monitoring target is referred to as a camera screen, a screen that is projected as a monitoring target is referred to as a tracking screen, and unless otherwise specified, the camera The screen and the tracking screen are described separately.
 なお図1では、説明を簡単にするために、同一の人物HM1が移動することを想定して説明するため、トラッキングポイントTP1,TP2におけるトラッキング位置のz座標は同じとする。更に、人物HM1がトラッキング位置A1からトラッキング位置A2に移動してもカメラ装置C1により撮像されるが、カメラ装置C1は、人物HM1の移動に追従して人物HM1の撮像を継続しても良いし、撮像を中止しても良い。 In FIG. 1, in order to simplify the description, it is assumed that the same person HM1 moves, so that the z coordinates of the tracking positions at the tracking points TP1 and TP2 are the same. Furthermore, even if the person HM1 moves from the tracking position A1 to the tracking position A2, the image is taken by the camera device C1, but the camera device C1 may continue to image the person HM1 following the movement of the person HM1. The imaging may be stopped.
 全方位マイクアレイ装置M1は、人物HM1の発する音声を収音し、ネットワークNWを介して接続された指向性制御装置3に、収音音声のデータを送信する。 The omnidirectional microphone array device M1 picks up the sound emitted by the person HM1, and transmits the sound collecting sound data to the directivity control device 3 connected via the network NW.
 指向性制御装置3は、監視対象物としての人物HM1がトラッキング位置A1に静止している場合には、全方位マイクアレイ装置M1からトラッキング位置A1への指向方向に、収音音声の指向性を形成する。また、指向性制御装置3は、人物HM1がトラッキング位置A1からトラッキング位置A2に移動した場合には、全方位マイクアレイ装置M1からトラッキング位置A2への指向方向に、収音音声の指向性を切り替えて形成する。 When the person HM1 as the monitoring target is stationary at the tracking position A1, the directivity control device 3 sets the directivity of the collected sound in the directivity direction from the omnidirectional microphone array device M1 to the tracking position A1. Form. In addition, the directivity control device 3 switches the directivity of the collected sound in the directivity direction from the omnidirectional microphone array device M1 to the tracking position A2 when the person HM1 moves from the tracking position A1 to the tracking position A2. Form.
 言い換えると、指向性制御装置3は、監視対象物としての人物HM1のトラッキング位置A1からトラッキング位置A2への移動に伴って、全方位マイクアレイ装置M1からトラッキング位置A1への方向から、全方位マイクアレイ装置M1からトラッキング位置A2への方向に収音音声の指向性を追従制御する、即ち音声トラッキング処理を行う。 In other words, the directivity control device 3 moves the omnidirectional microphone from the direction from the omnidirectional microphone array device M1 to the tracking position A1 as the person HM1 as the monitoring object moves from the tracking position A1 to the tracking position A2. The directivity of the collected sound is tracked in the direction from the array apparatus M1 to the tracking position A2, that is, a sound tracking process is performed.
 図2に示す指向性制御システム100は、1つ以上のカメラ装置C1,…,Cnと、1つ以上の全方位マイクアレイ装置M1,…,Mmと、指向性制御装置3と、レコーダ装置4とを含む構成である。n,mは1以上の整数であり、同数でも良いし異数でも良く、以下の各実施形態でも同様である。 The directivity control system 100 shown in FIG. 2 includes one or more camera devices C1,..., Cn, one or more omnidirectional microphone array devices M1,..., Mm, a directivity control device 3, and a recorder device 4. It is the structure containing these. n and m are integers of 1 or more and may be the same number or different numbers. The same applies to the following embodiments.
 カメラ装置C1,…,Cnと、全方位マイクアレイ装置M1,…,Mmと、指向性制御装置3と、レコーダ装置4とは、ネットワークNWを介して相互に接続されている。ネットワークNWは、有線ネットワーク(例えばイントラネット、インターネット)でも良いし、無線ネットワーク(例えば無線LAN(Local Area Network)、WiMAX(登録商標)、無線WAN(Wide Area Network))でも良い。以下の本実施形態では、説明を簡単にするために、1つのカメラ装置C1及び全方位マイクアレイ装置M1が設けられた構成として説明する。 The camera devices C1, ..., Cn, the omnidirectional microphone array devices M1, ..., Mm, the directivity control device 3, and the recorder device 4 are connected to each other via a network NW. The network NW may be a wired network (for example, an intranet or the Internet), or a wireless network (for example, a wireless LAN (Local Area Network), WiMAX (registered trademark), or a wireless WAN (Wide Area Network)). In the following embodiment, a description will be given assuming that one camera device C1 and an omnidirectional microphone array device M1 are provided in order to simplify the description.
 以下、指向性制御システム100を構成する各装置について説明する。なお、本実施形態を含む各実施形態では、カメラ装置C1の筐体と全方位マイクアレイ装置M1の筐体とは異なる位置に別体として取り付けられるが、カメラ装置C1の筐体と全方位マイクアレイ装置M1の筐体とは同一の位置に一体的に取り付けられても良い。 Hereinafter, each device constituting the directivity control system 100 will be described. In each of the embodiments including this embodiment, the housing of the camera device C1 and the omnidirectional microphone array device M1 are separately attached at different positions, but the housing of the camera device C1 and the omnidirectional microphone are attached. The housing of the array device M1 may be integrally attached at the same position.
 撮像部の一例としてのカメラ装置C1は、例えばイベント会場の天井面に固定して設置され、監視システムにおける監視カメラとしての機能を有し、ネットワークNWに接続された監視制御室(不図示)からの遠隔操作によって、所定の収音エリア(例えばイベント会場内の既定領域)において、カメラ装置C1の所定画角内の映像を撮像する。なお、カメラ装置C1は、PTZ機能を有するカメラでも良いし、全方位を撮像可能なカメラでも良い。なお、カメラ装置C1は、全方位を撮像可能なカメラである場合には、収音エリアの全方位の映像を示す画像データ(即ち、全方位画像データ)、又は全方位画像データに所定の歪み補正処理を施してパノラマ変換して生成した平面画像データを、ネットワークNWを介して指向性制御装置3又はレコーダ装置4に送信する。 A camera device C1 as an example of an imaging unit is installed fixedly on a ceiling surface of an event venue, for example, has a function as a monitoring camera in a monitoring system, and from a monitoring control room (not shown) connected to a network NW. With the remote operation, an image within a predetermined angle of view of the camera device C1 is captured in a predetermined sound collection area (for example, a predetermined area in the event venue). The camera device C1 may be a camera having a PTZ function or a camera capable of imaging all directions. Note that when the camera device C1 is a camera capable of capturing an omnidirectional image, the image data indicating the omnidirectional video in the sound collection area (that is, the omnidirectional image data), or a predetermined distortion in the omnidirectional image data. Planar image data generated by performing panorama conversion by performing correction processing is transmitted to the directivity control device 3 or the recorder device 4 via the network NW.
 カメラ装置C1は、ディスプレイ装置35に表示された画像データの中で、任意の位置がカーソルCSR又はユーザの指FGにより指定されると、画像データ中の指定位置の座標データを指向性制御装置3から受信し、カメラ装置1から、指定位置に対応する実空間上の音声位置(以下、単に「音声位置」と略記する)までの距離、方向(水平角及び垂直角を含む。以下同様。)のデータを算出して指向性制御装置3に送信する。なお、カメラ装置C1における距離、方向のデータ算出処理は公知技術であるため、説明は省略する。 When an arbitrary position is designated by the cursor CSR or the user's finger FG in the image data displayed on the display device 35, the camera device C1 transmits the coordinate data of the designated position in the image data to the directivity control device 3. The distance and direction from the camera device 1 to the sound position in the real space corresponding to the designated position (hereinafter simply abbreviated as “sound position”) (including horizontal and vertical angles, and so on). Is transmitted to the directivity control device 3. Note that the distance and direction data calculation processing in the camera device C1 is a known technique, and thus description thereof is omitted.
 収音部の一例としての全方位マイクアレイ装置M1は、例えばイベント会場の天井面に固定して設置され、複数のマイクロホンユニット22,23(図36(A)~(E)参照)が均等な間隔で設けられたマイクロホン部と、マイクロホン部の各マイクロホンユニット22,23の動作を制御するCPU(Central Processing Unit)とを少なくとも含む構成である。 An omnidirectional microphone array apparatus M1 as an example of a sound collection unit is fixedly installed on the ceiling surface of an event venue, for example, and a plurality of microphone units 22 and 23 (see FIGS. 36A to 36E) are even. The configuration includes at least a microphone unit provided at intervals and a CPU (Central Processing Unit) that controls operations of the microphone units 22 and 23 of the microphone unit.
 全方位マイクアレイ装置M1は、電源がONされると、マイクロホンユニット内のマイク素子により収音された音声の音声データに所定の音声信号処理(例えば増幅処理、フィルタ処理、加算処理)を施し、所定の音声信号処理により得られた音声データを、ネットワークNWを介して、指向性制御装置3又はレコーダ装置4に送信する。 When the power is turned on, the omnidirectional microphone array device M1 performs predetermined audio signal processing (for example, amplification processing, filter processing, addition processing) on the audio data of the sound collected by the microphone element in the microphone unit, The voice data obtained by the predetermined voice signal processing is transmitted to the directivity control device 3 or the recorder device 4 via the network NW.
 ここで、全方位マイクアレイ装置M1の筐体の外観について、図36(A)~(E)を参照して説明する。図36(A)~(E)は、全方位マイクアレイ装置M1の筐体の外観図である。図36(A)~(E)に示す全方位マイクアレイ装置M1C,M1A,M1B,M1,M1Dは、外観及び複数のマイクロホンユニットの配置位置が異なるが、全方位マイクアレイ装置の機能は同等である。 Here, the appearance of the casing of the omnidirectional microphone array apparatus M1 will be described with reference to FIGS. 36 (A) to (E). FIGS. 36A to 36E are external views of the casing of the omnidirectional microphone array apparatus M1. The omnidirectional microphone array apparatuses M1C, M1A, M1B, M1, and M1D shown in FIGS. 36A to 36E are different in appearance and arrangement positions of a plurality of microphone units, but the functions of the omnidirectional microphone array apparatuses are the same. is there.
 図36(A)に示す全方位マイクアレイ装置M1Cは、円盤状の筐体21を有する。筐体21には、複数のマイクロホンユニット22,23が同心円状に配置されている。具体的には、複数のマイクロホンユニット22が、筐体21と同一の中心を有する同心円状に且つ筐体21の円周に沿って配置され、複数のマイクロホンユニット23が、筐体21と同一の中心を有する同心円状に且つ筐体21の内側に配置されている。各々のマイクロホンユニット22は、互いの間隔が広く、直径が大きく、低い音域に適した特性を有する。一方、各々のマイクロホンユニット23は、互いの間隔が狭く、直径が小さく、高い音域に適した特性を有する。 36A includes an omnidirectional microphone array apparatus M1C having a disk-shaped casing 21. A plurality of microphone units 22 and 23 are concentrically arranged in the housing 21. Specifically, the plurality of microphone units 22 are arranged concentrically with the same center as the casing 21 and along the circumference of the casing 21, and the plurality of microphone units 23 are the same as the casing 21. A concentric circle having a center is disposed inside the housing 21. Each microphone unit 22 has a wide interval, a large diameter, and characteristics suitable for a low sound range. On the other hand, each microphone unit 23 is narrow in distance from each other, has a small diameter, and has characteristics suitable for a high sound range.
 図36(B)に示す全方位マイクアレイ装置M1Aは、円盤状の筐体21を有する。筐体21には、複数のマイクロホンユニット22が、均等な間隔で縦方向と横方向の2方向に沿って十字状に配置され、縦方向の配列と横方向の配列とが筐体21の中心において交わっている。全方位マイクアレイ装置M1Aは、複数のマイクロホンユニット22が縦方向と横方向の2方向に直線的に配置されているので、音声データの指向性を形成する場合の演算量を低減できる。なお、図36(B)に示す全方位マイクアレイ装置M1Aにおいて、縦方向又は横方向の1列だけに、複数のマイクロホンユニット22が配置されても良い。 36B includes an omnidirectional microphone array apparatus M1A having a disk-shaped casing 21. A plurality of microphone units 22 are arranged in a cross shape along the vertical direction and the horizontal direction at equal intervals in the casing 21, and the vertical array and the horizontal array are the center of the casing 21. At In the omnidirectional microphone array apparatus M1A, since the plurality of microphone units 22 are linearly arranged in two directions, ie, the vertical direction and the horizontal direction, it is possible to reduce the amount of calculation when forming the directivity of audio data. In the omnidirectional microphone array apparatus M1A shown in FIG. 36B, a plurality of microphone units 22 may be arranged in only one column in the vertical direction or the horizontal direction.
 図36(C)に示す全方位マイクアレイ装置M1Bは、図36(A)に示す全方位マイクアレイ装置M1Cに比べ、直径の小さい円盤状の筐体21Bを有する。筐体21Bには、複数のマイクロホンユニット22が、筐体21Bの円周に沿って均等な間隔で配置されている。図36(C)に示す全方位マイクアレイ装置M1Bは、各々のマイクロホンユニット22の間隔が短いので、高い音域に適した特性を有する。 The omnidirectional microphone array apparatus M1B shown in FIG. 36 (C) has a disk-shaped casing 21B having a smaller diameter than the omnidirectional microphone array apparatus M1C shown in FIG. 36 (A). In the casing 21B, a plurality of microphone units 22 are arranged at equal intervals along the circumference of the casing 21B. The omnidirectional microphone array apparatus M1B shown in FIG. 36C has characteristics suitable for a high sound range because the distance between the microphone units 22 is short.
 図36(D)に示す全方位マイクアレイ装置M1は、筐体21Cの中心に所定の直径を有する開口部21aが形成されたドーナツ型形状又はリング型形状の筐体21Cを有する。本実施形態の指向性制御システム100,100Aでは、例えば図36(D)に示す全方位マイクアレイ装置M1が用いられる。筐体21Cでは、複数のマイクロホンユニット22が、筐体21Cの円周方向に沿って、均等な間隔で同心円状に配置されている。 36D includes an omnidirectional microphone array apparatus M1 having a donut-shaped or ring-shaped casing 21C in which an opening 21a having a predetermined diameter is formed at the center of the casing 21C. In the directivity control systems 100 and 100A of the present embodiment, for example, an omnidirectional microphone array apparatus M1 shown in FIG. 36 (D) is used. In the housing 21C, a plurality of microphone units 22 are concentrically arranged at equal intervals along the circumferential direction of the housing 21C.
 図36(E)に示す全方位マイクアレイ装置M1Dは、矩形状の筐体21Dを有する。筐体21Dには、複数のマイクロホンユニット22が、筐体21Dの外周に沿って均等な間隔で配置されている。図36(E)に示す全方位マイクアレイ装置M1Dでは、筐体21Dが矩形形状であるため、例えばコーナー又は壁面においても全方位マイクアレイ装置M1Dの設置を簡易化できる。 The omnidirectional microphone array apparatus M1D shown in FIG. 36 (E) has a rectangular casing 21D. In the casing 21D, a plurality of microphone units 22 are arranged at equal intervals along the outer periphery of the casing 21D. In the omnidirectional microphone array apparatus M1D shown in FIG. 36 (E), the casing 21D has a rectangular shape, and therefore the installation of the omnidirectional microphone array apparatus M1D can be simplified even at, for example, a corner or a wall surface.
 全方位マイクアレイ装置M1の各マイクロホンユニット22,23は、無指向性マイクロホンでも良いし、双指向性マイクロホン、単一指向性マイクロホン、鋭指向性マイクロホン、超指向性マイクロホン(例えばガンマイク)又はこれらの組み合わせでも良い。 The microphone units 22 and 23 of the omnidirectional microphone array apparatus M1 may be omnidirectional microphones, bi-directional microphones, unidirectional microphones, sharp directional microphones, super-directional microphones (for example, gun microphones), or the like. A combination may be used.
 指向性制御装置3,3Aは、例えば監視制御室(不図示)に設置される据置型のPC(Personal Computer)でも良いし、ユーザが携帯可能な携帯電話機、PDA(Personal Digital Assistant)、タブレット端末、スマートフォン等のデータ通信端末でも良い。 The directivity control devices 3 and 3A may be, for example, a stationary PC (Personal Computer) installed in a monitoring control room (not shown), a mobile phone that can be carried by the user, a PDA (Personal Digital Assistant), or a tablet terminal. A data communication terminal such as a smartphone may be used.
 指向性制御装置3は、通信部31と、操作部32と、メモリ33と、信号処理部34と、ディスプレイ装置35と、スピーカ装置36とを少なくとも含む構成である。信号処理部34は、指向方向算出部34aと、出力制御部34bと、トラッキング処理部34cとを少なくとも含む。 The directivity control device 3 includes at least a communication unit 31, an operation unit 32, a memory 33, a signal processing unit 34, a display device 35, and a speaker device 36. The signal processing unit 34 includes at least a directivity direction calculation unit 34a, an output control unit 34b, and a tracking processing unit 34c.
 通信部31は、カメラ装置C1から送信された画像データ又は全方位マイクアレイ装置M1から送信された音声データを受信して信号処理部34に出力する。 The communication unit 31 receives the image data transmitted from the camera device C1 or the audio data transmitted from the omnidirectional microphone array device M1, and outputs the received image data to the signal processing unit 34.
 操作部32は、ユーザの入力操作を信号処理部34に通知するためのユーザインターフェース(UI:User Interface)であり、例えばマウス、キーボード等のポインティングデバイスである。また、操作部32は、例えばディスプレイ装置35の表示画面に対応して配置され、ユーザの指FG又はスタイラスペンによる入力操作を検出可能なタッチパネルを用いて構成されても良い。 The operation unit 32 is a user interface (UI) for notifying the signal processing unit 34 of a user input operation, and is, for example, a pointing device such as a mouse or a keyboard. In addition, the operation unit 32 may be configured using a touch panel that is arranged corresponding to the display screen of the display device 35 and can detect an input operation with a user's finger FG or a stylus pen, for example.
 操作部32は、ディスプレイ装置35に表示された画像データ(即ち、カメラ装置C1により撮像された画像データ)の中で、ユーザのマウス操作によるカーソルCSR又はユーザの指FGにより指定された指定位置の座標データを信号処理部34に出力する。 The operation unit 32 has a designated position designated by the cursor CSR or the user's finger FG by the user's mouse operation in the image data displayed on the display device 35 (that is, image data taken by the camera device C1). The coordinate data is output to the signal processing unit 34.
 メモリ33は、例えばRAM(Random Access Memory)を用いて構成され、指向性制御装置3の各部の動作時のワークメモリとして機能する。また、画像記憶部又は音声記憶部の一例としてのメモリ33は、例えばハードディスク又はフラッシュメモリを用いて構成され、レコーダ装置4において記憶されている画像データ又は音声データ、即ち、一定期間にわたってカメラ装置C1により撮像された画像データ又は全方位マイクアレイ装置M1により収音された音声データを記憶している。 The memory 33 is configured by using, for example, a RAM (Random Access Memory), and functions as a work memory during operation of each unit of the directivity control device 3. The memory 33 as an example of an image storage unit or an audio storage unit is configured using, for example, a hard disk or a flash memory, and the image data or audio data stored in the recorder device 4, that is, the camera device C1 over a certain period. Is stored, or audio data picked up by the omnidirectional microphone array apparatus M1 is stored.
 また、指定リスト記憶部の一例としてのメモリ33は、ディスプレイ装置35に表示された画像データのトラッキング画面TRW上の全ての指定位置及び指定時刻(後述参照)のデータを含む指定リストの一例としてのトラッキングリストLST(例えば図16(B)参照)のデータを記憶する。 The memory 33 as an example of the designation list storage unit is an example of a designation list including data of all designated positions and designated times (see later) of the image data displayed on the display device 35 on the tracking screen TRW. Data of the tracking list LST (see, for example, FIG. 16B) is stored.
 信号処理部34は、例えばCPU(Central Processing Unit)、MPU(Micro Processing Unit)又はDSP(Digital Signal Processor)を用いて構成され、指向性制御装置3の各部の動作を全体的に統括するための制御処理、他の各部との間のデータの入出力処理、データの演算(計算)処理及びデータの記憶処理を行う。 The signal processing unit 34 is configured using, for example, a CPU (Central Processing Unit), an MPU (Micro Processing Unit), or a DSP (Digital Signal Processor), and controls the operation of each unit of the directivity control device 3 as a whole. Control processing, data input / output processing between other units, data calculation (calculation) processing, and data storage processing are performed.
 指向方向算出部34aは、指向方向座標(θMAh,θMAv)の算出時では、ユーザのマウス操作によるカーソルCSR又はユーザの指FGにより指定された画像データの指定位置の座標データを操作部32から取得すると、通信部31からカメラ装置C1に座標データを送信させる。指向方向算出部34aは、カメラ装置1の設置位置から、画像データの指定位置に対応する実空間上の音声(音源)位置までの距離、方向のデータを、通信部31から取得する。 At the time of calculation of the directivity direction coordinates (θ MAh , θ MAv ), the directivity direction calculation unit 34 a uses the cursor CSR by the user's mouse operation or the coordinate data of the specified position of the image data specified by the user's finger FG as the operation unit 32. Is acquired from the communication unit 31 to the camera device C1. The directivity direction calculation unit 34 a acquires distance and direction data from the installation position of the camera device 1 to the sound (sound source) position in the real space corresponding to the specified position of the image data from the communication unit 31.
 指向方向算出部34aは、カメラ装置C1の設置位置から、音声位置までの距離、方向のデータを用いて、全方位マイクアレイ装置M1の設置位置から音声位置に向かう指向方向座標(θMAh,θMAv)を算出する。 The directivity direction calculation unit 34a uses the data on the distance and direction from the installation position of the camera device C1 to the sound position, and uses the direction direction coordinates (θ MAh , θ) from the installation position of the omnidirectional microphone array device M1 to the sound position. MAv ) is calculated.
 また、本実施形態のように、カメラ装置C1の筐体と全方位マイクアレイ装置M1の筐体とが離れて別体として取り付けられている場合には、指向方向算出部34aは、事前に算出された所定のキャリブレーションパラメータのデータと、カメラ装置C1から音声位置(音源位置)までの方向(水平角,垂直角)のデータとを用いて、全方位マイクアレイ装置M1から音声位置(音源位置)までの指向方向座標(θMAh,θMAv)を算出する。なお、キャリブレーションとは、指向性制御装置3の指向方向算出部34aが指向方向座標(θMAh,θMAv)を算出するために必要となる所定のキャリブレーションパラメータを算出又は取得する動作であり、具体的なキャリブレーション方法及びキャリブレーションパラメータの内容は特に限定されず、例えば公知技術の範囲で実現可能である。 In addition, as in the present embodiment, when the housing of the camera device C1 and the housing of the omnidirectional microphone array device M1 are separated and attached separately, the directivity calculation unit 34a calculates in advance. The sound position (sound source position) from the omnidirectional microphone array apparatus M1 is obtained using the predetermined calibration parameter data and the data in the direction (horizontal angle, vertical angle) from the camera device C1 to the sound position (sound source position). ) Directivity direction coordinates (θ MAh , θ MAv ) are calculated. The calibration is an operation for calculating or acquiring a predetermined calibration parameter necessary for the directivity direction calculation unit 34a of the directivity control device 3 to calculate the directivity direction coordinates (θ MAh , θ MAv ). Specific contents of the calibration method and calibration parameters are not particularly limited, and can be realized, for example, within the scope of known techniques.
 また、カメラ装置C1の筐体を囲むように全方位マイクアレイ装置M1の筐体が一体的に取り付けられている場合には、カメラ装置C1から音声位置(音源位置)までの方向(水平角,垂直角)を、全方位マイクアレイ装置2から音声位置までの指向方向座標(θMAh,θMAv)として用いることができる。 Further, when the omnidirectional microphone array device M1 is integrally attached so as to surround the camera device C1, the direction from the camera device C1 to the sound position (sound source position) (horizontal angle, (Vertical angle) can be used as the directivity direction coordinates (θ MAh , θ MAv ) from the omnidirectional microphone array device 2 to the sound position.
 ここで、指向方向座標(θMAh,θMAv)のうち、θMAhは全方位マイクアレイ装置2の設置位置から音声位置に向かう指向方向の水平角を示し、θMAvは全方位マイクアレイ装置2の設置位置から音声位置に向かう指向方向の垂直角を示す。以下の説明では、説明を簡単にするために、カメラ装置C1及び全方位マイクアレイ装置M1の各水平角の基準方向(0度方向)が一致するとする。 Here, of the directivity direction coordinates (θ MAh , θ MAv ), θ MAh indicates a horizontal angle in the directivity direction from the installation position of the omnidirectional microphone array device 2 to the voice position, and θ MAv is the omnidirectional microphone array device 2. The vertical angle of the pointing direction from the installation position to the voice position is shown. In the following description, to simplify the description, it is assumed that the reference directions (0 degree directions) of the horizontal angles of the camera device C1 and the omnidirectional microphone array device M1 coincide.
 出力制御部34bは、ディスプレイ装置35及びスピーカ装置36の動作を制御する。例えば、表示制御部の一例としての出力制御部34bは、例えばユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作に応じて、カメラ装置C1から送信された画像データをディスプレイ装置35に表示させる。音声出力制御部の一例としての出力制御部34bは、全方位マイクアレイ装置2から送信された音声データ、又は一定期間にわたって全方位マイクアレイ装置M1により収音された音声データをレコーダ装置4から取得した場合には、例えばユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作に応じて、音声データをスピーカ装置36に出力させる。 The output control unit 34b controls the operations of the display device 35 and the speaker device 36. For example, the output control unit 34b as an example of the display control unit displays the image data transmitted from the camera device C1 on the display device 35 in accordance with, for example, an input operation with the cursor CSR or the user's finger FG by the user's mouse operation. Let The output control unit 34b as an example of the audio output control unit acquires the audio data transmitted from the omnidirectional microphone array apparatus 2 or the audio data collected by the omnidirectional microphone array apparatus M1 over a certain period from the recorder apparatus 4. In such a case, for example, audio data is output to the speaker device 36 in response to an input operation using a cursor CSR or a user's finger FG by a user's mouse operation.
 また、画像再生部の一例としての出力制御部34bは、一定期間にわたってカメラ装置C1により撮像された画像データをレコーダ装置4から取得した場合には、例えばユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作に応じて、画像データをディスプレイ装置35に再生させる。 Further, when the output control unit 34b as an example of the image reproduction unit acquires image data captured by the camera device C1 over a certain period from the recorder device 4, for example, the cursor CSR or the user's finger by the user's mouse operation is obtained. The display device 35 is caused to reproduce the image data in response to an input operation by the FG.
 また、指向性形成部の一例としての出力制御部34bは、全方位マイクアレイ装置2から送信された音声データ又はレコーダ装置4から取得した音声データを用いて、指向方向算出部34aにより算出された指向方向座標(θMAh,θMAv)が示す指向方向に、全方位マイクアレイ装置2により収音された音声(収音音声)の指向性(ビーム)を形成する。 Further, the output control unit 34b as an example of the directivity forming unit is calculated by the directivity direction calculating unit 34a using the audio data transmitted from the omnidirectional microphone array device 2 or the audio data acquired from the recorder device 4. The directivity (beam) of the sound (collected sound) collected by the omnidirectional microphone array device 2 is formed in the directivity direction indicated by the directivity direction coordinates (θ MAh , θ MAv ).
 これにより、指向性制御装置3は、指向性が形成された指向方向に存在する監視対象物(例えば人物HM1)の発する音声の音量レベルを相対的に増大でき、指向性が形成されない方向の音声を抑圧して音量レベルを相対的に低減できる。 Thereby, the directivity control device 3 can relatively increase the volume level of the sound emitted by the monitoring target (for example, the person HM1) existing in the directivity direction in which the directivity is formed, and the sound in the direction in which the directivity is not formed. Can be suppressed to relatively reduce the volume level.
 情報取得部の一例としてのトラッキング処理部34cは、上述した音声トラッキング処理に関する情報を取得する。例えば、トラッキング処理部34cは、カメラ装置C1により撮像された画像データが表示されたディスプレイ装置35のトラッキング画面TRW上において、例えばユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作に応じて新たな位置が指定された場合には、新たに指定された位置に関する情報を取得する。 The tracking processing unit 34c as an example of the information acquisition unit acquires information related to the above-described voice tracking processing. For example, on the tracking screen TRW of the display device 35 on which the image data captured by the camera device C1 is displayed, the tracking processing unit 34c responds to an input operation with a cursor CSR by a user's mouse operation or a user's finger FG, for example. When a new position is designated, information on the newly designated position is acquired.
 ここで、新たに指定された位置に関する情報には、トラッキング画面TRW上で指定された画像データ上の位置を示す座標情報以外に、新たに指定された時刻(指定時刻)、指定時刻に指定された画像データ上の位置に対応する実空間上の監視対象物(例えば人物HM1)が存在する音声位置(音源位置)の座標情報、又は全方位マイクアレイ装置M1からその音声位置(音源位置)までの距離情報が含まれる。 Here, in addition to the coordinate information indicating the position on the image data specified on the tracking screen TRW, the information regarding the newly specified position is specified at the newly specified time (specified time) and the specified time. From the omnidirectional microphone array device M1 to the sound position (sound source position) or the coordinate information of the sound position (sound source position) where the monitoring object (for example, the person HM1) in the real space corresponding to the position on the image data exists. Distance information is included.
 また、再生時刻算出部の一例としてのトラッキング処理部34cは、メモリ33に記憶されたトラッキングリストLSTのデータを用いて、例えばユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作に応じて、指定された動線上の位置における音声の再生時刻を算出する(後述参照)。 Further, the tracking processing unit 34c as an example of the reproduction time calculating unit uses the data of the tracking list LST stored in the memory 33, for example, according to the input operation with the cursor CSR by the user's mouse operation or the user's finger FG. The reproduction time of the sound at the position on the designated flow line is calculated (see later).
 表示部の一例としてのディスプレイ装置35は、例えばLCD(Liquid Crystal Display)又は有機EL(Electroluminescence)を用いて構成され、出力制御部34bの制御の下で、カメラ装置C1により撮像された画像データを表示する。 The display device 35 as an example of the display unit is configured using, for example, an LCD (Liquid Crystal Display) or an organic EL (Electroluminescence), and receives image data captured by the camera device C1 under the control of the output control unit 34b. indicate.
 音声出力部の一例としてのスピーカ装置36は、全方位マイクアレイ装置M1により収音された音声の音声データ、又は指向方向座標(θMAh,θMAv)が示す指向方向に指向性が形成された音声データを出力する。なお、ディスプレイ装置35及びスピーカ装置36は、指向性制御装置3とは別の構成としても良い。 The speaker device 36 as an example of the sound output unit has directivity formed in sound data of sound collected by the omnidirectional microphone array device M1 or in a directivity direction indicated by directivity direction coordinates (θ MAh , θ MAv ). Output audio data. Note that the display device 35 and the speaker device 36 may have different configurations from the directivity control device 3.
 レコーダ装置4は、カメラ装置C1により撮像された画像データと、全方位マイクアレイ装置M1により収音された音声の音声データとを対応付けて記憶している。 The recorder device 4 stores the image data picked up by the camera device C1 and the sound data of the sound collected by the omnidirectional microphone array device M1 in association with each other.
 図3に示す指向性制御システム100Aは、1つ以上のカメラ装置C1,…,Cnと、1つ以上の全方位マイクアレイ装置M1,…,Mmと、指向性制御装置3Aと、レコーダ装置4とを含む構成である。図3では、図2の各部と同一の構成及び動作のものには同一の符号を付して説明を簡略化又は省略し、異なる内容について説明する。 The directivity control system 100A shown in FIG. 3 includes one or more camera devices C1,..., Cn, one or more omnidirectional microphone array devices M1,..., Mm, a directivity control device 3A, and a recorder device 4. It is the structure containing these. In FIG. 3, the same components and operations as those in FIG. 2 are denoted by the same reference numerals, description thereof is simplified or omitted, and different contents are described.
 指向性制御装置3Aは、通信部31と、操作部32と、メモリ33と、信号処理部34Aと、ディスプレイ装置35と、スピーカ装置36と、画像処理部37とを少なくとも含む構成である。信号処理部34Aは、指向方向算出部34aと、出力制御部34bと、トラッキング処理部34cと、音源検出部34dとを少なくとも含む。 The directivity control device 3A includes at least a communication unit 31, an operation unit 32, a memory 33, a signal processing unit 34A, a display device 35, a speaker device 36, and an image processing unit 37. The signal processing unit 34A includes at least a pointing direction calculation unit 34a, an output control unit 34b, a tracking processing unit 34c, and a sound source detection unit 34d.
 音源検出部34dは、ディスプレイ装置35に表示されている画像データから、監視対象物である人物HM1の発した音声に対応する実空間上の音声位置(音源位置)を検出する。例えば、音源検出部34dは、全方位マイクアレイ装置M1の収音エリアを複数の格子状エリアに分割し、全方位マイクアレイ装置M1から各格子状エリアの中心位置に対して指向性が形成された音声の強さ又は音量レベルを計測する。音源検出部34dは、全ての格子状エリアの中で、最も音声の強さ又は音量レベルが高い格子状エリアに音源が存在すると推定する。音源検出部34dの検出結果には、例えば全方位マイクアレイ装置M1から最も音声の強さ又は音量レベルが高い格子状エリアの中心位置までの距離情報が含まれる。 The sound source detection unit 34d detects the sound position (sound source position) in the real space corresponding to the sound uttered by the person HM1, which is the monitoring target, from the image data displayed on the display device 35. For example, the sound source detection unit 34d divides the sound collection area of the omnidirectional microphone array apparatus M1 into a plurality of grid areas, and directivity is formed from the omnidirectional microphone array apparatus M1 to the center position of each grid area. Measure the sound intensity or volume level. The sound source detection unit 34d estimates that the sound source exists in the lattice area having the highest sound intensity or volume level among all the lattice areas. The detection result of the sound source detection unit 34d includes, for example, distance information from the omnidirectional microphone array device M1 to the center position of the lattice area having the highest sound intensity or volume level.
 画像処理部37は、信号処理部34の指示に応じて、ディスプレイ装置35に表示された画像データに対して所定の画像処理(例えば人物HM1の動きを検出するためのVMD(Video Motion Detector)処理、人物の顔及び顔の向きの検出処理、人物検出処理)を行い、画像処理結果を信号処理部34に出力する。 In response to an instruction from the signal processing unit 34, the image processing unit 37 performs predetermined image processing on the image data displayed on the display device 35 (for example, VMD (Video Motion Detector) processing for detecting the motion of the person HM1). , Human face and face orientation detection processing, human detection processing), and the image processing result is output to the signal processing unit 34.
 また、画像処理部37は、例えばユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作に応じて、ディスプレイ装置35に表示された監視対象物(例えば人物HM1)の顔の輪郭DTLを検出し、顔にマスキング処理を施す。具体的には、画像処理部37は、検出された顔の輪郭DTLを包含する矩形領域を算出し、矩形領域内に所定のぼかしを入れる処理を行う(図22(C)参照)。図22(C)は、検出された人物の顔の輪郭DTL内にぼかしを入れる処理の説明図である。画像処理部37は、ぼかしを入れる処理により生成された画像データを信号処理部34に出力する。 In addition, the image processing unit 37 detects the face outline DTL of the monitoring target (for example, the person HM1) displayed on the display device 35 in accordance with, for example, an input operation with the cursor CSR by the user's mouse operation or the user's finger FG. And masking the face. Specifically, the image processing unit 37 calculates a rectangular region including the detected face outline DTL, and performs a process of adding a predetermined blur to the rectangular region (see FIG. 22C). FIG. 22C is an explanatory diagram of the process of blurring the detected human face outline DTL. The image processing unit 37 outputs the image data generated by the blurring process to the signal processing unit 34.
 図37は、全方位マイクアレイ装置M1が角度θの方向に音声データの指向性を形成する遅延和方式の簡単な説明図である。説明を分かり易くするため、マイク素子221~22nは直線上に配列しているとする。この場合、指向性は面内の二次元領域となるが、三次元空間において指向性を形成するためには、マイクロホンを二次元配列にして、同じ処理方法を行えば良い。 FIG. 37 is a simple explanatory diagram of the delay sum method in which the omnidirectional microphone array apparatus M1 forms the directivity of the audio data in the direction of the angle θ. For easy understanding, it is assumed that the microphone elements 221 to 22n are arranged on a straight line. In this case, the directivity is a two-dimensional region in the plane. However, in order to form directivity in a three-dimensional space, the same processing method may be performed by arranging microphones in a two-dimensional array.
 音源80から発した音波は、全方位マイクアレイ装置M1のマイクロホンユニット22,23に内蔵される各マイク素子221,222,223,…,22(n-1),22nに対し、ある一定の角度(入射角=(90-θ)[度])で入射する。 The sound wave emitted from the sound source 80 is at a certain angle with respect to the microphone elements 221, 222, 223,..., 22 (n−1), 22n built in the microphone units 22, 23 of the omnidirectional microphone array apparatus M1. Incident at (incident angle = (90−θ) [degrees]).
 音源80は、例えば全方位マイクアレイ装置M1の指向方向に存在する監視対象物(例えば人物HM1)であり、全方位マイクアレイ装置M1の筐体21の面上に対し、所定角度θの方向に存在する。また、各マイク素子221,222,223,…,22(n-1),22n間の間隔dは一定とする。 The sound source 80 is, for example, a monitoring object (for example, a person HM1) that exists in the direction of the omnidirectional microphone array apparatus M1, and is in a direction of a predetermined angle θ with respect to the surface of the casing 21 of the omnidirectional microphone array apparatus M1. Exists. Further, the distance d between the microphone elements 221, 222, 223,..., 22 (n−1), 22n is constant.
 音源80から発した音波は、最初にマイク素子221に到達して収音され、次にマイク素子222に到達して収音され、同様に次々に収音され、最後にマイク素子22nに到達して収音される。 The sound wave emitted from the sound source 80 first reaches the microphone element 221 and is collected, then reaches the microphone element 222 and is collected, and is successively collected, and finally reaches the microphone element 22n. Sound is collected.
 なお、全方位マイクアレイ装置M1の各マイク素子221,222,223,…,22(n-1),22nの位置から音源80に向かう方向は、例えば音源80が監視対象物(例えば人物HM1)の発する音声である場合に、全方位マイクアレイ装置2の各マイクロホン(マイク素子)から、ユーザがディスプレイ装置35上において指定した指定位置に対応する音声位置(音源位置)に向かう方向と同じである。 Note that the direction from the position of each microphone element 221, 222, 223,..., 22 (n−1), 22n of the omnidirectional microphone array apparatus M1 toward the sound source 80 is, for example, the sound source 80 being monitored (for example, the person HM1). Is the same as the direction from each microphone (microphone element) of the omnidirectional microphone array device 2 toward the sound position (sound source position) corresponding to the designated position designated on the display device 35 by the user. .
 ここで、音波がマイク素子221,222,223,…,22(n-1)の順に到達した時刻から最後のマイク素子22nに到達した時刻までには、到達時間差τ1,τ2,τ3,…,τ(n-1)が生じる。このため、各々のマイク素子221,222,223,…,22(n-1),22nが収音した音声の音声データがそのまま加算された場合には、位相がずれた状態で加算されるため、音波の音量レベルが全体的に弱め合う。 Here, there is an arrival time difference τ1, τ2, τ3,... From the time when the sound wave reaches the microphone elements 221, 222, 223,. τ (n−1) is generated. For this reason, when the voice data collected by the microphone elements 221, 222, 223,..., 22 (n−1), 22n are added as they are, they are added in a state of being out of phase. , The sound wave volume level weakens as a whole.
 なお、τ1は音波がマイク素子221に到達した時刻と音波がマイク素子22nに到達した時刻との差分の時間であり、τ2は音波がマイク素子222に到達した時刻と音波がマイク素子22nに到達した時刻との差分の時間であり、同様に、τ(n-1)は音波がマイク素子22(n-1)に到達した時刻と音波がマイク素子22nに到達した時刻との差分の時間である。 Note that τ1 is a difference time between the time when the sound wave reaches the microphone element 221 and the time when the sound wave reaches the microphone element 22n, and τ2 is the time when the sound wave reaches the microphone element 222 and the sound wave reaches the microphone element 22n. Similarly, τ (n−1) is the difference time between the time when the sound wave reaches the microphone element 22 (n−1) and the time when the sound wave reaches the microphone element 22n. is there.
 本実施形態では、全方位マイクアレイ装置M1は、マイク素子221,222,223,…,22(n-1),22n毎に対応して設けられたA/D変換器241,242,243,…,24(n-1),24nと、遅延器251,252,253,…,25(n-1),25nと、加算器26と、を有する構成である(図37参照)。 In the present embodiment, the omnidirectional microphone array apparatus M1 includes A / D converters 241, 242, 243 provided corresponding to the microphone elements 221, 222, 223, ..., 22 (n-1), 22n. ..., 24 (n-1), 24n, delay units 251, 252, 253, ..., 25 (n-1), 25n, and an adder 26 (see FIG. 37).
 即ち、全方位マイクアレイ装置M1は、各マイク素子221,222,223,…,22(n-1),22nが収音したアナログの音声データを、A/D変換器241,242,243,…,24(n-1),24nにおいてデジタルの音声データにAD変換する。 That is, the omnidirectional microphone array apparatus M1 converts analog audio data collected by the microphone elements 221, 222, 223,..., 22 (n−1), 22n into A / D converters 241, 242, 243, .., 24 (n-1), and 24n are AD-converted into digital audio data.
 更に、全方位マイクアレイ装置M1は、遅延器251,252,253,…,25(n-1),25nにおいて、各々のマイク素子221,222,223,…,22(n-1),22nにおける到達時間差に対応する遅延時間を与えて全ての音波の位相を揃えた後、加算器26において遅延処理後の音声データを加算する。これにより、全方位マイクアレイ装置M1は、各マイク素子221,222,223,…,22(n-1),22nに、所定角度θの方向に音声データの指向性を形成できる。 Further, the omnidirectional microphone array apparatus M1 includes the microphone elements 221, 222, 223,..., 22 (n−1), 22n in the delay units 251, 252, 253,. After the delay time corresponding to the arrival time difference is given and the phases of all the sound waves are made uniform, the adder 26 adds the audio data after the delay processing. Thereby, the omnidirectional microphone array apparatus M1 can form the directivity of the audio data in the direction of the predetermined angle θ in each of the microphone elements 221, 222, 223,..., 22 (n−1), 22n.
 例えば図37では、遅延器251,252,253,…,25(n-1),25nに設定された各遅延時間D1,D2,D3,…,D(n-1),Dnは、それぞれ到達時間差τ1,τ2,τ3,…,τ(n-1)に相当し、数式(1)により示される。 For example, in FIG. 37, the delay times D1, D2, D3,..., D (n−1), Dn set in the delay units 251, 252, 253,. This corresponds to the time differences τ1, τ2, τ3,..., Τ (n−1), and is expressed by Equation (1).
Figure JPOXMLDOC01-appb-M000001
Figure JPOXMLDOC01-appb-M000001
 L1は、マイク素子221とマイク素子22nとにおける音波到達距離の差である。L2は、マイク素子222とマイク素子22nとにおける音波到達距離の差である。L3は、マイク素子223とマイク素子22nとにおける音波到達距離の差であり、同様に、L(n-1)は、マイク素子22(n-1)とマイク素子22nとにおける音波到達距離の差である。Vsは音波の速度(音速)である。L1,L2,L3,…,L(n-1),Vsは既知の値である。図37では、遅延器25nに設定される遅延時間Dnは0(ゼロ)である。 L1 is a difference in sound wave arrival distance between the microphone element 221 and the microphone element 22n. L2 is a difference in sound wave arrival distance between the microphone element 222 and the microphone element 22n. L3 is a difference in sound wave arrival distance between the microphone element 223 and the microphone element 22n. Similarly, L (n−1) is a difference in sound wave arrival distance between the microphone element 22 (n−1) and the microphone element 22n. It is. Vs is the speed of sound waves (sound speed). L1, L2, L3,..., L (n−1), and Vs are known values. In FIG. 37, the delay time Dn set in the delay device 25n is 0 (zero).
 このように、全方位マイクアレイ装置M1は、遅延器251,252,253,…,25(n-1),25nに設定される遅延時間D1,D2,D3,…,Dn-1,Dnを変更することで、マイクロホンユニット22,23に内蔵された各々のマイク素子221,222,223,…,22(n-1),22nが収音した音声の音声データの指向性を簡易に形成できる。 As described above, the omnidirectional microphone array apparatus M1 uses the delay times D1, D2, D3,..., Dn-1, Dn set in the delay units 251, 252, 253,. By changing the directivity of the voice data collected by the microphone elements 221, 222, 223,..., 22 (n-1), 22n built in the microphone units 22, 23 can be easily formed. .
 なお、図37に示す指向性の形成処理の説明は、説明を簡単にするために全方位マイクアレイ装置2が行うことを前提として記載し、他の全方位マイクアレイ装置(例えば全方位マイクアレイ装置Mm)にも同様に適用可能である。但し、指向性制御装置3、3Aの信号処理部34,34Aの出力制御部34bが全方位マイクアレイ装置M1のマイクロホンの数と同数のAD変換器241~24n及び遅延器251~25nと1つの加算器26とを有する構成である場合には、指向性制御装置3,3Aの信号処理部34,34Aの出力制御部34bが、全方位マイクアレイ装置M1の各マイク素子により収音された音声の音声データを用いて、図37に示す指向性の形成処理を行っても良い。 The description of the directivity forming process shown in FIG. 37 is based on the assumption that the omnidirectional microphone array apparatus 2 performs the description for the sake of simplicity, and other omnidirectional microphone array apparatuses (for example, the omnidirectional microphone array). The same applies to the device Mm). However, the output control unit 34b of the signal processing units 34 and 34A of the directivity control devices 3 and 3A has the same number of AD converters 241 to 24n and delay units 251 to 25n as the number of microphones of the omnidirectional microphone array device M1. In the case of the configuration including the adder 26, the sound collected by each microphone element of the omnidirectional microphone array device M1 by the output control unit 34b of the signal processing units 34 and 34A of the directivity control devices 3 and 3A. 37 may be used to perform the directivity forming process shown in FIG.
(各種モード、各種方法の説明)
 ここで、本実施形態を含む各実施形態において共通する各種モード及び各種方法について、詳細に説明する。
(Description of various modes and methods)
Here, various modes and various methods common to the embodiments including the present embodiment will be described in detail.
 本実施形態を含む各実施形態では、次のような各種モード及び各種方法が存在する。それぞれについて簡単に説明する。 In each embodiment including this embodiment, the following various modes and various methods exist. Each will be briefly described.
 (1)録画再生モード:オン/オフ
 (2)トラッキングモード:オン/オフ
 (3)トラッキング処理方法:手動/自動
 (4)トラッキング対象数:シングル/マルチ
 (5)手動指定方法:クリック操作/ドラッグ操作
 (6)スロー再生モード:オン/オフ
 (7)拡大表示モード:オン/オフ
 (8)音声プライバシー保護モード:オン/オフ
 (9)画像プライバシー保護モード:オン/オフ
 (10)結線モード:都度/一括
 (11)補正モード:オン/オフ
 (12)複数カメラ切替方法:自動/手動
 (13)複数マイク切替方法:自動/手動
 (14)トラッキングポイントの上限設定モード:オン/オフ
(1) Recording / playback mode: ON / OFF (2) Tracking mode: ON / OFF (3) Tracking processing method: manual / automatic (4) Number of tracking targets: single / multi (5) Manual specification method: click operation / drag Operation (6) Slow playback mode: ON / OFF (7) Enlarged display mode: ON / OFF (8) Audio privacy protection mode: ON / OFF (9) Image privacy protection mode: ON / OFF (10) Connection mode: Each time / Batch (11) Correction mode: ON / OFF (12) Multiple camera switching method: Automatic / Manual (13) Multiple microphone switching method: Automatic / Manual (14) Tracking point upper limit setting mode: ON / OFF
 (1)録画再生モードとは、例えば一定期間にわたってカメラ装置C1により撮像された映像の画像データを、撮像後のある時点でユーザ(例えば監視者。以下同様)が内容確認等のために再生する場合に使用される。なお、録画再生モードがオフである場合、カメラ装置C1がリアルタイムに撮像している映像の画像データがディスプレイ装置35に表示される。 (1) In the recording / playback mode, for example, a user (for example, a supervisor; the same applies hereinafter) plays back image data of a video imaged by the camera device C1 over a certain period of time for the purpose of confirming the contents. Used when. Note that when the recording / playback mode is OFF, image data of a video imaged in real time by the camera device C1 is displayed on the display device 35.
 (2)トラッキングモードとは、監視対象物(例えば人物HM1)の移動に伴って、全方位マイクアレイ装置M1により収音された音声の指向性の追従制御(音声トラッキング処理)を行う場合に使用される。 (2) The tracking mode is used when the follow-up control (voice tracking process) of the directivity of the sound collected by the omnidirectional microphone array device M1 as the monitoring target (for example, the person HM1) moves is performed. Is done.
 (3)トラッキング処理方法とは、監視対象物(例えば人物HM1)の移動により、全方位マイクアレイ装置M1により収音された音声の指向性の追従制御(音声トラッキング処理)を行う場合に監視対象物の位置(例えばディスプレイ装置35のトラッキング画面TRW上の指定位置、又は実空間上の位置)を設定する方法であり、手動トラッキング処理と自動トラッキング処理とに分かれる。それぞれの詳細については後述する。 (3) The tracking processing method is a monitoring target when performing tracking control (voice tracking processing) of the directivity of the sound collected by the omnidirectional microphone array device M1 by the movement of the monitoring target (for example, the person HM1). This is a method of setting the position of an object (for example, a designated position on the tracking screen TRW of the display device 35 or a position in real space), and is divided into manual tracking processing and automatic tracking processing. Details of each will be described later.
 (4)トラッキング対象数とは、全方位マイクアレイ装置M1により収音された音声の指向性の追従制御(音声トラッキング処理)を行う対象となる監視対象物の数を示し、例えば人物であれば1人又は複数人である。 (4) The number of tracking objects indicates the number of monitoring objects to be subjected to follow-up control (voice tracking processing) of the directivity of the sound collected by the omnidirectional microphone array apparatus M1, for example, one person for a person Or more than one person.
 (5)手動指定方法とは、手動トラッキング処理(後述参照)において、トラッキング画面TRW上においてユーザがトラッキングポイントを指定する場合の方法を示し、例えばマウス操作によるカーソルCSRのクリック操作又はドラッグ操作、ユーザの指FGによるタッチ操作又はタッチスライド操作が該当する。 (5) The manual designation method refers to a method in which the user designates a tracking point on the tracking screen TRW in manual tracking processing (described later). For example, the click operation or drag operation of the cursor CSR by a mouse operation, the user This corresponds to a touch operation or a touch slide operation with the finger FG.
 (6)スロー再生モードとは、録画再生モードがオンであることを前提に、ディスプレイ装置35において再生される画像データの再生速度が初期値(例えば通常値)よりも小さい速度値で再生される場合に使用される。 (6) The slow playback mode is based on the assumption that the recording / playback mode is on, and the playback speed of the image data played back on the display device 35 is played back at a speed value smaller than the initial value (eg, normal value). Used when.
 (7)拡大表示モードとは、ディスプレイ装置35のトラッキング画面TRW上に表示された監視対象物(例えば人物HM1)を拡大表示させる場合に使用される。 (7) The enlarged display mode is used when the monitored object (for example, the person HM1) displayed on the tracking screen TRW of the display device 35 is enlarged and displayed.
 (8)音声プライバシー保護モードとは、全方位マイクアレイ装置M1により収音された音声データがスピーカ装置36において出力される際、出力される音声が誰であるかの特定を困難にするための音声処理(例えばボイスチェンジ処理)が行われる場合に使用される。 (8) The voice privacy protection mode is intended to make it difficult to specify who is the voice to be output when the voice data collected by the omnidirectional microphone array device M1 is output from the speaker device 36. Used when voice processing (for example, voice change processing) is performed.
 (9)画像プライバシー保護モードとは、拡大表示モードがオンである場合に、ディスプレイ装置35のトラッキング画面TRW上に表示された監視対象物(例えば人物HM1)が誰であるかの特定を困難にするための画像処理が行われる場合に使用される。 (9) The image privacy protection mode means that it is difficult to specify who is the monitoring object (for example, the person HM1) displayed on the tracking screen TRW of the display device 35 when the enlarged display mode is on. This is used when image processing is performed.
 (10)結線モードとは、監視対象物の移動過程において手動指定又は自動指定によりトラッキング画面TRW上で指定された指定位置(例えば後述するポイントマーカMR1参照)同士を結線する場合に使用される。結線モードが都度であれば、監視対象物の移動過程において指定位置が指定される度に、隣接するポイントマーカ同士が結線される。結線モードが一括であれば、監視対象物の移動過程において得られた全ての指定位置に対応するポイントマーカが隣接するポイントマーカとの間において一括で結線される。 (10) The connection mode is used when connecting designated positions (see, for example, a point marker MR1 described later) designated on the tracking screen TRW by manual designation or automatic designation in the process of moving the monitoring object. If the connection mode is every time, adjacent point markers are connected each time a specified position is specified in the movement process of the monitoring target. If the connection mode is batch, the point markers corresponding to all the designated positions obtained in the process of moving the monitoring object are connected together with the adjacent point markers.
 (11)補正モードとは、自動トラッキング処理において自動指定された指定位置が監視対象物の移動過程から外れている場合等において、自動トラッキング処理から手動トラッキング処理に切り替える場合に使用される。 (11) The correction mode is used when the automatic tracking process is switched to the manual tracking process when the designated position automatically designated in the automatic tracking process is out of the movement process of the monitoring target.
 (12)複数カメラ切替方法とは、複数のカメラ装置C1~Cnのうち、監視対象物の画像の撮像に用いるカメラ装置を切り替える場合に使用される。複数カメラ切替方法の詳細については第2の実施形態において説明する。 (12) The multiple camera switching method is used when a camera device used for capturing an image of a monitoring object is switched among the multiple camera devices C1 to Cn. Details of the multiple camera switching method will be described in the second embodiment.
 (13)複数マイク切替方法とは、複数の全方位マイクアレイ装置M1~Mmのうち、監視対象物の発する音声の収音に用いる全方位マイクアレイ装置を切り替える場合に使用される。複数マイク切替方法の詳細については第2の実施形態において説明する。 (13) The multi-microphone switching method is used when switching the omnidirectional microphone array device used for collecting the sound emitted from the monitored object among the plurality of omnidirectional microphone array devices M1 to Mm. Details of the multi-microphone switching method will be described in the second embodiment.
 (14)トラッキングポイントの上限設定モードとは、トラッキングポイントの上限値が設定される場合に使用される。例えばトラッキングポイントの上限設定モードがオンである場合には、トラッキングポイントの数が上限値に到達すると、トラッキング処理部34cは、全てのトラッキングポイントをリセット(消去)しても良いし、トラッキングポイントの数が上限値に達したことをトラッキング画面TRW上に表示させても良い。また、トラッキングポイントの数が上限値に達するまでであれば、複数回の音声トラッキング処理も実行可能である。 (14) The tracking point upper limit setting mode is used when the upper limit value of the tracking point is set. For example, when the tracking point upper limit setting mode is ON, when the number of tracking points reaches the upper limit value, the tracking processing unit 34c may reset (erase) all the tracking points, The fact that the number has reached the upper limit value may be displayed on the tracking screen TRW. Further, a plurality of voice tracking processes can be executed as long as the number of tracking points reaches the upper limit.
 なお、上述した(1)~(14)の各種モード又は各種方法を指定するためには、例えば監視システム用のアプリケーション(不図示)における所定の設定ボタン若しくは設定メニュー、トラッキング画面TRW上に表示される設定ボタン若しくは設定メニューに対し、ユーザのマウス操作によるカーソルCSRのクリック操作又はユーザの指FGによるタッチ操作によって決められる。 In order to specify the various modes or methods (1) to (14) described above, for example, a predetermined setting button or setting menu in a monitoring system application (not shown) is displayed on the tracking screen TRW. The setting button or setting menu is determined by a click operation of the cursor CSR by the user's mouse operation or a touch operation by the user's finger FG.
 次に、指向性制御装置3,3Aにおける手動トラッキング処理の操作例について、図4を参照して説明する。図4は、手動トラッキング処理の操作例を示す説明図である。 Next, an operation example of manual tracking processing in the directivity control devices 3 and 3A will be described with reference to FIG. FIG. 4 is an explanatory diagram illustrating an operation example of the manual tracking process.
 図4では、ディスプレイ装置35に表示されたトラッキング画面TRW上に、監視対象物としての人物HM1の移動過程が示され、例えばユーザのマウス操作によるカーソルCSRのクリック操作又はドラッグ操作により、3つのトラッキングポイントb1,b2,b3が指定されている。 In FIG. 4, the movement process of the person HM1 as the monitoring target is shown on the tracking screen TRW displayed on the display device 35. For example, three tracking operations are performed by clicking or dragging the cursor CSR by the user's mouse operation. Points b1, b2, and b3 are designated.
 トラッキング処理部34cは、カーソルCSRがトラッキングポイントb1を指定したトラッキング時刻t1、トラッキングポイントb2を指定したトラッキング時刻t2、トラッキングポイントb3を指定したトラッキング時刻t3の情報を取得する。また、トラッキング処理部34cは、トラッキングポイントb1のトラッキング画面TRW上の座標情報又はこの座標情報に対応する実空間上の位置を示す3次元座標とトラッキング時刻t1の情報とを対応付けてメモリ33に保存する。また、トラッキング処理部34cは、トラッキングポイントb2のトラッキング画面TRW上の座標情報又はこの座標情報に対応する実空間上の位置を示す3次元座標とトラッキング時刻t2の情報とを対応付けてメモリ33に保存する。また、トラッキング処理部34cは、トラッキングポイントb3のトラッキング画面TRW上の座標情報又はこの座標情報に対応する実空間上の位置を示す3次元座標とトラッキング時刻t3の情報とを対応付けてメモリ33に保存する。 The tracking processing unit 34c acquires information of the tracking time t1 when the cursor CSR designates the tracking point b1, the tracking time t2 that designates the tracking point b2, and the tracking time t3 that designates the tracking point b3. Further, the tracking processing unit 34c associates the coordinate information on the tracking screen TRW of the tracking point b1 or the three-dimensional coordinates indicating the position in the real space corresponding to the coordinate information with the information of the tracking time t1 in the memory 33. save. In addition, the tracking processing unit 34c associates the coordinate information on the tracking screen TRW of the tracking point b2 or the three-dimensional coordinates indicating the position in the real space corresponding to the coordinate information with the information of the tracking time t2 in the memory 33. save. Further, the tracking processing unit 34c associates the coordinate information on the tracking screen TRW of the tracking point b3 or the three-dimensional coordinates indicating the position in the real space corresponding to the coordinate information with the information of the tracking time t3 in the memory 33. save.
 出力制御部34bは、トラッキング画面TRW上のトラッキングポイントb1にポイントマーカMR1を表示させ、トラッキング画面TRW上のトラッキングポイントb2にポイントマーカMR2を表示させ、更に、トラッキング画面TRW上のトラッキングポイントb3にポイントマーカMR3を表示させる。これにより、出力制御部34bは、移動中の人物HM1が通過したトラッキングポイントを軌跡としてトラッキング画面TRW上に明示的に示すことができる。 The output control unit 34b displays the point marker MR1 at the tracking point b1 on the tracking screen TRW, displays the point marker MR2 at the tracking point b2 on the tracking screen TRW, and further points to the tracking point b3 on the tracking screen TRW. The marker MR3 is displayed. Thereby, the output control unit 34b can explicitly indicate the tracking point through which the moving person HM1 has passed on the tracking screen TRW as a locus.
 また、出力制御部34bは、ポイントマーカMR1,MR2間を結線して動線LN1を表示させ、更に、ポイントマーカMR2,MR3間を結線して動線LN2を表示させる。 Further, the output control unit 34b displays the flow line LN1 by connecting the point markers MR1 and MR2, and further displays the flow line LN2 by connecting the point markers MR2 and MR3.
 次に、指向性制御装置3,3Aにおける補正モードの操作例について、図5を参照して説明する。図5は、自動トラッキング処理において自動指定されたトラッキングポイントが間違っていた場合に、手動トラッキング処理によりトラッキングポイントを変更する操作例を示す説明図である。 Next, an operation example of the correction mode in the directivity control devices 3 and 3A will be described with reference to FIG. FIG. 5 is an explanatory diagram illustrating an operation example of changing the tracking point by the manual tracking process when the tracking point automatically designated in the automatic tracking process is incorrect.
 図5の左側のトラッキング画面TRWでは、画像処理部37又は音源検出部34dにより自動指定されたトラッキングポイントが人物HM1の移動過程の地点と異なっており、ポイントマーカMR1,MR2W間の結線によって間違った動線LNWが表示されている。 In the tracking screen TRW on the left side of FIG. 5, the tracking point automatically specified by the image processing unit 37 or the sound source detection unit 34d is different from the point of the movement process of the person HM1, and is incorrect due to the connection between the point markers MR1 and MR2W. A flow line LNW is displayed.
 補正モードがオンである場合には、図5の右側のトラッキング画面TRWに示すように、自動トラッキング処理から手動トラッキング処理に切り替わるので、例えばカーソルCSRによるクリック操作によって、正しいトラッキングポイントが指定されると、出力制御部34bは、ポイントマーカMR1,MR2R間を結線し、正しい動線LNRをトラッキング画面TRW上に表示させる。 When the correction mode is on, as shown in the tracking screen TRW on the right side of FIG. 5, the automatic tracking process is switched to the manual tracking process. For example, when a correct tracking point is designated by a click operation using the cursor CSR. The output control unit 34b connects the point markers MR1 and MR2R, and displays the correct flow line LNR on the tracking screen TRW.
 次に、指向性制御装置3,3Aにおける録画再生モード及びスロー再生モードにおけるスロー再生処理について、図6を参照して説明する。図6は、録画再生モード及びスロー再生モードにおけるスロー再生処理を示す説明図である。 Next, the slow playback processing in the recording playback mode and the slow playback mode in the directivity control devices 3 and 3A will be described with reference to FIG. FIG. 6 is an explanatory diagram showing the slow playback process in the recording playback mode and the slow playback mode.
 図6の上側のトラッキング画面TRWでは、例えば人物HM1の動きが速いため、手動トラッキング処理でも自動トラッキング処理でも人物HM1の指定が困難であるとする。録画再生モード及びスロー再生モードがオンである場合には、例えばユーザの指FGによりディスプレイ装置35に表示されたスロー再生ボタンがタッチ操作されると、出力制御部34bは、再生速度の初期値(通常値)より小さい速度値で、人物HM1の移動過程を示す映像の画像データをトラッキング画面TRW上にスロー再生させる(図6の下側のトラッキング画面TRW参照)。 In the upper tracking screen TRW in FIG. 6, for example, it is assumed that it is difficult to specify the person HM1 in either the manual tracking process or the automatic tracking process because the movement of the person HM1 is fast. When the recording / playback mode and the slow playback mode are on, for example, when the slow playback button displayed on the display device 35 is touched by the user's finger FG, the output control unit 34b sets the initial playback speed ( The image data of the video showing the moving process of the person HM1 is played back slowly on the tracking screen TRW at a speed value smaller than the normal value (see the tracking screen TRW on the lower side of FIG. 6).
 これにより、出力制御部34bは、トラッキング画面TRW上の人物HM1の動きを遅らせることができるので、手動トラッキング処理又は自動トラッキング処理においてトラッキングポイントを簡易に指定させることができる。なお、出力制御部34bは、人物HM1の移動速度が所定値以上である場合には、ユーザの指FGのタッチ操作を受け付けることなく、スロー再生処理を行っても良い。また、スロー再生時の再生速度は一定値でも良いし、ユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作に応じて適宜変更されても良い。 Thereby, since the output control unit 34b can delay the movement of the person HM1 on the tracking screen TRW, the tracking point can be easily designated in the manual tracking process or the automatic tracking process. Note that the output control unit 34b may perform the slow reproduction process without accepting the touch operation of the user's finger FG when the moving speed of the person HM1 is equal to or higher than a predetermined value. Further, the playback speed during slow playback may be a fixed value, or may be changed as appropriate in accordance with an input operation with the cursor CSR by the user's mouse operation or the user's finger FG.
 次に、指向性制御装置3,3Aにおける拡大表示モードにおける拡大表示処理について、図7を参照して説明する。図7は、拡大表示モードにおける拡大表示処理を示す説明図である。 Next, enlargement display processing in the enlargement display mode in the directivity control devices 3 and 3A will be described with reference to FIG. FIG. 7 is an explanatory diagram showing an enlarged display process in the enlarged display mode.
 図7の上側のトラッキング画面TRWでは、例えば人物HM1のサイズが小さいため、手動トラッキング処理又は自動トラッキング処理でも人物HM1の指定が困難であるとする。例えばユーザのマウス操作によるカーソルCSRのクリック操作により、拡大表示モードがオンになった後、人物HM1の位置(表示位置)でクリック操作されると、出力制御部34bは、クリックされた位置を中心として、所定倍率でトラッキング画面TRWを拡大表示させる(図7の下側のトラッキング画面TRW参照)。これにより、出力制御部34bは、トラッキング画面TRW上の人物HM1を拡大表示することができるので、手動トラッキング処理又は自動トラッキング処理においてトラッキングポイントを簡易に指定させることができる。 In the upper tracking screen TRW in FIG. 7, for example, since the size of the person HM1 is small, it is difficult to specify the person HM1 even in the manual tracking process or the automatic tracking process. For example, when the enlarged display mode is turned on by the click operation of the cursor CSR by the user's mouse operation, when the click operation is performed at the position (display position) of the person HM1, the output control unit 34b centers on the clicked position. The tracking screen TRW is enlarged and displayed at a predetermined magnification (see the tracking screen TRW on the lower side of FIG. 7). Thereby, since the output control part 34b can carry out enlarged display of the person HM1 on the tracking screen TRW, it can designate a tracking point easily in a manual tracking process or an automatic tracking process.
 なお、出力制御部34bは、クリックされた位置を中心として、トラッキング画面TRWの内容を別のポップアップ画面(不図示)に拡大表示しても良い。これにより、出力制御部34bは、例えばユーザの簡易な指定操作によって、拡大表示されていないトラッキング画面TRWと拡大表示されたポップアップ画面とを対比させてユーザに監視対象物(人物HM1)を簡易に指定させることができる。 The output control unit 34b may enlarge and display the content of the tracking screen TRW on another pop-up screen (not shown) with the clicked position as the center. As a result, the output control unit 34b makes it easy for the user to compare the tracking screen TRW that has not been enlarged and the pop-up screen that has been enlarged, for example, by a simple designation operation by the user, so that the user can easily monitor the object to be monitored (person HM1). Can be specified.
 また、出力制御部34bは、例えばトラッキングポイントが未だ指定されていない場合には、ディスプレイ装置35の中心を基準にして、映し出されているカメラ画面の内容を拡大表示しても良い。これにより、出力制御部34bは、例えばユーザの簡易な指定操作によって、例えばディスプレイ装置35の中心付近に監視対象物(人物HM1)が映っている場合には、ユーザに監視対象物を簡易に指定させることができる。 Further, for example, when the tracking point is not yet designated, the output control unit 34b may enlarge and display the contents of the displayed camera screen with the center of the display device 35 as a reference. Thereby, the output control unit 34b simply designates the monitoring target to the user when, for example, the monitoring target (person HM1) is shown near the center of the display device 35 by a simple designation operation of the user, for example. Can be made.
 また、出力制御部34bは、複数の監視対象物が指定されている場合には、トラッキング画面TRW上の複数の指定位置の幾何平均に対応する位置を中心として、拡大表示させても良い。これにより、出力制御部34bは、トラッキング画面TRW上に映し出されている複数の監視対象物を、ユーザに対して簡易に選択させることができる。 In addition, when a plurality of monitoring objects are designated, the output control unit 34b may enlarge the display centering on a position corresponding to a geometric average of a plurality of designated positions on the tracking screen TRW. Thereby, the output control part 34b can make a user select easily the several monitoring target object currently projected on the tracking screen TRW.
 次に、指向性制御装置3,3Aにおける拡大表示モードにおける拡大表示処理後の自動スクロール処理について、図8(A)、(B)及び(C)を参照して説明する。図8(A)は、拡大表示モードにおける拡大表示処理後の自動スクロール処理を示す説明図である。図8(B)は、時刻t=t1におけるトラッキング画面TRWを示す図である。図8(C)は、時刻t=t2におけるトラッキング画面TRWを示す図である。 Next, the automatic scroll process after the enlargement display process in the enlargement display mode in the directivity control devices 3 and 3A will be described with reference to FIGS. 8 (A), (B), and (C). FIG. 8A is an explanatory diagram showing an automatic scroll process after the enlargement display process in the enlargement display mode. FIG. 8B is a diagram showing the tracking screen TRW at time t = t1. FIG. 8C shows the tracking screen TRW at time t = t2.
 図8(A)では、カメラ装置C1の撮像エリアC1RN内であって、監視対象物としての人物HM1の時刻t=t1の位置から時刻t=t2の位置への移動経路が示されている。例えばトラッキング画面TRWが拡大表示された結果、撮像エリアC1RN全体の画像がトラッキング画面TRWに映し出されなくなることがある。 FIG. 8A shows a movement path from the position at time t = t1 to the position at time t = t2 of the person HM1 as the monitoring target in the imaging area C1RN of the camera device C1. For example, as a result of the enlarged display of the tracking screen TRW, an image of the entire imaging area C1RN may not be displayed on the tracking screen TRW.
 出力制御部34bは、ユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作に応じて、例えば時刻t=t1から時刻t=t2までの人物HM1の移動経路に沿って、人物HM1が常にトラッキング画面TRWの中心に表示されるようにトラッキング画面TRWを自動スクロール処理する。これにより、出力制御部34bは、拡大表示されたトラッキング画面TRWに映し出された人物HM1が移動したことにより、ユーザの指定位置が常にトラッキング画面TRWの中心となるようにトラッキング画面TRWを自動的にスクロールするので、トラッキング画面TRWが拡大表示された場合でも、ユーザの人物HM1の指定位置がトラッキング画面TRWから外れることを防ぐことができ、更に、移動を続けるトラッキング画面TRW上の人物HM1を簡易に指定させることができる。 The output control unit 34b always moves the person HM1 along the movement path of the person HM1 from time t = t1 to time t = t2, for example, in response to an input operation with the cursor CSR by the user's mouse operation or the user's finger FG. The tracking screen TRW is automatically scrolled so as to be displayed at the center of the tracking screen TRW. Accordingly, the output control unit 34b automatically displays the tracking screen TRW so that the user's designated position is always at the center of the tracking screen TRW due to the movement of the person HM1 displayed on the enlarged tracking screen TRW. Since the scrolling is performed, it is possible to prevent the designated position of the user person HM1 from deviating from the tracking screen TRW even when the tracking screen TRW is displayed in an enlarged manner. Further, the person HM1 on the tracking screen TRW that continues to move can be simply displayed. Can be specified.
 図8(B)では、時刻t=t1におけるトラッキング画面TRWが示され、人物HM1が中心に表示されている。同図のTP1は、時刻t=t1において人物HM1がユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作によって指定されたトラッキングポイントを示す。 FIG. 8B shows the tracking screen TRW at time t = t1, and the person HM1 is displayed at the center. TP1 in the figure indicates a tracking point designated by the person HM1 by an input operation with the cursor CSR or the user's finger FG by the user's mouse operation at time t = t1.
 同様に、図8(C)では、時刻t=t2におけるトラッキング画面TRWが示され、人物HM1が中心に表示されている。同図のTP2は、時刻t=t2において人物HM1がユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作によって指定されたトラッキングポイントを示す。図8(B)でも図8(C)でも、自動スクロール処理の間、監視対象物としての人物HM1は、トラッキング画面TRW上の中心に表示されるので、ユーザの選択が容易になる。 Similarly, in FIG. 8C, the tracking screen TRW at time t = t2 is shown, and the person HM1 is displayed at the center. TP2 in the figure indicates a tracking point designated by the person HM1 by an input operation with the cursor CSR or the user's finger FG by the user's mouse operation at time t = t2. In both FIG. 8B and FIG. 8C, the person HM1 as the monitoring object is displayed at the center on the tracking screen TRW during the automatic scrolling process, so that the user can easily select.
 次に、本実施形態の指向性制御システム100における手動トラッキング処理の全体フローについて、図9(A)及び(B)を参照して説明する。図9(A)は、第1の実施形態の指向性制御システム100における手動トラッキング処理の全体フローの第1例を説明するフローチャートである。図9(B)は、第1の実施形態の指向性制御システム100における手動トラッキング処理の全体フローの第2例を説明するフローチャートである。 Next, an overall flow of manual tracking processing in the directivity control system 100 of the present embodiment will be described with reference to FIGS. 9 (A) and 9 (B). FIG. 9A is a flowchart illustrating a first example of the overall flow of manual tracking processing in the directivity control system 100 of the first embodiment. FIG. 9B is a flowchart illustrating a second example of the overall flow of manual tracking processing in the directivity control system 100 of the first embodiment.
 以下、説明の複雑化を避けるために、図9(A)及び図9(B)を参照して本実施形態の指向性制御システム100における手動トラッキング処理の全体フローについて先に説明し、個々の処理の詳細な内容については後述する図面を参照して都度、説明する。図9(B)に示す動作のうち、図9(A)に示す動作と同一の内容には同一のステップ番号を付して説明を簡略化又は省略し、異なる内容について説明する。図9(A)及び(B)では、指向性制御装置3の動作が示されている。 Hereinafter, in order to avoid complication of the description, the overall flow of the manual tracking process in the directivity control system 100 of the present embodiment will be described first with reference to FIGS. 9A and 9B. Detailed contents of the processing will be described each time with reference to the drawings described later. Among the operations illustrated in FIG. 9B, the same contents as those illustrated in FIG. 9A are denoted by the same step numbers, and description thereof is simplified or omitted, and different contents are described. 9A and 9B show the operation of the directivity control device 3.
 図9(A)の説明の前提として、出力制御部34bは、カメラ装置C1により撮像された監視対象物としての人物HM1の画像が映し出されたディスプレイ装置35のトラッキング画面TRWにおいて、全方位マイクアレイ装置M1から、ユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作により指定された位置に対応する人物HM1の位置(音声位置、音源位置)への方向に収音音声の指向性を形成しているとする。なお、図9(B)の説明の前提としても同様とする。 As a premise of the description of FIG. 9A, the output control unit 34b is configured to use an omnidirectional microphone array on the tracking screen TRW of the display device 35 on which an image of the person HM1 as a monitoring target imaged by the camera device C1 is displayed. The directivity of the collected sound is formed in the direction from the device M1 to the position (sound position, sound source position) of the person HM1 corresponding to the position specified by the cursor CSR by the user's mouse operation or the input operation by the user's finger FG. Suppose you are. Note that the same applies to the explanation of FIG.
 図9(A)において、トラッキングモードがオフであれば(S1、NO)、図9(A)に示す手動トラッキング処理は終了するが、トラッキングモードがオンである場合には(S1、YES)、トラッキング補助処理が開始される(S2)。トラッキング補助処理の詳細は図13(A)を参照して後述する。 In FIG. 9A, if the tracking mode is off (S1, NO), the manual tracking process shown in FIG. 9A ends, but if the tracking mode is on (S1, YES), The tracking assist process is started (S2). Details of the tracking assist processing will be described later with reference to FIG.
 ステップS2の後、ディスプレイ装置35のトラッキング画面TRWにおいて、人物HM1の移動過程(移動経路)のトラッキング位置、即ち、トラッキングポイントが、ユーザのマウス操作によるカーソルCSRのクリック操作又はユーザの指FGのタッチ操作により指定される(S3)。 After step S2, on the tracking screen TRW of the display device 35, the tracking position of the movement process (movement path) of the person HM1, that is, the tracking point, is the click operation of the cursor CSR by the user's mouse operation or the touch of the user's finger FG. Designated by operation (S3).
 トラッキング処理部34cは、ステップS3において指定されたトラッキング画面TRW上の指定位置に対応する実空間上の位置を示す3次元座標及び指定時刻を、それぞれトラッキングポイントのトラッキング位置及びトラッキング時刻として対応付けてメモリ33に保存し、更に、出力制御部34bを介して、トラッキング画面TRW上のトラッキングポイントにポイントマーカを表示させる(S4)。なお、ポイントマーカは、トラッキング処理部34cにより表示されても良く、以下の各実施形態においても同様である。 The tracking processing unit 34c associates the three-dimensional coordinates indicating the position in the real space corresponding to the specified position on the tracking screen TRW specified in step S3 and the specified time as the tracking position and tracking time of the tracking point, respectively. The data is stored in the memory 33, and further, a point marker is displayed at the tracking point on the tracking screen TRW via the output control unit 34b (S4). The point marker may be displayed by the tracking processing unit 34c, and the same applies to the following embodiments.
 出力制御部34bは、全方位マイクアレイ装置M1から、ステップS3において指定されたトラッキングポイントに対応する人物HM1の位置(音声位置、音源位置)への方向に、収音音声の指向性を形成する(S5)。なお、ユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作に応じて、人物HM1の移動過程(移動経路)の指定によってトラッキング処理部34cがトラッキングポイントのトラッキング位置及びトラッキング時刻のデータを取得するだけで良い場合には、ステップS5の動作は省略されても良い。言い換えると、出力制御部34bは、全方位マイクアレイ装置M1から、ステップS3において指定されたトラッキングポイントに対応する人物HM1の位置(音声位置、音源位置)への方向に、指向性を切り替えなくても良く、以下の各実施形態においても同様である。 The output control unit 34b forms the directivity of the collected sound in the direction from the omnidirectional microphone array device M1 to the position (sound position, sound source position) of the person HM1 corresponding to the tracking point specified in step S3. (S5). Note that the tracking processing unit 34c acquires the tracking position and tracking time data of the tracking point by designating the movement process (movement route) of the person HM1 in response to an input operation with the cursor CSR by the user's mouse operation or the user's finger FG. When it is only necessary to do this, the operation of step S5 may be omitted. In other words, the output control unit 34b does not switch the directivity from the omnidirectional microphone array apparatus M1 to the direction of the person HM1 corresponding to the tracking point specified in step S3 (voice position, sound source position). The same applies to the following embodiments.
 ステップS5の後、出力制御部34bは、トラッキング結線処理を行う(S6)。トラッキング結線処理の詳細は図15(A)を参照して後述する。ステップS6の後、出力制御部34bは、ステップS5において指向性を形成した収音音声をスピーカ装置36から出力する(S7)。音声出力処理の詳細は図21(A)を参照して後述する。ステップS7の後、指向性制御装置3の動作はステップS1に戻り、トラッキングモードがオフにされるまで、ステップS1~ステップS7の処理が繰り返される。 After step S5, the output control unit 34b performs tracking connection processing (S6). Details of the tracking connection process will be described later with reference to FIG. After step S6, the output control unit 34b outputs the collected sound having the directivity formed in step S5 from the speaker device 36 (S7). Details of the audio output processing will be described later with reference to FIG. After step S7, the operation of the directivity control device 3 returns to step S1, and the processes of steps S1 to S7 are repeated until the tracking mode is turned off.
 図9(B)において、ステップS1の後、トラッキング補助処理が開始される(S2)。トラッキング補助処理の詳細は図13(A)を参照して後述する。ステップS2の後、ディスプレイ装置35のトラッキング画面TRWにおいて、人物HM1の移動過程(移動経路)の位置(即ち、トラッキングポイント)が、ユーザのマウス操作によるカーソルCSRのドラッグ操作又はユーザの指FGのタッチスライド操作が開始されたとする(S3A)。 In FIG. 9B, after step S1, tracking assist processing is started (S2). Details of the tracking assist processing will be described later with reference to FIG. After step S2, on the tracking screen TRW of the display device 35, the position (namely, tracking point) of the movement process (movement path) of the person HM1 is the drag operation of the cursor CSR by the user's mouse operation or the touch of the user's finger FG. It is assumed that the slide operation is started (S3A).
 ステップS3Aの後、前回のトラッキングポイントに対応するトラッキング位置及びトラッキング時刻のデータの保存が終わってから所定時間(例えば数秒程度)が経過していない場合には(S8、NO)、ステップS3Aにて開始したドラッグ操作又はタッチスライド操作は終了していないと考えられ、指向性制御装置3の動作はステップS7に進む。 After step S3A, if the predetermined time (for example, about several seconds) has not elapsed since the storage of the tracking position and tracking time data corresponding to the previous tracking point has ended (S8, NO), in step S3A It is considered that the started drag operation or touch slide operation has not ended, and the operation of the directivity control device 3 proceeds to step S7.
 一方、ステップS3の後、前回のトラッキングポイントに対応するトラッキング位置及びトラッキング時刻のデータの保存が終わってから所定時間(例えば数秒程度)が経過した場合には(S8、YES)、ステップS3にて開始したドラッグ操作又はタッチスライド操作が終了したと考えられ、新しいトラッキングポイントが指定されたことになる。即ち、トラッキング処理部34cは、ドラッグ操作又はタッチスライド操作が終了した時の指定位置に対応する実空間上の位置を示す3次元座標及び指定時刻を、それぞれ新しいトラッキングポイントのトラッキング位置及びトラッキング時刻として対応付けてメモリ33に保存し、更に、出力制御部34bを介して、トラッキング画面TRW上のトラッキングポイントにポイントマーカを表示させる(S4)。ステップS4以降の動作は図9(A)に示すステップS4以降の動作と同一であるため、説明を省略する。 On the other hand, after step S3, when a predetermined time (for example, about several seconds) has elapsed since the storage of the tracking position and tracking time data corresponding to the previous tracking point has ended (S8, YES), in step S3 It is considered that the started drag operation or touch slide operation is completed, and a new tracking point is designated. That is, the tracking processing unit 34c uses the three-dimensional coordinates indicating the position in the real space corresponding to the specified position when the drag operation or the touch slide operation is ended and the specified time as the tracking position and tracking time of the new tracking point, respectively. The information is stored in the memory 33 in association with each other, and a point marker is displayed at the tracking point on the tracking screen TRW via the output control unit 34b (S4). The operation after step S4 is the same as the operation after step S4 shown in FIG.
 次に、本実施形態の指向性制御システム100Aにおける自動トラッキング処理の全体フローについて、図10(A)及び(B)、図11(A)及び(B)、図12を参照して説明する。図10(A)は、第1の実施形態の指向性制御システム100Aにおける自動トラッキング処理の全体フローの第1例を説明するフローチャートである。図10(B)は、図10(A)に示す自動トラッキング処理の第1例を説明するフローチャートである。図11(A)は、図10(A)に示す自動トラッキング処理の第2例を説明するフローチャートである。図11(B)は、図11(A)に示すトラッキング補正処理の一例を説明するフローチャートである。図12は、図10(A)に示す自動トラッキング処理の第3例を説明するフローチャートである。 Next, the entire flow of the automatic tracking process in the directivity control system 100A of the present embodiment will be described with reference to FIGS. 10 (A) and (B), FIGS. 11 (A) and (B), and FIG. FIG. 10A is a flowchart for explaining a first example of the entire flow of the automatic tracking process in the directivity control system 100A of the first embodiment. FIG. 10B is a flowchart for explaining a first example of the automatic tracking process shown in FIG. FIG. 11A is a flowchart for explaining a second example of the automatic tracking process shown in FIG. FIG. 11B is a flowchart illustrating an example of the tracking correction process illustrated in FIG. FIG. 12 is a flowchart for explaining a third example of the automatic tracking process shown in FIG.
 また、図10(A)においても図9(A)及び(B)と同様に、説明の複雑化を避けるために、図10(A)を参照して本実施形態の指向性制御システム100Aにおける自動トラッキング処理の全体フローについて先に説明し、個々の処理の詳細な内容については後述する図面を参照して都度、説明する。 Also in FIG. 10 (A), as in FIGS. 9 (A) and 9 (B), in order to avoid complication of explanation, referring to FIG. 10 (A), in the directivity control system 100A of the present embodiment. The overall flow of the automatic tracking process will be described first, and the detailed contents of each process will be described each time with reference to the drawings described later.
 図10(A)に示す動作のうち、図9(A)又は(B)に示す動作と同一の内容には同一のステップ番号を付して説明を簡略化又は省略し、異なる内容について説明する。図10(A)でも、指向性制御装置3の動作が示されている。 Of the operations shown in FIG. 10A, the same contents as those shown in FIG. 9A or 9B are denoted by the same step numbers, and the description will be simplified or omitted, and different contents will be described. . FIG. 10A also shows the operation of the directivity control device 3.
 図10(A)の説明の前提として、出力制御部34bは、カメラ装置C1により撮像された監視対象物としての人物HM1の画像が映し出されたディスプレイ装置35のトラッキング画面TRWにおいて、全方位マイクアレイ装置M1から、音源検出部34d又は画像処理部37の検出処理結果を用いて自動指定された位置に対応する人物HM1の位置(音声位置、音源位置)への方向に収音音声の指向性を形成しているとする。 As a premise of the description of FIG. 10A, the output control unit 34b is configured to display an omnidirectional microphone array on the tracking screen TRW of the display device 35 on which an image of the person HM1 as the monitoring target imaged by the camera device C1 is displayed. The directivity of the collected sound is set in the direction from the device M1 to the position (speech position, sound source position) of the person HM1 corresponding to the position automatically designated by using the detection processing result of the sound source detection unit 34d or the image processing unit 37. Suppose that it is formed.
 図10(A)において、ステップS1の後、トラッキング補助処理が開始される(S2)。トラッキング補助処理の詳細は図13(A)を参照して後述する。ステップS2の後、自動トラッキング処理が行われる(S3B)。自動トラッキング処理の詳細は図10(B)、図11(A)及び図12を参照して後述する。ステップS3Bの後、出力制御部34bは、全方位マイクアレイ装置M1から、ステップS3Bにおいて自動指定されたトラッキングポイントに対応する人物HM1の位置(音声位置、音源位置)への方向に、収音音声の指向性を形成する(S5)。ステップS5以降の動作は図9(A)に示すステップS4以降の動作と同一であるため、説明を省略する。 In FIG. 10A, after step S1, tracking assist processing is started (S2). Details of the tracking assist processing will be described later with reference to FIG. After step S2, automatic tracking processing is performed (S3B). Details of the automatic tracking process will be described later with reference to FIGS. 10B, 11A, and 12. FIG. After step S3B, the output control unit 34b collects sound in the direction from the omnidirectional microphone array apparatus M1 to the position of the person HM1 (speech position, sound source position) corresponding to the tracking point automatically designated in step S3B. (S5). The operation after step S5 is the same as the operation after step S4 shown in FIG.
 図10(B)において、画像処理部37は、公知の画像処理を行うことで、ディスプレイ装置35のトラッキング画面TRW上に、監視対象物としての人物HM1の検出の有無を判定し、人物HM1を検出したと判定した場合には、判定結果(人物HM1の検出位置(例えば既知の代表点)及び検出時刻のデータを含む)を信号処理部34のトラッキング処理部34cに出力する(S3B-1)。 In FIG. 10B, the image processing unit 37 performs known image processing to determine whether or not the person HM1 as the monitoring target is detected on the tracking screen TRW of the display device 35, and the person HM1 is detected. If it is determined that it has been detected, the determination result (including the detection position (eg, known representative point) of the person HM1 and detection time data) is output to the tracking processing unit 34c of the signal processing unit 34 (S3B-1). .
 又は、音源検出部34dは、公知の音源検出処理を行うことで、ディスプレイ装置35のトラッキング画面TRW上に、監視対象物としての人物HM1の発する音声(音源)の位置の検出の有無を判定し、音源の位置を検出したと判定した場合には、判定結果(音源の検出位置及び検出時刻のデータを含む)をトラッキング処理部34cに出力する(S3B-1)。なお、ステップS3B-1の説明を簡単にするために、トラッキング画面TRW上には、監視対象物の人物HM1以外の監視対象物は存在しないものとして説明する。 Alternatively, the sound source detection unit 34d determines whether or not the position of the sound (sound source) emitted by the person HM1 as the monitoring target is detected on the tracking screen TRW of the display device 35 by performing a known sound source detection process. If it is determined that the position of the sound source has been detected, the determination result (including the sound source detection position and detection time data) is output to the tracking processing unit 34c (S3B-1). In order to simplify the description of step S3B-1, it is assumed that there is no monitoring target other than the monitoring target person HM1 on the tracking screen TRW.
 トラッキング処理部34cは、画像処理部37又は音源検出部34dの判定結果を用いて、自動トラッキング処理における人物HM1の指定位置、即ち、トラッキングポイントを自動設定する(S3B-1)。トラッキング処理部34cは、ステップS3B-1において自動指定したトラッキング画面TRW上の検出位置に対応する実空間上の位置を示す3次元座標及び検出時刻を、それぞれトラッキングポイントのトラッキング位置及びトラッキング時刻として対応付けてメモリ33に保存し、更に、出力制御部34bを介して、トラッキング画面TRW上のトラッキングポイントにポイントマーカを表示させる(S3B-2)。ステップS3B-2の後、図10(B)に示す自動トラッキング処理は終了し、図10(A)に示すステップS5に進む。 The tracking processing unit 34c automatically sets the designated position of the person HM1 in the automatic tracking processing, that is, the tracking point, using the determination result of the image processing unit 37 or the sound source detection unit 34d (S3B-1). The tracking processing unit 34c corresponds to the tracking position of the tracking point and the tracking time by using the three-dimensional coordinates indicating the position in the real space corresponding to the detection position on the tracking screen TRW automatically specified in step S3B-1 and the detection time, respectively. Then, it is stored in the memory 33, and further, a point marker is displayed at the tracking point on the tracking screen TRW via the output control unit 34b (S3B-2). After step S3B-2, the automatic tracking process shown in FIG. 10B ends, and the process proceeds to step S5 shown in FIG.
 図11(A)において、最初のトラッキングポイント(初期位置)が既に指定されている場合には(S3B-3、YES)、ステップS3B-4の動作は省略される。一方、最初のトラッキングポイントが指定されていない場合には(S3B-3、NO)、ディスプレイ装置35のトラッキング画面TRWにおいて、ユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作(例えばクリック操作、タッチ操作)により、人物HM1の移動過程(移動経路)の位置(即ち、トラッキングポイント)が指定される(S3B-4)。 In FIG. 11A, when the first tracking point (initial position) has already been designated (S3B-3, YES), the operation of step S3B-4 is omitted. On the other hand, when the first tracking point is not designated (S3B-3, NO), on the tracking screen TRW of the display device 35, an input operation (for example, a click operation) by the cursor CSR by the user's mouse operation or the user's finger FG. , The touch operation) designates the position (namely, tracking point) of the movement process (movement path) of the person HM1 (S3B-4).
 最初のトラッキングポイントが既に指定されている場合、又はステップS3B-4において最初のトラッキングポイントが指定された後、トラッキング処理部34cは、最初のトラッキングポイントを中心とする画像処理部37又は音源検出部34dの判定結果を用いて、次のトラッキングポイントを自動指定する(S3B-5)。これにより、トラッキング処理部34cは、例えばユーザが最初のトラッキングポイントを指定することで、トラッキング画面TRW上の最初のトラッキングポイント(初期位置)を中心に、人物HM1の発する音声(音源)の位置に関する情報又は人物HM1の位置に関する情報の検出処理を開始するので、それぞれの検出処理を高速に行うことができる。 When the first tracking point has already been specified, or after the first tracking point has been specified in step S3B-4, the tracking processing unit 34c performs an image processing unit 37 or a sound source detection unit centered on the first tracking point. The next tracking point is automatically designated using the determination result 34d (S3B-5). Accordingly, the tracking processing unit 34c relates to the position of the sound (sound source) emitted by the person HM1 around the first tracking point (initial position) on the tracking screen TRW, for example, by the user specifying the first tracking point. Since the detection process of the information or the information related to the position of the person HM1 is started, each detection process can be performed at high speed.
 トラッキング処理部34cは、ステップS3B-5において自動指定したトラッキング画面TRW上の検出位置に対応する実空間上の位置を示す3次元座標及び検出時刻を、それぞれトラッキングポイントのトラッキング位置及びトラッキング時刻として対応付けてメモリ33に保存し、更に、出力制御部34bを介して、トラッキング画面TRW上のトラッキングポイントにポイントマーカを表示させる(S3B-2)。 The tracking processing unit 34c corresponds to the tracking position of the tracking point and the tracking time by using the three-dimensional coordinates indicating the position in the real space corresponding to the detection position on the tracking screen TRW automatically specified in step S3B-5 and the detection time, respectively. Then, it is stored in the memory 33, and further, a point marker is displayed at the tracking point on the tracking screen TRW via the output control unit 34b (S3B-2).
 ステップS3B-2の後、トラッキングポイントを補正する操作が行われない場合には(S3B-6、NO)、図11(A)に示す自動トラッキング処理は終了し、図10(A)に示すステップS5に進む。 If the operation for correcting the tracking point is not performed after step S3B-2 (S3B-6, NO), the automatic tracking process shown in FIG. 11A ends, and the step shown in FIG. Proceed to S5.
 一方、ステップS3B-2の後、例えば画像処理部37又は音源検出部34dの判定結果が間違っていたために、トラッキングポイントに対応するトラッキング位置を補正する操作が行われた場合には(S3B-6、YES)、図11(B)に示すトラッキング補正処理が行われる(S3B-7)。 On the other hand, after step S3B-2, for example, when the determination result of the image processing unit 37 or the sound source detection unit 34d is incorrect, an operation for correcting the tracking position corresponding to the tracking point is performed (S3B-6). YES), the tracking correction process shown in FIG. 11B is performed (S3B-7).
 図11(B)において、トラッキング画面TRW上で移動している人物HM1の発する音声が出力されていた場合に、ユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作により音声の出力が一時中止される(S3B-7-1)。ステップS3B-7-1の後、ユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作により、補正モードがオンになることで一時的に自動トラッキング処理から手動トラッキング処理に移行し、更に、正しいトラッキングポイントが指定されたとする(S3B-7-2)。 In FIG. 11B, when the voice uttered by the person HM1 moving on the tracking screen TRW is output, the voice is temporarily output by the cursor CSR by the user's mouse operation or the input operation by the user's finger FG. Canceled (S3B-7-1). After step S3B-7-1, the correction mode is turned on by an input operation with the cursor CSR by the user's mouse operation or the user's finger FG, so that the automatic tracking process is temporarily shifted to the manual tracking process. Assume that the correct tracking point is designated (S3B-7-2).
 出力制御部34bは、ステップS3B-7-2において指定される直前にトラッキング画面TRW上に表示されていた間違ったポイントマーカを消去し(S3B-7-3)、変更されたトラッキングポイント、即ち、ステップS3B-7-2において指定されたトラッキングポイントにポイントマーカを表示させ、ステップS3B-7-1において一時中止されていた音声の出力を再開させる(S3B-7-3)。更に、トラッキング処理部34cは、ステップS3B-7-2において指定された位置をトラッキングポイントとして上書き保存する(S3B-7-3)。ステップS3B-7-3の後、図11(B)に示すトラッキング補正処理は終了し、図10(A)に示すステップS5に進む。 The output control unit 34b deletes the wrong point marker displayed on the tracking screen TRW immediately before the designation in step S3B-7-2 (S3B-7-3), that is, the changed tracking point, that is, A point marker is displayed at the tracking point designated in step S3B-7-2, and the output of the voice that was temporarily suspended in step S3B-7-1 is resumed (S3B-7-3). Further, the tracking processing unit 34c overwrites and stores the position designated in step S3B-7-2 as a tracking point (S3B-7-3). After step S3B-7-3, the tracking correction process shown in FIG. 11B ends, and the process proceeds to step S5 shown in FIG.
 図12において、画像処理部37は、公知の画像処理を行うことで、ディスプレイ装置35のトラッキング画面TRW上に、監視対象物としての人物HM1の検出の有無を判定する(S3B-8)。画像処理部37は、人物HM1を検出したと判定した場合には(S3B-9、YES)、人物HM1の検出位置(例えば既知の代表点)を算出し、更に、検出時刻と検出位置との各データを判定結果として、信号処理部34のトラッキング処理部34cに出力する(S3B-10)。 12, the image processing unit 37 performs known image processing to determine whether or not the person HM1 as the monitoring target is detected on the tracking screen TRW of the display device 35 (S3B-8). When it is determined that the person HM1 has been detected (S3B-9, YES), the image processing unit 37 calculates a detection position (for example, a known representative point) of the person HM1, and further compares the detection time and the detection position. Each data is output as a determination result to the tracking processing unit 34c of the signal processing unit 34 (S3B-10).
 音源検出部34dは、公知の音源検出処理を行うことで、ディスプレイ装置35のトラッキング画面TRW上に、監視対象物としての人物HM1の発する音声(音源)の位置の検出の有無を判定し、音源の位置を検出したと判定した場合には、人物HM1の検出位置を算出し、更に、検出時刻と検出位置との各データを判定結果として、トラッキング処理部34cに出力する(S3B-11)。 The sound source detection unit 34d determines whether or not the position of the sound (sound source) emitted by the person HM1 as the monitoring target is detected on the tracking screen TRW of the display device 35 by performing a known sound source detection process. If it is determined that the detected position is detected, the detected position of the person HM1 is calculated, and each data of the detection time and the detected position is output as a determination result to the tracking processing unit 34c (S3B-11).
 トラッキング処理部34cは、ステップS3B-11において算出されたトラッキング画面TRW上の音源の検出位置及び検出時刻を、それぞれトラッキングポイントのトラッキング位置及びトラッキング時刻として対応付けてメモリ33に保存し、更に、出力制御部34bを介して、トラッキング画面TRW上のトラッキングポイントにポイントマーカを表示させる(S3B-12)。 The tracking processing unit 34c stores the sound source detection position and the detection time on the tracking screen TRW calculated in step S3B-11 in association with the tracking point tracking position and the tracking time in the memory 33, and further outputs them. Point markers are displayed at the tracking points on the tracking screen TRW via the control unit 34b (S3B-12).
 ステップS3B-12の後、トラッキング処理部34cは、ステップS3B-10において算出された人物HM1の検出位置とステップS3B-11において算出された音源の検出位置との距離が所定値以内であるか否かを判定する(S3B-13)。人物HM1の検出位置と音源の検出位置との距離が所定値内である場合には(S3B-13、YES)、図12に示す自動トラッキング処理は終了し、図10(A)に示すステップS5に進む。 After step S3B-12, the tracking processing unit 34c determines whether the distance between the detected position of the person HM1 calculated in step S3B-10 and the detected position of the sound source calculated in step S3B-11 is within a predetermined value. Is determined (S3B-13). If the distance between the detection position of the person HM1 and the detection position of the sound source is within a predetermined value (S3B-13, YES), the automatic tracking process shown in FIG. 12 is terminated, and step S5 shown in FIG. Proceed to
 一方、人物HM1の検出位置と音源の検出位置との距離が所定値内ではない場合には(S3B-13、NO)、図11(B)に示すトラッキング補正処理が行われる(S3B-7)。トラッキング補正処理については図11(B)を参照して説明したので、ここでは説明を省略する。ステップS3B-7の後、図12に示す自動トラッキング処理は終了し、図10(A)に示すステップS5に進む。 On the other hand, when the distance between the detection position of the person HM1 and the detection position of the sound source is not within the predetermined value (S3B-13, NO), the tracking correction process shown in FIG. 11B is performed (S3B-7). . Since the tracking correction processing has been described with reference to FIG. 11B, description thereof is omitted here. After step S3B-7, the automatic tracking process shown in FIG. 12 ends, and the process proceeds to step S5 shown in FIG.
 これにより、トラッキング処理部34cは、音源の位置の検出処理又は人物HM1の位置の検出処理により検出された音源の位置と人物HM1の位置との距離が所定値以上であれば、例えばトラッキング補正処理(図11(B)参照)におけるユーザの位置の変更操作によって指定された位置に関する情報を、人物HM1の位置に関する情報として容易に修正して取得することができる。更に、トラッキング処理部34cは、音源の位置の検出処理又は人物HM1の位置の検出処理により検出された音源の位置と人物HM1の位置との距離が所定値以上でなければ、例えばユーザの位置の変更操作を必要とすることなく、音源の位置又は人物HM1の位置を、人物HM1の移動後の位置に関する情報として容易に取得することができる。 Accordingly, the tracking processing unit 34c, for example, performs tracking correction processing if the distance between the sound source position detected by the sound source position detection process or the person HM1 position detection process and the position of the person HM1 is equal to or greater than a predetermined value. Information relating to the position designated by the user's position changing operation in (see FIG. 11B) can be easily corrected and acquired as information relating to the position of the person HM1. Further, if the distance between the sound source position detected by the sound source position detection process or the person HM1 position detection process and the position of the person HM1 is not equal to or greater than a predetermined value, the tracking processing unit 34c, for example, Without requiring a change operation, the position of the sound source or the position of the person HM1 can be easily acquired as information regarding the position after the movement of the person HM1.
 次に、指向性制御装置3,3Aにおけるトラッキング補助処理の詳細について、図13(A)を参照して説明する。図13(A)は、図9(A)に示すトラッキング補助処理の一例を説明するフローチャートである。 Next, details of the tracking assist processing in the directivity control devices 3 and 3A will be described with reference to FIG. FIG. 13A is a flowchart for explaining an example of the tracking assist process shown in FIG.
 図13(A)において、指向性制御装置3,3Aの拡大表示モードがオフである場合には(S2-1、NO)、指向性制御装置3,3Aの動作はステップS2-5に進む。一方、指向性制御装置3,3Aの拡大表示モードがオンである場合には(S2-1、YES)、指向性制御装置3,3Aは、画像プライバシー保護処理を行い(S2-2)、更に、自動スクロール処理を行う(S2-3)。画像プライバシー保護処理の詳細は図21(B)を参照して後述する。自動スクロール処理の詳細は図13(B)、図14(A)及び(B)を参照して後述する。 In FIG. 13A, when the enlarged display mode of the directivity control devices 3 and 3A is OFF (S2-1, NO), the operation of the directivity control devices 3 and 3A proceeds to step S2-5. On the other hand, when the enlarged display mode of the directivity control devices 3 and 3A is on (S2-1, YES), the directivity control devices 3 and 3A perform image privacy protection processing (S2-2), and further Then, an automatic scroll process is performed (S2-3). Details of the image privacy protection process will be described later with reference to FIG. Details of the automatic scroll processing will be described later with reference to FIGS. 13B, 14A, and 14B.
 ステップS2-3の後、出力制御部34bは、トラッキング画面TRW上の直近のトラッキングポイントに対応するトラッキング位置を中心に、所定倍率でトラッキング画面TRWの内容を拡大表示させる(S2-4)。ステップS2-4の後、指向性制御装置3,3Aの録画再生モード及びスロー再生モードの両方がオンである場合には(S2-5、YES)、出力制御部34bは、再生速度の初期値(通常値)より小さい速度値で、人物HM1の移動過程を示す映像の画像データをトラッキング画面TRW上にスロー再生させる(S2-6)。 After step S2-3, the output control unit 34b enlarges and displays the content of the tracking screen TRW at a predetermined magnification with the tracking position corresponding to the nearest tracking point on the tracking screen TRW as the center (S2-4). After step S2-4, when both the recording / playback mode and slow playback mode of the directivity control devices 3 and 3A are on (S2-5, YES), the output control unit 34b sets the initial value of the playback speed. Image data of a video showing the movement process of the person HM1 is played back on the tracking screen TRW at a speed value smaller than (normal value) (S2-6).
 ステップS2-6の後、又は指向性制御装置3,3Aの録画再生モード及びスロー再生モードの両方がオンではない場合には(S2-5、NO)、図13(A)に示すトラッキング補助処理は終了し、図9(A)に示すステップS3、図9(B)に示すステップS3A、又は図10(A)に示すステップS3Bに進む。 After step S2-6, or when both the recording and playback modes and slow playback mode of directivity control devices 3 and 3A are not on (S2-5, NO), the tracking assist process shown in FIG. Is completed, and the process proceeds to step S3 shown in FIG. 9A, step S3A shown in FIG. 9B, or step S3B shown in FIG.
 次に、指向性制御装置3,3Aにおける自動スクロール処理の詳細について、図13(B)、図14(A)及び(B)を参照して説明する。図13(B)は、図13(A)に示す自動スクロール処理の一例を説明するフローチャートである。図14(A)は、図13(B)に示す自動スクロール処理要否判定処理の一例を示すフローチャートである。図14(B)は、自動スクロール処理要否判定処理におけるスクロール要否判定線の説明図である。 Next, details of the automatic scroll processing in the directivity control devices 3 and 3A will be described with reference to FIGS. 13B, 14A, and 14B. FIG. 13B is a flowchart illustrating an example of the automatic scroll process illustrated in FIG. FIG. 14A is a flowchart illustrating an example of the automatic scroll process necessity determination process illustrated in FIG. FIG. 14B is an explanatory diagram of a scroll necessity determination line in the automatic scroll processing necessity determination processing.
 図13(B)において、トラッキング処理部34cは、自動スクロール処理要否判定処理を行う(S2-3-1)。自動スクロール処理要否判定処理の詳細は、図14(A)を参照して後述する。 In FIG. 13B, the tracking processing unit 34c performs an automatic scroll process necessity determination process (S2-3-1). Details of the automatic scroll process necessity determination process will be described later with reference to FIG.
 ステップS2-3-1の後、出力制御部34bは、自動スクロール処理要否判定処理結果として自動スクロール処理が必要であると判定された場合には(S2-3-2、YES)、トラッキング画面TRWに対して所定の自動スクロール処理を行う(S2-3-3)。例えば、出力制御部34bは、ユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作に応じて、トラッキング画面TRW上の人物HM1の移動経路に沿って、人物HM1が常にトラッキング画面TRWの中心に表示されるようにトラッキング画面TRWを自動スクロール処理する。これにより、出力制御部34bは、トラッキング画面TRWが拡大表示された場合でも、ユーザの監視対象物としての人物HM1の指定位置がトラッキング画面TRWから外れることを防ぐことができ、更に、移動を続けるトラッキング画面TRW上の人物HM1を簡易に指定させることができる。 After step S2-3-1, if the output control unit 34b determines that the automatic scroll process is necessary as a result of the automatic scroll process necessity determination process (S2-3-2, YES), the tracking screen A predetermined automatic scroll process is performed on the TRW (S2-3-3). For example, the output control unit 34b always keeps the person HM1 at the center of the tracking screen TRW along the movement path of the person HM1 on the tracking screen TRW according to the input operation with the cursor CSR by the user's mouse operation or the user's finger FG. The tracking screen TRW is automatically scrolled so as to be displayed on the screen. Thereby, even when the tracking screen TRW is displayed in an enlarged manner, the output control unit 34b can prevent the designated position of the person HM1 as the user's monitoring target from moving out of the tracking screen TRW, and continue to move. The person HM1 on the tracking screen TRW can be easily designated.
 なお、ステップS2-3-1-1の時点においてトラッキングポイントが未だ指定されていない場合には、出力制御部34bは、人物HM1が常にトラッキング画面TRWの中心に表示されるようにトラッキング画面TRWを自動スクロール処理するものとして、この場合には、ステップS2-3-1に示す自動スクロール処理要否判定処理を省略しても良い。 If the tracking point is not yet designated at the time of step S2-3-1-1, the output control unit 34b displays the tracking screen TRW so that the person HM1 is always displayed at the center of the tracking screen TRW. In this case, the automatic scroll process necessity determination process shown in step S2-3-1 may be omitted as the automatic scroll process.
 また、出力制御部34bは、人物HM1が後述するスクロール判定線JDLを超えて移動した場合には、人物HM1の移動方向(例えば後述するスクロール判定線JDLを超えた方向)に所定量、自動スクロール処理する。これにより、出力制御部34bは、トラッキング画面TRWが拡大表示された場合でも、ユーザの監視対象物としての人物HM1の指定位置がトラッキング画面TRWから外れることを防ぐことができる。 In addition, when the person HM1 moves beyond a scroll determination line JDL, which will be described later, the output control unit 34b automatically scrolls by a predetermined amount in the moving direction of the person HM1 (for example, the direction beyond the scroll determination line JDL, which will be described later). To process. Thereby, even when the tracking screen TRW is enlarged and displayed, the output control unit 34b can prevent the designated position of the person HM1 as the monitoring target of the user from deviating from the tracking screen TRW.
 また、出力制御部34bは、人物HM1が後述するスクロール判定線JDLを超えて移動した場合には、ユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作により指定された位置(例えば次のトラッキングポイント)がトラッキング画面TRWの中心になるように、トラッキング画面TRWを自動スクロール処理する。これにより、出力制御部34bは、トラッキング画面TRWが拡大表示された場合でも、ユーザの監視対象物としての人物HM1の指定位置がトラッキング画面TRWから外れることを防ぐことができ、更に、移動を続けるトラッキング画面TRW上の人物HM1を簡易に指定させることができる。 Further, when the person HM1 moves beyond a scroll determination line JDL, which will be described later, the output control unit 34b determines the position (for example, the next position specified by the cursor CSR by the user's mouse operation or the input operation by the user's finger FG). The tracking screen TRW is automatically scrolled so that the tracking point) becomes the center of the tracking screen TRW. Thereby, even when the tracking screen TRW is displayed in an enlarged manner, the output control unit 34b can prevent the designated position of the person HM1 as the user's monitoring target from moving out of the tracking screen TRW, and continue to move. The person HM1 on the tracking screen TRW can be easily designated.
 ステップS2-3-3の後、又は自動スクロール処理要否判定処理結果として自動スクロール処理が必要ではないと判定された場合には(S2-3-2、NO)、図13(B)に示す自動スクロール処理は終了し、図13(A)に示すステップS2-4に進む。 After step S2-3-3, or when it is determined that the automatic scroll process is not necessary as a result of the automatic scroll process necessity determination process (S2-3-2, NO), it is shown in FIG. The automatic scroll process ends, and the process proceeds to step S2-4 shown in FIG.
 図14(A)において、トラッキング処理部34cは、指定されたトラッキングポイントTP1に対応するトラッキング位置が、拡大表示されるトラッキング画面XTRWの上下左右のいずれかのスクロール判定線JDLを超えるか否かを判定する(S2-3-1-1)。 In FIG. 14A, the tracking processing unit 34c determines whether or not the tracking position corresponding to the designated tracking point TP1 exceeds any one of the scroll determination lines JDL on the upper, lower, left, and right sides of the tracking screen XTRW to be enlarged. Determination is made (S2-3-1-1).
 トラッキング処理部34cは、トラッキング位置がいずれかのスクロール判定線JDLを超えないと判定した場合には(S2-3-1-1、NO)、自動スクロール処理は不要と判定する(S2-3-1-2)。一方、トラッキング処理部34cは、トラッキング位置がいずれかのスクロール判定線JDLを超えると判定した場合には(S2-3-1-1、YES)、自動スクロール処理が必要と判定し、更に、該当するスクロール判定線JDLの種別(例えば、図14(B)に示す4つのスクロール判定線JDLのいずれかを示す情報)をメモリ33に保存する(S2-3-1-3)。ステップS2-3-1-2,S2-3-1-3の後、図14(A)に示す自動スクロール処理要否判定処理は終了し、図13(B)に示すステップS2-3-2に進む。 If the tracking processing unit 34c determines that the tracking position does not exceed any of the scroll determination lines JDL (S2-3-1-1, NO), the tracking processing unit 34c determines that the automatic scroll processing is unnecessary (S2-3- 1-2). On the other hand, when the tracking processing unit 34c determines that the tracking position exceeds any of the scroll determination lines JDL (S2-3-1-1, YES), the tracking processing unit 34c determines that automatic scroll processing is necessary, and further applies The type of the scroll determination line JDL to be performed (for example, information indicating one of the four scroll determination lines JDL shown in FIG. 14B) is stored in the memory 33 (S2-3-1-3). After steps S2-3-1-2 and S2-3-1-3, the automatic scroll process necessity determination process shown in FIG. 14A ends, and step S2-3-2 shown in FIG. Proceed to
 次に、指向性制御装置3,3Aにおけるトラッキング結線処理の詳細について、図15(A)及び(B)を参照して説明する。図15(A)は、図9(A)に示すトラッキング結線処理の一例を説明するフローチャートである。図15(B)は、図15(A)に示す一括結線処理の一例を説明するフローチャートである。 Next, details of the tracking connection processing in the directivity control devices 3 and 3A will be described with reference to FIGS. 15 (A) and 15 (B). FIG. 15A is a flowchart illustrating an example of the tracking connection process illustrated in FIG. FIG. 15B is a flowchart illustrating an example of the batch connection process illustrated in FIG.
 図15(A)において、トラッキング処理部34cは、トラッキングポイントが既に指定されている場合には(S6-1、YES)、結線モードが都度であるか否かを判定する(S6-2)。出力制御部34bは、結線モードが都度であると判定された場合には(S6-2、YES)、直前に指定された1つ以上のトラッキングポイントと対応する最新の1つ以上のトラッキングポイントとを結線して表示させる(S6-3)。これにより、出力制御部34bは、ディスプレイ装置35のトラッキング画面TRW上に映し出された人物HM1が移動した場合にユーザの指定操作により指定された複数の指定位置のうち、少なくとも現在の指定位置と直前の指定位置とを結線して表示させるので、人物HM1の移動の一部の軌跡を明示的に示すことができる。 In FIG. 15A, when the tracking point 34c has already been designated (S6-1, YES), the tracking processing unit 34c determines whether or not the connection mode is in each case (S6-2). When it is determined that the connection mode is each time (YES in S6-2), the output control unit 34b selects the latest one or more tracking points corresponding to the one or more tracking points specified immediately before. Are connected and displayed (S6-3). As a result, the output control unit 34b, when the person HM1 displayed on the tracking screen TRW of the display device 35 moves, at least the current specified position and the immediately preceding position among the plurality of specified positions specified by the user specifying operation. Since the designated position is connected and displayed, a part of the trajectory of the movement of the person HM1 can be explicitly shown.
 なお、ステップS6-3では、トラッキングポイントが1つずつ指定された単一指定の場合の動作に限定されず、複数個のトラッキングポイントが同時に指定された場合の動作も含まれ、後述するステップS6-4-3においても同様である。 Note that step S6-3 is not limited to the operation in the case of single designation in which tracking points are designated one by one, but also includes the operation in the case where a plurality of tracking points are designated at the same time. The same applies to -4--3.
 ステップS6-3の後、又はトラッキングポイントが未だ指定されていない場合には(S6-1、NO)、図15(A)に示すトラッキング結線処理は終了し、図9(A)、図9(B)又は図10(A)に示すステップS7に進む。 After step S6-3 or when the tracking point has not yet been designated (S6-1, NO), the tracking connection process shown in FIG. 15A is terminated, and FIGS. B) or proceed to step S7 shown in FIG.
 また、結線モードが都度ではないと判定された場合には(S6-2、NO)、一括結線処理が行われる(S6-4)。一括結線処理について、図15(B)を参照して説明する。 If it is determined that the connection mode is not frequent (S6-2, NO), a batch connection process is performed (S6-4). The batch connection process will be described with reference to FIG.
 図15(B)において、トラッキング処理部34cは、メモリ33に保存されたトラッキングリストLST(例えば図16(B)参照)のデータを順次、読み出す(S6-4-1)。読み出されたデータがトラッキングポイントの始点と判定された場合には(S6-4-2、YES)、トラッキング処理部34cは、再度、トラッキングリストLST(例えば図16(B)参照)のデータを読み出す(S6-4-1)。 15B, the tracking processing unit 34c sequentially reads data in the tracking list LST (for example, see FIG. 16B) stored in the memory 33 (S6-4-1). When it is determined that the read data is the start point of the tracking point (S6-4-2, YES), the tracking processing unit 34c again uses the data in the tracking list LST (see, for example, FIG. 16B). Read (S6-4-1).
 一方、読み出されたデータがトラッキングポイントの始点ではないと判定された場合には(S6-4-2、NO)、出力制御部34bは、読み出されたトラッキングリストのデータを用いて、直前に指定された1つ以上のトラッキングポイントと対応する最新の1つ以上のトラッキングポイントとの各ポイントマーカ同士を結線して表示させる(S6-4-3)。 On the other hand, when it is determined that the read data is not the starting point of the tracking point (S6-4-2, NO), the output control unit 34b uses the read data of the tracking list to The point markers of the one or more tracking points specified in (1) and the corresponding one or more latest tracking points are connected and displayed (S6-4-3).
 ステップS6-4-3の後、トラッキングポイントの終点まで結線された場合には(S6-4-4、YES)、図15(B)に示す一括結線処理は終了し、図9(A)、図9(B)又は図10(A)に示すステップS7に進む。 After step S6-4-3, if the connection is made up to the end point of the tracking point (S6-4-4, YES), the batch connection process shown in FIG. 15B ends, and FIG. The process proceeds to step S7 shown in FIG. 9B or FIG.
 一方、ステップS6-4-3の後、トラッキングポイントの終点まで結線されていない場合には(S6-4-4、NO)、トラッキング処理部34cは、メモリ33に保存されたトラッキングリストLST(例えば図16(B)参照)のデータを順次、読み出し、トラッキングリストLSTの全てのトラッキングポイントに対応するポイントマーカ同士が結線して表示されるまで、ステップS6-4-1からステップS6-4-4までの動作が繰り返される。これにより、出力制御部34bは、ディスプレイ装置35のトラッキング画面TRW上に映し出された人物HM1が移動した場合にユーザの指定操作により指定された複数の指定位置の全てに対し、各指定位置に隣接する1つ又は2つの指定位置を結線して表示させるので、人物HM1の移動の全部の軌跡を明示的に示すことができる。 On the other hand, after step S6-4-3, if the end of the tracking point is not connected (S6-4-4, NO), the tracking processing unit 34c stores the tracking list LST stored in the memory 33 (for example, The data in FIG. 16 (B) is sequentially read out, and from step S6-4-1 to step S6-4-4 until point markers corresponding to all tracking points in the tracking list LST are connected and displayed. The operations up to are repeated. Accordingly, the output control unit 34b is adjacent to each designated position with respect to all of the plurality of designated positions designated by the user's designated operation when the person HM1 projected on the tracking screen TRW of the display device 35 moves. Since one or two designated positions are connected and displayed, the entire trajectory of the movement of the person HM1 can be explicitly shown.
 図16(A)は、1回分の人物HM1の移動に対して表示されたトラッキングポイント間の動線上におけるユーザの指定位置P0に対応した収音音声の再生開始時刻PTの説明図である。図16(B)は、トラッキングリストの第1例を示す図である。図16(A)において、TP1,TP2,TP3,TP4は、図16(B)に示すトラッキングリストLSTにも示されるように、1回分の人物HM1の移動中に指定されたトラッキングポイントである。 FIG. 16A is an explanatory diagram of the collected sound reproduction start time PT corresponding to the user's designated position P0 on the flow line between the tracking points displayed for one movement of the person HM1. FIG. 16B is a diagram illustrating a first example of a tracking list. In FIG. 16A, TP1, TP2, TP3, and TP4 are tracking points designated during the movement of the person HM1 for one time, as also shown in the tracking list LST shown in FIG.
 図16(B)では、トラッキングポイントTP1(始点),TP2,TP3,TP4(終点)毎に、トラッキング位置を示す座標(x,y,z)とトラッキング時刻とが対応付けて保存されている。なお、説明を簡単にするために、トラッキング位置を示す座標のz座標値z0は一定としている。 In FIG. 16B, for each tracking point TP1 (start point), TP2, TP3, TP4 (end point), the coordinates (x, y, z) indicating the tracking position and the tracking time are stored in association with each other. In order to simplify the description, the z coordinate value z0 of the coordinates indicating the tracking position is assumed to be constant.
 トラッキング処理部34cは、図16(A)に示すトラッキングポイント間の動線上において、ユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作に応じて指定位置P0が指定されると、指定位置P0の前後2つのトラッキングポイントTP1,TP2を抽出し、トラッキングポイントTP1,TP2のトラッキング位置を示す座標及びトラッキング時刻のデータを用いて、指定位置P0における再生開始時刻PTを数式(2)に従って算出する。 When the designated position P0 is designated on the flow line between the tracking points shown in FIG. 16A according to the input operation with the cursor CSR by the user's mouse operation or the user's finger FG, the tracking processing unit 34c designates the designated position. Two tracking points TP1 and TP2 before and after P0 are extracted, and the reproduction start time PT at the designated position P0 is calculated according to the equation (2) using the coordinates indicating the tracking positions of the tracking points TP1 and TP2 and the tracking time data. .
Figure JPOXMLDOC01-appb-M000002
Figure JPOXMLDOC01-appb-M000002
 また、出力制御部34bは、音声をスピーカ装置36に出力(再生)する際、ユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作により指定された指定位置P0を含むトラッキング時刻の順番に、該当するトラッキング位置に対応する指向方向に指向性を形成した上で、指向性が形成された音声を出力(再生)する。 In addition, when outputting (reproducing) the sound to the speaker device 36, the output control unit 34b performs tracking time order including the designated position P0 designated by the cursor CSR by the user's mouse operation or the input operation by the user's finger FG. After the directivity is formed in the directivity direction corresponding to the corresponding tracking position, the sound having the directivity is output (reproduced).
 図17(A)は、複数同時指定に基づく異なるトラッキングポイント間の動線上におけるユーザの指定位置P0に対応した収音音声の再生開始時刻PTの説明図である。図17(B)は、トラッキングリストLSTの第2例を示す図である。図17(A)において、(TP11,TP21),(TP12,TP22),(TP13,TP23),(TP14,TP24)は、図17(B)に示すトラッキングリストLSTにも示されるように、例えば複数の監視対象物としての異なる人物の移動中に同時に指定されたトラッキングポイントである。 FIG. 17A is an explanatory diagram of the reproduction start time PT of the collected sound corresponding to the user's designated position P0 on the flow line between different tracking points based on a plurality of simultaneous designations. FIG. 17B is a diagram showing a second example of the tracking list LST. In FIG. 17A, (TP11, TP21), (TP12, TP22), (TP13, TP23), and (TP14, TP24) are, for example, as shown in the tracking list LST shown in FIG. The tracking points are specified simultaneously during movement of different persons as a plurality of monitoring objects.
 図17(B)では、トラッキングポイント(TP11,TP21),(TP12,TP22),(TP13,TP23),(TP14,TP24)毎に、トラッキング位置を示す座標(x,y,z)とトラッキング時刻とが対応付けて保存されている。トラッキングポイント(TP11,TP21)は始点であり、トラッキングポイント(TP14,TP24)は終点である。なお、説明を簡単にするために、トラッキング位置を示す座標のz座標値z0は一定としている。 In FIG. 17B, for each tracking point (TP11, TP21), (TP12, TP22), (TP13, TP23), (TP14, TP24), the coordinates (x, y, z) indicating the tracking position and the tracking time. Are stored in association with each other. The tracking points (TP11, TP21) are start points, and the tracking points (TP14, TP24) are end points. In order to simplify the description, the z coordinate value z0 of the coordinates indicating the tracking position is assumed to be constant.
 トラッキング処理部34cは、図17(A)に示すトラッキングポイント間の異なる動線上のいずれかの位置に、ユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作に応じて指定位置P0が指定されると、指定位置P0の前後2つのトラッキングポイントTP11,TP12を抽出し、トラッキングポイントTP11,TP12のトラッキング位置を示す座標及びトラッキング時刻のデータを用いて、指定位置P0における再生開始時刻PTを数式(3)に従って算出する。 The tracking processing unit 34c designates the designated position P0 at any position on the different flow line between the tracking points shown in FIG. 17A according to the input operation with the cursor CSR by the user's mouse operation or the user's finger FG. Then, the two tracking points TP11 and TP12 before and after the designated position P0 are extracted, and the reproduction start time PT at the designated position P0 is expressed by using the coordinates indicating the tracking positions of the tracking points TP11 and TP12 and the tracking time data. Calculate according to (3).
Figure JPOXMLDOC01-appb-M000003
Figure JPOXMLDOC01-appb-M000003
 また、出力制御部34bは、音声をスピーカ装置36に出力(再生)する際、ユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作により指定された指定位置P0を含むトラッキング時刻の順番に、該当するトラッキング位置に対応する指向方向に指向性を形成した上で、指向性が形成された音声を出力(再生)する。 In addition, when outputting (reproducing) the sound to the speaker device 36, the output control unit 34b performs tracking time order including the designated position P0 designated by the cursor CSR by the user's mouse operation or the input operation by the user's finger FG. After the directivity is formed in the directivity direction corresponding to the corresponding tracking position, the sound having the directivity is output (reproduced).
 図18(A)は、複数回指定に基づく異なるトラッキングポイント間の動線上におけるユーザの各指定位置P0,P0’に対応した収音音声の再生開始時刻PT,PT’の説明図である。図18(B)は、トラッキングリストLSTの第3例を示す図である。図18(A)において、(TP11,TP12,TP13,TP14)は、図18(B)に示すトラッキングリストLSTにも示されるように、例えば第1回目の監視対象物としての人物の移動中に指定されたトラッキングポイントである。また、図18(A)において、(TP21,TP22,TP23)は、同様に、例えば第2回目の監視対象物としての人物の移動中に指定されたトラッキングポイントである。なお、第2回目の監視対象物としての人物は、第1回目の監視対象物としての人物と同一人物でも異なる人物でも良い。 FIG. 18A is an explanatory diagram of the sound collection sound reproduction start times PT and PT 'corresponding to the user's designated positions P0 and P0' on the flow line between different tracking points based on the designation of a plurality of times. FIG. 18B is a diagram showing a third example of the tracking list LST. In FIG. 18A, (TP11, TP12, TP13, TP14) is, for example, during the movement of a person as the first monitoring target as shown in the tracking list LST shown in FIG. 18B. The specified tracking point. In FIG. 18A, (TP21, TP22, TP23) is also a tracking point designated during the movement of a person as the second monitoring object, for example. The person as the second monitoring object may be the same person or a different person from the person as the first monitoring object.
 図18(B)では、トラッキングポイントTP11,TP12,TP13,TP14,TP21,TP22,TP23毎に、トラッキング位置を示す座標(x,y,z)とトラッキング時刻とが対応付けて保存されている。トラッキングポイントTP11,TP21は始点であり、トラッキングポイントTP14,TP23は終点である。なお、説明を簡単にするために、トラッキング位置を示す座標のz座標値z0は一定としている。 In FIG. 18B, the coordinates (x, y, z) indicating the tracking position and the tracking time are stored in association with each other for each tracking point TP11, TP12, TP13, TP14, TP21, TP22, TP23. The tracking points TP11 and TP21 are start points, and the tracking points TP14 and TP23 are end points. In order to simplify the description, the z coordinate value z0 of the coordinates indicating the tracking position is assumed to be constant.
 トラッキング処理部34cは、図18(A)に示すトラッキングポイント間の各動線上のいずれかの位置に、ユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作に応じて指定位置P0,P0’が指定されると、指定位置P0,P0’の前後2つのトラッキングポイント(TP11,TP12),(TP21,TP22)を抽出し、トラッキングポイント(TP11,TP12),(TP21,TP22)のトラッキング位置を示す座標及びトラッキング時刻のデータを用いて、指定位置P0,P0’における再生開始時刻PT,PT’を数式(4),数式(5)に従ってそれぞれ算出する。数式(4),数式(5)において、指定位置P0の座標は(x0,y0,z0)であり、指定位置P0’の座標は(x0’,y0’,z0)である。 The tracking processing unit 34c is located at any position on each flow line between the tracking points shown in FIG. 18A according to the input operation with the cursor CSR by the user's mouse operation or the user's finger FG. When 'is specified, two tracking points (TP11, TP12) and (TP21, TP22) before and after the specified positions P0 and P0' are extracted, and the tracking positions of the tracking points (TP11, TP12) and (TP21, TP22) are extracted. Are used to calculate the reproduction start times PT and PT ′ at the designated positions P0 and P0 ′ according to the equations (4) and (5), respectively. In Equations (4) and (5), the coordinates of the designated position P0 are (x0, y0, z0), and the coordinates of the designated position P0 'are (x0', y0 ', z0).
Figure JPOXMLDOC01-appb-M000004
Figure JPOXMLDOC01-appb-M000004
Figure JPOXMLDOC01-appb-M000005
Figure JPOXMLDOC01-appb-M000005
 なお、図18(A)において、第1回目及び第2回目の各人物の移動中に指定されたトラッキングポイントの数及びトラッキング時刻は一致しなくても良い。また、出力制御部34bは、音声をスピーカ装置36に出力(再生)する際、ユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作により指定された指定位置P0又は指定位置P0’を含むトラッキング時刻の順番に、該当するトラッキング位置に対応する指向方向に指向性を形成した上で、指向性が形成された音声を出力(再生)する。 In FIG. 18A, the number of tracking points and the tracking time specified during the movement of each person at the first time and the second time may not match. The output control unit 34b includes a designated position P0 or a designated position P0 ′ designated by an input operation with a cursor CSR by a user's mouse operation or a user's finger FG when outputting (reproducing) sound to the speaker device 36. After the directivity is formed in the directivity direction corresponding to the corresponding tracking position in the order of the tracking time, the sound having the directivity is output (reproduced).
 次に、主に録画再生モードがオンである指向性制御装置3,3Aにおける動線表示再生処理の全体フローについて、図19(A)を参照して説明する。図19(A)は、第1の実施形態の指向性制御システム100,100AにおけるトラッキングリストLSTを用いた動線表示再生処理の全体フローの一例を説明するフローチャートである。 Next, the entire flow of the flow line display reproduction process in the directivity control devices 3 and 3A in which the recording / reproduction mode is mainly turned on will be described with reference to FIG. FIG. 19A is a flowchart for explaining an example of the entire flow of the flow line display reproduction process using the tracking list LST in the directivity control systems 100 and 100A of the first embodiment.
 図19(A)において、先ず動線表示処理が行われる(S11)。動線表示処理の詳細は図20を参照して後述する。ステップS11の後、ステップS11において表示されたトラッキングポイント間の動線上において、ユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作に応じて指定位置P0が指定されると(S12)、再生開始時刻算出処理が行われる(S13)。再生開始時刻算出処理の詳細は図19(B)を参照して後述する。 In FIG. 19A, a flow line display process is first performed (S11). Details of the flow line display processing will be described later with reference to FIG. After step S11, when the designated position P0 is designated on the flow line between the tracking points displayed in step S11 according to the input operation with the cursor CSR by the user's mouse operation or the user's finger FG (S12), the reproduction is performed. A start time calculation process is performed (S13). Details of the reproduction start time calculation process will be described later with reference to FIG.
 トラッキング処理部34cは、メモリ33に保存されているトラッキングリストLSTを参照し、ステップS13に示す再生開始時刻算出処理において算出された指定位置P0の再生開始時刻PTに最も近いトラッキング時刻に対応する全て(1つでも可)のトラッキング位置の座標を読み出す(S14)。また、出力制御部34bは、トラッキング処理部34cが読み出したトラッキング位置の座標のデータを用いて、全方位マイクアレイ装置M1から、全て(1つでも可)のトラッキング位置への方向に、収音音声の指向性を形成する(S14)。これにより、出力制御部34bは、人物HM1の移動の軌跡を示す動線上に対してユーザが任意に指定した位置(任意指定位置)に応じて、任意指定位置の次に指定されていたトラッキング位置に向かう方向に音声の指向性を事前に形成することができる。 The tracking processing unit 34c refers to the tracking list LST stored in the memory 33, and all corresponding to the tracking time closest to the reproduction start time PT of the designated position P0 calculated in the reproduction start time calculation process shown in step S13. The coordinates of the tracking position (which may be one) are read (S14). Further, the output control unit 34b uses the tracking position coordinate data read by the tracking processing unit 34c to collect sound from the omnidirectional microphone array apparatus M1 in the direction toward all (or one) tracking positions. Voice directivity is formed (S14). As a result, the output control unit 34b determines the tracking position specified next to the arbitrary specified position in accordance with the position (optional specified position) arbitrarily specified by the user on the flow line indicating the movement trajectory of the person HM1. The directivity of the voice can be formed in advance in the direction toward the.
 出力制御部34bは、ステップS14の後、ステップS13において算出された再生開始時刻PTから、レコーダ装置4又はメモリ33に記憶されている収音音声の音声データの再生を開始する(S15)。 After step S14, the output control unit 34b starts reproduction of the collected voice data stored in the recorder device 4 or the memory 33 from the reproduction start time PT calculated in step S13 (S15).
 ステップS15の後、再生開始時刻PTから所定時間内に次のトラッキング時刻がある場合には(S16、YES)、出力制御部34bは、次のトラッキング時刻に対応する全て(1つでも可)のトラッキング位置の座標のデータを用いて、全方位マイクアレイ装置M1から、全て(1つでも可)のトラッキング位置への方向に、収音音声の指向性を形成する(S17)。 After step S15, when there is a next tracking time within a predetermined time from the reproduction start time PT (S16, YES), the output control unit 34b performs all (or one) corresponding to the next tracking time. Using the coordinate data of the tracking position, the directivity of the collected sound is formed in the direction from the omnidirectional microphone array apparatus M1 to all (or even one) tracking positions (S17).
 ステップS17の後、又は再生開始時刻PTから所定時間内に次のトラッキング時刻がない場合には(S16、NO)、音声出力処理が行われる(S7)。音声出力処理の詳細は図21(A)を参照して後述する。ステップS7の後、トラッキングポイントの終点に対応するトラッキング時刻の音声出力処理が終了した場合には(S18、YES)、図19(A)に示す動線表示再生処理は終了する。これにより、出力制御部34bは、ユーザの任意指定位置に応じて算出された再生開始時刻における監視対象物の発した収音音声を明瞭に出力することができ、再生開始時刻から所定時間内に次の指定位置がある場合には、次の指定位置における音声の指向性を事前に形成することができる。 After step S17 or when there is no next tracking time within a predetermined time from the reproduction start time PT (S16, NO), an audio output process is performed (S7). Details of the audio output processing will be described later with reference to FIG. After step S7, when the audio output process at the tracking time corresponding to the end point of the tracking point is completed (S18, YES), the flow line display reproduction process shown in FIG. Thereby, the output control part 34b can output clearly the sound-collected sound which the monitoring target object emitted at the reproduction start time calculated according to the user's arbitrary designated positions, and within a predetermined time from the reproduction start time. When there is a next designated position, the directivity of the voice at the next designated position can be formed in advance.
 一方、ステップS7の後、トラッキングポイントの終点に対応するトラッキング時刻の音声出力処理が終了していない場合には(S18、NO)、トラッキングポイントの終点に対応するトラッキング時刻の音声出力処理が終了するまで、ステップS16からステップS18までの動作が繰り返される。 On the other hand, after step S7, if the audio output process at the tracking time corresponding to the end point of the tracking point has not ended (S18, NO), the audio output process at the tracking time corresponding to the end point of the tracking point ends. Until then, the operations from step S16 to step S18 are repeated.
 次に、指向性制御装置3,3Aにおける再生開始時刻算出処理の詳細について、図19(B)を参照して説明する。図19(B)は、図19(A)に示す再生開始時刻算出処理の一例を説明するフローチャートである。 Next, details of the reproduction start time calculation process in the directivity control devices 3 and 3A will be described with reference to FIG. FIG. 19B is a flowchart for explaining an example of the reproduction start time calculation process shown in FIG.
 図19(B)において、トラッキング処理部34cは、メモリ33に保存されているトラッキングリストLST(例えば図16(B)参照)を読み出す(S13-1)。トラッキング処理部34cは、ステップS13-1で読み出したトラッキングリストLSTのデータより、ステップS12において指定された指定位置P0の前後2つのトラッキングポイントTP1,TP2を抽出する(S13-2)。トラッキング処理部34cは、トラッキングポイントTP1,TP2のトラッキング位置を示す座標及びトラッキング時刻のデータを用いて、指定位置P0における再生開始時刻PTを算出する(S13-3、例えば数式(2)参照)。ステップS13-3の後、図19(B)に示す再生開始時刻算出処理は終了し、図19(A)に示すステップS14に進む。 In FIG. 19B, the tracking processing unit 34c reads the tracking list LST (see, for example, FIG. 16B) stored in the memory 33 (S13-1). The tracking processing unit 34c extracts two tracking points TP1 and TP2 before and after the designated position P0 designated in step S12 from the data of the tracking list LST read in step S13-1 (S13-2). The tracking processing unit 34c calculates the reproduction start time PT at the designated position P0 using the coordinates indicating the tracking positions of the tracking points TP1 and TP2 and the data of the tracking time (S13-3, for example, refer to Equation (2)). After step S13-3, the reproduction start time calculation process shown in FIG. 19B ends, and the process proceeds to step S14 shown in FIG.
 次に、指向性制御装置3,3Aにおける動線表示処理の詳細について、図20を参照して説明する。図20は、図19(A)に示す動線表示処理の一例を説明するフローチャートである。 Next, details of the flow line display processing in the directivity control devices 3 and 3A will be described with reference to FIG. FIG. 20 is a flowchart for explaining an example of the flow line display process shown in FIG.
 図20において、トラッキング処理部34cは、メモリ33に保存されたトラッキングリストLST(例えば図16(B)参照)のデータを順次、読み出す(S11-1)。ステップS11-1において読み出された全てのトラッキングポイントについてポイントマーカ同士の結線が終了した場合には(S11-2、YES)、図20に示す動線表示処理は終了し、図19(A)に示すステップS12に進む。 20, the tracking processing unit 34c sequentially reads data in the tracking list LST (for example, see FIG. 16B) stored in the memory 33 (S11-1). When the connection between the point markers is completed for all the tracking points read out in step S11-1 (S11-2, YES), the flow line display process shown in FIG. 20 ends, and FIG. The process proceeds to step S12 shown in FIG.
 一方、ステップS11-1において読み出された全てのトラッキングポイントについてポイントマーカ同士の結線が終了していない場合には(S11-2、NO)、トラッキング処理部34cは、トラッキングリストLST(例えば図16(B)参照)のデータを順次読み出す。出力制御部34bは、トラッキング処理部34cにより読み出された1つ以上のトラッキングポイントに、監視対象物毎に区別してポイントマーカを表示させる(S11-3)。 On the other hand, when the connection between the point markers has not been completed for all the tracking points read in step S11-1, (S11-2, NO), the tracking processing unit 34c detects the tracking list LST (for example, FIG. 16). The data of (B) is read sequentially. The output control unit 34b displays a point marker on each of the monitoring objects at one or more tracking points read by the tracking processing unit 34c (S11-3).
 なお、ステップS11-3では、出力制御部34bは、特に図示はしないが、例えばユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作(例えばマウスの右クリック操作及び左クリック操作、キーボードの複数キーの同時押下、マウスのクリック操作及びキーボードの数字キーの同時押下、タッチパネルに対する同時指定等)に応じて、同一の監視対象物が識別可能な態様(例えば同一の記号、識別番号、記号及び識別番号の組み合わせ、所定形状の枠等)で監視対象物毎に区別してポイントマーカを表示する。ここでいう所定形状の枠とは、例えば矩形、丸、三角である。枠の形状で識別する以外に、枠の線種(例えば実線、点線)、枠の色、枠の上に付記された番号等によって識別可能に表示されても良い。 In step S11-3, although not particularly illustrated, the output control unit 34b is, for example, an input operation (for example, a mouse right-click operation and a left-click operation, a keyboard operation by a user's mouse FG or a user's finger FG). A mode (for example, the same symbol, identification number, symbol, and so on) that allows the same monitoring object to be identified according to simultaneous pressing of multiple keys, mouse click operation and simultaneous pressing of numeric keys on the keyboard, simultaneous specification on the touch panel, etc. A point marker is displayed by distinguishing each monitoring object by a combination of identification numbers, a frame of a predetermined shape, or the like). The frame having a predetermined shape here is, for example, a rectangle, a circle, or a triangle. In addition to identifying by the shape of the frame, it may be displayed so as to be identifiable by the line type (for example, solid line, dotted line) of the frame, the color of the frame, the number appended on the frame, or the like.
 ステップS11-3の後、ステップ11-3において読み出されたトラッキングポイントのデータがトラッキングポイントの始点と判定された場合には(S11-4、YES)、トラッキング処理部34cは、再度、トラッキングリストLST(例えば図16(B)参照)のデータを読み出す(S11-3)。 After step S11-3, when it is determined that the tracking point data read in step 11-3 is the starting point of the tracking point (S11-4, YES), the tracking processing unit 34c again performs the tracking list. Data of LST (see, for example, FIG. 16B) is read (S11-3).
 一方、ステップS11-3において読み出されたデータがトラッキングポイントの始点ではないと判定された場合には(S11-4、NO)、出力制御部34bは、読み出されたトラッキングリストのデータを用いて、直前に指定された1つ以上のトラッキングポイントと対応する最新の1つ以上のトラッキングポイントとの各ポイントマーカ同士を結線して表示させる(S11-5)。 On the other hand, when it is determined that the data read in step S11-3 is not the starting point of the tracking point (S11-4, NO), the output control unit 34b uses the data of the read tracking list. Then, the point markers of the one or more tracking points designated immediately before and the latest one or more tracking points corresponding to each other are connected and displayed (S11-5).
 ステップS11-5の後、ステップS11-1において読み出されたトラッキングリストLSTのトラッキングポイントの終点まで結線された場合には(S11-6、YES)、ステップS11-2の動作に進む。 After step S11-5, if the end of the tracking point of the tracking list LST read in step S11-1 is connected (S11-6, YES), the operation proceeds to step S11-2.
 一方、ステップS11-5の後、ステップS11-1において読み出されたトラッキングリストLSTのトラッキングポイントの終点まで結線されていない場合には(S11-6、NO)、ステップS11-1において読み出されたトラッキングリストLSTのトラッキングポイントの終点まで結線されるまで、ステップS11-3からステップS11-6までの動作が繰り返される。 On the other hand, after step S11-5, if the end of the tracking point of the tracking list LST read in step S11-1 is not connected (S11-6, NO), the data is read in step S11-1. The operation from step S11-3 to step S11-6 is repeated until the end of the tracking point in the tracking list LST is connected.
 次に、指向性制御装置3,3Aにおける音声出力処理及び画像プライバシー保護処理について、それぞれ図21(A)及び(B)、図22(A)~(C)を参照して説明する。図21(A)は、図9(A)に示す音声出力処理の一例を説明するフローチャートである。図21(B)は、図13(A)に示す画像プライバシー保護処理の一例を説明するフローチャートである。図22(A)は、ボイスチェンジ処理前のピッチに対応する音声信号の波形の一例を示す図である。図22(B)は、ボイスチェンジ処理後のピッチに対応する音声信号の波形の一例を示す図である。図22(C)は、検出された人物の顔の輪郭内にぼかしを入れる処理の説明図である。 Next, audio output processing and image privacy protection processing in the directivity control devices 3 and 3A will be described with reference to FIGS. 21 (A) and (B) and FIGS. 22 (A) to (C), respectively. FIG. 21A is a flowchart illustrating an example of the audio output process illustrated in FIG. FIG. 21B is a flowchart for explaining an example of the image privacy protection process shown in FIG. FIG. 22A is a diagram illustrating an example of a waveform of an audio signal corresponding to the pitch before the voice change process. FIG. 22B is a diagram illustrating an example of a waveform of an audio signal corresponding to the pitch after the voice change process. FIG. 22C is an explanatory diagram of processing for blurring the detected outline of a person's face.
 図21(A)において、出力制御部34bは、音声プライバシー保護モードがオンであるか否かを判定する(S7-1)。出力制御部34bは、音声プライバシー保護モードがオンであると判定した場合には(S7-1、YES)、スピーカ装置36において出力される収音音声のデータに対してボイスチェンジ処理を施す(S7-2)。 In FIG. 21A, the output control unit 34b determines whether or not the voice privacy protection mode is on (S7-1). If the output control unit 34b determines that the voice privacy protection mode is on (S7-1, YES), the output control unit 34b performs voice change processing on the collected voice data output from the speaker device 36 (S7). -2).
 ステップS7-2の後、又は音声プライバシー保護モードがオフであると判定された場合には(S7-1、NO)、出力制御部34bは、収音音声をそのままスピーカ装置36から出力させる(S7-3)。ステップS7-3の後、図21(A)に示す音声出力処理は終了し、図9(A)、図9(B)又は図10(A)に示すステップS1に戻る。 After step S7-2, or when it is determined that the voice privacy protection mode is off (S7-1, NO), the output control unit 34b outputs the collected sound as it is from the speaker device 36 (S7). -3). After step S7-3, the audio output process shown in FIG. 21A ends, and the process returns to step S1 shown in FIG. 9A, FIG. 9B, or FIG.
 ボイスチェンジ処理の一例として、出力制御部34bは、例えば全方位マイクアレイ装置M1により収音された音声の音声データ若しくは出力制御部34b自身が指向性を形成した音声データの波形のピッチを増大又は減少する(例えば図22(A)及び(B)参照)。これにより、出力制御部34bは、例えばユーザの簡易な入力操作により、全方位マイクアレイ装置M1によりリアルタイムに収音されている音声をボイスチェンジ処理して音声出力するので、人物HM1の発する音声を誰の音声か分かり難くすることで、現在撮像されている人物HM1の音声上のプライバシーを効果的に保護することができる。また、出力制御部34bは、例えばユーザの簡易な入力操作により、一定期間にわたって全方位マイクアレイ装置M1により収音された音声を音声出力する場合には、音声にボイスチェンジ処理を施して音声出力するので、人物HM1の発する音声を誰の音声か分かり難くすることで、人物HM1の音声上のプライバシーを効果的に保護することができる。 As an example of the voice change process, the output control unit 34b increases, for example, the pitch of the voice data collected by the omnidirectional microphone array device M1 or the waveform of the voice data formed by the output control unit 34b itself. Decrease (see, for example, FIGS. 22A and 22B). As a result, the output control unit 34b performs voice change processing on the sound collected in real time by the omnidirectional microphone array apparatus M1 by a simple input operation of the user, for example, and outputs the sound, so that the sound emitted by the person HM1 is output. By making it difficult to understand who's voice, it is possible to effectively protect the privacy on the voice of the person HM1 currently imaged. In addition, the output control unit 34b performs voice change processing on the sound and outputs the sound when the sound collected by the omnidirectional microphone array apparatus M1 is output for a certain period of time by a simple input operation of the user, for example. Therefore, it is possible to effectively protect the privacy of the voice of the person HM1 by making it difficult to understand who the voice of the person HM1 is.
 図21(B)において、トラッキング処理部34cは、画像プライバシー保護モードがオンであるか否かを判定する(S2-2-1)。画像処理部37は、画像プライバシー保護モードがオンであると判定された場合には(S2-2-1、YES)、ディスプレイ装置35のトラッキング画面TRW上に表示される人物HM1の顔の輪郭DTLを検出(抽出)し(S2-2-2)、顔の輪郭DTLにマスキング処理を施す(S2-2-3)。具体的には、画像処理部37は、検出された顔の輪郭DTLを包含する矩形領域を算出し、矩形領域内に所定のぼかしを入れる処理を行う(図22(C)参照)。画像処理部37は、ぼかしを入れる処理により生成された画像データを出力制御部34bに出力する。 In FIG. 21B, the tracking processing unit 34c determines whether or not the image privacy protection mode is on (S2-2-1). When it is determined that the image privacy protection mode is on (S2-2-1, YES), the image processing unit 37 determines the face outline DTL of the person HM1 displayed on the tracking screen TRW of the display device 35. Is detected (extracted) (S2-2-2), and the face outline DTL is masked (S2-2-3). Specifically, the image processing unit 37 calculates a rectangular region including the detected face outline DTL, and performs a process of adding a predetermined blur to the rectangular region (see FIG. 22C). The image processing unit 37 outputs the image data generated by the blurring process to the output control unit 34b.
 ステップS2-2-3の後、又は画像プライバシー保護モードがオフであると判定された場合には(S2-2-1、NO)、出力制御部34bは、画像処理部37から得られた画像データをディスプレイ装置35に表示させる(S2-2-4)。 After step S2-2-3 or when it is determined that the image privacy protection mode is off (S2-2-1, NO), the output control unit 34b displays the image obtained from the image processing unit 37. Data is displayed on the display device 35 (S2-2-4).
 これにより、画像処理部37は、例えばユーザの簡易な入力操作により、ディスプレイ装置35のトラッキング画面TRW上に映し出された監視対象物としての人物HM1の一部(例えば顔)をマスキング処理するので、監視対象物の人物HM1が誰であるかを分かり難くすることでプライバシーを効果的に保護することができる。 Thereby, the image processing unit 37 performs a masking process on a part (for example, a face) of the person HM1 as the monitoring target displayed on the tracking screen TRW of the display device 35 by, for example, a simple input operation of the user. Privacy can be effectively protected by making it difficult to understand who the person HM1 of the monitoring object is.
 なお、図21(B)に示す画像プライバシー保護処理は、監視対象物(例えば人物HM1)がカメラ画面に現れた時点で指向性制御装置3,3Aの画像プライバシー保護モードがオンになっていれば、拡大表示モードがオンになっていなくても行われて良い。 Note that the image privacy protection process shown in FIG. 21B is performed if the image privacy protection mode of the directivity control devices 3 and 3A is turned on when the monitoring object (for example, the person HM1) appears on the camera screen. This may be done even if the enlarged display mode is not turned on.
 以上により、本実施形態の指向性制御システム100,100Aでは、指向性制御装置3,3Aは、複数のマイクを含む全方位マイクアレイ装置M1から、ディスプレイ装置35のトラッキング画面TRW上の画像データに対する指定位置に対応する監視対象物(例えば人物HM1)への方向に音声の指向性を形成し、更に、移動している監視対象物(例えば人物HM1)を指定した指定位置に関する情報(例えばトラッキングポイントに対応するトラッキング位置及びトラッキング時刻)を取得する。また、指向性制御装置3,3Aは、ディスプレイ装置35のトラッキング画面TRW上の画像データに対する指定位置に関する情報を用いて、指定位置に対応する監視対象物(例えば人物HM1)への方向に、音声の指向性を追従して切り替える。 As described above, in the directivity control systems 100 and 100A of the present embodiment, the directivity control devices 3 and 3A apply the image data on the tracking screen TRW of the display device 35 from the omnidirectional microphone array device M1 including a plurality of microphones. Information about a designated position (for example, a tracking point) in which sound directivity is formed in the direction toward the monitoring object (for example, the person HM1) corresponding to the designated position and the moving monitoring object (for example, the person HM1) is designated. (Tracking position and tracking time) corresponding to. In addition, the directivity control devices 3 and 3A use the information related to the designated position with respect to the image data on the tracking screen TRW of the display device 35 in the direction toward the monitoring target (for example, the person HM1) corresponding to the designated position. Follow the directivity of and switch.
 これにより、指向性制御装置3,3Aは、ディスプレイ装置35のトラッキング画面TRW上の画像データに映し出されている監視対象物(例えば人物HM1)が移動しても、監視対象物(例えば人物HM1)の移動前の位置に向かう方向に形成された音声の指向性を、監視対象物(例えば人物HM1)の移動後の位置に向かう方向に形成するので、監視対象物(例えば人物HM1)の移動に伴って音声の指向性を追従して適正に形成することができ、監視者の監視業務の効率劣化を抑制できる。 Thereby, even if the monitoring target object (for example, person HM1) currently displayed on the image data on the tracking screen TRW of the display apparatus 35 moves, the directivity control devices 3 and 3A are monitored objects (for example, the person HM1). Since the directivity of the sound formed in the direction toward the position before the movement of the object is formed in the direction toward the position after the movement of the monitoring object (for example, the person HM1), the movement of the monitoring object (for example, the person HM1) Along with this, the directivity of the voice can be properly formed and the efficiency of the monitoring work of the supervisor can be suppressed.
 また、指向性制御装置3,3Aは、ディスプレイ装置35のトラッキング画面TRW上に映し出された画像データの中で移動する監視対象物(例えば人物HM1)を指定する簡易な手動操作によって、監視対象物(例えば人物HM1)の移動後の位置に関する正確な情報を容易に取得することができる。 In addition, the directivity control devices 3 and 3A can monitor the object to be monitored by a simple manual operation that designates the object to be moved (for example, the person HM1) in the image data displayed on the tracking screen TRW of the display device 35. It is possible to easily acquire accurate information regarding the position after the movement of the person (for example, the person HM1).
 また、指向性制御装置3Aは、ディスプレイ装置35のトラッキング画面TRW上に映し出された画像データから監視対象物(例えば人物HM1)の発する音声の音源、及び監視対象物(例えば人物HM1)自体を簡易に検出することができるので、音源の位置に関する情報又は監視対象物の位置に関する情報を、監視対象物(例えば人物HM1)の移動後の位置に関する情報として容易に取得することができる。 In addition, the directivity control device 3A simplifies the sound source of the sound emitted from the monitoring object (for example, the person HM1) and the monitoring object (for example, the person HM1) itself from the image data displayed on the tracking screen TRW of the display device 35. Therefore, information regarding the position of the sound source or information regarding the position of the monitoring target can be easily obtained as information regarding the position after the movement of the monitoring target (for example, the person HM1).
(第2の実施形態)
 第2の実施形態では、指向性制御装置3Bは、監視対象物(例えば人物)の移動状況に合わせて、カメラ装置の撮像エリア又は全方位マイクアレイ装置の収音エリアを超えようとする場合には、監視対象物の画像の撮像に用いるカメラ装置を他のカメラ装置に切り替え、又は監視対象物の発する音声の収音に用いる全方位マイクアレイ装置を他の全方位マイクアレイ装置に切り替える。
(Second Embodiment)
In the second embodiment, the directivity control device 3B is configured to exceed the imaging area of the camera device or the sound collection area of the omnidirectional microphone array device in accordance with the movement state of the monitoring target (for example, a person). Switches the camera device used to capture an image of the monitoring object to another camera device, or switches the omnidirectional microphone array device used to collect sound emitted from the monitoring object to another omnidirectional microphone array device.
 なお、本実施形態では、音声トラッキング処理の対象となる監視対象物(例えば人物HM1)の画像の撮像に用いるカメラ装置と、人物HM1の発する音声の収音に用いる全方位マイクアレイ装置とは予め対応付けられており、この対応付けに関する情報は指向性制御装置3Bのメモリ33に予め保存されているとする。 In this embodiment, the camera device used for capturing an image of the monitoring target (for example, the person HM1) that is the target of the audio tracking process and the omnidirectional microphone array apparatus used for collecting the sound emitted by the person HM1 are preliminarily used. Assume that information is associated with the information and is stored in advance in the memory 33 of the directivity control device 3B.
 図23は、第2の実施形態の指向性制御システム100Bのシステム構成例を示すブロック図である。図23に示す指向性制御システム100Bは、1つ以上のカメラ装置C1,…,Cnと、1つ以上の全方位マイクアレイ装置M1,…,Mmと、指向性制御装置3Bと、レコーダ装置4とを含む構成である。図23の各部の説明では、図2,図3に示す指向性制御システム100,100Aに示す各部の構成及び動作のものには同一の符号を付して説明を簡略化又は省略し、異なる内容について説明する。 FIG. 23 is a block diagram illustrating a system configuration example of the directivity control system 100B of the second embodiment. The directivity control system 100B shown in FIG. 23 includes one or more camera devices C1,..., Cn, one or more omnidirectional microphone array devices M1,..., Mm, a directivity control device 3B, and a recorder device 4. It is the structure containing these. In the description of each part in FIG. 23, the same reference numerals are given to the configuration and operation of each part shown in the directivity control systems 100 and 100A shown in FIGS. Will be described.
 指向性制御装置3Bは、例えば監視制御室(不図示)に設置される据置型のPCでも良いし、ユーザが携帯可能な携帯電話機、PDA、タブレット端末、スマートフォン等のデータ通信端末でも良い。 The directivity control device 3B may be, for example, a stationary PC installed in a monitoring control room (not shown), or a data communication terminal such as a mobile phone, PDA, tablet terminal, or smartphone that can be carried by the user.
 指向性制御装置3Bは、通信部31と、操作部32と、メモリ33と、信号処理部34Aと、ディスプレイ装置35と、スピーカ装置36と、画像処理部37と、動作切替制御部38とを少なくとも含む構成である。信号処理部34Aは、指向方向算出部34aと、出力制御部34bと、トラッキング処理部34cと、音源検出部34dとを少なくとも含む。 The directivity control device 3B includes a communication unit 31, an operation unit 32, a memory 33, a signal processing unit 34A, a display device 35, a speaker device 36, an image processing unit 37, and an operation switching control unit 38. It is the structure which contains at least. The signal processing unit 34A includes at least a pointing direction calculation unit 34a, an output control unit 34b, a tracking processing unit 34c, and a sound source detection unit 34d.
 動作切替制御部38は、トラッキング処理部34cが取得する監視対象物(例えば人物)の移動状況に関する各種の情報又はデータを基に、複数のカメラ装置C1~Cn又は複数の全方位マイクアレイ装置M1~Mmのうち、指向性制御システム100Bの監視対象物の画像の撮像に用いるカメラ装置、又は監視対象物の発する音声の収音に用いる全方位マイクアレイ装置を切り替えるための各種の動作を行う。 The operation switching control unit 38, based on various information or data regarding the movement status of the monitoring target (for example, a person) acquired by the tracking processing unit 34c, a plurality of camera devices C1 to Cn or a plurality of omnidirectional microphone array devices M1. Among Mm, various operations for switching the camera device used for capturing an image of the monitoring object of the directivity control system 100B or the omnidirectional microphone array device used for collecting the sound emitted from the monitoring object are performed.
 次に、指向性制御装置3Bにおけるカメラ装置の自動切替処理について、図24を参照して説明する。図24は、ディスプレイ装置35に表示される画像の撮像に用いるカメラ装置の自動切替処理を示す説明図である。図24では、説明を簡単にするために、監視対象物としての人物HM1がトラッキング位置A1からトラッキング位置A2に移動することにより、人物HM1の画像の撮像に用いるカメラ装置を、カメラ装置C1からカメラ装置C2に切り替える例を説明する。 Next, the automatic switching process of the camera device in the directivity control device 3B will be described with reference to FIG. FIG. 24 is an explanatory diagram showing an automatic switching process of a camera device used for capturing an image displayed on the display device 35. In FIG. 24, for the sake of simplicity, the camera device used for capturing an image of the person HM1 is moved from the camera device C1 to the camera by moving the person HM1 as the monitoring object from the tracking position A1 to the tracking position A2. An example of switching to the device C2 will be described.
 トラッキング位置A1は、カメラ装置C1の撮像エリアC1RNの範囲内であり、予め決められたカメラ装置C1の切替判定ラインJC1の範囲内である。トラッキング位置A2は、カメラ装置C2の撮像エリアC2RNの範囲内であり、カメラ装置C1の切替判定ラインJC1の範囲外である。なお、図示は省略しているが、トラッキング位置A1,A2は全方位マイクアレイ装置M1の収音エリア内である。 The tracking position A1 is within the range of the imaging area C1RN of the camera device C1, and is within the range of the switching determination line JC1 of the camera device C1 determined in advance. The tracking position A2 is within the range of the imaging area C2RN of the camera device C2, and is outside the range of the switching determination line JC1 of the camera device C1. Although not shown, the tracking positions A1 and A2 are within the sound collection area of the omnidirectional microphone array apparatus M1.
 動作切替制御部38は、人物HM1がカメラ装置C1の撮像エリアC1RNを超えそうになった場合、人物HM1の画像の撮像に用いるカメラ装置をカメラ装置C1からカメラ装置C2に切り替える旨の情報を、通信部31及びネットワークNWを介して、カメラ装置C2に通知する。言い換えると、動作切替制御部38は、カメラ装置C2に対し、カメラ装置C2の画角内の範囲の画像の撮像準備を指示する。ただ、この時点では、ディスプレイ装置35のトラッキング画面TRW上には、カメラ装置C1により撮像された映像の画像データが表示されている。 When the person HM1 is about to exceed the imaging area C1RN of the camera device C1, the operation switching control unit 38 includes information indicating that the camera device used for capturing an image of the person HM1 is switched from the camera device C1 to the camera device C2. The camera device C2 is notified through the communication unit 31 and the network NW. In other words, the operation switching control unit 38 instructs the camera device C2 to prepare for capturing an image in a range within the angle of view of the camera device C2. However, at this time, the image data of the video imaged by the camera device C1 is displayed on the tracking screen TRW of the display device 35.
 例えば、動作切替制御部38は、人物HM1がカメラ装置C1の切替判定ラインJC1を超えた場合に、人物HM1の画像の撮像に用いるカメラ装置をカメラ装置C1からカメラ装置C2に切り替える旨の情報を、通信部31及びネットワークNWを介して、カメラ装置C2に通知する。 For example, when the person HM1 exceeds the switching determination line JC1 of the camera device C1, the operation switching control unit 38 provides information indicating that the camera device used for capturing an image of the person HM1 is switched from the camera device C1 to the camera device C2. And notifies the camera device C2 via the communication unit 31 and the network NW.
 動作切替制御部38は、カメラ装置C1が計測するカメラ装置C1と人物HM1との距離情報を用いて、人物HM1が切替判定ラインJC1を超えたか否かを判定する。より具体的には、動作切替制御部38は、人物HM1がカメラ装置C1の画角内に存在し、かつ、カメラ装置C1から人物HM1までの距離がカメラ装置C1から切替判定ラインJC1までの距離(既知)より大きくなった場合に、人物HM1が切替判定ラインJC1を超えたと判定する。なお、動作切替制御部38は、カメラ装置C1から切替可能なカメラ装置(例えばカメラ装置C2)を予め知っており、他のカメラ装置から切替可能なカメラ装置も予め知っているとする。 The operation switching control unit 38 uses the distance information between the camera device C1 and the person HM1 measured by the camera device C1 to determine whether the person HM1 has exceeded the switching determination line JC1. More specifically, the operation switching control unit 38 has the person HM1 within the angle of view of the camera device C1, and the distance from the camera device C1 to the person HM1 is the distance from the camera device C1 to the switching determination line JC1. When it becomes larger than (known), it is determined that the person HM1 exceeds the switching determination line JC1. It is assumed that the operation switching control unit 38 knows in advance a camera device (for example, the camera device C2) that can be switched from the camera device C1, and also knows a camera device that can be switched from other camera devices in advance.
 動作切替制御部38は、切替判定ラインJC1を超えた人物HM1がカメラ装置C1の撮像エリアC1RNを超えたと判定した場合には、人物HM1の画像の撮像に用いるカメラ装置を、カメラ装置C1からカメラ装置C2に切り替える。この後、ディスプレイ装置35のトラッキング画面TRW上には、カメラ装置C2により撮像された映像の画像データ(例えば移動中の人物HM1の画像データ)が表示されている。 When it is determined that the person HM1 that has exceeded the switching determination line JC1 has exceeded the imaging area C1RN of the camera device C1, the operation switching control unit 38 selects a camera device to be used for capturing an image of the person HM1 from the camera device C1 to the camera. Switch to device C2. Thereafter, on the tracking screen TRW of the display device 35, image data of a video imaged by the camera device C2 (for example, image data of the moving person HM1) is displayed.
 これにより、動作切替制御部38は、移動中の監視対象物(例えば人物HM1)の画像を的確に映し出すことが可能なカメラ装置に適応的に切り替えることができ、ユーザの監視対象物(例えば人物HM1)の画像を簡易に指定させることができる。 Thereby, the operation switching control unit 38 can adaptively switch to a camera device that can accurately display an image of the moving monitoring target (for example, the person HM1), and the user's monitoring target (for example, the person) The image of HM1) can be designated easily.
 次に、指向性制御装置3Bにおける全方位マイクアレイ装置の自動切替処理について、図25を参照して説明する。図25は、監視対象物(例えば人物HM1)の音声の収音に用いる全方位マイクアレイ装置の自動切替処理を示す説明図である。図25では、説明を簡単にするために、監視対象物としての人物HM1がトラッキング位置A1からトラッキング位置A2に移動することにより、人物HM1の発する音声の収音に用いる全方位マイクアレイ装置を、全方位マイクアレイ装置M1から全方位マイクアレイ装置M2に切り替える例を説明する。 Next, automatic switching processing of the omnidirectional microphone array device in the directivity control device 3B will be described with reference to FIG. FIG. 25 is an explanatory diagram showing an automatic switching process of the omnidirectional microphone array device used for collecting the sound of the monitoring target (for example, the person HM1). In FIG. 25, in order to simplify the explanation, the omnidirectional microphone array device used for collecting the sound emitted by the person HM1 by moving the person HM1 as the monitoring object from the tracking position A1 to the tracking position A2 An example of switching from the omnidirectional microphone array apparatus M1 to the omnidirectional microphone array apparatus M2 will be described.
 トラッキング位置A1は、全方位マイクアレイ装置M1の収音エリアM1RNの範囲内であり、予め決められた全方位マイクアレイ装置M1の切替判定ラインJM1の範囲内である。トラッキング位置A2は、全方位マイクアレイ装置M2の収音エリアM2RNの範囲内であり、全方位マイクアレイ装置M1の切替判定ラインJM1の範囲外である。なお、図示は省略しているが、トラッキング位置A1,A2はカメラ装置C1の撮像エリア内である。 The tracking position A1 is within the range of the sound collection area M1RN of the omnidirectional microphone array apparatus M1, and is within the range of the switching determination line JM1 of the omnidirectional microphone array apparatus M1 determined in advance. The tracking position A2 is within the range of the sound collection area M2RN of the omnidirectional microphone array apparatus M2, and is outside the range of the switching determination line JM1 of the omnidirectional microphone array apparatus M1. Although not shown, the tracking positions A1 and A2 are within the imaging area of the camera device C1.
 動作切替制御部38は、人物HM1が全方位マイクアレイ装置M1の収音エリアM1RNを超えそうになった場合、人物HM1の発する音声の収音に用いる全方位マイクアレイ装置を全方位マイクアレイ装置M1から全方位マイクアレイ装置M2に切り替える旨の情報を、通信部31及びネットワークNWを介して、全方位マイクアレイ装置M2に通知する。言い換えると、動作切替制御部38は、全方位マイクアレイ装置M2に対し、全方位マイクアレイ装置M2の収音エリア内の音声の収音準備を指示する。 When the person HM1 is about to exceed the sound collection area M1RN of the omnidirectional microphone array apparatus M1, the operation switching control unit 38 uses the omnidirectional microphone array apparatus used for collecting sound emitted from the person HM1. Information indicating switching from M1 to the omnidirectional microphone array apparatus M2 is notified to the omnidirectional microphone array apparatus M2 via the communication unit 31 and the network NW. In other words, the operation switching control unit 38 instructs the omnidirectional microphone array apparatus M2 to prepare to collect sound within the sound collection area of the omnidirectional microphone array apparatus M2.
 例えば、動作切替制御部38は、人物HM1が全方位マイクアレイ装置M1の切替判定ラインJM1を超えた場合に、人物HM1の発する音声の収音に用いる全方位マイクアレイ装置を全方位マイクアレイ装置M1から全方位マイクアレイ装置M2に切り替える旨の情報を、通信部31及びネットワークNWを介して、全方位マイクアレイ装置M2に通知する。 For example, when the person HM1 exceeds the switching determination line JM1 of the omnidirectional microphone array apparatus M1, the operation switching control unit 38 uses the omnidirectional microphone array apparatus used for collecting the sound emitted by the person HM1. Information indicating switching from M1 to the omnidirectional microphone array apparatus M2 is notified to the omnidirectional microphone array apparatus M2 via the communication unit 31 and the network NW.
 動作切替制御部38は、全方位マイクアレイ装置M1と人物HM1との距離情報を用いて、人物HM1が切替判定ラインJM1を超えたか否かを判定する。より具体的には、動作切替制御部38は、全方位マイクアレイ装置M1から人物HM1までの距離が全方位マイクアレイ装置M1から切替判定ラインJM1までの距離(既知)より大きくなった場合に、人物HM1が切替判定ラインJM1を超えたと判定する。なお、動作切替制御部38は、全方位マイクアレイ装置M1から切替可能な全方位マイクアレイ装置(例えば全方位マイクアレイ装置M2)を予め知っており、他の全方位マイクアレイ装置から切替可能な全方位マイクアレイ装置も予め知っているとする。 The operation switching control unit 38 uses the distance information between the omnidirectional microphone array device M1 and the person HM1 to determine whether or not the person HM1 exceeds the switching determination line JM1. More specifically, when the distance from the omnidirectional microphone array apparatus M1 to the person HM1 becomes larger than the distance (known) from the omnidirectional microphone array apparatus M1 to the switching determination line JM1, It is determined that the person HM1 has exceeded the switching determination line JM1. The operation switching control unit 38 knows in advance an omnidirectional microphone array apparatus (for example, omnidirectional microphone array apparatus M2) that can be switched from the omnidirectional microphone array apparatus M1, and can switch from other omnidirectional microphone array apparatuses. It is assumed that the omnidirectional microphone array apparatus is known in advance.
 動作切替制御部38は、切替判定ラインJM1を超えた人物HM1が全方位マイクアレイ装置M1の収音エリアM1RNを超えたと判定した場合には、人物HM1の発する音声の収音に用いる全方位マイクアレイ装置Mを、全方位マイクアレイ装置M1から全方位マイクアレイ装置M2に切り替える。 When it is determined that the person HM1 exceeding the switching determination line JM1 has exceeded the sound collection area M1RN of the omnidirectional microphone array apparatus M1, the operation switching control unit 38 uses the omnidirectional microphone for sound collection by the person HM1. The array device M is switched from the omnidirectional microphone array device M1 to the omnidirectional microphone array device M2.
 これにより、動作切替制御部38は、移動中の監視対象物(例えば人物HM1)の発する音声を的確に収音することが可能な全方位マイクアレイ装置に適応的に切り替えることができ、監視対象物(例えば人物HM1)の発する音声を高精度に収音することができる。 Thereby, the operation switching control unit 38 can adaptively switch to the omnidirectional microphone array device capable of accurately collecting the sound emitted from the moving monitoring target (for example, the person HM1). It is possible to pick up sound produced by an object (for example, the person HM1) with high accuracy.
 次に、指向性制御装置3Bにおけるカメラ装置の手動切替処理について、図26を参照して説明する。図26は、ディスプレイ装置35に表示される画像の撮像に用いるカメラ装置の手動切替処理を示す説明図である。図26では、ディスプレイ装置35には、ユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作に応じて、人物HM1の画像の撮像に現在使用中のカメラ装置C1により撮像された画像のトラッキング画面TRWが、カメラ装置C1のカメラ画面C1Wと、カメラ装置C1の周辺のカメラ装置(例えば8台のカメラ装置)のカメラ画面とを含むマルチカメラ画面に切り替わる。 Next, the manual switching process of the camera device in the directivity control device 3B will be described with reference to FIG. FIG. 26 is an explanatory diagram illustrating manual switching processing of the camera device used for capturing an image displayed on the display device 35. In FIG. 26, on the display device 35, tracking of an image captured by the camera device C1 currently used for capturing an image of the person HM1 in response to an input operation with the cursor CSR by the user's mouse operation or the user's finger FG. The screen TRW is switched to a multi-camera screen including a camera screen C1W of the camera device C1 and camera screens of camera devices (for example, eight camera devices) around the camera device C1.
 図24と同様に、現在使用中のカメラ装置C1には、切替可能なカメラ装置が予め決められており、例えばカメラ装置C2,C3,C4とする。図26に示すマルチカメラ画面では、カメラ装置C2,C3,C4により撮像されたカメラ画面C2W,C3W,C4Wが表示されている(図26に示すハッチング参照)。人物HM1は、移動方向MV1に移動しているとする。 As in FIG. 24, switchable camera devices are determined in advance for the camera device C1 currently in use, for example, camera devices C2, C3, and C4. On the multi-camera screen shown in FIG. 26, camera screens C2W, C3W, and C4W captured by the camera devices C2, C3, and C4 are displayed (see hatching shown in FIG. 26). It is assumed that the person HM1 is moving in the movement direction MV1.
 ユーザは、監視対象物としての人物HM1の移動方向MV1を考慮した上で、図26に示すマルチカメラ画面に対し、指FGで、3つのカメラ画面C2W,C3W,C4Wのうちいずれかのカメラ画面(例えばカメラ画面C3W)をタッチ操作したとする。 In consideration of the moving direction MV1 of the person HM1 as the monitoring target, the user uses one of the three camera screens C2W, C3W, and C4W with the finger FG on the multi-camera screen shown in FIG. Assume that a touch operation is performed on (for example, camera screen C3W).
 動作切替制御部38は、ユーザの指FGのタッチ操作に応じて、人物HM1の画像の撮像に用いるカメラ装置を、現在使用中のカメラ装置C1から、タッチ操作の対象となったカメラ画面C3Wに対応するカメラ装置C3に切り替える。 In response to the user's finger FG touch operation, the operation switching control unit 38 changes the camera device used for capturing an image of the person HM1 from the currently used camera device C1 to the camera screen C3W that is the target of the touch operation. Switch to the corresponding camera device C3.
 これにより、動作切替制御部38は、ユーザの簡易な操作によって、移動中の監視対象物(例えば人物HM1)の画像を的確に映し出すことが可能なカメラ装置に適応的に切り替えることができ、ユーザの監視対象物(例えば人物HM1)の画像を簡易に指定させることができる。 Thereby, the operation switching control unit 38 can adaptively switch to a camera device that can accurately display an image of the moving monitoring target (for example, the person HM1) by a simple operation of the user. The image of the monitoring object (for example, the person HM1) can be easily designated.
 次に、指向性制御装置3Bにおける全方位マイクアレイ装置の手動切替処理について、図27を参照して説明する。図27は、監視対象物(例えば人物HM1)の音声の収音に用いる全方位マイクアレイ装置の手動切替処理を示す説明図である。図27では、トラッキング画面TRW上に、監視対象物としての人物HM1が中央に表示されている。また、現在使用中の全方位マイクアレイ装置M1から切替可能な全方位マイクアレイ装置は、全方位マイクアレイ装置M1の周辺に設置された3台の全方位マイクアレイ装置M2,M3,M4とする。 Next, manual switching processing of the omnidirectional microphone array device in the directivity control device 3B will be described with reference to FIG. FIG. 27 is an explanatory diagram illustrating a manual switching process of the omnidirectional microphone array device used for collecting the sound of the monitoring target (for example, the person HM1). In FIG. 27, the person HM1 as the monitoring target is displayed in the center on the tracking screen TRW. The omnidirectional microphone array devices that can be switched from the currently used omnidirectional microphone array device M1 are the three omnidirectional microphone array devices M2, M3, and M4 installed around the omnidirectional microphone array device M1. .
 図27において、ユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作に応じて、トラッキング画面TRW上に、現在使用中の全方位マイクアレイ装置M1から切替可能な全方位マイクアレイ装置M2,M3,M4の概略位置を示すマーカM2R,M3R,M4Rが表示される(図27に示す(1)参照)。 In FIG. 27, the omnidirectional microphone array apparatus M2, which can be switched from the omnidirectional microphone array apparatus M1 currently used on the tracking screen TRW in response to an input operation with the cursor CSR by the user's mouse operation or the user's finger FG. Markers M2R, M3R, and M4R indicating the approximate positions of M3 and M4 are displayed (see (1) shown in FIG. 27).
 ユーザは、監視対象物としての人物HM1のトラッキングポイントに対応するトラッキング位置A1からの移動方向MV1を考慮した上で、ユーザの指FGのタッチ操作により、3つのマーカのうちいずれかのマーカ(例えばマーカM3R)が選択される(図27に示す(2)参照)。動作切替制御部38は、現在使用中の全方位マイクアレイ装置M1から、ユーザの指FGのタッチ操作により選択されたマーカM3Rに対応する全方位マイクアレイ装置M3に、通信部31及びネットワークNWを介して、収音の開始を指示する(図27に示す(3)参照)。 The user considers the moving direction MV1 from the tracking position A1 corresponding to the tracking point of the person HM1 as the monitoring object, and touches one of the three markers (for example, the user's finger FG) (for example, The marker M3R) is selected (see (2) shown in FIG. 27). The operation switching control unit 38 connects the communication unit 31 and the network NW from the omnidirectional microphone array apparatus M1 currently in use to the omnidirectional microphone array apparatus M3 corresponding to the marker M3R selected by the touch operation of the user's finger FG. To start sound collection (see (3) in FIG. 27).
 また、出力制御部34bは、選択されたマーカM3Rに対応する全方位マイクアレイ装置M3から、現時点の人物HM1のトラッキング位置への方向に、指向性を切り替える(図27に示す(4)参照)。この後、出力制御部34bによって、トラッキング画面TRW上に表示された全方位マイクアレイ装置M2,M3,M4の概略位置を示すマーカM2R,M3R,M4Rが消去される。 Further, the output control unit 34b switches the directivity from the omnidirectional microphone array apparatus M3 corresponding to the selected marker M3R to the current tracking position of the person HM1 (see (4) shown in FIG. 27). . Thereafter, the markers M2R, M3R, and M4R indicating the approximate positions of the omnidirectional microphone array devices M2, M3, and M4 displayed on the tracking screen TRW are deleted by the output control unit 34b.
 これにより、動作切替制御部38は、トラッキング画面TRW上に表示されたマーカM2R,M3R,M4Rに対するユーザの簡易な操作によって、移動中の監視対象物(例えば人物HM1)の発する音声を的確に収音することが可能な全方位マイクアレイ装置M3に適応的に切り替えることができ、人物HM1の移動方向MV1に合わせて人物HM1の発する音声を高精度に収音することができる。 As a result, the operation switching control unit 38 accurately captures the sound generated by the moving monitoring target (for example, the person HM1) by a simple user operation on the markers M2R, M3R, and M4R displayed on the tracking screen TRW. It is possible to adaptively switch to the omnidirectional microphone array apparatus M3 capable of sounding, and it is possible to pick up the sound emitted by the person HM1 with high accuracy in accordance with the moving direction MV1 of the person HM1.
 次に、指向性制御装置3Bにおける最適な全方位マイクアレイ装置の選択処理について、図28を参照して説明する。図28は、監視対象物の音声の収音に用いる最適な全方位マイクアレイ装置の選択処理を示す説明図である。図28の左上側のディスプレイ装置35には、ユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作に応じて、指向性制御システム100Bが管轄する全てのカメラ装置(例えば9台のカメラ装置)のカメラ画面が一覧表示されている。 Next, the optimum omnidirectional microphone array device selection process in the directivity control device 3B will be described with reference to FIG. FIG. 28 is an explanatory diagram showing a selection process of the optimum omnidirectional microphone array device used for collecting the sound of the monitoring object. In the upper left display device 35 in FIG. 28, all the camera devices (for example, nine camera devices) managed by the directivity control system 100B according to the input operation with the cursor CSR by the user's mouse operation or the user's finger FG. ) Camera screens are listed.
 図28の左上側のディスプレイ装置35に一覧表示された各カメラ画面の中で、音声トラッキング処理の対象となる監視対象物(例えば人物HM1)が映っているカメラ画面は、カメラ画面C1W,C2W,C3Wである。これらのカメラ画面C1W,C2W,C3Wの中で、ユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作に応じて、人物HM1の映りが最も良好なカメラ画面C1Wが選択されたとする。 Of the camera screens displayed as a list on the display device 35 on the upper left side of FIG. 28, the camera screens on which the monitoring target object (for example, the person HM1) that is the target of the audio tracking process is shown. C3W. In these camera screens C1W, C2W, and C3W, it is assumed that the camera screen C1W with the best reflection of the person HM1 is selected in accordance with the input operation with the cursor CSR by the user's mouse operation or the user's finger FG.
 動作切替制御部38は、ユーザのカメラ画面C1Wの選択に応じて、人物HM1の画像の撮像に用いるカメラ装置として、カメラ画面C1Wに対応するカメラ装置C1を選択して切り替える。これにより、出力制御部34bは、カメラ画面C1Wに対応するカメラ装置により撮像された画像データを拡大して、ディスプレイ装置35のトラッキング画面TRW1上に表示させる(図28の左下側参照)。 The operation switching control unit 38 selects and switches the camera device C1 corresponding to the camera screen C1W as a camera device used for capturing an image of the person HM1 in accordance with the user's selection of the camera screen C1W. Thereby, the output control unit 34b enlarges the image data captured by the camera device corresponding to the camera screen C1W and displays the image data on the tracking screen TRW1 of the display device 35 (see the lower left side in FIG. 28).
 また、出力制御部34bは、動作切替制御部38により選択されたカメラ装置C1に対応付けられた全ての全方位マイクアレイ装置の概略位置を示すマーカM1R,M2R,M3R,M4Rをトラッキング画面TRW1の四隅に表示させる。なお、マーカM1R,M2R,M3R,M4Rの表示位置はトラッキング画面TRW1上の四隅に限定されない。 Further, the output control unit 34b displays markers M1R, M2R, M3R, and M4R indicating the approximate positions of all the omnidirectional microphone array devices associated with the camera device C1 selected by the operation switching control unit 38 on the tracking screen TRW1. Display in the four corners. The display positions of the markers M1R, M2R, M3R, and M4R are not limited to the four corners on the tracking screen TRW1.
 更に、出力制御部34bは、ユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作によりマーカM1R,M2R,M3R,M4Rが順次、指定されると、1つずつマーカを強調表示(例えばブリンクBr)させながら、それぞれのマーカについて、各マーカに対応する全方位マイクアレイ装置から人物HM1の位置への方向に指向性を形成した上で、一定時間収音した音声を出力する。 Further, when the markers M1R, M2R, M3R, and M4R are sequentially designated by the input operation with the cursor CSR by the user's mouse operation or the user's finger FG, the output control unit 34b highlights the markers one by one (for example, blinking). Br), for each marker, directivity is formed in the direction from the omnidirectional microphone array device corresponding to each marker to the position of the person HM1, and the sound collected for a certain time is output.
 動作切替制御部38は、一定時間出力された音声の中でユーザが最適と判断した全方位マイクアレイ装置の概略位置を示すマーカ(例えばマーカM3R)が選択されると、選択されたマーカM3Rに対応する全方位マイクアレイ装置M3を、人物HM1の発する音声の収音に用いる全方位マイクアレイ装置として選択して切り替える。 When the marker (for example, the marker M3R) indicating the approximate position of the omnidirectional microphone array device determined by the user to be optimal among the sounds output for a certain period of time is selected, the operation switching control unit 38 selects the selected marker M3R. The corresponding omnidirectional microphone array apparatus M3 is selected and switched as the omnidirectional microphone array apparatus used for collecting the sound emitted by the person HM1.
 これにより、動作切替制御部38は、選択されたカメラ装置C5に対応付けられた複数の全方位マイクアレイ装置M1,M2,M3,M4において異なる指向性が形成された収音音声を一定時間にわたって出力することができるので、ユーザが最適と判断する収音音声を選択する簡易な操作を行うことにより、移動中の監視対象物(例えば人物HM1)の発する音声を的確に収音することが可能な最適な全方位マイクアレイ装置M3を選択することができ、監視対象物(例えば人物HM1)の発する音声を高精度に収音することができる。 As a result, the operation switching control unit 38 receives the collected sound having different directivities for a certain period of time in the plurality of omnidirectional microphone array devices M1, M2, M3, and M4 associated with the selected camera device C5. Since the sound can be output, it is possible to accurately collect the sound emitted from the moving monitoring target (for example, the person HM1) by performing a simple operation for selecting the sound to be collected that the user determines to be optimal. The optimum omnidirectional microphone array device M3 can be selected, and the sound emitted from the monitoring object (for example, the person HM1) can be collected with high accuracy.
 次に、本実施形態の指向性制御システム100Bにおけるカメラ装置の自動切替処理について、図29(A)を参照して説明する。図29(A)は、第2の実施形態の指向性制御システム100Bにおけるカメラ装置の自動切替処理の一例を説明するフローチャートである。図29(A)に示すカメラ装置の自動切替処理は、図24に示すカメラ装置の自動切替処理の内容を詳細に説明しており、例えば、図10(B)に示すステップS3B-1の後に続けて行われる。 Next, camera device automatic switching processing in the directivity control system 100B of the present embodiment will be described with reference to FIG. FIG. 29A is a flowchart for explaining an example of automatic switching processing of the camera device in the directivity control system 100B of the second embodiment. The automatic switching process of the camera device shown in FIG. 29A explains the details of the automatic switching process of the camera device shown in FIG. 24. For example, after step S3B-1 shown in FIG. Continued.
 図29(A)において、画像処理部37は、ディスプレイ装置35のトラッキング画面TRW上に映し出されている画像データに対して所定の画像処理を行うことにより、監視対象物(例えば人物HM1)の位置(即ち、トラッキングポイント)を検出する(S21)。ステップS21の後、カメラ切替判定処理が行われる(S22)。カメラ切替判定処理の詳細は図29(B)を参照して後述する。 In FIG. 29A, the image processing unit 37 performs predetermined image processing on the image data displayed on the tracking screen TRW of the display device 35, whereby the position of the monitoring object (for example, the person HM1) is detected. (That is, a tracking point) is detected (S21). After step S21, camera switching determination processing is performed (S22). Details of the camera switching determination process will be described later with reference to FIG.
 ステップS22の後、動作切替制御部38によりカメラ切替モードがオンに設定されている場合には(S23、YES)、動作切替制御部38は、現在使用中のカメラ装置(例えばカメラ装置C1)に対応付けられた切替可能な全てのカメラ装置に対し、通信部31及びネットワークNWを介して、画像の撮像を指示する(S24)。画像の撮像の指示を受けた全てのカメラ装置は、画像の撮像を開始する。なお、カメラ切替モードは、複数カメラ切替方法が自動である場合に、カメラ装置を切り替えるか否かの処理の制御用に用いられるフラグである。 After step S22, when the camera switching mode is set to ON by the operation switching control unit 38 (S23, YES), the operation switching control unit 38 sets the camera device currently in use (for example, the camera device C1). All the switchable camera devices associated with each other are instructed to capture an image via the communication unit 31 and the network NW (S24). All the camera devices that have received an instruction to capture an image start capturing an image. The camera switching mode is a flag used for controlling the process of whether to switch the camera device when the multiple camera switching method is automatic.
 動作切替制御部38は、現在使用中のカメラ装置C1が計測したカメラ装置C1と人物HM1との距離情報を用いて、ステップS21において検出された実空間上のトラッキング位置A1にいる人物HM1がカメラ装置C1の撮像エリアC1RNを超えたか否かを判定する(S25)。動作切替制御部38は、人物HM1がカメラ装置C1の撮像エリアC1RNを超えたと判定した場合には(S25、YES)、ステップS24の指示により、現在使用中のカメラ装置C1に対応付けられた切替可能な全てのカメラ装置により撮像された画像データを画像処理部37に出力する。画像処理部37は、動作切替制御部38から出力された全ての画像データに対して所定の画像処理を行うことにより、監視対象物としての人物HM1の検出の有無を判定する(S26)。画像処理部37は、画像処理結果を動作切替制御部38に出力する。 The operation switching control unit 38 uses the distance information between the camera device C1 and the person HM1 measured by the camera device C1 currently in use to detect the person HM1 at the tracking position A1 in the real space detected in step S21. It is determined whether or not the imaging area C1RN of the device C1 has been exceeded (S25). When it is determined that the person HM1 has exceeded the imaging area C1RN of the camera device C1 (S25, YES), the operation switching control unit 38 switches according to the instruction in step S24 and is associated with the currently used camera device C1. Image data captured by all possible camera devices is output to the image processing unit 37. The image processing unit 37 performs predetermined image processing on all the image data output from the operation switching control unit 38, thereby determining whether or not the person HM1 as the monitoring target is detected (S26). The image processing unit 37 outputs the image processing result to the operation switching control unit 38.
 動作切替制御部38は、画像処理部37の画像処理結果を用いて、監視対象物としての人物HM1の検出ができていて、かつ、ステップS21において検出された実空間上のトラッキング位置A1に最も近いカメラ装置(例えばカメラ装置C2)を1つ選択し、人物HM1の画像の撮像に用いるカメラ装置を、カメラ装置C1からカメラ装置C2に切り替える(S27)。これにより、出力制御部34bは、ディスプレイ装置35に表示されているトラッキング画面TRWを、動作切替制御部38により選択されたカメラ装置C2のカメラ画面に切り替えて表示する(S27)。 The operation switching control unit 38 can detect the person HM1 as the monitoring target using the image processing result of the image processing unit 37, and is the closest to the tracking position A1 in the real space detected in step S21. One near camera device (for example, camera device C2) is selected, and the camera device used for capturing an image of the person HM1 is switched from the camera device C1 to the camera device C2 (S27). Thereby, the output control unit 34b switches the tracking screen TRW displayed on the display device 35 to the camera screen of the camera device C2 selected by the operation switching control unit 38 and displays it (S27).
 一方、動作切替制御部38によりカメラ切替モードがオフに設定されている場合(S23、NO)、又は人物HM1がカメラ装置C1の撮像エリアC1RNを超えていないと判定された場合には(S25、NO)、図29(A)に示すカメラ装置の自動切替処理は終了し、図30(A)に示す全方位マイクアレイ装置の自動切替処理に進む。 On the other hand, when the camera switching mode is set to OFF by the operation switching control unit 38 (S23, NO), or when it is determined that the person HM1 does not exceed the imaging area C1RN of the camera device C1 (S25, NO), the automatic switching process of the camera device shown in FIG. 29A ends, and the process proceeds to the automatic switching process of the omnidirectional microphone array device shown in FIG.
 次に、指向性制御装置3Bにおけるカメラ切替判定処理について、図29(B)を参照して説明する。図29(B)は、図29(A)に示すカメラ切替判定処理の一例を示すフローチャートである。 Next, camera switching determination processing in the directivity control device 3B will be described with reference to FIG. FIG. 29B is a flowchart illustrating an example of the camera switching determination process illustrated in FIG.
 図29(B)において、動作切替制御部38は、指向性制御装置3Bにおけるカメラ切替モードをオフに設定する(S22-1)。動作切替制御部38は、現在使用中のカメラ装置C1が計測したカメラ装置C1と人物HM1との距離情報を用いて、ステップS21において検出されたトラッキングポイントに対応する実空間上のトラッキング位置A1が現在使用中のカメラ装置C1の所定の切替判定ラインJC1を超えたか否かを判定する(S22-2)。 In FIG. 29 (B), the operation switching control unit 38 sets the camera switching mode in the directivity control device 3B to OFF (S22-1). The operation switching control unit 38 uses the distance information between the camera device C1 and the person HM1 measured by the camera device C1 currently in use to determine the tracking position A1 in the real space corresponding to the tracking point detected in step S21. It is determined whether or not a predetermined switching determination line JC1 of the camera device C1 currently in use has been exceeded (S22-2).
 動作切替制御部38は、ステップS21において検出されたトラッキングポイントに対応する実空間上のトラッキング位置A1が現在使用中のカメラ装置C1の所定の切替判定ラインJC1を超えたと判定した場合には(S22-2、YES)、カメラ切替モードをオン(自動)に設定する(S22-3)。 When the operation switching control unit 38 determines that the tracking position A1 in the real space corresponding to the tracking point detected in step S21 exceeds the predetermined switching determination line JC1 of the camera device C1 currently in use (S22). -2, YES), the camera switching mode is set to ON (automatic) (S22-3).
 ステップS22-3の後、又はトラッキング位置A1が現在使用中のカメラ装置C1の所定の切替判定ラインJC1を超えていないと判定された場合には(S22-2、NO)、図29(B)に示すカメラ切替判定処理は終了し、図29(A)に示すステップS23に進む。 After step S22-3, or when it is determined that the tracking position A1 does not exceed the predetermined switching determination line JC1 of the camera device C1 currently in use (S22-2, NO), FIG. 29 (B) The camera switching determination process shown in FIG. 9 ends, and the process proceeds to step S23 shown in FIG.
 次に、本実施形態の指向性制御システム100Bにおける全方位マイクアレイ装置の自動切替処理について、図30(A)を参照して説明する。図30(A)は、第2の実施形態の指向性制御システム100Bにおける全方位マイクアレイ装置の自動切替処理の一例を説明するフローチャートである。図30(A)に示す全方位マイクアレイ装置の自動切替処理は、図25に示す全方位マイクアレイ装置の自動切替処理の内容を詳細に説明しており、図29(A)に示すステップS27の後に続けて行われても良いし、図29(A)に示すカメラ装置の自動切替処理が図30(A)に示す全方位マイクアレイ装置の自動切替処理の後に行われても良い。 Next, automatic switching processing of the omnidirectional microphone array device in the directivity control system 100B of the present embodiment will be described with reference to FIG. FIG. 30A is a flowchart for explaining an example of automatic switching processing of the omnidirectional microphone array device in the directivity control system 100B of the second embodiment. The automatic switching process of the omnidirectional microphone array apparatus shown in FIG. 30A explains in detail the contents of the automatic switching process of the omnidirectional microphone array apparatus shown in FIG. 25, and step S27 shown in FIG. The automatic switching process of the camera device shown in FIG. 29A may be performed after the automatic switching process of the omnidirectional microphone array device shown in FIG.
 図30(A)において、音源検出部34dは、所定の音源検出処理を行うことにより、実空間上の監視対象物(例えば人物HM1)の位置(音源の位置)を算出し、又は算出された音源の位置に対応する画像データ上の位置を示す座標(即ち、トラッキングポイントに対応するトラッキング位置A1の座標)を算出する(S31)。ステップS31の後、マイク切替判定処理が行われる(S32)。マイク切替判定処理の詳細は図30(B)を参照して後述する。 In FIG. 30A, the sound source detection unit 34d calculates or calculates the position (sound source position) of the monitoring object (for example, the person HM1) in the real space by performing a predetermined sound source detection process. The coordinates indicating the position on the image data corresponding to the position of the sound source (that is, the coordinates of the tracking position A1 corresponding to the tracking point) are calculated (S31). After step S31, a microphone switching determination process is performed (S32). Details of the microphone switching determination process will be described later with reference to FIG.
 ステップS32の後、動作切替制御部38によりマイク切替モードがオンに設定されている場合には(S33、YES)、動作切替制御部38は、現在使用中の全方位マイクアレイ装置(例えば全方位マイクアレイ装置M1)に対応付けられた切替可能な全ての全方位マイクアレイ装置に対し、通信部31及びネットワークNWを介して、人物HM1の発する音声の収音を指示する(S34)。音声の収音の指示を受けた全ての全方位マイクアレイ装置は、音声の収音を開始する。なお、マイク切替モードは、複数マイク切替方法が自動である場合に、全方位マイクアレイ装置を切り替えるか否かの処理の制御用に用いられるフラグである。 After step S32, when the microphone switching mode is set to ON by the operation switching control unit 38 (S33, YES), the operation switching control unit 38 selects the omnidirectional microphone array device (for example, omnidirectional) currently in use. Instruct all the switchable omnidirectional microphone array devices associated with the microphone array device M1) to collect the sound emitted by the person HM1 via the communication unit 31 and the network NW (S34). All omnidirectional microphone array apparatuses that have received an instruction to collect sound start to collect sound. Note that the microphone switching mode is a flag used for controlling processing for determining whether to switch the omnidirectional microphone array apparatus when the multiple microphone switching method is automatic.
 動作切替制御部38は、音源検出部34dが算出した現在使用中の全方位マイクアレイ装置M1と人物HM1との距離情報を用いて、人物HM1が全方位マイクアレイ装置M1の収音エリアM1RNを超えたか否かを判定する(S35)。音源検出部34dは、人物HM1が全方位マイクアレイ装置M1の収音エリアM1RNを超えたと判定された場合には(S35、YES)、ステップS34の指示により、現在使用中の全方位マイクアレイ装置M1に対応付けられた切替可能な全ての全方位マイクアレイ装置により収音された音声の強さ又は音量レベルを基に、監視対象物としての人物HM1の位置(音源の位置)を算出する(S36)。 The operation switching control unit 38 uses the distance information between the omnidirectional microphone array apparatus M1 currently used and the person HM1 calculated by the sound source detection unit 34d, so that the person HM1 selects the sound collection area M1RN of the omnidirectional microphone array apparatus M1. It is determined whether or not it has been exceeded (S35). When it is determined that the person HM1 has exceeded the sound collection area M1RN of the omnidirectional microphone array apparatus M1 (S35, YES), the sound source detection unit 34d determines that the omnidirectional microphone array apparatus currently in use is in accordance with an instruction in step S34. Based on the strength or volume level of the sound collected by all switchable omnidirectional microphone array devices associated with M1, the position (sound source position) of the person HM1 as the monitoring object is calculated ( S36).
 動作切替制御部38は、音源検出部34dの音源検出理結果を用いて、現在使用中の全方位マイクアレイ装置M1に対応付けられた切替可能な全ての全方位マイクアレイ装置のうち、監視対象物としての人物HM1の位置(音源の位置)と全方位マイクアレイ装置との距離の差異が最小となる全方位マイクアレイ装置(例えば全方位マイクアレイ装置M2)を1つ選択し、人物HM1の発する音声の収音に用いる全方位マイクアレイ装置を、全方位マイクアレイ装置M1から全方位マイクアレイ装置M2に切り替える(S37)。これにより、出力制御部34bは、切り替え後の全方位マイクアレイ装置M2から、ステップS36において算出された音源の位置への方向に、音声の指向性を切り替える(S37)。 The operation switching control unit 38 uses the sound source detection result of the sound source detection unit 34d to monitor among all switchable omnidirectional microphone array devices associated with the currently used omnidirectional microphone array device M1. One omnidirectional microphone array apparatus (for example, omnidirectional microphone array apparatus M2) that minimizes the difference in the distance between the position of the person HM1 as an object (the position of the sound source) and the omnidirectional microphone array apparatus is selected, and the person HM1 The omnidirectional microphone array apparatus used for collecting the emitted voice is switched from the omnidirectional microphone array apparatus M1 to the omnidirectional microphone array apparatus M2 (S37). Thereby, the output control unit 34b switches the sound directivity from the omnidirectional microphone array device M2 after switching to the direction of the sound source calculated in step S36 (S37).
 一方、動作切替制御部38によりマイク切替モードがオフに設定されている場合(S33、NO)、又は人物HM1が全方位マイクアレイ装置M1の収音エリアM1RNを超えていないと判定された場合には(S35、NO)、図30(A)に示す全方位マイクアレイ装置の自動切替処理は終了し、例えば図10(B)に示すステップS3B-2に進む。なお、図30(A)に示す全方位マイクアレイ装置の自動切替処理が終了した後に、図29(A)に示すカメラ装置の自動切替処理が開始しても良い。 On the other hand, when the microphone switching mode is set to OFF by the operation switching control unit 38 (S33, NO), or when it is determined that the person HM1 does not exceed the sound collection area M1RN of the omnidirectional microphone array apparatus M1. (S35, NO), the automatic switching process of the omnidirectional microphone array apparatus shown in FIG. 30A ends, and the process proceeds to, for example, step S3B-2 shown in FIG. 10B. Note that the automatic switching process of the camera device shown in FIG. 29A may be started after the automatic switching process of the omnidirectional microphone array device shown in FIG.
 次に、指向性制御装置3Bにおけるマイク切替判定処理について、図30(B)を参照して説明する。図30(B)は、図30(A)に示すマイク切替判定処理の一例を示すフローチャートである。 Next, the microphone switching determination process in the directivity control device 3B will be described with reference to FIG. FIG. 30B is a flowchart illustrating an example of the microphone switching determination process illustrated in FIG.
 図30(B)において、動作切替制御部38は、マイク切替モードをオフに設定する(S32-1)。動作切替制御部38は、現在使用中の全方位マイクアレイ装置M1と人物HM1との距離情報を用いて、ステップS31において算出されたトラッキング位置A1が現在使用中の全方位マイクアレイ装置M1の所定の切替判定ラインJM1を超えたか否かを判定する(S32-2)。 In FIG. 30 (B), the operation switching control unit 38 sets the microphone switching mode to OFF (S32-1). The operation switching control unit 38 uses the distance information between the omnidirectional microphone array apparatus M1 currently in use and the person HM1, and the tracking position A1 calculated in step S31 is a predetermined value of the omnidirectional microphone array apparatus M1 currently in use. It is determined whether or not the switching determination line JM1 is exceeded (S32-2).
 動作切替制御部38は、トラッキング位置A1が現在使用中の全方位マイクアレイ装置M1の所定の切替判定ラインJM1を超えたと判定した場合には(S32-2、YES)、マイク切替モードをオンに設定する(S32-3)。 When it is determined that the tracking position A1 exceeds the predetermined switching determination line JM1 of the omnidirectional microphone array apparatus M1 currently in use (S32-2, YES), the operation switching control unit 38 turns on the microphone switching mode. Setting is made (S32-3).
 ステップS32-3の後、又はトラッキング位置A1が現在使用中の全方位マイクアレイ装置M1の所定の切替判定ラインJM1を超えていないと判定された場合には(S32-2、NO)、図30(B)に示すマイク切替判定処理は終了し、図30(A)に示すステップS33に進む。 After step S32-3, or when it is determined that the tracking position A1 does not exceed the predetermined switching determination line JM1 of the omnidirectional microphone array apparatus M1 currently in use (S32-2, NO), FIG. The microphone switching determination process shown in (B) ends, and the process proceeds to step S33 shown in FIG.
 次に、本実施形態の指向性制御システム100Bにおけるカメラ装置の手動切替処理について、図31(A)を参照して説明する。図31(A)は、第2の実施形態の指向性制御システム100Bにおけるカメラ装置の手動切替処理の一例を説明するフローチャートである。図31(A)に示す指向性制御システム100Bにおけるカメラ装置の手動切替処理は、図9(A)、図9(B)又は図10(A)に示すステップS1に続けて行われる。 Next, camera device manual switching processing in the directivity control system 100B of the present embodiment will be described with reference to FIG. FIG. 31A is a flowchart illustrating an example of manual switching processing of the camera device in the directivity control system 100B of the second embodiment. The manual switching process of the camera device in the directivity control system 100B shown in FIG. 31A is performed following step S1 shown in FIG. 9A, FIG. 9B, or FIG.
 図31(A)において、ディスプレイ装置35に対し、ユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作に応じて、カメラ装置を切り替えるための指示が入力されると(S41)、出力制御部34bは、人物HM1の画像の撮像に現在使用中のカメラ装置C1により撮像された画像のトラッキング画面TRWを、カメラ装置C1のカメラ画面C1Wと、カメラ装置C1の周辺のカメラ装置(例えば8台のカメラ装置)のカメラ画面とを含むマルチカメラ画面に切り替える(S42)。 In FIG. 31A, when an instruction for switching the camera device is input to the display device 35 in response to an input operation with the cursor CSR by the user's mouse operation or the user's finger FG (S41), output control is performed. The unit 34b displays the tracking screen TRW of the image captured by the camera device C1 currently used for capturing an image of the person HM1, the camera screen C1W of the camera device C1, and the camera devices (for example, eight cameras) around the camera device C1. Switch to a multi-camera screen including the camera screen (S42).
 ステップS42においてディスプレイ装置35に表示されたマルチカメラ画面に対し、ユーザは、監視対象物としての人物HM1の移動方向MV1を考慮した上で(図26参照)例えば指FGで、いずれかのカメラ画面をタッチ操作によって選択したとする(S43)。 For the multi-camera screen displayed on the display device 35 in step S42, the user considers the moving direction MV1 of the person HM1 as the monitoring target (see FIG. 26), for example, any camera screen with the finger FG. Is selected by a touch operation (S43).
 動作切替制御部38は、ユーザの指FGのタッチ操作に応じて、人物HM1の画像の撮像に用いるカメラ装置を、現在使用中のカメラ装置C1から、ステップS43においてタッチ操作の対象となったカメラ画面C3Wに対応するカメラ装置C3に切り替える(S44)。これにより、図31(A)に示すカメラ装置の手動切替処理が終了し、図31(B),図32(A)又は図32(B)に示すステップS45,S51,S61又はS71のいずれかに進む。 The operation switching control unit 38 selects a camera device used for capturing an image of the person HM1 in response to the touch operation of the user's finger FG from the currently used camera device C1 in step S43. Switching to the camera device C3 corresponding to the screen C3W (S44). Thereby, the manual switching process of the camera device shown in FIG. 31A is completed, and any one of steps S45, S51, S61, and S71 shown in FIG. 31B, FIG. 32A, and FIG. Proceed to
 次に、本実施形態の指向性制御システム100Bにおける全方位マイクアレイ装置の手動切替処理について、図31(B)を参照して説明する。図31(B)は、第2の実施形態の指向性制御システム100Bにおける全方位マイクアレイ装置の手動切替処理の一例を説明するフローチャートである。 Next, manual switching processing of the omnidirectional microphone array device in the directivity control system 100B of the present embodiment will be described with reference to FIG. FIG. 31B is a flowchart for explaining an example of manual switching processing of the omnidirectional microphone array device in the directivity control system 100B of the second embodiment.
 図31(B)において、ユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作に応じて、全方位マイクアレイ装置を切り替えるための指示が入力されると(S45)、出力制御部34bは、トラッキング画面TRW上に、現在使用中の全方位マイクアレイ装置M1から切替可能な全方位マイクアレイ装置(例えば、全方位マイクアレイ装置M2,M3,M4)の概略位置を示すマーカ(例えばマーカM2R,M3R,M4R)を表示する(S46)。 In FIG. 31B, when an instruction for switching the omnidirectional microphone array apparatus is input in response to an input operation with the cursor CSR or the user's finger FG by the user's mouse operation (S45), the output control unit 34b On the tracking screen TRW, a marker (for example, marker M2R) indicating the approximate position of the omnidirectional microphone array device (for example, omnidirectional microphone array device M2, M3, M4) that can be switched from the currently used omnidirectional microphone array device M1. , M3R, M4R) are displayed (S46).
 ユーザは、監視対象物としての人物HM1のトラッキング位置A1からの移動方向MV1を考慮した上で、ユーザの指FGのタッチ操作により、3つのマーカのうちいずれかのマーカ(例えばマーカM3R)が選択される(S47、図27参照)。動作切替制御部38は、現在使用中の全方位マイクアレイ装置M1から、ユーザの指FGのタッチ操作により選択されたマーカM3Rに対応する全方位マイクアレイ装置M3に、通信部31及びネットワークNWを介して、収音の開始を指示する(S47)。 The user selects one of the three markers (for example, the marker M3R) by touching the finger FG of the user in consideration of the moving direction MV1 from the tracking position A1 of the person HM1 as the monitoring target. (S47, see FIG. 27). The operation switching control unit 38 connects the communication unit 31 and the network NW from the omnidirectional microphone array apparatus M1 currently in use to the omnidirectional microphone array apparatus M3 corresponding to the marker M3R selected by the touch operation of the user's finger FG. The start of sound collection is instructed (S47).
 出力制御部34bは、ステップS47において選択されたマーカM3Rに対応する全方位マイクアレイ装置M3から、現時点の人物HM1のトラッキング位置への方向に、指向性を切り替える(S48)。また、出力制御部34bは、トラッキング画面TRW上に表示された全方位マイクアレイ装置M2,M3,M4の概略位置を示すマーカM2R,M3R,M4Rを消去する(S48)。 The output control unit 34b switches the directivity from the omnidirectional microphone array device M3 corresponding to the marker M3R selected in Step S47 to the direction toward the tracking position of the current person HM1 (S48). Further, the output control unit 34b deletes the markers M2R, M3R, and M4R indicating the approximate positions of the omnidirectional microphone array devices M2, M3, and M4 displayed on the tracking screen TRW (S48).
 ステップS48の後、図31(B)に示す全方位マイクアレイ装置の手動切替処理が終了し、図9(A)、図9(B)又は図10(A)に示すステップS2に進む。なお、図31(B)に示す全方位マイクアレイ装置の手動切替処理の後に、図31(A)に示すカメラ装置の手動切替処理が行われても良い。 After step S48, the manual switching process of the omnidirectional microphone array apparatus shown in FIG. 31 (B) ends, and the process proceeds to step S2 shown in FIG. 9 (A), FIG. 9 (B), or FIG. 10 (A). In addition, the manual switching process of the camera apparatus shown in FIG. 31A may be performed after the manual switching process of the omnidirectional microphone array apparatus shown in FIG.
 次に、本実施形態の指向性制御システム100Bにおける最適な全方位マイクアレイ装置の選択処理について、図32(A)、図32(B)及び図33を参照して説明する。図32(A)は、第2の実施形態の指向性制御システム100Bにおける最適な全方位マイクアレイ装置の選択処理の第1例を説明するフローチャートである。図32(B)は、第2の実施形態の指向性制御システム100Bにおける最適な全方位マイクアレイ装置の選択処理の第2例を説明するフローチャートである。図33は、第2の実施形態の指向性制御システム100Bにおける最適な全方位マイクアレイ装置の選択処理の第3例を説明するフローチャートである。 Next, the optimum omnidirectional microphone array device selection process in the directivity control system 100B of this embodiment will be described with reference to FIGS. 32 (A), 32 (B), and 33. FIG. FIG. 32A is a flowchart for explaining a first example of selection processing of the optimum omnidirectional microphone array device in the directivity control system 100B of the second embodiment. FIG. 32B is a flowchart illustrating a second example of the optimum omnidirectional microphone array device selection process in the directivity control system 100B of the second embodiment. FIG. 33 is a flowchart for explaining a third example of the optimum omnidirectional microphone array apparatus selection process in the directivity control system 100B of the second embodiment.
 図32(A)において、ディスプレイ装置35に表示されたトラッキング画面TRW上において、ユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作に応じて、監視対象物としての人物HM1の移動方向上の位置(トラッキングポイントに対応するトラッキング位置)が指定されると(S51)、この指定位置に関する情報(例えば座標)が動作切替制御部38に入力される(S52)。 In FIG. 32A, on the tracking screen TRW displayed on the display device 35, in the moving direction of the person HM1 as the monitoring object in accordance with the input operation with the cursor CSR by the user's mouse operation or the user's finger FG. When the position (tracking position corresponding to the tracking point) is designated (S51), information (for example, coordinates) regarding the designated position is input to the operation switching control unit 38 (S52).
 動作切替制御部38は、各全方位マイクアレイ装置から、ステップS51において指定された指定位置に対応する実空間上の位置までの各距離、即ち、各全方位マイクアレイ装置から監視対象物としての人物HM1までの各距離を算出する(S53)。 The operation switching control unit 38 determines each distance from each omnidirectional microphone array device to a position in the real space corresponding to the designated position designated in step S51, that is, from each omnidirectional microphone array device as a monitoring object. Each distance to the person HM1 is calculated (S53).
 動作切替制御部38は、ステップS53において算出された各距離の中で最小の距離が得られた全方位マイクアレイ装置を選択し、信号処理部34に対し、この選択された全方位マイクアレイ装置により収音された音声の音声データに対して指向性を形成するように指示する(S54)。 The operation switching control unit 38 selects the omnidirectional microphone array device that provides the minimum distance among the distances calculated in step S53, and instructs the signal processing unit 34 to select the selected omnidirectional microphone array device. Instruct to form directivity for the voice data of the voice collected by (S54).
 信号処理部34の出力制御部34bは、ステップS54における指示に応じて、ステップS54において動作切替制御部38により選択された全方位マイクアレイ装置から、監視対象物としての人物HM1の位置への方向に音声の指向性を形成し、指向性が形成された音声をスピーカ装置36から出力させる(S55)。 In response to the instruction in step S54, the output control unit 34b of the signal processing unit 34 is directed from the omnidirectional microphone array apparatus selected by the operation switching control unit 38 in step S54 to the position of the person HM1 as the monitoring target. Then, the sound directivity is formed, and the sound having the directivity is output from the speaker device 36 (S55).
 これにより、動作切替制御部38は、ユーザが監視対象物(例えば人物HM1)の移動方向を示す位置を簡易に指定することにより、移動中の監視対象物(例えば人物HM1)の発する音声を的確に収音することが可能な最適な全方位マイクアレイ装置を選択することができ、監視対象物(例えば人物HM1)の発する音声を高精度に収音することができる。 As a result, the operation switching control unit 38 accurately specifies the sound emitted by the moving monitoring object (for example, the person HM1) by simply specifying the position indicating the moving direction of the monitoring object (for example, the person HM1). The optimum omnidirectional microphone array device that can pick up the sound can be selected, and the sound emitted from the monitoring object (for example, the person HM1) can be picked up with high accuracy.
 なお、ステップS55の後、図32(A)に示す最適な全方位マイクアレイ装置の選択処理が終了し、図9(A)、図9(B)又は図10(A)に示すステップS2に進む。なお、図32(A)に示す最適な全方位マイクアレイ装置の選択処理の後に、図31(A)に示すカメラ装置の手動切替処理が行われても良い。 After step S55, the optimum omnidirectional microphone array apparatus selection process shown in FIG. 32A is completed, and the process proceeds to step S2 shown in FIG. 9A, FIG. 9B, or FIG. move on. Note that the manual switching process of the camera apparatus shown in FIG. 31A may be performed after the selection process of the optimum omnidirectional microphone array apparatus shown in FIG.
 図32(B)において、ディスプレイ装置35に表示されたトラッキング画面TRW上において、ユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作に応じて、監視対象物としての人物HM1の移動方向上の位置(トラッキングポイントに対応するトラッキング位置)が指定されると(S61)、この指定位置に関する情報(例えば座標)が動作切替制御部38に入力される。 In FIG. 32 (B), on the tracking screen TRW displayed on the display device 35, in the moving direction of the person HM1 as the monitoring object in accordance with the input operation with the cursor CSR by the user's mouse operation or the user's finger FG. When the position (tracking position corresponding to the tracking point) is designated (S61), information (for example, coordinates) regarding the designated position is input to the operation switching control unit 38.
 画像処理部37は、現在使用中のカメラ装置(例えばカメラ装置C1)により撮像された画像データに対して所定の画像処理を行うことにより、監視対象物としての人物HM1の顔の向きを検出する(S62)。画像処理部37は、監視対象物としての人物HM1の顔の向きの検出結果を動作切替制御部38に出力する。 The image processing unit 37 detects the orientation of the face of the person HM1 as the monitoring target by performing predetermined image processing on the image data captured by the currently used camera device (for example, the camera device C1). (S62). The image processing unit 37 outputs the detection result of the face direction of the person HM1 as the monitoring target to the operation switching control unit 38.
 動作切替制御部38は、ステップS61において指定された指定位置に関する情報(例えば画像データ上の位置を示す座標)と、ステップS62において画像処理部37から得られた人物HM1の顔の向きの検出結果とを用いて、人物HM1の顔の向きと、指定位置と、各全方位マイクアレイ装置との関係を算出する(S63)。例えば、動作切替制御部38は、ステップS61において指定された画像データ上の指定位置に対応する監視対象物(例えば人物HM1)の位置と各全方位マイクアレイ装置との距離を算出する。 The operation switching control unit 38 detects information related to the specified position specified in step S61 (for example, coordinates indicating the position on the image data) and the detection result of the face orientation of the person HM1 obtained from the image processing unit 37 in step S62. Are used to calculate the relationship between the face orientation of the person HM1, the designated position, and each omnidirectional microphone array device (S63). For example, the operation switching control unit 38 calculates the distance between the position of the monitoring object (for example, the person HM1) corresponding to the designated position on the image data designated in step S61 and each omnidirectional microphone array device.
 動作切替制御部38は、監視対象物(例えば人物HM1)の顔の向きに沿う方向(例えば水平方向45度以内)にあって、かつ、ステップS61において指定された画像データ上の指定位置に対応する監視対象物(例えば人物HM1)の位置と各全方位マイクアレイ装置との距離の最小値が得られる全方位マイクアレイ装置を選択する(S64)。更に、動作切替制御部38は、信号処理部34に対し、ステップS64において選択された全方位マイクアレイ装置により収音された音声の音声データに対して指向性を形成するように指示する(S64)。 The operation switching control unit 38 corresponds to the designated position on the image data that is in the direction along the face of the monitoring target (eg, the person HM1) (eg, within 45 degrees in the horizontal direction) and designated in step S61. The omnidirectional microphone array apparatus that selects the minimum distance between the position of the monitoring target (for example, the person HM1) and each omnidirectional microphone array apparatus is selected (S64). Furthermore, the operation switching control unit 38 instructs the signal processing unit 34 to form directivity with respect to the audio data of the audio collected by the omnidirectional microphone array device selected in step S64 (S64). ).
 信号処理部34の出力制御部34bは、ステップS64における指示に応じて、ステップS64において選択された全方位マイクアレイ装置から、監視対象物としての人物HM1の位置への方向に音声の指向性を形成し、指向性が形成された音声をスピーカ装置36から出力させる(S65)。 In response to the instruction in step S64, the output control unit 34b of the signal processing unit 34 changes the sound directivity from the omnidirectional microphone array apparatus selected in step S64 toward the position of the person HM1 as the monitoring target. The formed voice having directivity is output from the speaker device 36 (S65).
 これにより、動作切替制御部38は、監視対象物(例えば人物HM1)の画像データ上の顔の向きと監視対象物(例えば人物HM1)と各全方位マイクアレイ装置との距離とによって、移動中の監視対象物(例えば人物HM1)の発する音声を的確に収音することが可能な最適な全方位マイクアレイ装置を選択することができ、監視対象物(例えば人物HM1)の発する音声を高精度に収音することができる。 Thereby, the operation switching control unit 38 is moving depending on the orientation of the face on the image data of the monitoring object (for example, the person HM1) and the distance between the monitoring object (for example, the person HM1) and each omnidirectional microphone array device. It is possible to select an optimal omnidirectional microphone array device that can accurately pick up the sound emitted from the monitoring object (for example, the person HM1), and the sound emitted from the monitoring object (for example, the person HM1) can be selected with high accuracy. Can be picked up.
 なお、ステップS65の後、図32(B)に示す最適な全方位マイクアレイ装置の選択処理が終了し、図9(A)、図9(B)又は図10(A)に示すステップS2に進む。なお、図32(B)に示す最適な全方位マイクアレイ装置の選択処理の後に、図31(A)に示すカメラ装置の手動切替処理が行われても良い。 After step S65, the optimum omnidirectional microphone array apparatus selection process shown in FIG. 32B is completed, and the process proceeds to step S2 shown in FIG. 9A, FIG. 9B, or FIG. move on. Note that the manual switching process of the camera device shown in FIG. 31A may be performed after the selection process of the optimum omnidirectional microphone array device shown in FIG.
 図33において、ユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作に応じて、出力制御部34bは、指向性制御システム100Bが管轄する全てのカメラ装置のカメラ画面をディスプレイ装置35に一覧表示する(S71)。ディスプレイ装置35に一覧表示された各カメラ画面の中で、音声トラッキング処理の対象となる監視対象物(例えば人物HM1)が映っているカメラ画面の中で、ユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作に応じて、人物HM1の映りが最も良好なカメラ画面C1Wが選択されたとする(S72)。 In FIG. 33, the output control unit 34b lists the camera screens of all the camera devices managed by the directivity control system 100B in the display device 35 in response to an input operation using the cursor CSR or the user's finger FG by the user's mouse operation. Displayed (S71). Among the camera screens displayed as a list on the display device 35, the cursor CSR by the user's mouse operation or the user's mouse is displayed on the camera screen on which the monitoring target object (for example, the person HM1) to be subjected to the voice tracking process is shown. Assume that the camera screen C1W with the best reflection of the person HM1 is selected in accordance with the input operation with the finger FG (S72).
 動作切替制御部38は、ステップS72におけるユーザのカメラ画面の選択に応じて、人物HM1の画像の撮像に用いるカメラ装置として、カメラ画面に対応するカメラ装置を選択して切り替える。これにより、出力制御部34bは、カメラ画面に対応するカメラ装置により撮像された画像データを拡大して、ディスプレイ装置35のトラッキング画面TRW1上に表示させる(S73、図28の左下側参照)。 The operation switching control unit 38 selects and switches a camera device corresponding to the camera screen as a camera device used for capturing an image of the person HM1 in accordance with the user's selection of the camera screen in step S72. As a result, the output control unit 34b enlarges the image data captured by the camera device corresponding to the camera screen and displays it on the tracking screen TRW1 of the display device 35 (see S73, lower left side of FIG. 28).
 出力制御部34bは、動作切替制御部38により選択されたカメラ装置に対応付けられた全ての全方位マイクアレイ装置の概略位置を示すマーカ(例えば図28に示すマーカM1R,M2R,M3R,M4R)をトラッキング画面TRW1の四隅に表示させる(S74)。 The output control unit 34b is a marker (for example, markers M1R, M2R, M3R, and M4R shown in FIG. 28) that indicates the approximate positions of all the omnidirectional microphone array devices associated with the camera device selected by the operation switching control unit 38. Are displayed at the four corners of the tracking screen TRW1 (S74).
 出力制御部34bは、ユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作によりマーカM1R,M2R,M3R,M4Rが順次、指定されると(S75)、1つずつマーカを強調表示(例えばブリンクBr)させながら、それぞれのマーカについて、各マーカに対応する全方位マイクアレイ装置から人物HM1の位置への方向に指向性を形成した上で、一定時間収音した音声を出力する(S76)。 When the markers M1R, M2R, M3R, and M4R are sequentially designated by the cursor CSR by the user's mouse operation or the input operation by the user's finger FG (S75), the output control unit 34b highlights the markers one by one (for example, With the blinking Br), for each marker, directivity is formed in the direction from the omnidirectional microphone array device corresponding to each marker to the position of the person HM1, and the sound collected for a certain time is output (S76). .
 動作切替制御部38は、一定時間出力された音声の中でユーザが最適と判断した全方位マイクアレイ装置の概略位置を示すマーカ(例えばマーカM3R)が選択されると、選択されたマーカM3Rに対応する全方位マイクアレイ装置M3を、人物HM1の発する音声の収音に用いる全方位マイクアレイ装置として選択して切り替える(S77)。 When the marker (for example, the marker M3R) indicating the approximate position of the omnidirectional microphone array device determined by the user to be optimal among the sounds output for a certain period of time is selected, the operation switching control unit 38 selects the selected marker M3R. The corresponding omnidirectional microphone array apparatus M3 is selected and switched as the omnidirectional microphone array apparatus used for collecting the sound emitted by the person HM1 (S77).
 なお、ステップS77の後、図33に示す最適な全方位マイクアレイ装置の選択処理が終了し、図9(A)、図9(B)又は図10(A)に示すステップS2に進む。なお、図33に示す最適な全方位マイクアレイ装置の選択処理の後に、図31(A)に示すカメラ装置の手動切替処理が行われても良い。 In addition, after step S77, the selection process of the optimal omnidirectional microphone array apparatus shown in FIG. 33 is completed, and the process proceeds to step S2 shown in FIG. 9A, FIG. 9B, or FIG. Note that the manual switching process of the camera device shown in FIG. 31A may be performed after the selection process of the optimum omnidirectional microphone array device shown in FIG.
(第1の実施形態の変形例)
 上述した各実施形態では、主に単一の監視対象物(例えば人物HM1)が画像データ上に映し出されている場合に、この単一の監視対象物としての人物HM1の移動に合わせた音声トラッキング処理について説明した。
(Modification of the first embodiment)
In each of the above-described embodiments, when a single monitoring object (for example, the person HM1) is mainly displayed on the image data, the voice tracking according to the movement of the person HM1 as the single monitoring object. The process has been described.
 第1の実施形態の変形例(以下、「本変形例」という)では、第1の実施形態又は第2の実施形態において、複数の監視対象物(例えば複数の人物)がトラッキング画面TRW上に現れている場合に、複数の人物を同じタイミング又は異なるタイミングに指定する場合の指向性制御システム100の動作例について説明する。なお、本変形例の指向性制御システムのシステム構成例は第1又は第2の実施形態の指向性制御システム100,100A,100Bと同一であるため、システム構成例の説明は簡略化又は省略し、異なる内容について説明する。以下、説明を簡単にするために、指向性制御システム100のシステム構成例を参照して説明する。 In the modified example of the first embodiment (hereinafter referred to as “this modified example”), in the first embodiment or the second embodiment, a plurality of monitoring objects (for example, a plurality of persons) are displayed on the tracking screen TRW. An operation example of the directivity control system 100 when a plurality of persons are designated at the same timing or at different timings when they appear will be described. Since the system configuration example of the directivity control system according to the present modification is the same as the directivity control system 100, 100A, 100B according to the first or second embodiment, the description of the system configuration example is simplified or omitted. The different contents will be described. Hereinafter, in order to simplify the explanation, the system configuration example of the directivity control system 100 will be described.
 本変形例の指向性制御システム100の動作例について、図34及び図35を参照して説明する。図34は、第1の実施形態の変形例の指向性制御システム100における複数同時指定に基づく手動トラッキング処理の全体フローの一例を説明するフローチャートである。図35は、第1の実施形態の変形例の指向性制御システム100における複数の監視対象物の自動トラッキング処理の一例を説明するフローチャートである。図35では、指向性制御装置3A,3Bが用いられる。 An example of the operation of the directivity control system 100 according to this modification will be described with reference to FIGS. FIG. 34 is a flowchart for explaining an example of the overall flow of manual tracking processing based on a plurality of simultaneous designations in the directivity control system 100 according to the modification of the first embodiment. FIG. 35 is a flowchart illustrating an example of an automatic tracking process for a plurality of monitoring objects in the directivity control system 100 according to the modification of the first embodiment. In FIG. 35, directivity control devices 3A and 3B are used.
 なお、図34では、ステップS1のトラッキングモードの判定処理、ステップS2のトラッキング補助処理、ステップS6のトラッキング結線処理、及びステップS7の音声出力処理は、それぞれ例えば図9(A)に示すステップS1のトラッキングモードの判定処理、ステップS2のトラッキング補助処理、図9(A)に示すステップS6のトラッキング結線処理、及び図9(A)に示すステップS7の音声出力処理であるため、これらの説明は省略する。 In FIG. 34, the tracking mode determination process in step S1, the tracking assist process in step S2, the tracking connection process in step S6, and the audio output process in step S7 are performed in, for example, step S1 shown in FIG. The tracking mode determination process, the tracking assist process in step S2, the tracking connection process in step S6 shown in FIG. 9 (A), and the audio output process in step S7 shown in FIG. 9 (A) are omitted. To do.
 図34において、トラッキングモードがオフであれば(S1、NO)、図34に示す複数同時指定に基づく手動トラッキング処理は終了するが、トラッキングモードがオンである場合には(S1、YES)、ディスプレイ装置35のトラッキング画面TRWにおいて、ユーザのマウス操作によるカーソルCSRのクリック操作又はユーザの指FGのタッチ操作により、現在スピーカ装置36から出力(再生)されている音声が一時停止される(S81)。ステップS81の後、トラッキング補助処理が行われる(S2)。 In FIG. 34, if the tracking mode is off (S1, NO), the manual tracking process based on multiple simultaneous designations shown in FIG. 34 ends, but if the tracking mode is on (S1, YES), the display On the tracking screen TRW of the device 35, the sound currently output (reproduced) from the speaker device 36 is temporarily stopped by the click operation of the cursor CSR by the user's mouse operation or the touch operation of the user's finger FG (S81). After step S81, tracking assist processing is performed (S2).
 ステップS2の後、ユーザのマウス操作によるカーソルCSR又はユーザの指FGによる入力操作に応じて、監視対象物としての複数の人物の移動過程(移動経路)のトラッキング位置に対応するトラッキングポイントが、同時に複数指定されたとする(S82)。 After step S2, the tracking points corresponding to the tracking positions of the movement processes (movement paths) of a plurality of persons as monitoring objects are simultaneously determined in accordance with the input operation with the cursor CSR by the user's mouse operation or the user's finger FG. It is assumed that a plurality are designated (S82).
 トラッキング処理部34cは、ステップS82において指定された監視対象物としての人物毎に、トラッキング画面TRW上の複数の指定位置に対応する実空間上の位置及び指定時刻を区別して、それぞれトラッキングポイントのトラッキング位置及びトラッキング時刻として対応付けてメモリ33に保存する(S83)。更に、トラッキング処理部34cは、出力制御部34bを介して、監視対象物としての人物毎に、トラッキング画面TRW上のトラッキングポイントに区別してポイントマーカを表示させる(S83)。 The tracking processing unit 34c, for each person as the monitoring target specified in step S82, distinguishes positions in the real space corresponding to a plurality of specified positions on the tracking screen TRW and specified times, and tracks tracking points respectively. The position and tracking time are associated with each other and stored in the memory 33 (S83). Further, the tracking processing unit 34c displays a point marker for each person as a monitoring target by distinguishing the tracking point on the tracking screen TRW via the output control unit 34b (S83).
 出力制御部34bは、現在使用中の全方位マイクアレイ装置(例えば全方位マイクアレイ装置)M1から、ステップS82において同時に指定された複数の監視対象物としての人物毎のトラッキング位置に対応する各人物の実空間上の位置(音声位置、音源位置)への方向に、収音音声の指向性を形成する(S84)。ステップS84の後、トラッキング結線処理が行われる(S6)。 The output control unit 34b selects each person corresponding to the tracking position of each person as a plurality of monitoring objects simultaneously designated in step S82 from the omnidirectional microphone array apparatus (for example, omnidirectional microphone array apparatus) M1 currently in use. The directivity of the collected sound is formed in the direction to the position in the real space (sound position, sound source position) (S84). After step S84, tracking connection processing is performed (S6).
 ステップS6の後、出力制御部34bは、ステップS81において一時停止していた音声のスピーカ装置36からの出力(再生)を再開する(S85)。ステップS85の後、音声出力処理が行われる(S7)。ステップS7の後、指向性制御装置3Bのトラッキングモードがオフになるまで、ステップS81からステップS7までの動作(ステップS81,S2,S82,S83,S84,S6,S85,S7の動作)が繰り返される。 After step S6, the output control unit 34b resumes the output (reproduction) of the sound that has been paused in step S81 from the speaker device 36 (S85). After step S85, an audio output process is performed (S7). After step S7, the operations from step S81 to step S7 (steps S81, S2, S82, S83, S84, S6, S85, and S7) are repeated until the tracking mode of the directivity control device 3B is turned off. .
 図35において、ステップS3の後、指向性制御装置3A,3Bの画像処理部37は、公知の画像処理を行うことで、ディスプレイ装置35のトラッキング画面TRW上に、監視対象物としての人物の検出の有無を判定し、複数の人物を検出したと判定した場合には、判定結果(各人物の検出位置(例えば既知の代表点)及び検出時刻のデータを含む)を自動指定結果として、信号処理部34のトラッキング処理部34cに出力する(S91)。また、音源検出部34dは、公知の音源検出処理を行うことで、ディスプレイ装置35のトラッキング画面TRW上に、監視対象物としての人物の発する音声(音源)の位置の検出の有無を判定し、複数の音源の位置を検出したと判定した場合には、判定結果(音源の検出位置及び検出時刻のデータを含む)を自動指定結果として、トラッキング処理部34cに出力する(S91)。 In FIG. 35, after step S3, the image processing unit 37 of the directivity control devices 3A and 3B performs known image processing to detect a person as a monitoring target on the tracking screen TRW of the display device 35. If it is determined that a plurality of persons have been detected, the signal processing is performed using the determination result (including the detection position of each person (for example, known representative points) and detection time data) as an automatic designation result. To the tracking processing unit 34c of the unit 34 (S91). In addition, the sound source detection unit 34d performs a known sound source detection process to determine whether or not the position of the sound (sound source) emitted by the person as the monitoring target is detected on the tracking screen TRW of the display device 35, When it is determined that the positions of a plurality of sound sources have been detected, the determination result (including the sound source detection position and detection time data) is output to the tracking processing unit 34c as an automatic designation result (S91).
 トラッキング処理部34cは、ステップS91における直前の1つ以上の自動指定結果の推移を用いて、複数の監視対象物としての各人物の移動ベクトルを算出し、各人物の移動方向を推定する(S91)。 The tracking processing unit 34c calculates the movement vector of each person as a plurality of monitoring objects using the transition of one or more automatic designation results immediately before in step S91, and estimates the movement direction of each person (S91). ).
 トラッキング処理部34cは、ステップS91における複数の監視対象物としての人物の移動方向の推定結果を用いて、自動指定された複数のトラッキングポイントに対応するトラッキング位置と前回の各自動指定結果とを対応付けて、トラッキング位置のペアとしてメモリ33に保存する(S92)。トラッキング処理部34cは、監視対象物としての人物毎に、トラッキング画面TRW上における各人物の指定位置及び指定時刻を区別して、それぞれトラッキングポイントのトラッキング位置及びトラッキング時刻として対応付けてメモリ33に保存する(S92)。更に、トラッキング処理部34cは、出力制御部34bを介して、監視対象物としての人物毎に、トラッキング画面TRW上のトラッキング位置に区別してポイントマーカを表示させる(S92)。 The tracking processing unit 34c associates the tracking positions corresponding to the plurality of automatically designated tracking points with the previous automatic designation results using the estimation result of the moving direction of the person as the plurality of monitoring objects in step S91. In addition, it is stored in the memory 33 as a pair of tracking positions (S92). The tracking processing unit 34c distinguishes the designated position and designated time of each person on the tracking screen TRW for each person as the monitoring target, and stores them in the memory 33 in association with the tracking position and tracking time of the tracking point. (S92). Further, the tracking processing unit 34c displays a point marker for each person as a monitoring target by distinguishing the tracking position on the tracking screen TRW via the output control unit 34b (S92).
 これにより、本変形例の指向性制御装置3,3A,3Bは、ディスプレイ装置35のトラッキング画面TRW上の画像データ上に映し出されている複数の監視対象物(例えば人物)がどのように移動しても、各人物の移動前の位置に向かう方向に形成された音声の指向性を、各人物の移動後の位置に向かう方向に形成するので、各人物の移動に伴って音声の指向性を追従して適正に形成することができ、監視者の監視業務の効率劣化を抑制できる。 As a result, the directivity control devices 3, 3 </ b> A, 3 </ b> B according to the present modified example move how the plurality of monitoring objects (for example, persons) displayed on the image data on the tracking screen TRW of the display device 35 move. However, since the directivity of the sound formed in the direction toward the position before the movement of each person is formed in the direction toward the position after the movement of each person, the directivity of the sound is increased as each person moves. It can follow and form appropriately, and it can control the efficiency degradation of the supervisor's monitoring work.
 以下、上述した本発明に係る指向性制御装置、指向性制御方法、記憶媒体及び指向性制御システムの構成、作用及び効果を説明する。 Hereinafter, configurations, operations, and effects of the above-described directivity control apparatus, directivity control method, storage medium, and directivity control system according to the present invention will be described.
 本発明の一実施形態は、複数のマイクを含む第1の収音部で収音された音声の指向性を制御する指向性制御装置であって、前記第1の収音部から、表示部の画像上の第1の指定位置に対応する監視対象物への方向に、前記音声の指向性を形成する指向性形成部と、前記監視対象物の移動に応じて指定された、前記表示部の画像上の第2の指定位置に関する情報を取得する情報取得部と、を備え、前記指向性形成部は、前記情報取得部により取得された前記第2の指定位置に関する情報を用いて、前記第2の指定位置に対応する前記監視対象物への方向に、前記音声の指向性を切り替える、指向性制御装置である。 One embodiment of the present invention is a directivity control device that controls the directivity of sound collected by a first sound collection unit including a plurality of microphones, the first sound collection unit, the display unit A directivity forming unit that forms the directivity of the sound in a direction toward the monitoring target corresponding to the first specified position on the image of the display, and the display unit specified according to the movement of the monitoring target An information acquisition unit that acquires information about a second designated position on the image of the image, and the directivity forming unit uses the information about the second designated position acquired by the information acquisition unit, The directivity control device switches the directivity of the voice in a direction toward the monitoring target corresponding to a second designated position.
 この構成では、指向性制御装置は、複数のマイクを含む第1の収音部から、表示部の画像上の第1の指定位置に対応する監視対象物への方向に音声の指向性を形成し、更に、移動している監視対象物を指定した第2の指定位置に関する情報を取得する。また、指向性制御装置は、表示部の画像上の第2の指定位置に関する情報を用いて、第2の指定位置に対応する監視対象物への方向に、音声の指向性を切り替える。 In this configuration, the directivity control device forms sound directivity in the direction from the first sound collection unit including a plurality of microphones to the monitoring target corresponding to the first designated position on the image of the display unit. In addition, information on the second designated position that designates the moving monitoring object is acquired. In addition, the directivity control device switches the directivity of the sound in the direction toward the monitoring target corresponding to the second designated position, using information regarding the second designated position on the image of the display unit.
 これにより、指向性制御装置は、表示部の画像上に映し出されている監視対象物が移動しても、監視対象物の移動前の位置に向かう方向に形成された音声の指向性を、監視対象物の移動後の位置に向かう方向に形成するので、監視対象物の移動に伴って音声の指向性を追従して適正に形成することができ、監視者の監視業務の効率劣化を抑制できる。 Thereby, the directivity control device monitors the directivity of the sound formed in the direction toward the position before the movement of the monitoring object even if the monitoring object projected on the image of the display unit moves. Since the object is formed in the direction toward the position after the movement, the sound can be properly formed following the directivity of the sound as the monitoring object moves, and the efficiency of monitoring work of the supervisor can be suppressed. .
 また、本発明の一実施形態は、前記情報取得部は、前記表示部の画像上で移動する前記監視対象物に対する指定操作に応じて、前記第2の指定位置に関する情報を取得する、指向性制御装置である。 In one embodiment of the present invention, the information acquisition unit acquires information related to the second specified position in response to a specifying operation on the monitoring object that moves on the image of the display unit. It is a control device.
 この構成によれば、指向性制御装置は、表示部に映し出された画像上で移動する監視対象物を指定する簡易な操作によって、監視対象物の移動後の位置に関する正確な情報を容易に取得することができる。 According to this configuration, the directivity control device can easily obtain accurate information regarding the position after the movement of the monitoring target object by a simple operation that specifies the monitoring target object moving on the image displayed on the display unit. can do.
 また、本発明の一実施形態は、前記表示部の画像から前記監視対象物に対応する音源位置を検出する音源検出部と、前記表示部の画像から前記監視対象物を検出する画像処理部と、を更に備え、前記情報取得部は、前記音源検出部により検出された前記音源位置に関する情報、又は前記画像処理部により検出された前記監視対象物の位置に関する情報を、前記第2の指定位置に関する情報として取得する、指向性制御装置である。 In one embodiment of the present invention, a sound source detection unit that detects a sound source position corresponding to the monitoring object from the image of the display unit, and an image processing unit that detects the monitoring object from the image of the display unit. The information acquisition unit includes the information regarding the sound source position detected by the sound source detection unit or the information regarding the position of the monitoring target detected by the image processing unit as the second designated position. It is a directivity control device that is acquired as information on the information.
 この構成によれば、指向性制御装置は、表示部に映し出された画像から監視対象物の発する音声の音源、及び監視対象物自体を簡易に検出することができるので、音源の位置に関する情報又は監視対象物の位置に関する情報を、監視対象物の移動後の位置に関する情報として容易に取得することができる。 According to this configuration, the directivity control device can easily detect the sound source of the sound generated by the monitoring target and the monitoring target itself from the image displayed on the display unit. Information regarding the position of the monitoring object can be easily acquired as information regarding the position after the movement of the monitoring object.
 また、本発明の一実施形態は、前記音源検出部は、前記表示部の画像上に指定された初期位置を中心に、前記監視対象物に対応する音源位置の検出処理を開始し、前記画像処理部は、前記初期位置を中心に、前記監視対象物の検出処理を開始する、指向性制御装置である。 In one embodiment of the present invention, the sound source detection unit starts detection processing of a sound source position corresponding to the monitoring object centering on an initial position designated on the image of the display unit, and the image The processing unit is a directivity control device that starts the detection process of the monitoring object around the initial position.
 この構成によれば、指向性制御装置は、例えばユーザの指定操作によって、表示部に映し出された画像上において指定された初期位置(例えば監視対象物の位置に)を中心に、音源の位置に関する情報又は監視対象物の位置に関する情報の検出処理を開始するので、音源の位置の検出処理又は監視対象物の位置の検出処理を高速に行うことができる。 According to this configuration, the directivity control device is related to the position of the sound source around the initial position (for example, at the position of the monitoring target) specified on the image displayed on the display unit, for example, by the user's specifying operation. Since the detection process of the information or the information related to the position of the monitoring object is started, the detection process of the position of the sound source or the detection process of the position of the monitoring object can be performed at high speed.
 また、本発明の一実施形態は、前記音源検出部により検出された前記音源位置に関する情報、又は前記画像処理部により検出された前記監視対象物の位置に関する情報の変更操作に応じて、前記変更操作により指定された前記表示部の画像上の位置に関する情報を、前記第2の指定位置に関する情報として取得する、指向性制御装置である。 Further, according to one embodiment of the present invention, the change is performed in accordance with an operation for changing the information on the sound source position detected by the sound source detection unit or the information on the position of the monitoring target detected by the image processing unit. It is a directivity control apparatus which acquires the information regarding the position on the image of the said display part designated by operation as information regarding the said 2nd designated position.
 この構成によれば、指向性制御装置は、音源の位置の検出処理又は監視対象物の位置の検出処理により検出された音源の位置又は監視対象物の位置が間違っていた場合でも、例えばユーザの位置の変更操作によって画像上で指定された位置に関する情報を、監視対象物の移動後の位置に関する情報として容易に修正して取得することができる。 According to this configuration, even if the position of the sound source or the position of the monitoring object detected by the sound source position detection process or the monitoring object position detection process is incorrect, for example, the user Information regarding the position designated on the image by the position changing operation can be easily corrected and acquired as information regarding the position after the movement of the monitoring object.
 また、本発明の一実施形態は、前記情報取得部は、前記音源検出部により検出された前記音源位置と、前記画像処理部により検出された前記監視対象物の位置との距離が所定値以上である場合、前記音源位置に関する情報又は前記監視対象物の位置に関する情報の変更操作に応じて、前記変更操作により指定された前記表示部の画像上の位置に関する情報を、前記第2の指定位置に関する情報として取得する、指向性制御装置である。 In one embodiment of the present invention, the information acquisition unit is configured such that a distance between the sound source position detected by the sound source detection unit and the position of the monitoring target detected by the image processing unit is a predetermined value or more. The information on the position on the image of the display unit specified by the change operation in response to the change operation of the information on the sound source position or the information on the position of the monitoring target object. It is a directivity control device that is acquired as information on the information.
 この構成によれば、指向性制御装置は、音源の位置の検出処理又は監視対象物の位置の検出処理により検出された音源の位置と監視対象物の位置との距離が所定値以上であれば、例えばユーザの位置の変更操作によって、画像上で指定された位置に関する情報を、監視対象物の移動後の位置に関する情報として容易に修正して取得することができる。更に、指向性制御装置は、音源の位置の検出処理又は監視対象物の位置の検出処理により検出された音源の位置と監視対象物の位置との距離が所定値以上でなければ、例えばユーザの位置の変更操作を必要とすることなく、音源の位置又は監視対象物の位置を監視対象物の移動後の位置に関する情報として容易に取得することができる。 According to this configuration, the directivity control device is configured so that the distance between the sound source position detected by the sound source position detection process or the monitoring target object position detection process and the monitoring target object position is equal to or greater than a predetermined value. For example, information related to the position designated on the image can be easily corrected and acquired as information related to the position after the movement of the monitoring target object by, for example, a user's position changing operation. Furthermore, the directivity control device, for example, if the distance between the sound source position detected by the sound source position detection process or the monitoring object position detection process and the position of the monitoring object is not a predetermined value or more, for example, The position of the sound source or the position of the monitoring object can be easily acquired as information on the position after the movement of the monitoring object without requiring a position changing operation.
 また、本発明の一実施形態は、一定期間にわたって撮像された画像を記憶する画像記憶部と、前記画像記憶部に記憶された前記画像を前記表示部に再生する画像再生部と、を更に備え、前記画像再生部は、所定の入力操作により、再生速度の初期値より小さい速度値で前記画像を再生する、指向性制御装置である。 In addition, an embodiment of the present invention further includes an image storage unit that stores an image captured over a certain period, and an image playback unit that plays back the image stored in the image storage unit on the display unit. The image reproduction unit is a directivity control device that reproduces the image at a speed value smaller than the initial value of the reproduction speed by a predetermined input operation.
 この構成によれば、指向性制御装置は、一定期間にわたって撮像された画像を映像として表示部に再生する場合、ユーザの所定の入力操作(例えばスロー再生の指示操作)によって、再生速度の初期値(例えば映像の再生時に用いられる通常値)よりも小さい速度値でスロー再生することができる。 According to this configuration, when the directivity control device reproduces an image captured over a certain period on the display unit as a video, the initial value of the reproduction speed is determined by a user's predetermined input operation (for example, a slow reproduction instruction operation). Slow reproduction can be performed at a speed value smaller than (for example, a normal value used at the time of video reproduction).
 また、本発明の一実施形態は、撮像された画像を前記表示部に表示させる表示制御部、を更に備え、前記表示制御部は、前記表示部の画像上の指定位置への指定に応じて、前記指定位置を中心に所定倍率で前記画像を同一画面において拡大表示させる、指向性制御装置である。 In addition, an embodiment of the present invention further includes a display control unit that displays a captured image on the display unit, and the display control unit is configured according to designation to a designated position on the image of the display unit. A directivity control device that enlarges and displays the image on the same screen at a predetermined magnification around the designated position.
 この構成によれば、指向性制御装置は、例えばユーザの簡易な指定操作によって、表示部に映し出された画像上の指定位置を中心に、同一画面内において所定倍率で画像を拡大して表示させるので、同一画面上においてユーザの監視対象物の指定操作を簡易化することができる。 According to this configuration, the directivity control device enlarges and displays an image at a predetermined magnification within the same screen, with the designated position on the image displayed on the display unit as the center, for example, by a simple designation operation by the user. Therefore, it is possible to simplify the user's operation of specifying the monitoring target on the same screen.
 また、本発明の一実施形態は、撮像された画像を前記表示部に表示させる表示制御部、を更に備え、前記表示制御部は、前記表示部の画像上の指定位置への指定に応じて、前記指定位置を中心に所定倍率で前記画像を他の画面において拡大表示させる、指向性制御装置である。 In addition, an embodiment of the present invention further includes a display control unit that displays a captured image on the display unit, and the display control unit is configured according to designation to a designated position on the image of the display unit. A directivity control device that enlarges and displays the image on another screen at a predetermined magnification with the designated position as the center.
 この構成によれば、指向性制御装置は、例えばユーザの簡易な指定操作によって、表示部に映し出された画像上の指定位置を中心に、異なる画面内において所定倍率で画像を拡大して表示させるので、拡大表示されていない画面と拡大表示された画面とを対比させてユーザに監視対象物を簡易に指定させることができる。 According to this configuration, the directivity control device enlarges and displays an image at a predetermined magnification in different screens with a designated position on the image displayed on the display unit as a center by, for example, a simple designation operation by the user. Therefore, the user can easily specify the monitoring target by comparing the screen that is not enlarged and the screen that is enlarged.
 また、本発明の一実施形態は、撮像された画像を前記表示部に表示させる表示制御部、を更に備え、前記表示制御部は、所定の入力操作に応じて、前記表示部の中心を基準に所定倍率で前記画像を拡大表示させる、指向性制御装置である。 In addition, an embodiment of the present invention further includes a display control unit that displays a captured image on the display unit, and the display control unit uses the center of the display unit as a reference in accordance with a predetermined input operation. Is a directivity control device that enlarges and displays the image at a predetermined magnification.
 この構成によれば、指向性制御装置は、例えばユーザの簡易な指定操作によって、表示部の中心を基準にして、所定倍率で画像を拡大して表示させるので、例えば表示部の中心付近に監視対象物が映っている場合には、ユーザに監視対象物を簡易に指定させることができる。 According to this configuration, the directivity control device enlarges and displays the image at a predetermined magnification on the basis of the center of the display unit, for example, by a simple designation operation by the user, for example, monitoring near the center of the display unit. When the object is shown, the user can easily specify the monitoring object.
 また、本発明の一実施形態は、前記表示制御部は、前記監視対象物の移動に応じて、前記画像が拡大表示された画面において前記指定位置が所定のスクロール判定線を超えた場合に、前記スクロール判定線を超えた方向に前記画面を所定量スクロールする、指向性制御装置である。 Further, in one embodiment of the present invention, the display control unit, when the designated position exceeds a predetermined scroll determination line on the screen on which the image is enlarged and displayed in accordance with the movement of the monitoring object, A directivity control apparatus that scrolls the screen by a predetermined amount in a direction beyond the scroll determination line.
 この構成によれば、指向性制御装置は、拡大表示された画面に映し出された監視対象物が移動したことにより、ユーザの指定位置がスクロール判定線を超えた場合に、スクロール判定線を超えた方向に画面を所定量、自動的にスクロールするので、画面が拡大表示された場合でも、ユーザの監視対象物の指定位置が画面から外れることを防ぐことができる。 According to this configuration, the directivity control device has exceeded the scroll determination line when the user-specified position exceeds the scroll determination line due to the movement of the monitoring object displayed on the enlarged display screen. Since the screen is automatically scrolled by a predetermined amount in the direction, it is possible to prevent the designated position of the monitoring target of the user from moving off the screen even when the screen is enlarged.
 また、本発明の一実施形態は、前記表示制御部は、前記監視対象物の移動に応じて、前記画像が拡大表示された画面において前記指定位置が所定のスクロール判定線を超えた場合に、前記指定位置が中心となるように前記画面をスクロールする、指向性制御装置である。 Further, in one embodiment of the present invention, the display control unit, when the designated position exceeds a predetermined scroll determination line on the screen on which the image is enlarged and displayed in accordance with the movement of the monitoring object, A directivity control device that scrolls the screen so that the designated position is at the center.
 この構成によれば、指向性制御装置は、拡大表示された画面に映し出された監視対象物が移動したことにより、ユーザの指定位置がスクロール判定線を超えた場合に、ユーザの指定位置が画面の中心となるように画面を自動的にスクロールするので、画面が拡大表示された場合でも、ユーザの監視対象物の指定位置が画面から外れることを防ぐことができ、更に、移動を続ける画面上の監視対象物を簡易に指定させることができる。 According to this configuration, the directivity control device allows the user-specified position to be displayed on the screen when the user-specified position exceeds the scroll determination line due to the movement of the monitoring object displayed on the enlarged display screen. Since the screen is automatically scrolled to the center of the screen, even if the screen is enlarged, it is possible to prevent the designated position of the user's monitoring target from moving off the screen and to keep moving It is possible to easily specify the monitoring object.
 また、本発明の一実施形態は、前記表示制御部は、前記画像が拡大表示された画面において、前記指定位置が前記画面の中心となるように前記画面をスクロールする、指向性制御装置である。 Moreover, one embodiment of the present invention is a directivity control device in which the display control unit scrolls the screen so that the designated position is at the center of the screen on the screen on which the image is enlarged and displayed. .
 この構成によれば、指向性制御装置は、拡大表示された画面に映し出された監視対象物が移動したことにより、ユーザの指定位置が常に画面の中心となるように画面を自動的にスクロールするので、画面が拡大表示された場合でも、ユーザの監視対象物の指定位置が画面から外れることを防ぐことができ、更に、移動を続ける画面上の監視対象物を簡易に指定させることができる。 According to this configuration, the directivity control device automatically scrolls the screen so that the position designated by the user is always at the center of the screen when the monitoring target displayed on the enlarged screen is moved. Therefore, even when the screen is enlarged and displayed, it is possible to prevent the designated position of the monitoring target of the user from moving off the screen, and it is possible to easily specify the monitoring target on the screen that continues to move.
 また、本発明の一実施形態は、前記画像処理部は、所定の入力操作に応じて、前記表示部の画像上の前記監視対象物の一部をマスキング処理する、指向性制御装置である。 Further, an embodiment of the present invention is a directivity control device in which the image processing unit performs masking processing on a part of the monitoring object on the image of the display unit in accordance with a predetermined input operation.
 この構成によれば、指向性制御装置は、例えばユーザの簡易な入力操作により、表示部の画面に映し出された監視対象物(例えば人物)の一部(例えば顔)をマスキング処理するので、監視対象物の人物が誰であるかを分かり難くすることでプライバシーを効果的に保護することができる。 According to this configuration, the directivity control device masks a part (for example, a face) of a monitoring object (for example, a person) displayed on the screen of the display unit by, for example, a simple input operation by the user. Privacy can be effectively protected by making it difficult to understand who the person of the object is.
 また、本発明の一実施形態は、前記第1の収音部で収音された音声を音声出力部に出力させる音声出力制御部、を更に備え、前記音声出力制御部は、所定の入力操作に応じて、前記第1の収音部で収音された音声をボイスチェンジ処理して前記音声出力部に出力させる、指向性制御装置である。 In addition, an embodiment of the present invention further includes a sound output control unit that causes the sound output unit to output the sound collected by the first sound collecting unit, and the sound output control unit includes a predetermined input operation. Accordingly, the directivity control device is configured to perform voice change processing on the sound collected by the first sound collection unit and output the sound to the sound output unit.
 この構成によれば、指向性制御装置は、例えばユーザの簡易な入力操作により、第1の収音部によりリアルタイムに収音されている音声をボイスチェンジ処理して音声出力するので、監視対象物(例えば人物)の発する音声を誰の音声か分かり難くすることで、現在撮像されている監視対象物の人物の音声上のプライバシーを効果的に保護することができる。 According to this configuration, the directivity control device performs voice change processing on the sound collected in real time by the first sound collection unit, for example, by a simple input operation by the user, and outputs the sound. By making it difficult to understand who the voice of (for example, a person) is, it is possible to effectively protect the privacy on the voice of the person of the monitored object currently being imaged.
 また、本発明の一実施形態は、一定期間にわたって前記第1の収音部で収音された音声を記憶する音声記憶部と、前記音声記憶部に記憶された前記音声を音声出力部に出力させる音声出力制御部と、を更に備え、前記音声出力制御部は、所定の入力操作に応じて、前記第1の収音部で収音された音声をボイスチェンジ処理して前記音声出力部に出力させる、指向性制御装置である。 In one embodiment of the present invention, a sound storage unit that stores sound collected by the first sound collection unit over a certain period of time, and the sound stored in the sound storage unit is output to a sound output unit. An audio output control unit for causing the audio output control unit to perform a voice change process on the audio collected by the first sound collection unit in response to a predetermined input operation. This is a directivity control device for output.
 この構成によれば、指向性制御装置は、例えばユーザの簡易な入力操作により、一定期間にわたって第1の収音部により収音された音声を音声出力する場合に、音声にボイスチェンジ処理を施して音声出力するので、監視対象物(例えば人物)の発する音声を誰の音声か分かり難くすることで、監視対象物の人物の音声上のプライバシーを効果的に保護することができる。 According to this configuration, the directivity control device performs a voice change process on the sound when the sound collected by the first sound collection unit is output for a certain period of time by a simple input operation of the user, for example. Therefore, it is possible to effectively protect the privacy of the monitoring target person's voice by making it difficult to understand the voice of the monitoring target object (for example, a person).
 また、本発明の一実施形態は、前記監視対象物の移動に応じて指定される、1つ以上の前記表示部の画像上の指定位置に所定のマーカを表示させる表示制御部、を更に備える、指向性制御装置である。 Moreover, one embodiment of the present invention further includes a display control unit that displays a predetermined marker at a specified position on the image of one or more of the display units that is specified according to the movement of the monitoring object. The directivity control device.
 この構成によれば、指向性制御装置は、例えばユーザが表示部に映し出されている監視対象物を指定する指定操作を行った場合に、表示部の画面上で指定された指定位置に所定のマーカを表示するので、移動中の監視対象物が通過した位置を軌跡として明示的に示すことができる。 According to this configuration, for example, when the user performs a designation operation for designating the monitoring target displayed on the display unit, the directivity control device is set to a predetermined position designated on the screen of the display unit. Since the marker is displayed, the position through which the moving monitoring object passes can be explicitly shown as a trajectory.
 また、本発明の一実施形態は、前記監視対象物の移動に応じて指定される、前記表示部の画像上の2つ以上の指定位置のうち、少なくとも現在の指定位置と直前の指定位置とを結線して表示させる表示制御部、を更に備える、指向性制御装置である。 In one embodiment of the present invention, at least the current designated position and the immediately preceding designated position among two or more designated positions on the image of the display unit designated according to the movement of the monitoring object. It is a directivity control apparatus further provided with the display control part which connects and displays.
 この構成によれば、指向性制御装置は、表示部の画面に映し出された監視対象物が移動した場合にユーザの指定操作により指定された複数の指定位置のうち、少なくとも現在の指定位置と直前の指定位置とを結線して表示させるので、監視対象物の移動の一部の軌跡を明示的に示すことができる。 According to this configuration, the directivity control device includes at least the current designated position and the immediately preceding position among the plurality of designated positions designated by the user's designated operation when the monitoring object projected on the screen of the display unit moves. Since the designated position is connected and displayed, a partial trajectory of the movement of the monitored object can be explicitly shown.
 また、本発明の一実施形態は、前記監視対象物の移動に応じて指定される、前記表示部の画像上の全ての指定位置に対し、各指定位置に隣接する1つ又は2つの指定位置を結線した動線を表示させる表示制御部、を更に備える、指向性制御装置である。 In addition, according to an embodiment of the present invention, one or two specified positions adjacent to each specified position with respect to all the specified positions on the image of the display unit that are specified according to the movement of the monitoring object. It is a directivity control apparatus further provided with the display control part which displays the flow line which connected.
 この構成によれば、指向性制御装置は、表示部の画面に映し出された監視対象物が移動した場合にユーザの指定操作により指定された複数の指定位置の全てに対し、各指定位置に隣接する1つ又は2つの指定位置を結線して表示させるので、監視対象物の移動の全部の軌跡を明示的に示すことができる。 According to this configuration, the directivity control device is adjacent to each designated position with respect to all of the plurality of designated positions designated by the user's designated operation when the monitoring object projected on the screen of the display unit moves. Since one or two designated positions to be displayed are connected and displayed, the entire trajectory of the movement of the monitoring object can be explicitly shown.
 また、本発明の一実施形態は、前記表示部の画像上の全ての指定位置及び指定時刻のデータを含む指定リストを記憶する指定リスト記憶部と、前記表示制御部により表示された前記全ての指定位置を結線する動線上の任意の位置の指定に応じて、前記指定リスト記憶部に記憶された前記指定リストを用いて、前記動線上の指定位置における前記音声の再生開始時刻を算出する再生時刻算出部と、を更に備え、前記指向性形成部は、前記再生時刻算出部により算出された前記音声の再生開始時刻に最も近い前記指定時刻に対応する前記指定位置のデータを用いて、前記音声の指向性を形成する、指向性制御装置である。 In one embodiment of the present invention, a specified list storage unit that stores data of all specified positions and specified times on the image of the display unit, and all of the display units displayed by the display control unit. Reproduction that calculates the reproduction start time of the sound at the designated position on the flow line using the designation list stored in the designation list storage unit in accordance with designation of an arbitrary position on the flow line connecting the designated position A time calculation unit, wherein the directivity forming unit uses the data at the specified position corresponding to the specified time closest to the reproduction start time of the sound calculated by the reproduction time calculation unit, It is a directivity control device that forms the directivity of speech.
 この構成では、指向性制御装置は、監視対象物の移動中にユーザにより指定された全ての指定位置が結線して表示された場合に、動線上の任意のユーザ指定に応じて指定された位置における収音音声の再生開始時刻を算出し、この再生時刻に最も近い監視対象物の移動中に指定されたいずれかの指定時刻に対応して音声の指向性を形成する。 In this configuration, the directivity control device is configured such that when all the designated positions designated by the user are displayed while moving the monitored object, the designated position on the flow line is designated according to any user designation. The reproduction start time of the collected voice at is calculated, and the directivity of the voice is formed corresponding to any designated time designated during movement of the monitoring object closest to the reproduction time.
 これにより、指向性制御装置は、監視対象物の移動の軌跡を示す動線上に対してユーザが任意に指定した位置(任意指定位置)に応じて、任意指定位置の次に指定されていた指定位置(トラッキング位置)に向かう方向に音声の指向性を事前に形成することができる。 As a result, the directivity control device designates the designation specified next to the arbitrarily designated position in accordance with the position (arbitrarily designated position) arbitrarily designated by the user on the flow line indicating the trajectory of the movement of the monitored object. Voice directivity can be formed in advance in the direction toward the position (tracking position).
 また、本発明の一実施形態は、一定期間にわたって前記第1の収音部で収音された音声を記憶する音声記憶部と、前記音声記憶部に記憶された前記音声を音声出力部に出力させる音声出力制御部と、を更に備え、前記音声出力制御部は、前記再生時刻算出部により算出された前記音声の再生開始時刻に、前記音声を前記音声出力部に出力させ、前記指向性形成部は、前記音声の再生開始時刻から所定時間内に次の指定時刻がある場合、前記次の指定時刻に対応する前記指定位置のデータを用いて、前記音声の指向性を形成する、指向性制御装置である。 In one embodiment of the present invention, a sound storage unit that stores sound collected by the first sound collection unit over a certain period of time, and the sound stored in the sound storage unit is output to a sound output unit. An audio output control unit that causes the audio output unit to output the audio to the audio output unit at the reproduction start time of the audio calculated by the reproduction time calculation unit. A unit configured to form directivity of the voice by using data of the designated position corresponding to the next designated time when there is a next designated time within a predetermined time from the reproduction start time of the voice; It is a control device.
 この構成では、指向性制御装置は、動線上の任意のユーザ指定に応じて指定された位置における音声の再生開始時刻における音声を再生させ、この音声の再生時刻から所定時間内に、監視対象物の移動中にユーザにより指定された次の指定時刻がある場合には、次の指定時刻に対応する指定位置のデータを用いて、音声の指向性を形成する。 In this configuration, the directivity control device reproduces the sound at the sound reproduction start time at the position designated according to any user designation on the flow line, and within a predetermined time from the sound reproduction time, the monitoring target object When there is a next designated time designated by the user during the movement, the directivity of the voice is formed using data at a designated position corresponding to the next designated time.
 これにより、指向性制御装置は、ユーザの任意指定位置に応じて算出された再生開始時刻における監視対象物の発した収音音声を明瞭に出力することができ、再生開始時刻から所定時間内に次の指定位置がある場合には、次の指定位置における音声の指向性を事前に形成することができる。 Thereby, the directivity control device can clearly output the collected sound emitted by the monitoring target at the reproduction start time calculated according to the user's arbitrary designated position, and within a predetermined time from the reproduction start time. When there is a next designated position, the directivity of the voice at the next designated position can be formed in advance.
 また、本発明の一実施形態は、前記表示部への画像の表示に用いる第1の撮像部に対応する所定の切替範囲を前記監視対象物が超えた場合に、前記表示部への画像の表示に用いる撮像部を、前記第1の撮像部から第2の撮像部に切り替える動作切替制御部、を更に備える、指向性制御装置である。 In addition, according to an embodiment of the present invention, when the monitoring object exceeds a predetermined switching range corresponding to the first imaging unit used for displaying the image on the display unit, the image on the display unit is displayed. The directivity control device further includes an operation switching control unit that switches an imaging unit used for display from the first imaging unit to the second imaging unit.
 この構成では、指向性制御装置は、移動中の監視対象物が、表示部への画像の表示に用いる第1の撮像部に対応する所定の切替範囲を超えた場合には、表示部への画像の表示に用いる撮像部を、第1の撮像部から第2の撮像部に切り替える。 In this configuration, the directivity control device, when the moving monitoring object exceeds a predetermined switching range corresponding to the first imaging unit used for displaying the image on the display unit, The imaging unit used for image display is switched from the first imaging unit to the second imaging unit.
 これにより、指向性制御装置は、移動中の監視対象物の画像を的確に映し出すことが可能な撮像部に適応的に切り替えることができ、ユーザの監視対象物の画像を簡易に指定させることができる。 As a result, the directivity control device can adaptively switch to an imaging unit capable of accurately displaying an image of the moving monitoring object, and can easily specify the image of the user's monitoring object. it can.
 また、本発明の一実施形態は、前記第1の収音部に対応する所定の切替範囲を前記監視対象物が超えた場合に、前記監視対象物の音声の収音に用いる収音部を、前記第1の収音部から第2の収音部に切り替える動作切替制御部、を更に備える、指向性制御装置である。 Further, according to an embodiment of the present invention, when the monitoring object exceeds a predetermined switching range corresponding to the first sound collection unit, a sound collection unit used for collecting sound of the monitoring object is provided. The directivity control device further includes an operation switching control unit that switches from the first sound collecting unit to the second sound collecting unit.
 この構成では、指向性制御装置は、移動中の監視対象物が、監視対象物の音声の収音に用いる第1の収音部に対応する所定の切替範囲を超えた場合には、監視対象物の音声の収音に用いる収音部を、第1の収音部から第2の収音部に切り替える。 In this configuration, the directivity control device, when the moving monitoring object exceeds the predetermined switching range corresponding to the first sound collection unit used for collecting the sound of the monitoring object, The sound collection unit used for collecting the sound of the object is switched from the first sound collection unit to the second sound collection unit.
 これにより、指向性制御装置は、移動中の監視対象物の発する音声を的確に収音することが可能な収音部に適応的に切り替えることができ、監視対象物の発する音声を高精度に収音することができる。 As a result, the directivity control device can adaptively switch to a sound collection unit capable of accurately collecting the sound emitted by the moving monitoring object, and the sound emitted by the monitoring object can be accurately obtained. Sound can be collected.
 また、本発明の一実施形態は、所定の入力操作に応じて、複数の撮像部により撮像された各画像を異なる画面で前記表示部に一覧表示させる表示制御部と、前記表示制御部により前記表示部に一覧表示された各画面のうち、所定の選択可能な画面のうちいずれかの画面の選択操作に応じて、前記表示部への前記監視対象物の画像の表示に用いる撮像部を選択する動作切替制御部と、を更に備える、指向性制御装置である。 In addition, according to one embodiment of the present invention, a display control unit that displays a list of images captured by a plurality of imaging units on different screens according to a predetermined input operation, and the display control unit Select an imaging unit to be used for displaying the image of the monitoring object on the display unit in response to a selection operation on one of the predetermined selectable screens among the screens displayed in a list on the display unit A directivity control device further comprising an operation switching control unit.
 この構成では、指向性制御装置は、表示部への画像の表示に用いる撮像部を、表示部に一覧表示された複数の異なる画面から監視対象物の移動方向に合わせてユーザが指定した画面に対応する撮像部に切り替える。 In this configuration, the directivity control device changes the imaging unit used for displaying the image on the display unit from a plurality of different screens displayed in a list on the display unit to a screen specified by the user according to the moving direction of the monitoring target. Switch to the corresponding imaging unit.
 これにより、指向性制御装置は、ユーザの簡易な操作によって、移動中の監視対象物の画像を的確に映し出すことが可能な撮像部に適応的に切り替えることができ、ユーザの監視対象物の画像を簡易に指定させることができる。 Thus, the directivity control device can adaptively switch to an imaging unit capable of accurately displaying an image of the moving monitoring target object by a simple operation of the user, and the user's monitoring target image Can be specified easily.
 また、本発明の一実施形態は、所定の入力操作に応じて、前記第1の収音部から切替可能な周囲の複数の収音部の概略位置を示すマーカを前記表示部に表示させる表示制御部と、前記表示制御部により前記表示部に表示された複数の前記マーカのうち、いずれかのマーカの選択操作に応じて、前記監視対象物の音声の収音に用いる収音部を、前記第1の収音部から、選択された前記マーカに対応する他の収音部に切り替える動作切替制御部、を更に備える、指向性制御装置である。 Further, according to one embodiment of the present invention, a display that displays a marker indicating the approximate positions of a plurality of surrounding sound collection units that can be switched from the first sound collection unit in accordance with a predetermined input operation is displayed on the display unit. In accordance with a selection operation of any one of the plurality of markers displayed on the display unit by the control unit and the display control unit, a sound collection unit used for collecting sound of the monitoring target object, The directivity control device further includes an operation switching control unit that switches from the first sound collection unit to another sound collection unit corresponding to the selected marker.
 この構成では、指向性制御装置は、例えばユーザの入力操作によって、第1の収音部から切り替え可能な周囲の複数の収音部の概略位置を示すマーカを表示部に表示させ、ユーザにより選択されたいずれかのマーカに応じて、監視対象物の音声の収音に用いる収音部を、第1の収音部から、選択されたマーカに対応する他の収音部に切り替える。 In this configuration, the directivity control device causes the display unit to display markers indicating the approximate positions of a plurality of surrounding sound collection units that can be switched from the first sound collection unit, for example, by a user input operation, and is selected by the user In accordance with one of the markers, the sound collection unit used to collect the sound of the monitoring target is switched from the first sound collection unit to another sound collection unit corresponding to the selected marker.
 これにより、指向性制御装置は、ユーザの簡易な操作によって、移動中の監視対象物の発する音声を的確に収音することが可能な収音部に適応的に切り替えることができ、監視対象物の発する音声を高精度に収音することができる。 As a result, the directivity control device can adaptively switch to the sound collection unit capable of accurately collecting the sound emitted from the moving monitoring object by a simple operation of the user. Can be collected with high accuracy.
 また、本発明の一実施形態は、前記動作切替制御部は、前記動作切替制御部により選択された前記撮像部により撮像された前記監視対象物の画像上の位置の指定に応じて、前記第1の収音部を含む複数の収音部から前記監視対象物までの距離が最も近い収音部を、前記監視対象物の音声の収音に用いる収音部として選択する、指向性制御装置である。 Further, in one embodiment of the present invention, the operation switching control unit is configured to perform the operation according to designation of a position on the image of the monitoring object captured by the imaging unit selected by the operation switching control unit. A directivity control device that selects a sound collecting unit having the shortest distance from a plurality of sound collecting units including one sound collecting unit to the monitoring target as a sound collecting unit used for collecting sound of the monitoring target. It is.
 この構成では、指向性制御装置は、選択された撮像部により撮像された監視対象物の画像上の位置指定に応じて、第1の収音部を含む複数の収音部から監視対象物までの距離が最も近い収音部を、監視対象物の音声の収音に用いる収音部として選択する。 In this configuration, the directivity control device includes a plurality of sound collection units including the first sound collection unit to the monitoring target according to the position designation on the image of the monitoring target captured by the selected imaging unit. Is selected as the sound collecting unit used for collecting the sound of the object to be monitored.
 これにより、指向性制御装置は、ユーザが監視対象物の移動方向を示す位置を簡易に指定することにより、移動中の監視対象物の発する音声を的確に収音することが可能な最適な収音部を選択することができ、監視対象物の発する音声を高精度に収音することができる。 As a result, the directivity control device allows the user to easily specify the position indicating the moving direction of the monitored object, so that the optimum sound collecting sound that can be accurately picked up by the moving monitored object can be obtained. The sound part can be selected, and the sound emitted from the monitoring object can be collected with high accuracy.
 また、本発明の一実施形態は、前記表示部の画像から前記監視対象物の顔の向きを検出する画像処理部、を更に備え、前記動作切替制御部は、前記動作切替制御部により選択された前記撮像部により撮像された前記監視対象物の画像上の位置の指定に応じて、前記画像処理部により検出された前記監視対象物の顔の向きに対応する方向で、前記第1の収音部を含む複数の収音部から前記監視対象物までの距離が最も近い収音部を、前記監視対象物の音声の収音に用いる収音部として選択する、指向性制御装置である。 In addition, an embodiment of the present invention further includes an image processing unit that detects a face direction of the monitoring target object from the image of the display unit, and the operation switching control unit is selected by the operation switching control unit. In response to designation of a position on the image of the monitoring object imaged by the imaging unit, the first convergence is performed in a direction corresponding to the face direction of the monitoring object detected by the image processing unit. A directivity control apparatus that selects a sound collecting unit having a shortest distance from a plurality of sound collecting units including a sound unit to the monitoring target as a sound collecting unit used for collecting sound of the monitoring target.
 この構成では、指向性制御装置は、選択された撮像部により撮像された監視対象物の画像上の位置指定に応じて、この画像上の監視対象物の顔の向きが示す方向に存在し、かつ、第1の収音部を含む複数の収音部から監視対象物までの距離が最も近い収音部を、監視対象物の音声の収音に用いる収音部として選択する。 In this configuration, the directivity control device exists in the direction indicated by the orientation of the face of the monitoring object on the image according to the position designation on the image of the monitoring object imaged by the selected imaging unit, And the sound collection part with the shortest distance from the some sound collection part containing a 1st sound collection part to the monitoring target object is selected as a sound collection part used for the sound collection of the sound of the monitoring target object.
 これにより、指向性制御装置は、監視対象物の画像上の顔の向きと監視対象物と収音部との距離とによって、移動中の監視対象物の発する音声を的確に収音することが可能な最適な収音部を選択することができ、監視対象物の発する音声を高精度に収音することができる。 As a result, the directivity control device can accurately collect the sound emitted by the moving monitoring object according to the orientation of the face on the image of the monitoring object and the distance between the monitoring object and the sound collection unit. It is possible to select an optimal sound pickup unit that is possible, and it is possible to pick up sound generated by the monitoring object with high accuracy.
 また、本発明の一実施形態は、前記第1の収音部で収音された音声を音声出力部に出力させる音声出力制御部、を更に備え、前記表示制御部は、前記動作切替制御部により選択された前記撮像部に対応付けられた前記第1の収音部を含む複数の収音部の概略位置を示すマーカを前記表示部に表示させ、前記音声出力制御部は、前記動作切替制御部により選択された前記撮像部により撮像された前記監視対象物の画像上の位置の指定に応じて、前記表示部に表示された各マーカに対応する前記収音部から前記監視対象物への方向に指向性が形成された音声を順次、所定時間出力し、前記動作切替制御部は、前記音声出力制御部により出力された音声に基づくいずれかの前記マーカの選択操作に応じて、選択されたマーカに対応する収音部を、前記監視対象物の音声の収音に用いる収音部として選択する、指向性制御装置である。 Moreover, one embodiment of the present invention further includes an audio output control unit that causes the audio output unit to output audio collected by the first sound collection unit, and the display control unit includes the operation switching control unit A marker indicating the approximate position of a plurality of sound collection units including the first sound collection unit associated with the imaging unit selected by the display unit is displayed on the display unit, and the audio output control unit is configured to switch the operation From the sound collection unit corresponding to each marker displayed on the display unit to the monitoring target according to the designation of the position on the image of the monitoring target captured by the imaging unit selected by the control unit The sound with directivity formed in the direction of is sequentially output for a predetermined time, and the operation switching control unit selects one of the markers based on the sound output by the sound output control unit. The sound collection part corresponding to the marked marker, Selecting a sound pickup unit used for sound collection sound serial monitored object, a directivity control apparatus.
 この構成では、指向性制御装置は、選択された撮像部に対応付けられた第1の収音部を含む複数の収音部の概略位置を示すマーカを表示部に表示させ、移動中の監視対象物の画像上の位置指定に応じて、各マーカに対応する収音部から監視対象物への方向に指向性が形成された音声を順次、所定時間の間出力し、更に、選択されたいずれかのマーカに対応する収音部を、監視対象物の音声の収音に用いる収音部として選択する。 In this configuration, the directivity control device causes the display unit to display a marker indicating the approximate position of the plurality of sound collection units including the first sound collection unit associated with the selected imaging unit, and to monitor the movement In accordance with the position designation on the image of the target object, sound in which directivity is formed in the direction from the sound collection unit corresponding to each marker to the monitoring target is sequentially output for a predetermined time, and further selected. The sound collection unit corresponding to any one of the markers is selected as the sound collection unit used for collecting the sound of the monitoring target.
 これにより、指向性制御装置は、選択された撮像部に対応付けられた複数の収音部において異なる指向性が形成された収音音声を一定時間にわたって出力することができるので、ユーザが最適と判断する収音音声を選択する簡易な操作を行うことにより、移動中の監視対象物の発する音声を的確に収音することが可能な最適な収音部を選択することができ、監視対象物の発する音声を高精度に収音することができる。 As a result, the directivity control device can output sound collection sounds in which different directivities are formed in a plurality of sound collection units associated with the selected imaging unit over a certain period of time, so that the user is optimal. By performing a simple operation to select the collected sound to be judged, it is possible to select the optimum sound collection part that can accurately pick up the sound emitted by the moving monitoring object. Can be collected with high accuracy.
 また、本発明の一実施形態は、複数のマイクを含む第1の収音部で収音された音声の指向性を制御する指向性制御装置における指向性制御方法であって、前記第1の収音部から、表示部の画像上の第1の指定位置に対応する監視対象物への方向に、前記音声の指向性を形成するステップと、前記監視対象物の移動に応じて指定された、前記表示部の画像上の第2の指定位置に関する情報を取得するステップと、取得された前記第2の指定位置に関する情報を用いて、前記第2の指定位置に対応する前記監視対象物への方向に、前記音声の指向性を切り替えるステップと、を有する、指向性制御方法である。 In addition, an embodiment of the present invention is a directivity control method in a directivity control apparatus that controls directivity of sound collected by a first sound collection unit including a plurality of microphones. The step of forming the directivity of the sound in the direction from the sound collection unit to the monitoring target corresponding to the first specified position on the image of the display unit, and specified according to the movement of the monitoring target Using the information on the second designated position on the image of the display unit and the obtained information on the second designated position to the monitoring object corresponding to the second designated position. And switching the directivity of the voice in the direction.
 この方法では、指向性制御装置は、複数のマイクを含む第1の収音部から、表示部の画像上の第1の指定位置に対応する監視対象物への方向に音声の指向性を形成し、更に、移動している監視対象物を指定した第2の指定位置に関する情報を取得する。また、指向性制御装置は、表示部の画像上の第2の指定位置に関する情報を用いて、第2の指定位置に対応する監視対象物への方向に、音声の指向性を切り替える。 In this method, the directivity control device forms sound directivity in the direction from the first sound collection unit including a plurality of microphones to the monitoring target corresponding to the first designated position on the image of the display unit. In addition, information on the second designated position that designates the moving monitoring object is acquired. In addition, the directivity control device switches the directivity of the sound in the direction toward the monitoring target corresponding to the second designated position, using information regarding the second designated position on the image of the display unit.
 これにより、指向性制御装置は、表示部の画像上に映し出されている監視対象物が移動しても、監視対象物の移動前の位置に向かう方向に形成された音声の指向性を、監視対象物の移動後の位置に向かう方向に形成するので、監視対象物の移動に伴って音声の指向性を追従して適正に形成することができ、監視者の監視業務の効率劣化を抑制できる。 Thereby, the directivity control device monitors the directivity of the sound formed in the direction toward the position before the movement of the monitoring object even if the monitoring object projected on the image of the display unit moves. Since the object is formed in the direction toward the position after the movement, the sound can be properly formed following the directivity of the sound as the monitoring object moves, and the efficiency of monitoring work of the supervisor can be suppressed. .
 また、本発明の一実施形態は、複数のマイクを含む第1の収音部で収音された音声の指向性を制御する指向性制御装置における処理を実行するプログラムが格納された記憶媒体であって、前記第1の収音部から、表示部の画像上の第1の指定位置に対応する監視対象物への方向に、前記音声の指向性を形成するステップと、前記監視対象物の移動に応じて指定された、前記表示部の画像上の第2の指定位置に関する情報を取得するステップと、取得された前記第2の指定位置に関する情報を用いて、前記第2の指定位置に対応する前記監視対象物への方向に、前記音声の指向性を切り替えるステップと、を実行するプログラムが格納された、記憶媒体である。 One embodiment of the present invention is a storage medium storing a program for executing processing in a directivity control device that controls directivity of sound collected by a first sound collection unit including a plurality of microphones. And forming the directivity of the sound in the direction from the first sound collection unit to the monitoring target corresponding to the first designated position on the image of the display unit; and Using the step of acquiring information related to the second specified position on the image of the display unit specified in accordance with movement and the acquired information related to the second specified position, the second specified position And a step of switching the directivity of the voice in a direction toward the corresponding monitoring object.
 この記憶媒体に格納されたプログラムが実行可能な指向性制御装置は、複数のマイクを含む第1の収音部から、表示部の画像上の第1の指定位置に対応する監視対象物への方向に音声の指向性を形成し、更に、移動している監視対象物を指定した第2の指定位置に関する情報を取得する。また、指向性制御装置は、表示部の画像上の第2の指定位置に関する情報を用いて、第2の指定位置に対応する監視対象物への方向に、音声の指向性を切り替える。 A directivity control device capable of executing a program stored in the storage medium is provided from a first sound collection unit including a plurality of microphones to a monitoring object corresponding to a first designated position on an image on a display unit. Voice directivity is formed in the direction, and information on the second designated position that designates the moving monitoring object is acquired. In addition, the directivity control device switches the directivity of the sound in the direction toward the monitoring target corresponding to the second designated position, using information regarding the second designated position on the image of the display unit.
 これにより、指向性制御装置は、表示部の画像上に映し出されている監視対象物が移動しても、監視対象物の移動前の位置に向かう方向に形成された音声の指向性を、監視対象物の移動後の位置に向かう方向に形成するので、監視対象物の移動に伴って音声の指向性を追従して適正に形成することができ、監視者の監視業務の効率劣化を抑制できる。 Thereby, the directivity control device monitors the directivity of the sound formed in the direction toward the position before the movement of the monitoring object even if the monitoring object projected on the image of the display unit moves. Since the object is formed in the direction toward the position after the movement, the sound can be properly formed following the directivity of the sound as the monitoring object moves, and the efficiency of monitoring work of the supervisor can be suppressed. .
 また、本発明の一実施形態は、収音領域を撮像する撮像部と、複数のマイクを含み前記収音領域の音声を収音する第1の収音部と、前記第1の収音部で収音された音声の指向性を制御する指向性制御装置と、を備え、前記指向性制御装置は、前記第1の収音部から、表示部の画像上の第1の指定位置に対応する監視対象物への方向に、前記音声の指向性を形成する指向性形成部と、前記監視対象物の移動に応じて指定された、前記表示部の画像上の第2の指定位置に関する情報を取得する情報取得部と、を備え、前記指向性形成部は、前記情報取得部により取得された前記第2の指定位置に関する情報を用いて、前記第2の指定位置に対応する前記監視対象物への方向に、前記音声の指向性を切り替える、指向性制御システムである。 In addition, an embodiment of the present invention includes an imaging unit that images a sound collection region, a first sound collection unit that includes a plurality of microphones and collects sound in the sound collection region, and the first sound collection unit. A directivity control device that controls the directivity of the sound collected in step 1, wherein the directivity control device corresponds to a first designated position on the image of the display unit from the first sound collection unit. Information on a second designated position on the image of the display unit, which is designated according to the movement of the monitoring target, and a directivity forming unit that forms the directivity of the voice in a direction toward the monitoring target The directivity forming unit uses the information related to the second designated position obtained by the information obtaining unit, and the monitoring target corresponding to the second designated position A directivity control system that switches the directivity of the voice in a direction toward an object.
 このシステムでは、指向性制御装置は、複数のマイクを含む第1の収音部から、表示部の画像上の第1の指定位置に対応する監視対象物への方向に音声の指向性を形成し、更に、移動している監視対象物を指定した第2の指定位置に関する情報を取得する。また、指向性制御装置は、表示部の画像上の第2の指定位置に関する情報を用いて、第2の指定位置に対応する監視対象物への方向に、音声の指向性を切り替える。 In this system, the directivity control device forms sound directivity in the direction from the first sound collection unit including a plurality of microphones to the monitoring target corresponding to the first designated position on the image of the display unit. In addition, information on the second designated position that designates the moving monitoring object is acquired. In addition, the directivity control device switches the directivity of the sound in the direction toward the monitoring target corresponding to the second designated position, using information regarding the second designated position on the image of the display unit.
 これにより、指向性制御システムでは、指向性制御装置は、表示部の画像上に映し出されている監視対象物が移動しても、監視対象物の移動前の位置に向かう方向に形成された音声の指向性を、監視対象物の移動後の位置に向かう方向に形成するので、監視対象物の移動に伴って音声の指向性を追従して適正に形成することができ、監視者の監視業務の効率劣化を抑制できる。 As a result, in the directivity control system, the directivity control device allows the sound formed in the direction toward the position before the movement of the monitoring object even if the monitoring object displayed on the image of the display unit moves. Because the directivity of the object is formed in the direction toward the position after the movement of the monitoring object, it can be formed appropriately following the directivity of the voice as the monitoring object moves. Efficiency degradation can be suppressed.
 以上、図面を参照しながら各種の実施形態について説明したが、本発明はかかる例に限定されないことは言うまでもない。当業者であれば、特許請求の範囲に記載された範疇内において、各種の変更例又は修正例に想到し得ることは明らかであり、それらについても当然に本発明の技術的範囲に属するものと了解される。 Although various embodiments have been described above with reference to the drawings, it goes without saying that the present invention is not limited to such examples. It will be apparent to those skilled in the art that various changes and modifications can be made within the scope of the claims, and these are naturally within the technical scope of the present invention. Understood.
 本発明は、画像上の監視対象物が移動しても、監視対象物に対する音声の指向性を追従して適正に形成し、監視者の監視業務の効率劣化を抑制する指向性制御装置、指向性制御方法、記憶媒体及び指向性制御システムとして有用である。 The present invention is directed to a directivity control device that properly forms the directivity of sound with respect to a monitoring object even when the monitoring object on the image moves, and suppresses deterioration in efficiency of the monitoring work of the observer. This is useful as a sex control method, a storage medium, and a directivity control system.
3、3A、3B 指向性制御装置
4 レコーダ装置
31 通信部
32 操作部
33 メモリ
34、34A 信号処理部
34a 指向方向算出部
34b 出力制御部
34c トラッキング処理部
34d 音源検出部
35 ディスプレイ装置
36 スピーカ装置
37 画像処理部
38 動作切替制御部
100、100A、100B 指向性制御システム
C1、Cn カメラ装置
C1RN、C2RN 撮像エリア
JC1、JM1 切替判定ライン
JDL スクロール判定線
LN1、LN2、LNR、LNW トラッキングライン
LST トラッキングリスト
NW ネットワーク
M1、Mm 全方位マイクアレイ装置
MR1、MR2、MR2W、MR2R、MR3 ポイントマーカ
TP1、TP2 トラッキングポイント
TRW トラッキング画面
3, 3A, 3B Directivity control device 4 Recorder device 31 Communication unit 32 Operation unit 33 Memory 34, 34A Signal processing unit 34a Directional direction calculation unit 34b Output control unit 34c Tracking processing unit 34d Sound source detection unit 35 Display device 36 Speaker device 37 Image processing unit 38 Operation switching control unit 100, 100A, 100B Directivity control system C1, Cn Camera device C1RN, C2RN Imaging area JC1, JM1 Switching determination line JDL Scroll determination line LN1, LN2, LNR, LNW Tracking line LST Tracking list NW Network M1, Mm Omnidirectional microphone array device MR1, MR2, MR2W, MR2R, MR3 Point marker TP1, TP2 Tracking point TRW Tracking screen

Claims (31)

  1.  複数のマイクを含む第1の収音部で収音された音声の指向性を制御する指向性制御装置であって、
     前記第1の収音部から、表示部の画像上の第1の指定位置に対応する監視対象物への方向に、前記音声の指向性を形成する指向性形成部と、
     前記監視対象物の移動に応じて指定された、前記表示部の画像上の第2の指定位置に関する情報を取得する情報取得部と、を備え、
     前記指向性形成部は、
     前記情報取得部により取得された前記第2の指定位置に関する情報を用いて、前記第2の指定位置に対応する前記監視対象物への方向に、前記音声の指向性を切り替える、
     指向性制御装置。
    A directivity control device for controlling the directivity of sound collected by a first sound collection unit including a plurality of microphones,
    A directivity forming unit that forms directivity of the sound in a direction from the first sound collecting unit to a monitoring target corresponding to the first designated position on the image of the display unit;
    An information acquisition unit that acquires information related to a second specified position on the image of the display unit that is specified according to the movement of the monitoring object;
    The directivity forming part is
    Using the information on the second designated position acquired by the information acquisition unit, the directivity of the voice is switched in the direction toward the monitoring object corresponding to the second designated position.
    Directivity control device.
  2.  請求項1に記載の指向性制御装置であって、
     前記情報取得部は、
     前記表示部の画像上で移動する前記監視対象物に対する指定操作に応じて、前記第2の指定位置に関する情報を取得する、
     指向性制御装置。
    The directivity control device according to claim 1,
    The information acquisition unit
    In response to a designation operation for the monitoring object that moves on the image of the display unit, information on the second designated position is acquired.
    Directivity control device.
  3.  請求項1に記載の指向性制御装置であって、
     前記表示部の画像から前記監視対象物に対応する音源位置を検出する音源検出部と、
     前記表示部の画像から前記監視対象物を検出する画像処理部と、を更に備え、
     前記情報取得部は、
     前記音源検出部により検出された前記音源位置に関する情報、又は前記画像処理部により検出された前記監視対象物の位置に関する情報を、前記第2の指定位置に関する情報として取得する、
     指向性制御装置。
    The directivity control device according to claim 1,
    A sound source detection unit for detecting a sound source position corresponding to the monitoring object from the image of the display unit;
    An image processing unit for detecting the monitoring object from the image of the display unit,
    The information acquisition unit
    Obtaining information on the position of the sound source detected by the sound source detection unit or information on the position of the monitoring target detected by the image processing unit as information on the second designated position;
    Directivity control device.
  4.  請求項3に記載の指向性制御装置であって、
     前記音源検出部は、
     前記表示部の画像上に指定された初期位置を中心に、前記監視対象物に対応する音源位置の検出処理を開始し、
     前記画像処理部は、
     前記初期位置を中心に、前記監視対象物の検出処理を開始する、
     指向性制御装置。
    The directivity control device according to claim 3,
    The sound source detector
    Centering on the initial position designated on the image of the display unit, the detection processing of the sound source position corresponding to the monitoring object is started,
    The image processing unit
    Centering on the initial position, the monitoring object detection process is started.
    Directivity control device.
  5.  請求項3に記載の指向性制御装置であって、
     前記情報取得部は、
     前記音源検出部により検出された前記音源位置に関する情報、又は前記画像処理部により検出された前記監視対象物の位置に関する情報の変更操作に応じて、前記変更操作により指定された前記表示部の画像上の位置に関する情報を、前記第2の指定位置に関する情報として取得する、
     指向性制御装置。
    The directivity control device according to claim 3,
    The information acquisition unit
    The image of the display unit designated by the change operation in response to a change operation of the information on the sound source position detected by the sound source detection unit or the information on the position of the monitoring target detected by the image processing unit. Obtaining information relating to the upper position as information relating to the second designated position;
    Directivity control device.
  6.  請求項3に記載の指向性制御装置であって、
     前記情報取得部は、
     前記音源検出部により検出された前記音源位置と、前記画像処理部により検出された前記監視対象物の位置との距離が所定値以上である場合、前記音源位置に関する情報又は前記監視対象物の位置に関する情報の変更操作に応じて、前記変更操作により指定された前記表示部の画像上の位置に関する情報を、前記第2の指定位置に関する情報として取得する、
     指向性制御装置。
    The directivity control device according to claim 3,
    The information acquisition unit
    When the distance between the sound source position detected by the sound source detection unit and the position of the monitoring target detected by the image processing unit is a predetermined value or more, information on the sound source position or the position of the monitoring target In response to a change operation of information regarding, information related to the position on the image of the display unit specified by the change operation is acquired as information related to the second specified position.
    Directivity control device.
  7.  請求項1に記載の指向性制御装置であって、
     一定期間にわたって撮像された画像を記憶する画像記憶部と、
     前記画像記憶部に記憶された前記画像を前記表示部に再生する画像再生部と、
    を更に備え、
     前記画像再生部は、
     所定の入力操作により、再生速度の初期値より小さい速度値で前記画像を再生する、
     指向性制御装置。
    The directivity control device according to claim 1,
    An image storage unit for storing images taken over a certain period of time;
    An image reproduction unit for reproducing the image stored in the image storage unit on the display unit;
    Further comprising
    The image reproduction unit
    Playing the image at a speed value smaller than the initial value of the playback speed by a predetermined input operation,
    Directivity control device.
  8.  請求項1に記載の指向性制御装置であって、
     撮像された画像を前記表示部に表示させる表示制御部、を更に備え、
     前記表示制御部は、
     前記表示部の画像上の指定位置への指定に応じて、前記指定位置を中心に所定倍率で前記画像を同一画面において拡大表示させる、
     指向性制御装置。
    The directivity control device according to claim 1,
    A display control unit for displaying the captured image on the display unit;
    The display control unit
    In response to designation to a designated position on the image of the display unit, the image is enlarged and displayed on the same screen at a predetermined magnification around the designated position.
    Directivity control device.
  9.  請求項1に記載の指向性制御装置であって、
     撮像された画像を前記表示部に表示させる表示制御部、を更に備え、
     前記表示制御部は、
     前記表示部の画像上の指定位置への指定に応じて、前記指定位置を中心に所定倍率で前記画像を他の画面において拡大表示させる、
     指向性制御装置。
    The directivity control device according to claim 1,
    A display control unit for displaying the captured image on the display unit;
    The display control unit
    In response to designation to a designated position on the image of the display unit, the image is enlarged and displayed on another screen at a predetermined magnification around the designated position.
    Directivity control device.
  10.  請求項1に記載の指向性制御装置であって、
     撮像された画像を前記表示部に表示させる表示制御部、を更に備え、
     前記表示制御部は、
     所定の入力操作に応じて、前記表示部の中心を基準に所定倍率で前記画像を拡大表示させる、
     指向性制御装置。
    The directivity control device according to claim 1,
    A display control unit for displaying the captured image on the display unit;
    The display control unit
    In response to a predetermined input operation, the image is enlarged and displayed at a predetermined magnification with reference to the center of the display unit.
    Directivity control device.
  11.  請求項8に記載の指向性制御装置であって、
     前記表示制御部は、
     前記監視対象物の移動に応じて、前記画像が拡大表示された画面において前記指定位置が所定のスクロール判定線を超えた場合に、前記スクロール判定線を超えた方向に前記画面を所定量スクロールする、
     指向性制御装置。
    The directivity control device according to claim 8,
    The display control unit
    In response to the movement of the monitoring object, when the specified position exceeds a predetermined scroll determination line on the screen on which the image is enlarged and displayed, the screen is scrolled by a predetermined amount in the direction beyond the scroll determination line. ,
    Directivity control device.
  12.  請求項8に記載の指向性制御装置であって、
     前記表示制御部は、
     前記監視対象物の移動に応じて、前記画像が拡大表示された画面において前記指定位置が所定のスクロール判定線を超えた場合に、前記指定位置が中心となるように前記画面をスクロールする、
     指向性制御装置。
    The directivity control device according to claim 8,
    The display control unit
    In response to the movement of the monitoring object, when the designated position exceeds a predetermined scroll determination line on the screen on which the image is enlarged, the screen is scrolled so that the designated position is the center.
    Directivity control device.
  13.  請求項8に記載の指向性制御装置であって、
     前記表示制御部は、
     前記画像が拡大表示された画面において、前記指定位置が前記画面の中心となるように前記画面をスクロールする、
     指向性制御装置。
    The directivity control device according to claim 8,
    The display control unit
    In the screen on which the image is enlarged, the screen is scrolled so that the designated position is the center of the screen.
    Directivity control device.
  14.  請求項3に記載の指向性制御装置であって、
     前記画像処理部は、
     所定の入力操作に応じて、前記表示部の画像上の前記監視対象物の一部をマスキング処理する、
     指向性制御装置。
    The directivity control device according to claim 3,
    The image processing unit
    In accordance with a predetermined input operation, a part of the monitoring object on the image of the display unit is masked.
    Directivity control device.
  15.  請求項1に記載の指向性制御装置であって、
     前記第1の収音部で収音された音声を音声出力部に出力させる音声出力制御部、を更に備え、
     前記音声出力制御部は、
     所定の入力操作に応じて、前記第1の収音部で収音された音声をボイスチェンジ処理して前記音声出力部に出力させる、
     指向性制御装置。
    The directivity control device according to claim 1,
    An audio output control unit that causes the audio output unit to output the audio collected by the first sound collection unit;
    The audio output control unit
    In response to a predetermined input operation, the voice collected by the first sound collection unit is subjected to voice change processing and output to the voice output unit.
    Directivity control device.
  16.  請求項1に記載の指向性制御装置であって、
     一定期間にわたって前記第1の収音部で収音された音声を記憶する音声記憶部と、
     前記音声記憶部に記憶された前記音声を音声出力部に出力させる音声出力制御部と、を更に備え、
     前記音声出力制御部は、
     所定の入力操作に応じて、前記第1の収音部で収音された音声をボイスチェンジ処理して前記音声出力部に出力させる、
     指向性制御装置。
    The directivity control device according to claim 1,
    An audio storage unit for storing audio collected by the first sound collection unit over a certain period;
    An audio output control unit that causes the audio output unit to output the audio stored in the audio storage unit;
    The audio output control unit
    In response to a predetermined input operation, the voice collected by the first sound collection unit is subjected to voice change processing and output to the voice output unit.
    Directivity control device.
  17.  請求項1に記載の指向性制御装置であって、
     前記監視対象物の移動に応じて指定される、1つ以上の前記表示部の画像上の指定位置に所定のマーカを表示させる表示制御部、を更に備える、
     指向性制御装置。
    The directivity control device according to claim 1,
    A display control unit configured to display a predetermined marker at a specified position on the image of one or more of the display units specified in accordance with the movement of the monitoring object;
    Directivity control device.
  18.  請求項1に記載の指向性制御装置であって、
     前記監視対象物の移動に応じて指定される、前記表示部の画像上の2つ以上の指定位置のうち、少なくとも現在の指定位置と直前の指定位置とを結線して表示させる表示制御部、を更に備える、
     指向性制御装置。
    The directivity control device according to claim 1,
    A display control unit configured to connect and display at least the current specified position and the immediately preceding specified position among two or more specified positions on the image of the display unit, which are specified according to the movement of the monitoring object; Further comprising
    Directivity control device.
  19.  請求項1に記載の指向性制御装置であって、
     前記監視対象物の移動に応じて指定される、前記表示部の画像上の全ての指定位置に対し、各指定位置に隣接する1つ又は2つの指定位置を結線した動線を表示させる表示制御部、を更に備える、
     指向性制御装置。
    The directivity control device according to claim 1,
    Display control for displaying a flow line connecting one or two specified positions adjacent to each specified position for all specified positions on the image of the display unit specified in accordance with the movement of the monitoring object Further comprising
    Directivity control device.
  20.  請求項19に記載の指向性制御装置であって、
     前記表示部の画像上の全ての指定位置及び指定時刻のデータを含む指定リストを記憶する指定リスト記憶部と、
     前記表示制御部により表示された前記全ての指定位置を結線する動線上の任意の位置の指定に応じて、前記指定リスト記憶部に記憶された前記指定リストを用いて、前記動線上の指定位置における前記音声の再生開始時刻を算出する再生時刻算出部と、を更に備え、
     前記指向性形成部は、
     前記再生時刻算出部により算出された前記音声の再生開始時刻に最も近い前記指定時刻に対応する前記指定位置のデータを用いて、前記音声の指向性を形成する、
     指向性制御装置。
    The directivity control device according to claim 19,
    A designated list storage unit for storing a designated list including data of all designated positions and designated times on the image of the display unit;
    In accordance with designation of an arbitrary position on the flow line connecting all the designated positions displayed by the display control unit, the designated position stored on the designation list storage unit is used to designate the designated position on the flow line. A reproduction time calculation unit for calculating a reproduction start time of the sound in
    The directivity forming part is
    Using the data of the designated position corresponding to the designated time closest to the reproduction start time of the voice calculated by the reproduction time calculating unit, to form the directivity of the voice;
    Directivity control device.
  21.  請求項20に記載の指向性制御装置であって、
     一定期間にわたって前記第1の収音部で収音された音声を記憶する音声記憶部と、
     前記音声記憶部に記憶された前記音声を音声出力部に出力させる音声出力制御部と、を更に備え、
     前記音声出力制御部は、
     前記再生時刻算出部により算出された前記音声の再生開始時刻に、前記音声を前記音声出力部に出力させ、
     前記指向性形成部は、
     前記音声の再生開始時刻から所定時間内に次の指定時刻がある場合、前記次の指定時刻に対応する前記指定位置のデータを用いて、前記音声の指向性を形成する、
     指向性制御装置。
    The directivity control device according to claim 20,
    An audio storage unit for storing audio collected by the first sound collection unit over a certain period;
    An audio output control unit that causes the audio output unit to output the audio stored in the audio storage unit;
    The audio output control unit
    At the reproduction start time of the sound calculated by the reproduction time calculation unit, the sound is output to the sound output unit,
    The directivity forming part is
    When there is a next designated time within a predetermined time from the reproduction start time of the voice, the directivity of the voice is formed using data of the designated position corresponding to the next designated time.
    Directivity control device.
  22.  請求項1に記載の指向性制御装置であって、
     前記表示部への画像の表示に用いる第1の撮像部に対応する所定の切替範囲を前記監視対象物が超えた場合に、前記表示部への画像の表示に用いる撮像部を、前記第1の撮像部から第2の撮像部に切り替える動作切替制御部、を更に備える、
     指向性制御装置。
    The directivity control device according to claim 1,
    When the monitoring object exceeds a predetermined switching range corresponding to the first imaging unit used for displaying the image on the display unit, the imaging unit used for displaying the image on the display unit is the first imaging unit. An operation switching control unit that switches from the imaging unit to the second imaging unit,
    Directivity control device.
  23.  請求項1に記載の指向性制御装置であって、
     前記第1の収音部に対応する所定の切替範囲を前記監視対象物が超えた場合に、前記監視対象物の音声の収音に用いる収音部を、前記第1の収音部から第2の収音部に切り替える動作切替制御部、を更に備える、
     指向性制御装置。
    The directivity control device according to claim 1,
    When the monitored object exceeds a predetermined switching range corresponding to the first sound collecting unit, the sound collecting unit used for collecting the sound of the monitored object is changed from the first sound collecting unit to the first sound collecting unit. An operation switching control unit that switches to two sound collection units,
    Directivity control device.
  24.  請求項1に記載の指向性制御装置であって、
     所定の入力操作に応じて、複数の撮像部により撮像された各画像を異なる画面で前記表示部に一覧表示させる表示制御部と、
     前記表示制御部により前記表示部に一覧表示された各画面のうち、所定の選択可能な画面のうちいずれかの画面の選択操作に応じて、前記表示部への前記監視対象物の画像の表示に用いる撮像部を選択する動作切替制御部と、を更に備える、
     指向性制御装置。
    The directivity control device according to claim 1,
    A display control unit that displays a list of each image captured by a plurality of imaging units on a different screen according to a predetermined input operation;
    Displaying the image of the monitoring object on the display unit in response to a selection operation on any one of the screens that can be selected from among the screens displayed as a list on the display unit by the display control unit An operation switching control unit that selects an imaging unit to be used for
    Directivity control device.
  25.  請求項1に記載の指向性制御装置であって、
     所定の入力操作に応じて、前記第1の収音部から切替可能な周囲の複数の収音部の概略位置を示すマーカを前記表示部に表示させる表示制御部と、
     前記表示制御部により前記表示部に表示された複数の前記マーカのうち、いずれかのマーカの選択操作に応じて、前記監視対象物の音声の収音に用いる収音部を、前記第1の収音部から、選択された前記マーカに対応する他の収音部に切り替える動作切替制御部、を更に備える、
     指向性制御装置。
    The directivity control device according to claim 1,
    A display control unit that displays on the display unit markers indicating the approximate positions of a plurality of surrounding sound collecting units that can be switched from the first sound collecting unit in accordance with a predetermined input operation;
    In response to a selection operation of any one of the plurality of markers displayed on the display unit by the display control unit, a sound collection unit used for collecting sound of the monitoring target is the first collection unit. An operation switching control unit for switching from the sound collection unit to another sound collection unit corresponding to the selected marker;
    Directivity control device.
  26.  請求項24に記載の指向性制御装置であって、
     前記動作切替制御部は、
     前記動作切替制御部により選択された前記撮像部により撮像された前記監視対象物の画像上の位置の指定に応じて、前記第1の収音部を含む複数の収音部から前記監視対象物までの距離が最も近い収音部を、前記監視対象物の音声の収音に用いる収音部として選択する、
     指向性制御装置。
    A directivity control device according to claim 24,
    The operation switching control unit
    In accordance with designation of a position on the image of the monitoring object imaged by the imaging unit selected by the operation switching control unit, the monitoring object is selected from a plurality of sound collection units including the first sound collection unit. Select the sound collecting unit with the shortest distance to the sound collecting unit used for collecting the sound of the monitoring object,
    Directivity control device.
  27.  請求項24に記載の指向性制御装置であって、
     前記表示部の画像から前記監視対象物の顔の向きを検出する画像処理部、を更に備え、
     前記動作切替制御部は、
     前記動作切替制御部により選択された前記撮像部により撮像された前記監視対象物の画像上の位置の指定に応じて、前記画像処理部により検出された前記監視対象物の顔の向きに対応する方向で、前記第1の収音部を含む複数の収音部から前記監視対象物までの距離が最も近い収音部を、前記監視対象物の音声の収音に用いる収音部として選択する、
     指向性制御装置。
    A directivity control device according to claim 24,
    An image processing unit that detects the orientation of the face of the monitoring object from the image of the display unit;
    The operation switching control unit
    Corresponding to the face orientation of the monitoring object detected by the image processing unit according to the designation of the position on the image of the monitoring object imaged by the imaging unit selected by the operation switching control unit The sound collection unit having the closest distance from the plurality of sound collection units including the first sound collection unit to the monitoring target in the direction is selected as the sound collection unit used for collecting the sound of the monitoring target. ,
    Directivity control device.
  28.  請求項24に記載の指向性制御装置であって、
     前記第1の収音部で収音された音声を音声出力部に出力させる音声出力制御部、を更に備え、
     前記表示制御部は、
     前記動作切替制御部により選択された前記撮像部に対応付けられた前記第1の収音部を含む複数の収音部の概略位置を示すマーカを前記表示部に表示させ、
     前記音声出力制御部は、
     前記動作切替制御部により選択された前記撮像部により撮像された前記監視対象物の画像上の位置の指定に応じて、前記表示部に表示された各マーカに対応する前記収音部から前記監視対象物への方向に指向性が形成された音声を順次、所定時間出力し、
     前記動作切替制御部は、
     前記音声出力制御部により出力された音声に基づくいずれかの前記マーカの選択操作に応じて、選択されたマーカに対応する収音部を、前記監視対象物の音声の収音に用いる収音部として選択する、
     指向性制御装置。
    A directivity control device according to claim 24,
    An audio output control unit that causes the audio output unit to output the audio collected by the first sound collection unit;
    The display control unit
    Displaying a marker indicating the approximate position of a plurality of sound collection units including the first sound collection unit associated with the imaging unit selected by the operation switching control unit on the display unit;
    The audio output control unit
    The monitoring from the sound collection unit corresponding to each marker displayed on the display unit according to the designation of the position on the image of the monitoring target image captured by the imaging unit selected by the operation switching control unit Sequentially output sound with directivity in the direction to the object for a predetermined time,
    The operation switching control unit
    A sound collection unit that uses the sound collection unit corresponding to the selected marker in response to the selection operation of any of the markers based on the sound output by the sound output control unit, Select as the
    Directivity control device.
  29.  複数のマイクを含む第1の収音部で収音された音声の指向性を制御する指向性制御装置における指向性制御方法であって、
     前記第1の収音部から、表示部の画像上の第1の指定位置に対応する監視対象物への方向に、前記音声の指向性を形成するステップと、
     前記監視対象物の移動に応じて指定された、前記表示部の画像上の第2の指定位置に関する情報を取得するステップと、
     取得された前記第2の指定位置に関する情報を用いて、前記第2の指定位置に対応する前記監視対象物への方向に、前記音声の指向性を切り替えるステップと、を有する、
     指向性制御方法。
    A directivity control method in a directivity control device for controlling the directivity of sound collected by a first sound collection unit including a plurality of microphones,
    Forming the directivity of the voice in a direction from the first sound collection unit to the monitoring target corresponding to the first designated position on the image of the display unit;
    Obtaining information related to a second designated position on the image of the display unit designated in accordance with the movement of the monitoring object;
    Switching the directivity of the voice in a direction toward the monitoring object corresponding to the second designated position, using the acquired information on the second designated position.
    Directivity control method.
  30.  複数のマイクを含む第1の収音部で収音された音声の指向性を制御する指向性制御装置における処理を実行するプログラムが格納された記憶媒体であって、
     前記第1の収音部から、表示部の画像上の第1の指定位置に対応する監視対象物への方向に、前記音声の指向性を形成するステップと、
     前記監視対象物の移動に応じて指定された、前記表示部の画像上の第2の指定位置に関する情報を取得するステップと、
     取得された前記第2の指定位置に関する情報を用いて、前記第2の指定位置に対応する前記監視対象物への方向に、前記音声の指向性を切り替えるステップと、を実行するプログラムが格納された、
     記憶媒体。
    A storage medium storing a program for executing processing in a directivity control device that controls directivity of sound collected by a first sound collection unit including a plurality of microphones,
    Forming the directivity of the voice in a direction from the first sound collection unit to the monitoring target corresponding to the first designated position on the image of the display unit;
    Obtaining information related to a second designated position on the image of the display unit designated in accordance with the movement of the monitoring object;
    A program for executing the step of switching the directivity of the voice in a direction toward the monitoring object corresponding to the second designated position using the acquired information on the second designated position is stored. The
    Storage medium.
  31.  収音領域を撮像する撮像部と、
     複数のマイクを含み前記収音領域の音声を収音する第1の収音部と、
     前記第1の収音部で収音された音声の指向性を制御する指向性制御装置と、を備え、
     前記指向性制御装置は、
     前記第1の収音部から、表示部の画像上の第1の指定位置に対応する監視対象物への方向に、前記音声の指向性を形成する指向性形成部と、
     前記監視対象物の移動に応じて指定された、前記表示部の画像上の第2の指定位置に関する情報を取得する情報取得部と、を備え、
     前記指向性形成部は、
     前記情報取得部により取得された前記第2の指定位置に関する情報を用いて、前記第2の指定位置に対応する前記監視対象物への方向に、前記音声の指向性を切り替える、
     指向性制御システム。
    An imaging unit for imaging the sound collection area;
    A first sound collection unit that includes a plurality of microphones and collects sound in the sound collection region;
    A directivity control device that controls the directivity of the sound collected by the first sound collection unit,
    The directivity control device includes:
    A directivity forming unit that forms directivity of the sound in a direction from the first sound collecting unit to a monitoring target corresponding to the first designated position on the image of the display unit;
    An information acquisition unit that acquires information related to a second specified position on the image of the display unit that is specified according to the movement of the monitoring object;
    The directivity forming part is
    Using the information on the second designated position acquired by the information acquisition unit, the directivity of the voice is switched in the direction toward the monitoring object corresponding to the second designated position.
    Directional control system.
PCT/JP2014/002473 2014-05-09 2014-05-09 Directivity control apparatus, directivity control method, storage medium, and directivity control system WO2015170368A1 (en)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN201480045464.2A CN105474667B (en) 2014-05-09 2014-05-09 Directivity control method and directive property control system
JP2015526795A JP6218090B2 (en) 2014-05-09 2014-05-09 Directivity control method
PCT/JP2014/002473 WO2015170368A1 (en) 2014-05-09 2014-05-09 Directivity control apparatus, directivity control method, storage medium, and directivity control system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2014/002473 WO2015170368A1 (en) 2014-05-09 2014-05-09 Directivity control apparatus, directivity control method, storage medium, and directivity control system

Publications (1)

Publication Number Publication Date
WO2015170368A1 true WO2015170368A1 (en) 2015-11-12

Family

ID=54392238

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2014/002473 WO2015170368A1 (en) 2014-05-09 2014-05-09 Directivity control apparatus, directivity control method, storage medium, and directivity control system

Country Status (3)

Country Link
JP (1) JP6218090B2 (en)
CN (1) CN105474667B (en)
WO (1) WO2015170368A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2016178652A (en) * 2013-07-09 2016-10-06 ノキア テクノロジーズ オーユー Audio processing apparatus
WO2023054047A1 (en) * 2021-10-01 2023-04-06 ソニーグループ株式会社 Information processing device, information processing method, and program

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106098075B (en) * 2016-08-08 2018-02-02 腾讯科技(深圳)有限公司 Audio collection method and apparatus based on microphone array
CN107491101A (en) * 2017-09-14 2017-12-19 歌尔科技有限公司 A kind of adjusting method, device and the electronic equipment of microphone array pickup angle
JP2019062448A (en) * 2017-09-27 2019-04-18 カシオ計算機株式会社 Image processing apparatus, image processing method, and program
US11209306B2 (en) 2017-11-02 2021-12-28 Fluke Corporation Portable acoustic imaging tool with scanning and analysis capability
WO2020023633A1 (en) 2018-07-24 2020-01-30 Fluke Corporation Systems and methods for tagging and linking acoustic images
CN110189764B (en) * 2019-05-29 2021-07-06 深圳壹秘科技有限公司 System and method for displaying separated roles and recording equipment
CN110493690B (en) * 2019-08-29 2021-08-13 北京搜狗科技发展有限公司 Sound collection method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0918849A (en) * 1995-07-04 1997-01-17 Matsushita Electric Ind Co Ltd Photographing device
JP2009182437A (en) * 2008-01-29 2009-08-13 Mitsubishi Electric Corp Monitoring camera apparatus, and focus aid device
JP2013168757A (en) * 2012-02-15 2013-08-29 Hitachi Ltd Video monitoring apparatus, monitoring system and monitoring system construction method
WO2013179335A1 (en) * 2012-05-30 2013-12-05 株式会社 日立製作所 Monitoring camera control device and visual monitoring system

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH10191498A (en) * 1996-12-27 1998-07-21 Matsushita Electric Ind Co Ltd Sound signal processor
JP3575437B2 (en) * 2001-05-10 2004-10-13 日本電気株式会社 Directivity control device
JP4153208B2 (en) * 2002-01-22 2008-09-24 ソフトバンクテレコム株式会社 Base station antenna directivity control apparatus in CDMA system and base station antenna directivity control apparatus in CDMA cellular system
JP2008271157A (en) * 2007-04-19 2008-11-06 Fuji Xerox Co Ltd Sound enhancement device and control program
JP2010187363A (en) * 2009-01-16 2010-08-26 Sanyo Electric Co Ltd Acoustic signal processing apparatus and reproducing device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0918849A (en) * 1995-07-04 1997-01-17 Matsushita Electric Ind Co Ltd Photographing device
JP2009182437A (en) * 2008-01-29 2009-08-13 Mitsubishi Electric Corp Monitoring camera apparatus, and focus aid device
JP2013168757A (en) * 2012-02-15 2013-08-29 Hitachi Ltd Video monitoring apparatus, monitoring system and monitoring system construction method
WO2013179335A1 (en) * 2012-05-30 2013-12-05 株式会社 日立製作所 Monitoring camera control device and visual monitoring system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2016178652A (en) * 2013-07-09 2016-10-06 ノキア テクノロジーズ オーユー Audio processing apparatus
WO2023054047A1 (en) * 2021-10-01 2023-04-06 ソニーグループ株式会社 Information processing device, information processing method, and program

Also Published As

Publication number Publication date
CN105474667A (en) 2016-04-06
JP6218090B2 (en) 2017-10-25
JPWO2015170368A1 (en) 2017-04-20
CN105474667B (en) 2018-11-27

Similar Documents

Publication Publication Date Title
JP6218090B2 (en) Directivity control method
US10142727B2 (en) Directivity control apparatus, directivity control method, storage medium and directivity control system
EP2942975A1 (en) Directivity control apparatus, directivity control method, storage medium and directivity control system
JP6202277B2 (en) Voice processing system and voice processing method
JP5338498B2 (en) Control device, camera system and program used in surveillance camera system
JP5958717B2 (en) Directivity control system, directivity control method, sound collection system, and sound collection control method
CN102202168B (en) control device, camera system and program
JP4378636B2 (en) Information processing system, information processing apparatus, information processing method, program, and recording medium
JP2007295335A (en) Camera device and image recording and reproducing method
JP6145736B2 (en) Directivity control method, storage medium, and directivity control system
KR20110093040A (en) Apparatus and method for monitoring an object
KR102474729B1 (en) The Apparatus For Mornitoring
JP6388144B2 (en) Directivity control device, directivity control method, storage medium, and directivity control system
US9426408B2 (en) Method and apparatus for recording video sequences
WO2014064878A1 (en) Information-processing device, information-processing method, program, and information-processng system
JP2016181770A (en) Sound collection system
KR20120125037A (en) Method for controlling surveillance system
JP5229141B2 (en) Display control apparatus and display control method
JP5464290B2 (en) Control device, control method, and camera system
JP4595322B2 (en) Image processing system, remote controller and method, image processing apparatus and method, recording medium, and program
WO2023122511A1 (en) Apparatus and method for controlling an online meeting
KR20080045319A (en) Method for controlling cursor according to input device except mouse and system therefor

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 201480045464.2

Country of ref document: CN

ENP Entry into the national phase

Ref document number: 2015526795

Country of ref document: JP

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14891490

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14891490

Country of ref document: EP

Kind code of ref document: A1