WO2015170368A1

WO2015170368A1 - Directivity control apparatus, directivity control method, storage medium, and directivity control system

Info

Publication number: WO2015170368A1
Application number: PCT/JP2014/002473
Authority: WO
Inventors: 信一重永; 昭年泉; 林　和典; 徳田　肇道; 裕隆澤
Original assignee: パナソニックＩｐマネジメント株式会社
Priority date: 2014-05-09
Filing date: 2014-05-09
Publication date: 2015-11-12
Also published as: CN105474667A; JP6218090B2; JPWO2015170368A1; CN105474667B

Abstract

A directivity control apparatus controls the directivity of sounds collected by a first sound collection unit including a plurality of microphones. A directivity formation unit forms the directivity of the sounds in a direction extending from the first sound collection unit toward a monitoring target corresponding to a first designated position on an image displayed on a display unit. An information acquisition unit acquires information related to a second designated position on the image displayed on the display unit, said second designated position being designated in accordance with a movement of the monitoring target. The directivity formation unit changes, by use of the acquired information related to the second designated position, the directivity of the sounds to a direction extending toward the monitoring target corresponding to the second designated position.

Description

Directivity control device, directivity control method, storage medium, and directivity control system

The present invention relates to a directivity control device, a directivity control method, a storage medium, and a directivity control system that control the directivity of speech.

Conventionally, in a surveillance system installed at a predetermined position (for example, a ceiling surface) of a factory, a store (for example, a retail store, a bank) or a public place (for example, a library), one or more camera devices (for example, PTZ (for example, PTZ (for example)) Pan Tilt Zoom) camera device or omnidirectional camera device) is connected to widen the angle of view of the image data (including still images and moving images, the same applies hereinafter) of the video in the monitoring target range.

Since the amount of information obtained by monitoring using video is limited, in addition to one or more camera devices, a specific monitoring that exists within the angle of view of the camera device using a microphone array device containing a plurality of microphones. There is a high demand for a monitoring system that can obtain voice data from an object (for example, a person). In such a monitoring system, it is considered that it is necessary to consider the movement of a person when the microphone array apparatus collects sound.

Here, as a prior art that simplifies a user's input operation by drawing a trajectory point by designating a trajectory point from the start point to the end point on a monitor television screen that displays an image captured by a TV camera, for example, as a patent A pan / tilt head control device for a television camera shown in Document 1 has been proposed.

The pan / tilt head control device shown in Patent Document 1 displays an image captured by a TV camera installed on a pan / tilt head provided with pan and tilt driving means on a monitor TV and automatically shoots on the monitor TV screen. The trajectory points from the movement start point to the end point are input, the sequential trajectory points are sequentially connected to obtain a continuous trajectory line, and the trajectory data from the movement start point to the end point of the trajectory line is sequentially read to read the data. Automatic shooting is executed so that the point is positioned at the center of the shooting screen. As a result, the pan / tilt head control device of the TV camera can obtain the trajectory data of the pan and tilt drive by a simple input operation by inputting the trajectory point on the screen of the monitor TV, and performs accurate drive control. be able to.

Japanese Unexamined Patent Publication No. 06-133189

However, Patent Document 1 does not disclose a configuration for picking up sound produced by a person projected on a monitor television. For example, even if the configuration of Patent Document 1 is applied to the above-described monitoring system, the moving start point to the end point is not disclosed. There is a problem that it is difficult to pick up the voice of a person on a trajectory point with high accuracy.

In order to solve the above-described conventional problems, the present invention appropriately forms the sound directivity with respect to the monitoring target even if the monitoring target on the image moves, An object is to provide a directivity control device, a directivity control method, a storage medium, and a directivity control system that suppress efficiency degradation.

The present invention is a directivity control device that controls the directivity of sound collected by a first sound collection unit including a plurality of microphones, and is provided on an image of a display unit from the first sound collection unit. A directivity forming unit that forms directivity of the sound in a direction toward the monitoring target corresponding to the first specified position, and an image on the display unit that is specified according to the movement of the monitoring target An information acquisition unit that acquires information about a second specified position, wherein the directivity forming unit uses the information about the second specified position acquired by the information acquisition unit, It is a directivity control device that switches the directivity of the sound in a direction toward the monitoring object corresponding to a position.

The present invention is also a directivity control method in a directivity control apparatus for controlling the directivity of sound collected by a first sound collection unit including a plurality of microphones, the first sound collection unit including: Forming the directivity of the sound in a direction toward the monitoring target corresponding to the first designated position on the image of the display; and the display unit specified according to the movement of the monitoring target Using the information on the second designated position on the image and the information on the obtained second designated position in the direction toward the monitoring object corresponding to the second designated position, And a step of switching the directivity of the voice.

Further, the present invention is a storage medium storing a program for executing processing in a directivity control device that controls the directivity of sound collected by a first sound collection unit including a plurality of microphones, A step of forming directivity of the sound in a direction from the first sound collection unit to the monitoring target corresponding to the first designated position on the image of the display unit, and according to the movement of the monitoring target The step of acquiring information related to the second specified position on the image of the display unit specified, and the monitoring corresponding to the second specified position using the acquired information related to the second specified position And a step of switching the directivity of the sound in a direction toward the object.

Furthermore, the present invention provides an image pickup unit that picks up a sound pickup region, a first sound pickup unit that includes a plurality of microphones and picks up sound in the sound pickup region, and is picked up by the first sound pickup unit. A directivity control device for controlling the directivity of the sound, the directivity control device from the first sound collection unit to the monitoring target corresponding to the first designated position on the image of the display unit Information for obtaining information on a second designated position on the image of the display unit, which is designated according to the movement of the monitoring object, and a directivity forming unit that forms the directivity of the sound in the direction of An orientation unit, and the directivity forming unit uses the information on the second designated position obtained by the information obtaining unit, and directs the direction to the monitoring object corresponding to the second designated position. And a directivity control system for switching the directivity of the voice.

According to the present invention, even if the monitoring object on the image moves, it can be appropriately formed following the directivity of the sound with respect to the monitoring object, and the efficiency deterioration of the monitoring work of the monitor can be suppressed.

Explanatory drawing which shows the operation | movement outline | summary of the directivity control system of 1st Embodiment. The block diagram which shows the 1st system configuration example of the directivity control system of 1st Embodiment. The block diagram which shows the 2nd system configuration example of the directivity control system of 1st Embodiment. Explanatory drawing which shows the operation example of manual tracking processing Explanatory drawing which shows the operation example which changes a tracking point by manual tracking processing when the tracking point automatically specified in automatic tracking processing is wrong Explanatory drawing showing slow playback processing in recording playback mode and slow playback mode Explanatory drawing which shows the enlarged display process in enlarged display mode (A) An explanatory view showing an automatic scroll process after an enlargement display process in an enlargement display mode, (B) a view showing a tracking screen at time t = t1, and (C) a view showing a tracking screen at time t = t2. (A) A flowchart for explaining a first example of an overall flow of manual tracking processing in the directivity control system of the first embodiment. (B) An overall flow of manual tracking processing in the directivity control system of the first embodiment. Flow chart explaining the second example (A) A flowchart for explaining a first example of the entire flow of the automatic tracking process in the directivity control system of the first embodiment, and (B) a flowchart for explaining a first example of the automatic tracking process shown in (A). (A) A flowchart for explaining a second example of the automatic tracking process shown in FIG. 10 (A), and (B) a flowchart for explaining an example of the tracking correction process shown in (A). Flowchart for explaining a third example of the automatic tracking process shown in FIG. (A) A flowchart for explaining an example of the tracking assisting process shown in FIG. 9A, and (B) a flowchart for explaining an example of the automatic scrolling process shown in (A). FIG. 13A is a flowchart illustrating an example of the automatic scroll process necessity determination process shown in FIG. 13B, and FIG. 13B is an explanatory diagram of a scroll necessity determination line in the automatic scroll process necessity determination process. (A) A flowchart for explaining an example of the tracking connection process shown in FIG. 9 (A), and (B) a flowchart for explaining an example of the collective connection process shown in (A). (A) Explanatory drawing of the reproduction start time PT of collected sound corresponding to the user's designated position on the flow line between tracking points displayed for one person's movement, (B) First example of tracking list Figure showing (A) Explanatory drawing of reproduction | regeneration start time PT of the picked-up sound corresponding to the user's designated position on the flow line between the different tracking points based on multiple simultaneous designation | designated, (B) The figure which shows the 2nd example of a tracking list. (A) Explanatory drawing of reproduction start times PT and PT ′ of the collected sound corresponding to each designated position of the user on the flow line between different tracking points based on a plurality of designations, and (B) a third example of the tracking list. Figure (A) A flowchart for explaining an example of an entire flow of a flow line display reproduction process using a tracking list in the directivity control system of the first embodiment, and (B) an example of a reproduction start time calculation process shown in (A). Flow chart to explain A flowchart for explaining an example of the flow line display process shown in FIG. (A) A flowchart for explaining an example of the audio output process shown in FIG. 9 (A), (B) a flowchart for explaining an example of the image privacy protection process shown in FIG. 13 (A). (A) A diagram showing an example of a waveform of an audio signal corresponding to a pitch before voice change processing, (B) a diagram showing an example of a waveform of an audio signal corresponding to a pitch after voice change processing, (C) detected Explanatory diagram of processing to blur the outline of a person's face The block diagram which shows the system configuration example of the directivity control system of 2nd Embodiment. Explanatory drawing which shows the automatic switching process of the camera apparatus used for imaging of the image displayed on a display apparatus Explanatory drawing which shows the automatic switching process of the omnidirectional microphone array apparatus used for the sound collection of the sound of the monitoring object Explanatory drawing which shows the manual switching process of the camera apparatus used for imaging of the image displayed on a display apparatus Explanatory drawing which shows the manual switching process of the omnidirectional microphone array apparatus used for the sound collection of the sound of the monitoring object Explanatory drawing which shows the selection process of the optimal omnidirectional microphone array apparatus used for the sound collection of the sound of the monitoring object (A) A flowchart for explaining an example of the automatic switching process of the camera device in the directivity control system of the second embodiment, and (B) a flowchart showing an example of the camera switching determination process shown in (A). (A) The flowchart explaining an example of the automatic switching process of the omnidirectional microphone array apparatus in the directivity control system of 2nd Embodiment, The flowchart which shows an example of the microphone switching determination process shown to (B) (A). (A) The flowchart explaining an example of the manual switching process of the camera apparatus in the directivity control system of 2nd Embodiment, (B) The manual switching process of the omnidirectional microphone array apparatus in the directivity control system of 2nd Embodiment Flowchart explaining an example (A) A flow chart for explaining a first example of an optimal omnidirectional microphone array device selection process in the directivity control system of the second embodiment, and (B) an optimal all in the directivity control system of the second embodiment. Flowchart for explaining a second example of selection processing of azimuth microphone array device The flowchart explaining the 3rd example of the selection process of the optimal omnidirectional microphone array apparatus in the directivity control system of 2nd Embodiment. The flowchart explaining an example of the whole flow of the manual tracking process based on multiple simultaneous specification in the directivity control system of the modification of 1st Embodiment. The flowchart explaining an example of the automatic tracking process of the several monitoring target object in the directivity control system of the modification of 1st Embodiment. (A) to (E) External view of the omnidirectional microphone array device housing Simple explanatory diagram of the delay sum method in which the omnidirectional microphone array device forms the directivity of the audio data in the direction of the angle θ

Hereinafter, embodiments of a directivity control device, a directivity control method, a storage medium, and a directivity control system according to the present invention will be described with reference to the drawings. The directivity control system of each embodiment is, for example, as a monitoring system (including a manned monitoring system and an unmanned monitoring system) installed in a factory, a public facility (for example, a library, an event venue), or a store (for example, a retail store or a bank). Used.

The present invention records a program for causing a directivity control device, which is a computer, to execute an operation defined by a directivity control method, or a program for causing a computer to execute an operation defined by a directivity control method. It can also be expressed as a computer-readable recording medium.

(First embodiment)
FIG. 1 is an explanatory diagram illustrating an outline of operations of the

directivity control systems

100 and 100A according to the first embodiment. FIG. 2 is a block diagram illustrating a first system configuration example of the directivity control system 100 according to the first embodiment. FIG. 3 is a block diagram illustrating a second system configuration example of the directivity control system 100A according to the first embodiment.

The specific configuration of the

directivity control systems

100 and 100A will be described later. First, an outline of the operation of the

directivity control systems

100 and 100A will be briefly described with reference to FIG.

In FIG. 1, the camera apparatus C1 images the monitoring target object (for example, the person HM1) of the

directivity control systems

100 and 100A used as the monitoring system, for example, and transmits image data obtained by the imaging via the network NW. To the connected directivity control device 3.

In each embodiment including the present embodiment, the person HM1 may be stationary or may be moved, but will be described as moving. For example, the person HM1 moves from the tracking position A1 (x1, y1, z0) to the tracking position A2 (x2, y2, z0) by the tracking time t2 at the tracking time t1.

Here, the tracking point refers to a position where the user designates the person HM1 on the tracking screen TRW when an image of the moving person HM1 captured by the camera device C1 is displayed on the tracking screen TRW of the display device 35 ( That is, the position on the tracking screen TRW). The tracking point and tracking time data are associated with the tracking point (see, for example, FIG. 16B described later). The tracking position is a three-dimensional coordinate indicating a position in real space corresponding to a position on the tracking screen TRW where the person HM1 is designated.

In addition, the tracking screen TRW is a voice tracking process (for example, a person HM1 out of a screen (hereinafter referred to as “camera screen”) on which an image captured by a camera device (for example, the camera device C1) is displayed on the display device 35. The screen shown as the monitoring target object which becomes the object of the after-mentioned reference) is shown. In each of the following embodiments, a screen on which the person HM1 or the like is not projected as a monitoring target is referred to as a camera screen, a screen that is projected as a monitoring target is referred to as a tracking screen, and unless otherwise specified, the camera The screen and the tracking screen are described separately.

In FIG. 1, in order to simplify the description, it is assumed that the same person HM1 moves, so that the z coordinates of the tracking positions at the tracking points TP1 and TP2 are the same. Furthermore, even if the person HM1 moves from the tracking position A1 to the tracking position A2, the image is taken by the camera device C1, but the camera device C1 may continue to image the person HM1 following the movement of the person HM1. The imaging may be stopped.

The omnidirectional microphone array device M1 picks up the sound emitted by the person HM1, and transmits the sound collecting sound data to the directivity control device 3 connected via the network NW.

When the person HM1 as the monitoring target is stationary at the tracking position A1, the directivity control device 3 sets the directivity of the collected sound in the directivity direction from the omnidirectional microphone array device M1 to the tracking position A1. Form. In addition, the directivity control device 3 switches the directivity of the collected sound in the directivity direction from the omnidirectional microphone array device M1 to the tracking position A2 when the person HM1 moves from the tracking position A1 to the tracking position A2. Form.

In other words, the directivity control device 3 moves the omnidirectional microphone from the direction from the omnidirectional microphone array device M1 to the tracking position A1 as the person HM1 as the monitoring object moves from the tracking position A1 to the tracking position A2. The directivity of the collected sound is tracked in the direction from the array apparatus M1 to the tracking position A2, that is, a sound tracking process is performed.

The directivity control system 100 shown in FIG. 2 includes one or more camera devices C1,..., Cn, one or more omnidirectional microphone array devices M1,..., Mm, a directivity control device 3, and a recorder device 4. It is the structure containing these. n and m are integers of 1 or more and may be the same number or different numbers. The same applies to the following embodiments.

The camera devices C1, ..., Cn, the omnidirectional microphone array devices M1, ..., Mm, the directivity control device 3, and the recorder device 4 are connected to each other via a network NW. The network NW may be a wired network (for example, an intranet or the Internet), or a wireless network (for example, a wireless LAN (Local Area Network), WiMAX (registered trademark), or a wireless WAN (Wide Area Network)). In the following embodiment, a description will be given assuming that one camera device C1 and an omnidirectional microphone array device M1 are provided in order to simplify the description.

Hereinafter, each device constituting the directivity control system 100 will be described. In each of the embodiments including this embodiment, the housing of the camera device C1 and the omnidirectional microphone array device M1 are separately attached at different positions, but the housing of the camera device C1 and the omnidirectional microphone are attached. The housing of the array device M1 may be integrally attached at the same position.

A camera device C1 as an example of an imaging unit is installed fixedly on a ceiling surface of an event venue, for example, has a function as a monitoring camera in a monitoring system, and from a monitoring control room (not shown) connected to a network NW. With the remote operation, an image within a predetermined angle of view of the camera device C1 is captured in a predetermined sound collection area (for example, a predetermined area in the event venue). The camera device C1 may be a camera having a PTZ function or a camera capable of imaging all directions. Note that when the camera device C1 is a camera capable of capturing an omnidirectional image, the image data indicating the omnidirectional video in the sound collection area (that is, the omnidirectional image data), or a predetermined distortion in the omnidirectional image data. Planar image data generated by performing panorama conversion by performing correction processing is transmitted to the directivity control device 3 or the recorder device 4 via the network NW.

When an arbitrary position is designated by the cursor CSR or the user's finger FG in the image data displayed on the display device 35, the camera device C1 transmits the coordinate data of the designated position in the image data to the directivity control device 3. The distance and direction from the camera device 1 to the sound position in the real space corresponding to the designated position (hereinafter simply abbreviated as “sound position”) (including horizontal and vertical angles, and so on). Is transmitted to the directivity control device 3. Note that the distance and direction data calculation processing in the camera device C1 is a known technique, and thus description thereof is omitted.

An omnidirectional microphone array apparatus M1 as an example of a sound collection unit is fixedly installed on the ceiling surface of an event venue, for example, and a plurality of microphone units 22 and 23 (see FIGS. 36A to 36E) are even. The configuration includes at least a microphone unit provided at intervals and a CPU (Central Processing Unit) that controls operations of the

microphone units

22 and 23 of the microphone unit.

When the power is turned on, the omnidirectional microphone array device M1 performs predetermined audio signal processing (for example, amplification processing, filter processing, addition processing) on the audio data of the sound collected by the microphone element in the microphone unit, The voice data obtained by the predetermined voice signal processing is transmitted to the directivity control device 3 or the recorder device 4 via the network NW.

Here, the appearance of the casing of the omnidirectional microphone array apparatus M1 will be described with reference to FIGS. 36 (A) to (E). FIGS. 36A to 36E are external views of the casing of the omnidirectional microphone array apparatus M1. The omnidirectional microphone array apparatuses M1C, M1A, M1B, M1, and M1D shown in FIGS. 36A to 36E are different in appearance and arrangement positions of a plurality of microphone units, but the functions of the omnidirectional microphone array apparatuses are the same. is there.

36A includes an omnidirectional microphone array apparatus M1C having a disk-shaped casing 21. A plurality of

microphone units

22 and 23 are concentrically arranged in the housing 21. Specifically, the plurality of microphone units 22 are arranged concentrically with the same center as the casing 21 and along the circumference of the casing 21, and the plurality of microphone units 23 are the same as the casing 21. A concentric circle having a center is disposed inside the housing 21. Each microphone unit 22 has a wide interval, a large diameter, and characteristics suitable for a low sound range. On the other hand, each microphone unit 23 is narrow in distance from each other, has a small diameter, and has characteristics suitable for a high sound range.

36B includes an omnidirectional microphone array apparatus M1A having a disk-shaped casing 21. A plurality of microphone units 22 are arranged in a cross shape along the vertical direction and the horizontal direction at equal intervals in the casing 21, and the vertical array and the horizontal array are the center of the casing 21. At In the omnidirectional microphone array apparatus M1A, since the plurality of microphone units 22 are linearly arranged in two directions, ie, the vertical direction and the horizontal direction, it is possible to reduce the amount of calculation when forming the directivity of audio data. In the omnidirectional microphone array apparatus M1A shown in FIG. 36B, a plurality of microphone units 22 may be arranged in only one column in the vertical direction or the horizontal direction.

The omnidirectional microphone array apparatus M1B shown in FIG. 36 (C) has a disk-shaped casing 21B having a smaller diameter than the omnidirectional microphone array apparatus M1C shown in FIG. 36 (A). In the casing 21B, a plurality of microphone units 22 are arranged at equal intervals along the circumference of the casing 21B. The omnidirectional microphone array apparatus M1B shown in FIG. 36C has characteristics suitable for a high sound range because the distance between the microphone units 22 is short.

36D includes an omnidirectional microphone array apparatus M1 having a donut-shaped or ring-shaped casing 21C in which an opening 21a having a predetermined diameter is formed at the center of the casing 21C. In the

directivity control systems

100 and 100A of the present embodiment, for example, an omnidirectional microphone array apparatus M1 shown in FIG. 36 (D) is used. In the housing 21C, a plurality of microphone units 22 are concentrically arranged at equal intervals along the circumferential direction of the housing 21C.

The omnidirectional microphone array apparatus M1D shown in FIG. 36 (E) has a rectangular casing 21D. In the casing 21D, a plurality of microphone units 22 are arranged at equal intervals along the outer periphery of the casing 21D. In the omnidirectional microphone array apparatus M1D shown in FIG. 36 (E), the casing 21D has a rectangular shape, and therefore the installation of the omnidirectional microphone array apparatus M1D can be simplified even at, for example, a corner or a wall surface.

The

microphone units

22 and 23 of the omnidirectional microphone array apparatus M1 may be omnidirectional microphones, bi-directional microphones, unidirectional microphones, sharp directional microphones, super-directional microphones (for example, gun microphones), or the like. A combination may be used.

The directivity control devices 3 and 3A may be, for example, a stationary PC (Personal Computer) installed in a monitoring control room (not shown), a mobile phone that can be carried by the user, a PDA (Personal Digital Assistant), or a tablet terminal. A data communication terminal such as a smartphone may be used.

The directivity control device 3 includes at least a communication unit 31, an operation unit 32, a memory 33, a signal processing unit 34, a display device 35, and a speaker device 36. The signal processing unit 34 includes at least a directivity direction calculation unit 34a, an output control unit 34b, and a tracking processing unit 34c.

The communication unit 31 receives the image data transmitted from the camera device C1 or the audio data transmitted from the omnidirectional microphone array device M1, and outputs the received image data to the signal processing unit 34.

The operation unit 32 is a user interface (UI) for notifying the signal processing unit 34 of a user input operation, and is, for example, a pointing device such as a mouse or a keyboard. In addition, the operation unit 32 may be configured using a touch panel that is arranged corresponding to the display screen of the display device 35 and can detect an input operation with a user's finger FG or a stylus pen, for example.

The operation unit 32 has a designated position designated by the cursor CSR or the user's finger FG by the user's mouse operation in the image data displayed on the display device 35 (that is, image data taken by the camera device C1). The coordinate data is output to the signal processing unit 34.

The memory 33 is configured by using, for example, a RAM (Random Access Memory), and functions as a work memory during operation of each unit of the directivity control device 3. The memory 33 as an example of an image storage unit or an audio storage unit is configured using, for example, a hard disk or a flash memory, and the image data or audio data stored in the recorder device 4, that is, the camera device C1 over a certain period. Is stored, or audio data picked up by the omnidirectional microphone array apparatus M1 is stored.

The memory 33 as an example of the designation list storage unit is an example of a designation list including data of all designated positions and designated times (see later) of the image data displayed on the display device 35 on the tracking screen TRW. Data of the tracking list LST (see, for example, FIG. 16B) is stored.

The signal processing unit 34 is configured using, for example, a CPU (Central Processing Unit), an MPU (Micro Processing Unit), or a DSP (Digital Signal Processor), and controls the operation of each unit of the directivity control device 3 as a whole. Control processing, data input / output processing between other units, data calculation (calculation) processing, and data storage processing are performed.

At the time of calculation of the directivity direction coordinates (θ _MAh , θ _MAv ), the directivity direction calculation unit 34 a uses the cursor CSR by the user's mouse operation or the coordinate data of the specified position of the image data specified by the user's finger FG as the operation unit 32. Is acquired from the communication unit 31 to the camera device C1. The directivity direction calculation unit 34 a acquires distance and direction data from the installation position of the camera device 1 to the sound (sound source) position in the real space corresponding to the specified position of the image data from the communication unit 31.

The directivity direction calculation unit 34a uses the data on the distance and direction from the installation position of the camera device C1 to the sound position, and uses the direction direction coordinates (θ _MAh , θ) from the installation position of the omnidirectional microphone array device M1 to the sound position. _MAv ) is calculated.

In addition, as in the present embodiment, when the housing of the camera device C1 and the housing of the omnidirectional microphone array device M1 are separated and attached separately, the directivity calculation unit 34a calculates in advance. The sound position (sound source position) from the omnidirectional microphone array apparatus M1 is obtained using the predetermined calibration parameter data and the data in the direction (horizontal angle, vertical angle) from the camera device C1 to the sound position (sound source position). ) Directivity direction coordinates (θ _MAh , θ _MAv ) are calculated. The calibration is an operation for calculating or acquiring a predetermined calibration parameter necessary for the directivity direction calculation unit 34a of the directivity control device 3 to calculate the directivity direction coordinates (θ _MAh , θ _MAv ). Specific contents of the calibration method and calibration parameters are not particularly limited, and can be realized, for example, within the scope of known techniques.

Further, when the omnidirectional microphone array device M1 is integrally attached so as to surround the camera device C1, the direction from the camera device C1 to the sound position (sound source position) (horizontal angle, (Vertical angle) can be used as the directivity direction coordinates (θ _MAh , θ _MAv ) from the omnidirectional microphone array device 2 to the sound position.

Here, of the directivity direction coordinates (θ _MAh , θ _MAv ), θ _MAh indicates a horizontal angle in the directivity direction from the installation position of the omnidirectional microphone array device 2 to the voice position, and θ _MAv is the omnidirectional microphone array device 2. The vertical angle of the pointing direction from the installation position to the voice position is shown. In the following description, to simplify the description, it is assumed that the reference directions (0 degree directions) of the horizontal angles of the camera device C1 and the omnidirectional microphone array device M1 coincide.

The output control unit 34b controls the operations of the display device 35 and the speaker device 36. For example, the output control unit 34b as an example of the display control unit displays the image data transmitted from the camera device C1 on the display device 35 in accordance with, for example, an input operation with the cursor CSR or the user's finger FG by the user's mouse operation. Let The output control unit 34b as an example of the audio output control unit acquires the audio data transmitted from the omnidirectional microphone array apparatus 2 or the audio data collected by the omnidirectional microphone array apparatus M1 over a certain period from the recorder apparatus 4. In such a case, for example, audio data is output to the speaker device 36 in response to an input operation using a cursor CSR or a user's finger FG by a user's mouse operation.

Further, when the output control unit 34b as an example of the image reproduction unit acquires image data captured by the camera device C1 over a certain period from the recorder device 4, for example, the cursor CSR or the user's finger by the user's mouse operation is obtained. The display device 35 is caused to reproduce the image data in response to an input operation by the FG.

Further, the output control unit 34b as an example of the directivity forming unit is calculated by the directivity direction calculating unit 34a using the audio data transmitted from the omnidirectional microphone array device 2 or the audio data acquired from the recorder device 4. The directivity (beam) of the sound (collected sound) collected by the omnidirectional microphone array device 2 is formed in the directivity direction indicated by the directivity direction coordinates (θ _MAh , θ _MAv ).

Thereby, the directivity control device 3 can relatively increase the volume level of the sound emitted by the monitoring target (for example, the person HM1) existing in the directivity direction in which the directivity is formed, and the sound in the direction in which the directivity is not formed. Can be suppressed to relatively reduce the volume level.

The tracking processing unit 34c as an example of the information acquisition unit acquires information related to the above-described voice tracking processing. For example, on the tracking screen TRW of the display device 35 on which the image data captured by the camera device C1 is displayed, the tracking processing unit 34c responds to an input operation with a cursor CSR by a user's mouse operation or a user's finger FG, for example. When a new position is designated, information on the newly designated position is acquired.

Here, in addition to the coordinate information indicating the position on the image data specified on the tracking screen TRW, the information regarding the newly specified position is specified at the newly specified time (specified time) and the specified time. From the omnidirectional microphone array device M1 to the sound position (sound source position) or the coordinate information of the sound position (sound source position) where the monitoring object (for example, the person HM1) in the real space corresponding to the position on the image data exists. Distance information is included.

Further, the tracking processing unit 34c as an example of the reproduction time calculating unit uses the data of the tracking list LST stored in the memory 33, for example, according to the input operation with the cursor CSR by the user's mouse operation or the user's finger FG. The reproduction time of the sound at the position on the designated flow line is calculated (see later).

The display device 35 as an example of the display unit is configured using, for example, an LCD (Liquid Crystal Display) or an organic EL (Electroluminescence), and receives image data captured by the camera device C1 under the control of the output control unit 34b. indicate.

The speaker device 36 as an example of the sound output unit has directivity formed in sound data of sound collected by the omnidirectional microphone array device M1 or in a directivity direction indicated by directivity direction coordinates (θ _MAh , θ _MAv ). Output audio data. Note that the display device 35 and the speaker device 36 may have different configurations from the directivity control device 3.

The recorder device 4 stores the image data picked up by the camera device C1 and the sound data of the sound collected by the omnidirectional microphone array device M1 in association with each other.

The directivity control system 100A shown in FIG. 3 includes one or more camera devices C1,..., Cn, one or more omnidirectional microphone array devices M1,..., Mm, a directivity control device 3A, and a recorder device 4. It is the structure containing these. In FIG. 3, the same components and operations as those in FIG. 2 are denoted by the same reference numerals, description thereof is simplified or omitted, and different contents are described.

The directivity control device 3A includes at least a communication unit 31, an operation unit 32, a memory 33, a signal processing unit 34A, a display device 35, a speaker device 36, and an image processing unit 37. The signal processing unit 34A includes at least a pointing direction calculation unit 34a, an output control unit 34b, a tracking processing unit 34c, and a sound source detection unit 34d.

The sound source detection unit 34d detects the sound position (sound source position) in the real space corresponding to the sound uttered by the person HM1, which is the monitoring target, from the image data displayed on the display device 35. For example, the sound source detection unit 34d divides the sound collection area of the omnidirectional microphone array apparatus M1 into a plurality of grid areas, and directivity is formed from the omnidirectional microphone array apparatus M1 to the center position of each grid area. Measure the sound intensity or volume level. The sound source detection unit 34d estimates that the sound source exists in the lattice area having the highest sound intensity or volume level among all the lattice areas. The detection result of the sound source detection unit 34d includes, for example, distance information from the omnidirectional microphone array device M1 to the center position of the lattice area having the highest sound intensity or volume level.

In response to an instruction from the signal processing unit 34, the image processing unit 37 performs predetermined image processing on the image data displayed on the display device 35 (for example, VMD (Video Motion Detector) processing for detecting the motion of the person HM1). , Human face and face orientation detection processing, human detection processing), and the image processing result is output to the signal processing unit 34.

In addition, the image processing unit 37 detects the face outline DTL of the monitoring target (for example, the person HM1) displayed on the display device 35 in accordance with, for example, an input operation with the cursor CSR by the user's mouse operation or the user's finger FG. And masking the face. Specifically, the image processing unit 37 calculates a rectangular region including the detected face outline DTL, and performs a process of adding a predetermined blur to the rectangular region (see FIG. 22C). FIG. 22C is an explanatory diagram of the process of blurring the detected human face outline DTL. The image processing unit 37 outputs the image data generated by the blurring process to the signal processing unit 34.

FIG. 37 is a simple explanatory diagram of the delay sum method in which the omnidirectional microphone array apparatus M1 forms the directivity of the audio data in the direction of the angle θ. For easy understanding, it is assumed that the microphone elements 221 to 22n are arranged on a straight line. In this case, the directivity is a two-dimensional region in the plane. However, in order to form directivity in a three-dimensional space, the same processing method may be performed by arranging microphones in a two-dimensional array.

The sound wave emitted from the sound source 80 is at a certain angle with respect to the

microphone elements

221, 222, 223,..., 22 (n−1), 22n built in the

microphone units

22, 23 of the omnidirectional microphone array apparatus M1. Incident at (incident angle = (90−θ) [degrees]).

The sound source 80 is, for example, a monitoring object (for example, a person HM1) that exists in the direction of the omnidirectional microphone array apparatus M1, and is in a direction of a predetermined angle θ with respect to the surface of the casing 21 of the omnidirectional microphone array apparatus M1. Exists. Further, the distance d between the

microphone elements

221, 222, 223,..., 22 (n−1), 22n is constant.

The sound wave emitted from the sound source 80 first reaches the microphone element 221 and is collected, then reaches the microphone element 222 and is collected, and is successively collected, and finally reaches the microphone element 22n. Sound is collected.

Note that the direction from the position of each

microphone element

221, 222, 223,..., 22 (n−1), 22n of the omnidirectional microphone array apparatus M1 toward the sound source 80 is, for example, the sound source 80 being monitored (for example, the person HM1). Is the same as the direction from each microphone (microphone element) of the omnidirectional microphone array device 2 toward the sound position (sound source position) corresponding to the designated position designated on the display device 35 by the user. .

Here, there is an arrival time difference τ1, τ2, τ3,... From the time when the sound wave reaches the

microphone elements

221, 222, 223,. τ (n−1) is generated. For this reason, when the voice data collected by the

microphone elements

221, 222, 223,..., 22 (n−1), 22n are added as they are, they are added in a state of being out of phase. , The sound wave volume level weakens as a whole.

Note that τ1 is a difference time between the time when the sound wave reaches the microphone element 221 and the time when the sound wave reaches the microphone element 22n, and τ2 is the time when the sound wave reaches the microphone element 222 and the sound wave reaches the microphone element 22n. Similarly, τ (n−1) is the difference time between the time when the sound wave reaches the microphone element 22 (n−1) and the time when the sound wave reaches the microphone element 22n. is there.

In the present embodiment, the omnidirectional microphone array apparatus M1 includes A /

D converters

241, 242, 243 provided corresponding to the

microphone elements

221, 222, 223, ..., 22 (n-1), 22n. ..., 24 (n-1), 24n,

delay units

251, 252, 253, ..., 25 (n-1), 25n, and an adder 26 (see FIG. 37).

That is, the omnidirectional microphone array apparatus M1 converts analog audio data collected by the

microphone elements

221, 222, 223,..., 22 (n−1), 22n into A /

D converters

241, 242, 243, .., 24 (n-1), and 24n are AD-converted into digital audio data.

Further, the omnidirectional microphone array apparatus M1 includes the

microphone elements

221, 222, 223,..., 22 (n−1), 22n in the

delay units

251, 252, 253,. After the delay time corresponding to the arrival time difference is given and the phases of all the sound waves are made uniform, the adder 26 adds the audio data after the delay processing. Thereby, the omnidirectional microphone array apparatus M1 can form the directivity of the audio data in the direction of the predetermined angle θ in each of the

microphone elements

221, 222, 223,..., 22 (n−1), 22n.

For example, in FIG. 37, the delay times D1, D2, D3,..., D (n−1), Dn set in the

delay units

251, 252, 253,. This corresponds to the time differences τ1, τ2, τ3,..., Τ (n−1), and is expressed by Equation (1).

L1 is a difference in sound wave arrival distance between the microphone element 221 and the microphone element 22n. L2 is a difference in sound wave arrival distance between the microphone element 222 and the microphone element 22n. L3 is a difference in sound wave arrival distance between the microphone element 223 and the microphone element 22n. Similarly, L (n−1) is a difference in sound wave arrival distance between the microphone element 22 (n−1) and the microphone element 22n. It is. Vs is the speed of sound waves (sound speed). L1, L2, L3,..., L (n−1), and Vs are known values. In FIG. 37, the delay time Dn set in the delay device 25n is 0 (zero).

As described above, the omnidirectional microphone array apparatus M1 uses the delay times D1, D2, D3,..., Dn-1, Dn set in the

delay units

251, 252, 253,. By changing the directivity of the voice data collected by the

microphone elements

221, 222, 223,..., 22 (n-1), 22n built in the

microphone units

22, 23 can be easily formed. .

The description of the directivity forming process shown in FIG. 37 is based on the assumption that the omnidirectional microphone array apparatus 2 performs the description for the sake of simplicity, and other omnidirectional microphone array apparatuses (for example, the omnidirectional microphone array). The same applies to the device Mm). However, the output control unit 34b of the

signal processing units

34 and 34A of the directivity control devices 3 and 3A has the same number of AD converters 241 to 24n and delay units 251 to 25n as the number of microphones of the omnidirectional microphone array device M1. In the case of the configuration including the adder 26, the sound collected by each microphone element of the omnidirectional microphone array device M1 by the output control unit 34b of the

signal processing units

34 and 34A of the directivity control devices 3 and 3A. 37 may be used to perform the directivity forming process shown in FIG.

(Description of various modes and methods)
Here, various modes and various methods common to the embodiments including the present embodiment will be described in detail.

In each embodiment including this embodiment, the following various modes and various methods exist. Each will be briefly described.

(1) Recording / playback mode: ON / OFF (2) Tracking mode: ON / OFF (3) Tracking processing method: manual / automatic (4) Number of tracking targets: single / multi (5) Manual specification method: click operation / drag Operation (6) Slow playback mode: ON / OFF (7) Enlarged display mode: ON / OFF (8) Audio privacy protection mode: ON / OFF (9) Image privacy protection mode: ON / OFF (10) Connection mode: Each time / Batch (11) Correction mode: ON / OFF (12) Multiple camera switching method: Automatic / Manual (13) Multiple microphone switching method: Automatic / Manual (14) Tracking point upper limit setting mode: ON / OFF

(1) In the recording / playback mode, for example, a user (for example, a supervisor; the same applies hereinafter) plays back image data of a video imaged by the camera device C1 over a certain period of time for the purpose of confirming the contents. Used when. Note that when the recording / playback mode is OFF, image data of a video imaged in real time by the camera device C1 is displayed on the display device 35.

(2) The tracking mode is used when the follow-up control (voice tracking process) of the directivity of the sound collected by the omnidirectional microphone array device M1 as the monitoring target (for example, the person HM1) moves is performed. Is done.

(3) The tracking processing method is a monitoring target when performing tracking control (voice tracking processing) of the directivity of the sound collected by the omnidirectional microphone array device M1 by the movement of the monitoring target (for example, the person HM1). This is a method of setting the position of an object (for example, a designated position on the tracking screen TRW of the display device 35 or a position in real space), and is divided into manual tracking processing and automatic tracking processing. Details of each will be described later.

(4) The number of tracking objects indicates the number of monitoring objects to be subjected to follow-up control (voice tracking processing) of the directivity of the sound collected by the omnidirectional microphone array apparatus M1, for example, one person for a person Or more than one person.

(5) The manual designation method refers to a method in which the user designates a tracking point on the tracking screen TRW in manual tracking processing (described later). For example, the click operation or drag operation of the cursor CSR by a mouse operation, the user This corresponds to a touch operation or a touch slide operation with the finger FG.

(6) The slow playback mode is based on the assumption that the recording / playback mode is on, and the playback speed of the image data played back on the display device 35 is played back at a speed value smaller than the initial value (eg, normal value). Used when.

(7) The enlarged display mode is used when the monitored object (for example, the person HM1) displayed on the tracking screen TRW of the display device 35 is enlarged and displayed.

(8) The voice privacy protection mode is intended to make it difficult to specify who is the voice to be output when the voice data collected by the omnidirectional microphone array device M1 is output from the speaker device 36. Used when voice processing (for example, voice change processing) is performed.

(9) The image privacy protection mode means that it is difficult to specify who is the monitoring object (for example, the person HM1) displayed on the tracking screen TRW of the display device 35 when the enlarged display mode is on. This is used when image processing is performed.

(10) The connection mode is used when connecting designated positions (see, for example, a point marker MR1 described later) designated on the tracking screen TRW by manual designation or automatic designation in the process of moving the monitoring object. If the connection mode is every time, adjacent point markers are connected each time a specified position is specified in the movement process of the monitoring target. If the connection mode is batch, the point markers corresponding to all the designated positions obtained in the process of moving the monitoring object are connected together with the adjacent point markers.

(11) The correction mode is used when the automatic tracking process is switched to the manual tracking process when the designated position automatically designated in the automatic tracking process is out of the movement process of the monitoring target.

(12) The multiple camera switching method is used when a camera device used for capturing an image of a monitoring object is switched among the multiple camera devices C1 to Cn. Details of the multiple camera switching method will be described in the second embodiment.

(13) The multi-microphone switching method is used when switching the omnidirectional microphone array device used for collecting the sound emitted from the monitored object among the plurality of omnidirectional microphone array devices M1 to Mm. Details of the multi-microphone switching method will be described in the second embodiment.

(14) The tracking point upper limit setting mode is used when the upper limit value of the tracking point is set. For example, when the tracking point upper limit setting mode is ON, when the number of tracking points reaches the upper limit value, the tracking processing unit 34c may reset (erase) all the tracking points, The fact that the number has reached the upper limit value may be displayed on the tracking screen TRW. Further, a plurality of voice tracking processes can be executed as long as the number of tracking points reaches the upper limit.

In order to specify the various modes or methods (1) to (14) described above, for example, a predetermined setting button or setting menu in a monitoring system application (not shown) is displayed on the tracking screen TRW. The setting button or setting menu is determined by a click operation of the cursor CSR by the user's mouse operation or a touch operation by the user's finger FG.

Next, an operation example of manual tracking processing in the directivity control devices 3 and 3A will be described with reference to FIG. FIG. 4 is an explanatory diagram illustrating an operation example of the manual tracking process.

In FIG. 4, the movement process of the person HM1 as the monitoring target is shown on the tracking screen TRW displayed on the display device 35. For example, three tracking operations are performed by clicking or dragging the cursor CSR by the user's mouse operation. Points b1, b2, and b3 are designated.

The tracking processing unit 34c acquires information of the tracking time t1 when the cursor CSR designates the tracking point b1, the tracking time t2 that designates the tracking point b2, and the tracking time t3 that designates the tracking point b3. Further, the tracking processing unit 34c associates the coordinate information on the tracking screen TRW of the tracking point b1 or the three-dimensional coordinates indicating the position in the real space corresponding to the coordinate information with the information of the tracking time t1 in the memory 33. save. In addition, the tracking processing unit 34c associates the coordinate information on the tracking screen TRW of the tracking point b2 or the three-dimensional coordinates indicating the position in the real space corresponding to the coordinate information with the information of the tracking time t2 in the memory 33. save. Further, the tracking processing unit 34c associates the coordinate information on the tracking screen TRW of the tracking point b3 or the three-dimensional coordinates indicating the position in the real space corresponding to the coordinate information with the information of the tracking time t3 in the memory 33. save.

The output control unit 34b displays the point marker MR1 at the tracking point b1 on the tracking screen TRW, displays the point marker MR2 at the tracking point b2 on the tracking screen TRW, and further points to the tracking point b3 on the tracking screen TRW. The marker MR3 is displayed. Thereby, the output control unit 34b can explicitly indicate the tracking point through which the moving person HM1 has passed on the tracking screen TRW as a locus.

Further, the output control unit 34b displays the flow line LN1 by connecting the point markers MR1 and MR2, and further displays the flow line LN2 by connecting the point markers MR2 and MR3.

Next, an operation example of the correction mode in the directivity control devices 3 and 3A will be described with reference to FIG. FIG. 5 is an explanatory diagram illustrating an operation example of changing the tracking point by the manual tracking process when the tracking point automatically designated in the automatic tracking process is incorrect.

In the tracking screen TRW on the left side of FIG. 5, the tracking point automatically specified by the image processing unit 37 or the sound source detection unit 34d is different from the point of the movement process of the person HM1, and is incorrect due to the connection between the point markers MR1 and MR2W. A flow line LNW is displayed.

When the correction mode is on, as shown in the tracking screen TRW on the right side of FIG. 5, the automatic tracking process is switched to the manual tracking process. For example, when a correct tracking point is designated by a click operation using the cursor CSR. The output control unit 34b connects the point markers MR1 and MR2R, and displays the correct flow line LNR on the tracking screen TRW.

Next, the slow playback processing in the recording playback mode and the slow playback mode in the directivity control devices 3 and 3A will be described with reference to FIG. FIG. 6 is an explanatory diagram showing the slow playback process in the recording playback mode and the slow playback mode.

In the upper tracking screen TRW in FIG. 6, for example, it is assumed that it is difficult to specify the person HM1 in either the manual tracking process or the automatic tracking process because the movement of the person HM1 is fast. When the recording / playback mode and the slow playback mode are on, for example, when the slow playback button displayed on the display device 35 is touched by the user's finger FG, the output control unit 34b sets the initial playback speed ( The image data of the video showing the moving process of the person HM1 is played back slowly on the tracking screen TRW at a speed value smaller than the normal value (see the tracking screen TRW on the lower side of FIG. 6).

Thereby, since the output control unit 34b can delay the movement of the person HM1 on the tracking screen TRW, the tracking point can be easily designated in the manual tracking process or the automatic tracking process. Note that the output control unit 34b may perform the slow reproduction process without accepting the touch operation of the user's finger FG when the moving speed of the person HM1 is equal to or higher than a predetermined value. Further, the playback speed during slow playback may be a fixed value, or may be changed as appropriate in accordance with an input operation with the cursor CSR by the user's mouse operation or the user's finger FG.

Next, enlargement display processing in the enlargement display mode in the directivity control devices 3 and 3A will be described with reference to FIG. FIG. 7 is an explanatory diagram showing an enlarged display process in the enlarged display mode.

In the upper tracking screen TRW in FIG. 7, for example, since the size of the person HM1 is small, it is difficult to specify the person HM1 even in the manual tracking process or the automatic tracking process. For example, when the enlarged display mode is turned on by the click operation of the cursor CSR by the user's mouse operation, when the click operation is performed at the position (display position) of the person HM1, the output control unit 34b centers on the clicked position. The tracking screen TRW is enlarged and displayed at a predetermined magnification (see the tracking screen TRW on the lower side of FIG. 7). Thereby, since the output control part 34b can carry out enlarged display of the person HM1 on the tracking screen TRW, it can designate a tracking point easily in a manual tracking process or an automatic tracking process.

The output control unit 34b may enlarge and display the content of the tracking screen TRW on another pop-up screen (not shown) with the clicked position as the center. As a result, the output control unit 34b makes it easy for the user to compare the tracking screen TRW that has not been enlarged and the pop-up screen that has been enlarged, for example, by a simple designation operation by the user, so that the user can easily monitor the object to be monitored (person HM1). Can be specified.

Further, for example, when the tracking point is not yet designated, the output control unit 34b may enlarge and display the contents of the displayed camera screen with the center of the display device 35 as a reference. Thereby, the output control unit 34b simply designates the monitoring target to the user when, for example, the monitoring target (person HM1) is shown near the center of the display device 35 by a simple designation operation of the user, for example. Can be made.

In addition, when a plurality of monitoring objects are designated, the output control unit 34b may enlarge the display centering on a position corresponding to a geometric average of a plurality of designated positions on the tracking screen TRW. Thereby, the output control part 34b can make a user select easily the several monitoring target object currently projected on the tracking screen TRW.

Next, the automatic scroll process after the enlargement display process in the enlargement display mode in the directivity control devices 3 and 3A will be described with reference to FIGS. 8 (A), (B), and (C). FIG. 8A is an explanatory diagram showing an automatic scroll process after the enlargement display process in the enlargement display mode. FIG. 8B is a diagram showing the tracking screen TRW at time t = t1. FIG. 8C shows the tracking screen TRW at time t = t2.

FIG. 8A shows a movement path from the position at time t = t1 to the position at time t = t2 of the person HM1 as the monitoring target in the imaging area C1RN of the camera device C1. For example, as a result of the enlarged display of the tracking screen TRW, an image of the entire imaging area C1RN may not be displayed on the tracking screen TRW.

The output control unit 34b always moves the person HM1 along the movement path of the person HM1 from time t = t1 to time t = t2, for example, in response to an input operation with the cursor CSR by the user's mouse operation or the user's finger FG. The tracking screen TRW is automatically scrolled so as to be displayed at the center of the tracking screen TRW. Accordingly, the output control unit 34b automatically displays the tracking screen TRW so that the user's designated position is always at the center of the tracking screen TRW due to the movement of the person HM1 displayed on the enlarged tracking screen TRW. Since the scrolling is performed, it is possible to prevent the designated position of the user person HM1 from deviating from the tracking screen TRW even when the tracking screen TRW is displayed in an enlarged manner. Further, the person HM1 on the tracking screen TRW that continues to move can be simply displayed. Can be specified.

FIG. 8B shows the tracking screen TRW at time t = t1, and the person HM1 is displayed at the center. TP1 in the figure indicates a tracking point designated by the person HM1 by an input operation with the cursor CSR or the user's finger FG by the user's mouse operation at time t = t1.

Similarly, in FIG. 8C, the tracking screen TRW at time t = t2 is shown, and the person HM1 is displayed at the center. TP2 in the figure indicates a tracking point designated by the person HM1 by an input operation with the cursor CSR or the user's finger FG by the user's mouse operation at time t = t2. In both FIG. 8B and FIG. 8C, the person HM1 as the monitoring object is displayed at the center on the tracking screen TRW during the automatic scrolling process, so that the user can easily select.

Next, an overall flow of manual tracking processing in the directivity control system 100 of the present embodiment will be described with reference to FIGS. 9 (A) and 9 (B). FIG. 9A is a flowchart illustrating a first example of the overall flow of manual tracking processing in the directivity control system 100 of the first embodiment. FIG. 9B is a flowchart illustrating a second example of the overall flow of manual tracking processing in the directivity control system 100 of the first embodiment.

Hereinafter, in order to avoid complication of the description, the overall flow of the manual tracking process in the directivity control system 100 of the present embodiment will be described first with reference to FIGS. 9A and 9B. Detailed contents of the processing will be described each time with reference to the drawings described later. Among the operations illustrated in FIG. 9B, the same contents as those illustrated in FIG. 9A are denoted by the same step numbers, and description thereof is simplified or omitted, and different contents are described. 9A and 9B show the operation of the directivity control device 3.

As a premise of the description of FIG. 9A, the output control unit 34b is configured to use an omnidirectional microphone array on the tracking screen TRW of the display device 35 on which an image of the person HM1 as a monitoring target imaged by the camera device C1 is displayed. The directivity of the collected sound is formed in the direction from the device M1 to the position (sound position, sound source position) of the person HM1 corresponding to the position specified by the cursor CSR by the user's mouse operation or the input operation by the user's finger FG. Suppose you are. Note that the same applies to the explanation of FIG.

In FIG. 9A, if the tracking mode is off (S1, NO), the manual tracking process shown in FIG. 9A ends, but if the tracking mode is on (S1, YES), The tracking assist process is started (S2). Details of the tracking assist processing will be described later with reference to FIG.

After step S2, on the tracking screen TRW of the display device 35, the tracking position of the movement process (movement path) of the person HM1, that is, the tracking point, is the click operation of the cursor CSR by the user's mouse operation or the touch of the user's finger FG. Designated by operation (S3).

The tracking processing unit 34c associates the three-dimensional coordinates indicating the position in the real space corresponding to the specified position on the tracking screen TRW specified in step S3 and the specified time as the tracking position and tracking time of the tracking point, respectively. The data is stored in the memory 33, and further, a point marker is displayed at the tracking point on the tracking screen TRW via the output control unit 34b (S4). The point marker may be displayed by the tracking processing unit 34c, and the same applies to the following embodiments.

The output control unit 34b forms the directivity of the collected sound in the direction from the omnidirectional microphone array device M1 to the position (sound position, sound source position) of the person HM1 corresponding to the tracking point specified in step S3. (S5). Note that the tracking processing unit 34c acquires the tracking position and tracking time data of the tracking point by designating the movement process (movement route) of the person HM1 in response to an input operation with the cursor CSR by the user's mouse operation or the user's finger FG. When it is only necessary to do this, the operation of step S5 may be omitted. In other words, the output control unit 34b does not switch the directivity from the omnidirectional microphone array apparatus M1 to the direction of the person HM1 corresponding to the tracking point specified in step S3 (voice position, sound source position). The same applies to the following embodiments.

After step S5, the output control unit 34b performs tracking connection processing (S6). Details of the tracking connection process will be described later with reference to FIG. After step S6, the output control unit 34b outputs the collected sound having the directivity formed in step S5 from the speaker device 36 (S7). Details of the audio output processing will be described later with reference to FIG. After step S7, the operation of the directivity control device 3 returns to step S1, and the processes of steps S1 to S7 are repeated until the tracking mode is turned off.

In FIG. 9B, after step S1, tracking assist processing is started (S2). Details of the tracking assist processing will be described later with reference to FIG. After step S2, on the tracking screen TRW of the display device 35, the position (namely, tracking point) of the movement process (movement path) of the person HM1 is the drag operation of the cursor CSR by the user's mouse operation or the touch of the user's finger FG. It is assumed that the slide operation is started (S3A).

After step S3A, if the predetermined time (for example, about several seconds) has not elapsed since the storage of the tracking position and tracking time data corresponding to the previous tracking point has ended (S8, NO), in step S3A It is considered that the started drag operation or touch slide operation has not ended, and the operation of the directivity control device 3 proceeds to step S7.

On the other hand, after step S3, when a predetermined time (for example, about several seconds) has elapsed since the storage of the tracking position and tracking time data corresponding to the previous tracking point has ended (S8, YES), in step S3 It is considered that the started drag operation or touch slide operation is completed, and a new tracking point is designated. That is, the tracking processing unit 34c uses the three-dimensional coordinates indicating the position in the real space corresponding to the specified position when the drag operation or the touch slide operation is ended and the specified time as the tracking position and tracking time of the new tracking point, respectively. The information is stored in the memory 33 in association with each other, and a point marker is displayed at the tracking point on the tracking screen TRW via the output control unit 34b (S4). The operation after step S4 is the same as the operation after step S4 shown in FIG.

Next, the entire flow of the automatic tracking process in the directivity control system 100A of the present embodiment will be described with reference to FIGS. 10 (A) and (B), FIGS. 11 (A) and (B), and FIG. FIG. 10A is a flowchart for explaining a first example of the entire flow of the automatic tracking process in the directivity control system 100A of the first embodiment. FIG. 10B is a flowchart for explaining a first example of the automatic tracking process shown in FIG. FIG. 11A is a flowchart for explaining a second example of the automatic tracking process shown in FIG. FIG. 11B is a flowchart illustrating an example of the tracking correction process illustrated in FIG. FIG. 12 is a flowchart for explaining a third example of the automatic tracking process shown in FIG.

Also in FIG. 10 (A), as in FIGS. 9 (A) and 9 (B), in order to avoid complication of explanation, referring to FIG. 10 (A), in the directivity control system 100A of the present embodiment. The overall flow of the automatic tracking process will be described first, and the detailed contents of each process will be described each time with reference to the drawings described later.

Of the operations shown in FIG. 10A, the same contents as those shown in FIG. 9A or 9B are denoted by the same step numbers, and the description will be simplified or omitted, and different contents will be described. . FIG. 10A also shows the operation of the directivity control device 3.

As a premise of the description of FIG. 10A, the output control unit 34b is configured to display an omnidirectional microphone array on the tracking screen TRW of the display device 35 on which an image of the person HM1 as the monitoring target imaged by the camera device C1 is displayed. The directivity of the collected sound is set in the direction from the device M1 to the position (speech position, sound source position) of the person HM1 corresponding to the position automatically designated by using the detection processing result of the sound source detection unit 34d or the image processing unit 37. Suppose that it is formed.

In FIG. 10A, after step S1, tracking assist processing is started (S2). Details of the tracking assist processing will be described later with reference to FIG. After step S2, automatic tracking processing is performed (S3B). Details of the automatic tracking process will be described later with reference to FIGS. 10B, 11A, and 12. FIG. After step S3B, the output control unit 34b collects sound in the direction from the omnidirectional microphone array apparatus M1 to the position of the person HM1 (speech position, sound source position) corresponding to the tracking point automatically designated in step S3B. (S5). The operation after step S5 is the same as the operation after step S4 shown in FIG.

In FIG. 10B, the image processing unit 37 performs known image processing to determine whether or not the person HM1 as the monitoring target is detected on the tracking screen TRW of the display device 35, and the person HM1 is detected. If it is determined that it has been detected, the determination result (including the detection position (eg, known representative point) of the person HM1 and detection time data) is output to the tracking processing unit 34c of the signal processing unit 34 (S3B-1). .

Alternatively, the sound source detection unit 34d determines whether or not the position of the sound (sound source) emitted by the person HM1 as the monitoring target is detected on the tracking screen TRW of the display device 35 by performing a known sound source detection process. If it is determined that the position of the sound source has been detected, the determination result (including the sound source detection position and detection time data) is output to the tracking processing unit 34c (S3B-1). In order to simplify the description of step S3B-1, it is assumed that there is no monitoring target other than the monitoring target person HM1 on the tracking screen TRW.

The tracking processing unit 34c automatically sets the designated position of the person HM1 in the automatic tracking processing, that is, the tracking point, using the determination result of the image processing unit 37 or the sound source detection unit 34d (S3B-1). The tracking processing unit 34c corresponds to the tracking position of the tracking point and the tracking time by using the three-dimensional coordinates indicating the position in the real space corresponding to the detection position on the tracking screen TRW automatically specified in step S3B-1 and the detection time, respectively. Then, it is stored in the memory 33, and further, a point marker is displayed at the tracking point on the tracking screen TRW via the output control unit 34b (S3B-2). After step S3B-2, the automatic tracking process shown in FIG. 10B ends, and the process proceeds to step S5 shown in FIG.

In FIG. 11A, when the first tracking point (initial position) has already been designated (S3B-3, YES), the operation of step S3B-4 is omitted. On the other hand, when the first tracking point is not designated (S3B-3, NO), on the tracking screen TRW of the display device 35, an input operation (for example, a click operation) by the cursor CSR by the user's mouse operation or the user's finger FG. , The touch operation) designates the position (namely, tracking point) of the movement process (movement path) of the person HM1 (S3B-4).

When the first tracking point has already been specified, or after the first tracking point has been specified in step S3B-4, the tracking processing unit 34c performs an image processing unit 37 or a sound source detection unit centered on the first tracking point. The next tracking point is automatically designated using the determination result 34d (S3B-5). Accordingly, the tracking processing unit 34c relates to the position of the sound (sound source) emitted by the person HM1 around the first tracking point (initial position) on the tracking screen TRW, for example, by the user specifying the first tracking point. Since the detection process of the information or the information related to the position of the person HM1 is started, each detection process can be performed at high speed.

The tracking processing unit 34c corresponds to the tracking position of the tracking point and the tracking time by using the three-dimensional coordinates indicating the position in the real space corresponding to the detection position on the tracking screen TRW automatically specified in step S3B-5 and the detection time, respectively. Then, it is stored in the memory 33, and further, a point marker is displayed at the tracking point on the tracking screen TRW via the output control unit 34b (S3B-2).

If the operation for correcting the tracking point is not performed after step S3B-2 (S3B-6, NO), the automatic tracking process shown in FIG. 11A ends, and the step shown in FIG. Proceed to S5.

On the other hand, after step S3B-2, for example, when the determination result of the image processing unit 37 or the sound source detection unit 34d is incorrect, an operation for correcting the tracking position corresponding to the tracking point is performed (S3B-6). YES), the tracking correction process shown in FIG. 11B is performed (S3B-7).

In FIG. 11B, when the voice uttered by the person HM1 moving on the tracking screen TRW is output, the voice is temporarily output by the cursor CSR by the user's mouse operation or the input operation by the user's finger FG. Canceled (S3B-7-1). After step S3B-7-1, the correction mode is turned on by an input operation with the cursor CSR by the user's mouse operation or the user's finger FG, so that the automatic tracking process is temporarily shifted to the manual tracking process. Assume that the correct tracking point is designated (S3B-7-2).

The output control unit 34b deletes the wrong point marker displayed on the tracking screen TRW immediately before the designation in step S3B-7-2 (S3B-7-3), that is, the changed tracking point, that is, A point marker is displayed at the tracking point designated in step S3B-7-2, and the output of the voice that was temporarily suspended in step S3B-7-1 is resumed (S3B-7-3). Further, the tracking processing unit 34c overwrites and stores the position designated in step S3B-7-2 as a tracking point (S3B-7-3). After step S3B-7-3, the tracking correction process shown in FIG. 11B ends, and the process proceeds to step S5 shown in FIG.

12, the image processing unit 37 performs known image processing to determine whether or not the person HM1 as the monitoring target is detected on the tracking screen TRW of the display device 35 (S3B-8). When it is determined that the person HM1 has been detected (S3B-9, YES), the image processing unit 37 calculates a detection position (for example, a known representative point) of the person HM1, and further compares the detection time and the detection position. Each data is output as a determination result to the tracking processing unit 34c of the signal processing unit 34 (S3B-10).

The sound source detection unit 34d determines whether or not the position of the sound (sound source) emitted by the person HM1 as the monitoring target is detected on the tracking screen TRW of the display device 35 by performing a known sound source detection process. If it is determined that the detected position is detected, the detected position of the person HM1 is calculated, and each data of the detection time and the detected position is output as a determination result to the tracking processing unit 34c (S3B-11).

The tracking processing unit 34c stores the sound source detection position and the detection time on the tracking screen TRW calculated in step S3B-11 in association with the tracking point tracking position and the tracking time in the memory 33, and further outputs them. Point markers are displayed at the tracking points on the tracking screen TRW via the control unit 34b (S3B-12).

After step S3B-12, the tracking processing unit 34c determines whether the distance between the detected position of the person HM1 calculated in step S3B-10 and the detected position of the sound source calculated in step S3B-11 is within a predetermined value. Is determined (S3B-13). If the distance between the detection position of the person HM1 and the detection position of the sound source is within a predetermined value (S3B-13, YES), the automatic tracking process shown in FIG. 12 is terminated, and step S5 shown in FIG. Proceed to

On the other hand, when the distance between the detection position of the person HM1 and the detection position of the sound source is not within the predetermined value (S3B-13, NO), the tracking correction process shown in FIG. 11B is performed (S3B-7). . Since the tracking correction processing has been described with reference to FIG. 11B, description thereof is omitted here. After step S3B-7, the automatic tracking process shown in FIG. 12 ends, and the process proceeds to step S5 shown in FIG.

Accordingly, the tracking processing unit 34c, for example, performs tracking correction processing if the distance between the sound source position detected by the sound source position detection process or the person HM1 position detection process and the position of the person HM1 is equal to or greater than a predetermined value. Information relating to the position designated by the user's position changing operation in (see FIG. 11B) can be easily corrected and acquired as information relating to the position of the person HM1. Further, if the distance between the sound source position detected by the sound source position detection process or the person HM1 position detection process and the position of the person HM1 is not equal to or greater than a predetermined value, the tracking processing unit 34c, for example, Without requiring a change operation, the position of the sound source or the position of the person HM1 can be easily acquired as information regarding the position after the movement of the person HM1.

Next, details of the tracking assist processing in the directivity control devices 3 and 3A will be described with reference to FIG. FIG. 13A is a flowchart for explaining an example of the tracking assist process shown in FIG.

In FIG. 13A, when the enlarged display mode of the directivity control devices 3 and 3A is OFF (S2-1, NO), the operation of the directivity control devices 3 and 3A proceeds to step S2-5. On the other hand, when the enlarged display mode of the directivity control devices 3 and 3A is on (S2-1, YES), the directivity control devices 3 and 3A perform image privacy protection processing (S2-2), and further Then, an automatic scroll process is performed (S2-3). Details of the image privacy protection process will be described later with reference to FIG. Details of the automatic scroll processing will be described later with reference to FIGS. 13B, 14A, and 14B.

After step S2-3, the output control unit 34b enlarges and displays the content of the tracking screen TRW at a predetermined magnification with the tracking position corresponding to the nearest tracking point on the tracking screen TRW as the center (S2-4). After step S2-4, when both the recording / playback mode and slow playback mode of the directivity control devices 3 and 3A are on (S2-5, YES), the output control unit 34b sets the initial value of the playback speed. Image data of a video showing the movement process of the person HM1 is played back on the tracking screen TRW at a speed value smaller than (normal value) (S2-6).

After step S2-6, or when both the recording and playback modes and slow playback mode of directivity control devices 3 and 3A are not on (S2-5, NO), the tracking assist process shown in FIG. Is completed, and the process proceeds to step S3 shown in FIG. 9A, step S3A shown in FIG. 9B, or step S3B shown in FIG.

Next, details of the automatic scroll processing in the directivity control devices 3 and 3A will be described with reference to FIGS. 13B, 14A, and 14B. FIG. 13B is a flowchart illustrating an example of the automatic scroll process illustrated in FIG. FIG. 14A is a flowchart illustrating an example of the automatic scroll process necessity determination process illustrated in FIG. FIG. 14B is an explanatory diagram of a scroll necessity determination line in the automatic scroll processing necessity determination processing.

In FIG. 13B, the tracking processing unit 34c performs an automatic scroll process necessity determination process (S2-3-1). Details of the automatic scroll process necessity determination process will be described later with reference to FIG.

After step S2-3-1, if the output control unit 34b determines that the automatic scroll process is necessary as a result of the automatic scroll process necessity determination process (S2-3-2, YES), the tracking screen A predetermined automatic scroll process is performed on the TRW (S2-3-3). For example, the output control unit 34b always keeps the person HM1 at the center of the tracking screen TRW along the movement path of the person HM1 on the tracking screen TRW according to the input operation with the cursor CSR by the user's mouse operation or the user's finger FG. The tracking screen TRW is automatically scrolled so as to be displayed on the screen. Thereby, even when the tracking screen TRW is displayed in an enlarged manner, the output control unit 34b can prevent the designated position of the person HM1 as the user's monitoring target from moving out of the tracking screen TRW, and continue to move. The person HM1 on the tracking screen TRW can be easily designated.

If the tracking point is not yet designated at the time of step S2-3-1-1, the output control unit 34b displays the tracking screen TRW so that the person HM1 is always displayed at the center of the tracking screen TRW. In this case, the automatic scroll process necessity determination process shown in step S2-3-1 may be omitted as the automatic scroll process.

In addition, when the person HM1 moves beyond a scroll determination line JDL, which will be described later, the output control unit 34b automatically scrolls by a predetermined amount in the moving direction of the person HM1 (for example, the direction beyond the scroll determination line JDL, which will be described later). To process. Thereby, even when the tracking screen TRW is enlarged and displayed, the output control unit 34b can prevent the designated position of the person HM1 as the monitoring target of the user from deviating from the tracking screen TRW.

Further, when the person HM1 moves beyond a scroll determination line JDL, which will be described later, the output control unit 34b determines the position (for example, the next position specified by the cursor CSR by the user's mouse operation or the input operation by the user's finger FG). The tracking screen TRW is automatically scrolled so that the tracking point) becomes the center of the tracking screen TRW. Thereby, even when the tracking screen TRW is displayed in an enlarged manner, the output control unit 34b can prevent the designated position of the person HM1 as the user's monitoring target from moving out of the tracking screen TRW, and continue to move. The person HM1 on the tracking screen TRW can be easily designated.

After step S2-3-3, or when it is determined that the automatic scroll process is not necessary as a result of the automatic scroll process necessity determination process (S2-3-2, NO), it is shown in FIG. The automatic scroll process ends, and the process proceeds to step S2-4 shown in FIG.

In FIG. 14A, the tracking processing unit 34c determines whether or not the tracking position corresponding to the designated tracking point TP1 exceeds any one of the scroll determination lines JDL on the upper, lower, left, and right sides of the tracking screen XTRW to be enlarged. Determination is made (S2-3-1-1).

If the tracking processing unit 34c determines that the tracking position does not exceed any of the scroll determination lines JDL (S2-3-1-1, NO), the tracking processing unit 34c determines that the automatic scroll processing is unnecessary (S2-3- 1-2). On the other hand, when the tracking processing unit 34c determines that the tracking position exceeds any of the scroll determination lines JDL (S2-3-1-1, YES), the tracking processing unit 34c determines that automatic scroll processing is necessary, and further applies The type of the scroll determination line JDL to be performed (for example, information indicating one of the four scroll determination lines JDL shown in FIG. 14B) is stored in the memory 33 (S2-3-1-3). After steps S2-3-1-2 and S2-3-1-3, the automatic scroll process necessity determination process shown in FIG. 14A ends, and step S2-3-2 shown in FIG. Proceed to

Next, details of the tracking connection processing in the directivity control devices 3 and 3A will be described with reference to FIGS. 15 (A) and 15 (B). FIG. 15A is a flowchart illustrating an example of the tracking connection process illustrated in FIG. FIG. 15B is a flowchart illustrating an example of the batch connection process illustrated in FIG.

In FIG. 15A, when the tracking point 34c has already been designated (S6-1, YES), the tracking processing unit 34c determines whether or not the connection mode is in each case (S6-2). When it is determined that the connection mode is each time (YES in S6-2), the output control unit 34b selects the latest one or more tracking points corresponding to the one or more tracking points specified immediately before. Are connected and displayed (S6-3). As a result, the output control unit 34b, when the person HM1 displayed on the tracking screen TRW of the display device 35 moves, at least the current specified position and the immediately preceding position among the plurality of specified positions specified by the user specifying operation. Since the designated position is connected and displayed, a part of the trajectory of the movement of the person HM1 can be explicitly shown.

Note that step S6-3 is not limited to the operation in the case of single designation in which tracking points are designated one by one, but also includes the operation in the case where a plurality of tracking points are designated at the same time. The same applies to -4--3.

After step S6-3 or when the tracking point has not yet been designated (S6-1, NO), the tracking connection process shown in FIG. 15A is terminated, and FIGS. B) or proceed to step S7 shown in FIG.

If it is determined that the connection mode is not frequent (S6-2, NO), a batch connection process is performed (S6-4). The batch connection process will be described with reference to FIG.

15B, the tracking processing unit 34c sequentially reads data in the tracking list LST (for example, see FIG. 16B) stored in the memory 33 (S6-4-1). When it is determined that the read data is the start point of the tracking point (S6-4-2, YES), the tracking processing unit 34c again uses the data in the tracking list LST (see, for example, FIG. 16B). Read (S6-4-1).

On the other hand, when it is determined that the read data is not the starting point of the tracking point (S6-4-2, NO), the output control unit 34b uses the read data of the tracking list to The point markers of the one or more tracking points specified in (1) and the corresponding one or more latest tracking points are connected and displayed (S6-4-3).

After step S6-4-3, if the connection is made up to the end point of the tracking point (S6-4-4, YES), the batch connection process shown in FIG. 15B ends, and FIG. The process proceeds to step S7 shown in FIG. 9B or FIG.

On the other hand, after step S6-4-3, if the end of the tracking point is not connected (S6-4-4, NO), the tracking processing unit 34c stores the tracking list LST stored in the memory 33 (for example, The data in FIG. 16 (B) is sequentially read out, and from step S6-4-1 to step S6-4-4 until point markers corresponding to all tracking points in the tracking list LST are connected and displayed. The operations up to are repeated. Accordingly, the output control unit 34b is adjacent to each designated position with respect to all of the plurality of designated positions designated by the user's designated operation when the person HM1 projected on the tracking screen TRW of the display device 35 moves. Since one or two designated positions are connected and displayed, the entire trajectory of the movement of the person HM1 can be explicitly shown.

FIG. 16A is an explanatory diagram of the collected sound reproduction start time PT corresponding to the user's designated position P0 on the flow line between the tracking points displayed for one movement of the person HM1. FIG. 16B is a diagram illustrating a first example of a tracking list. In FIG. 16A, TP1, TP2, TP3, and TP4 are tracking points designated during the movement of the person HM1 for one time, as also shown in the tracking list LST shown in FIG.

In FIG. 16B, for each tracking point TP1 (start point), TP2, TP3, TP4 (end point), the coordinates (x, y, z) indicating the tracking position and the tracking time are stored in association with each other. In order to simplify the description, the z coordinate value z0 of the coordinates indicating the tracking position is assumed to be constant.

When the designated position P0 is designated on the flow line between the tracking points shown in FIG. 16A according to the input operation with the cursor CSR by the user's mouse operation or the user's finger FG, the tracking processing unit 34c designates the designated position. Two tracking points TP1 and TP2 before and after P0 are extracted, and the reproduction start time PT at the designated position P0 is calculated according to the equation (2) using the coordinates indicating the tracking positions of the tracking points TP1 and TP2 and the tracking time data. .

In addition, when outputting (reproducing) the sound to the speaker device 36, the output control unit 34b performs tracking time order including the designated position P0 designated by the cursor CSR by the user's mouse operation or the input operation by the user's finger FG. After the directivity is formed in the directivity direction corresponding to the corresponding tracking position, the sound having the directivity is output (reproduced).

FIG. 17A is an explanatory diagram of the reproduction start time PT of the collected sound corresponding to the user's designated position P0 on the flow line between different tracking points based on a plurality of simultaneous designations. FIG. 17B is a diagram showing a second example of the tracking list LST. In FIG. 17A, (TP11, TP21), (TP12, TP22), (TP13, TP23), and (TP14, TP24) are, for example, as shown in the tracking list LST shown in FIG. The tracking points are specified simultaneously during movement of different persons as a plurality of monitoring objects.

In FIG. 17B, for each tracking point (TP11, TP21), (TP12, TP22), (TP13, TP23), (TP14, TP24), the coordinates (x, y, z) indicating the tracking position and the tracking time. Are stored in association with each other. The tracking points (TP11, TP21) are start points, and the tracking points (TP14, TP24) are end points. In order to simplify the description, the z coordinate value z0 of the coordinates indicating the tracking position is assumed to be constant.

The tracking processing unit 34c designates the designated position P0 at any position on the different flow line between the tracking points shown in FIG. 17A according to the input operation with the cursor CSR by the user's mouse operation or the user's finger FG. Then, the two tracking points TP11 and TP12 before and after the designated position P0 are extracted, and the reproduction start time PT at the designated position P0 is expressed by using the coordinates indicating the tracking positions of the tracking points TP11 and TP12 and the tracking time data. Calculate according to (3).

FIG. 18A is an explanatory diagram of the sound collection sound reproduction start times PT and PT 'corresponding to the user's designated positions P0 and P0' on the flow line between different tracking points based on the designation of a plurality of times. FIG. 18B is a diagram showing a third example of the tracking list LST. In FIG. 18A, (TP11, TP12, TP13, TP14) is, for example, during the movement of a person as the first monitoring target as shown in the tracking list LST shown in FIG. 18B. The specified tracking point. In FIG. 18A, (TP21, TP22, TP23) is also a tracking point designated during the movement of a person as the second monitoring object, for example. The person as the second monitoring object may be the same person or a different person from the person as the first monitoring object.

In FIG. 18B, the coordinates (x, y, z) indicating the tracking position and the tracking time are stored in association with each other for each tracking point TP11, TP12, TP13, TP14, TP21, TP22, TP23. The tracking points TP11 and TP21 are start points, and the tracking points TP14 and TP23 are end points. In order to simplify the description, the z coordinate value z0 of the coordinates indicating the tracking position is assumed to be constant.

The tracking processing unit 34c is located at any position on each flow line between the tracking points shown in FIG. 18A according to the input operation with the cursor CSR by the user's mouse operation or the user's finger FG. When 'is specified, two tracking points (TP11, TP12) and (TP21, TP22) before and after the specified positions P0 and P0' are extracted, and the tracking positions of the tracking points (TP11, TP12) and (TP21, TP22) are extracted. Are used to calculate the reproduction start times PT and PT ′ at the designated positions P0 and P0 ′ according to the equations (4) and (5), respectively. In Equations (4) and (5), the coordinates of the designated position P0 are (x0, y0, z0), and the coordinates of the designated position P0 'are (x0', y0 ', z0).

In FIG. 18A, the number of tracking points and the tracking time specified during the movement of each person at the first time and the second time may not match. The output control unit 34b includes a designated position P0 or a designated position P0 ′ designated by an input operation with a cursor CSR by a user's mouse operation or a user's finger FG when outputting (reproducing) sound to the speaker device 36. After the directivity is formed in the directivity direction corresponding to the corresponding tracking position in the order of the tracking time, the sound having the directivity is output (reproduced).

Next, the entire flow of the flow line display reproduction process in the directivity control devices 3 and 3A in which the recording / reproduction mode is mainly turned on will be described with reference to FIG. FIG. 19A is a flowchart for explaining an example of the entire flow of the flow line display reproduction process using the tracking list LST in the

directivity control systems

100 and 100A of the first embodiment.

In FIG. 19A, a flow line display process is first performed (S11). Details of the flow line display processing will be described later with reference to FIG. After step S11, when the designated position P0 is designated on the flow line between the tracking points displayed in step S11 according to the input operation with the cursor CSR by the user's mouse operation or the user's finger FG (S12), the reproduction is performed. A start time calculation process is performed (S13). Details of the reproduction start time calculation process will be described later with reference to FIG.

The tracking processing unit 34c refers to the tracking list LST stored in the memory 33, and all corresponding to the tracking time closest to the reproduction start time PT of the designated position P0 calculated in the reproduction start time calculation process shown in step S13. The coordinates of the tracking position (which may be one) are read (S14). Further, the output control unit 34b uses the tracking position coordinate data read by the tracking processing unit 34c to collect sound from the omnidirectional microphone array apparatus M1 in the direction toward all (or one) tracking positions. Voice directivity is formed (S14). As a result, the output control unit 34b determines the tracking position specified next to the arbitrary specified position in accordance with the position (optional specified position) arbitrarily specified by the user on the flow line indicating the movement trajectory of the person HM1. The directivity of the voice can be formed in advance in the direction toward the.

After step S14, the output control unit 34b starts reproduction of the collected voice data stored in the recorder device 4 or the memory 33 from the reproduction start time PT calculated in step S13 (S15).

After step S15, when there is a next tracking time within a predetermined time from the reproduction start time PT (S16, YES), the output control unit 34b performs all (or one) corresponding to the next tracking time. Using the coordinate data of the tracking position, the directivity of the collected sound is formed in the direction from the omnidirectional microphone array apparatus M1 to all (or even one) tracking positions (S17).

After step S17 or when there is no next tracking time within a predetermined time from the reproduction start time PT (S16, NO), an audio output process is performed (S7). Details of the audio output processing will be described later with reference to FIG. After step S7, when the audio output process at the tracking time corresponding to the end point of the tracking point is completed (S18, YES), the flow line display reproduction process shown in FIG. Thereby, the output control part 34b can output clearly the sound-collected sound which the monitoring target object emitted at the reproduction start time calculated according to the user's arbitrary designated positions, and within a predetermined time from the reproduction start time. When there is a next designated position, the directivity of the voice at the next designated position can be formed in advance.

On the other hand, after step S7, if the audio output process at the tracking time corresponding to the end point of the tracking point has not ended (S18, NO), the audio output process at the tracking time corresponding to the end point of the tracking point ends. Until then, the operations from step S16 to step S18 are repeated.

Next, details of the reproduction start time calculation process in the directivity control devices 3 and 3A will be described with reference to FIG. FIG. 19B is a flowchart for explaining an example of the reproduction start time calculation process shown in FIG.

In FIG. 19B, the tracking processing unit 34c reads the tracking list LST (see, for example, FIG. 16B) stored in the memory 33 (S13-1). The tracking processing unit 34c extracts two tracking points TP1 and TP2 before and after the designated position P0 designated in step S12 from the data of the tracking list LST read in step S13-1 (S13-2). The tracking processing unit 34c calculates the reproduction start time PT at the designated position P0 using the coordinates indicating the tracking positions of the tracking points TP1 and TP2 and the data of the tracking time (S13-3, for example, refer to Equation (2)). After step S13-3, the reproduction start time calculation process shown in FIG. 19B ends, and the process proceeds to step S14 shown in FIG.

Next, details of the flow line display processing in the directivity control devices 3 and 3A will be described with reference to FIG. FIG. 20 is a flowchart for explaining an example of the flow line display process shown in FIG.

20, the tracking processing unit 34c sequentially reads data in the tracking list LST (for example, see FIG. 16B) stored in the memory 33 (S11-1). When the connection between the point markers is completed for all the tracking points read out in step S11-1 (S11-2, YES), the flow line display process shown in FIG. 20 ends, and FIG. The process proceeds to step S12 shown in FIG.

On the other hand, when the connection between the point markers has not been completed for all the tracking points read in step S11-1, (S11-2, NO), the tracking processing unit 34c detects the tracking list LST (for example, FIG. 16). The data of (B) is read sequentially. The output control unit 34b displays a point marker on each of the monitoring objects at one or more tracking points read by the tracking processing unit 34c (S11-3).

In step S11-3, although not particularly illustrated, the output control unit 34b is, for example, an input operation (for example, a mouse right-click operation and a left-click operation, a keyboard operation by a user's mouse FG or a user's finger FG). A mode (for example, the same symbol, identification number, symbol, and so on) that allows the same monitoring object to be identified according to simultaneous pressing of multiple keys, mouse click operation and simultaneous pressing of numeric keys on the keyboard, simultaneous specification on the touch panel, etc. A point marker is displayed by distinguishing each monitoring object by a combination of identification numbers, a frame of a predetermined shape, or the like). The frame having a predetermined shape here is, for example, a rectangle, a circle, or a triangle. In addition to identifying by the shape of the frame, it may be displayed so as to be identifiable by the line type (for example, solid line, dotted line) of the frame, the color of the frame, the number appended on the frame, or the like.

After step S11-3, when it is determined that the tracking point data read in step 11-3 is the starting point of the tracking point (S11-4, YES), the tracking processing unit 34c again performs the tracking list. Data of LST (see, for example, FIG. 16B) is read (S11-3).

On the other hand, when it is determined that the data read in step S11-3 is not the starting point of the tracking point (S11-4, NO), the output control unit 34b uses the data of the read tracking list. Then, the point markers of the one or more tracking points designated immediately before and the latest one or more tracking points corresponding to each other are connected and displayed (S11-5).

After step S11-5, if the end of the tracking point of the tracking list LST read in step S11-1 is connected (S11-6, YES), the operation proceeds to step S11-2.

On the other hand, after step S11-5, if the end of the tracking point of the tracking list LST read in step S11-1 is not connected (S11-6, NO), the data is read in step S11-1. The operation from step S11-3 to step S11-6 is repeated until the end of the tracking point in the tracking list LST is connected.

Next, audio output processing and image privacy protection processing in the directivity control devices 3 and 3A will be described with reference to FIGS. 21 (A) and (B) and FIGS. 22 (A) to (C), respectively. FIG. 21A is a flowchart illustrating an example of the audio output process illustrated in FIG. FIG. 21B is a flowchart for explaining an example of the image privacy protection process shown in FIG. FIG. 22A is a diagram illustrating an example of a waveform of an audio signal corresponding to the pitch before the voice change process. FIG. 22B is a diagram illustrating an example of a waveform of an audio signal corresponding to the pitch after the voice change process. FIG. 22C is an explanatory diagram of processing for blurring the detected outline of a person's face.

In FIG. 21A, the output control unit 34b determines whether or not the voice privacy protection mode is on (S7-1). If the output control unit 34b determines that the voice privacy protection mode is on (S7-1, YES), the output control unit 34b performs voice change processing on the collected voice data output from the speaker device 36 (S7). -2).

After step S7-2, or when it is determined that the voice privacy protection mode is off (S7-1, NO), the output control unit 34b outputs the collected sound as it is from the speaker device 36 (S7). -3). After step S7-3, the audio output process shown in FIG. 21A ends, and the process returns to step S1 shown in FIG. 9A, FIG. 9B, or FIG.

As an example of the voice change process, the output control unit 34b increases, for example, the pitch of the voice data collected by the omnidirectional microphone array device M1 or the waveform of the voice data formed by the output control unit 34b itself. Decrease (see, for example, FIGS. 22A and 22B). As a result, the output control unit 34b performs voice change processing on the sound collected in real time by the omnidirectional microphone array apparatus M1 by a simple input operation of the user, for example, and outputs the sound, so that the sound emitted by the person HM1 is output. By making it difficult to understand who's voice, it is possible to effectively protect the privacy on the voice of the person HM1 currently imaged. In addition, the output control unit 34b performs voice change processing on the sound and outputs the sound when the sound collected by the omnidirectional microphone array apparatus M1 is output for a certain period of time by a simple input operation of the user, for example. Therefore, it is possible to effectively protect the privacy of the voice of the person HM1 by making it difficult to understand who the voice of the person HM1 is.

In FIG. 21B, the tracking processing unit 34c determines whether or not the image privacy protection mode is on (S2-2-1). When it is determined that the image privacy protection mode is on (S2-2-1, YES), the image processing unit 37 determines the face outline DTL of the person HM1 displayed on the tracking screen TRW of the display device 35. Is detected (extracted) (S2-2-2), and the face outline DTL is masked (S2-2-3). Specifically, the image processing unit 37 calculates a rectangular region including the detected face outline DTL, and performs a process of adding a predetermined blur to the rectangular region (see FIG. 22C). The image processing unit 37 outputs the image data generated by the blurring process to the output control unit 34b.

After step S2-2-3 or when it is determined that the image privacy protection mode is off (S2-2-1, NO), the output control unit 34b displays the image obtained from the image processing unit 37. Data is displayed on the display device 35 (S2-2-4).

Thereby, the image processing unit 37 performs a masking process on a part (for example, a face) of the person HM1 as the monitoring target displayed on the tracking screen TRW of the display device 35 by, for example, a simple input operation of the user. Privacy can be effectively protected by making it difficult to understand who the person HM1 of the monitoring object is.

Note that the image privacy protection process shown in FIG. 21B is performed if the image privacy protection mode of the directivity control devices 3 and 3A is turned on when the monitoring object (for example, the person HM1) appears on the camera screen. This may be done even if the enlarged display mode is not turned on.

As described above, in the

directivity control systems

100 and 100A of the present embodiment, the directivity control devices 3 and 3A apply the image data on the tracking screen TRW of the display device 35 from the omnidirectional microphone array device M1 including a plurality of microphones. Information about a designated position (for example, a tracking point) in which sound directivity is formed in the direction toward the monitoring object (for example, the person HM1) corresponding to the designated position and the moving monitoring object (for example, the person HM1) is designated. (Tracking position and tracking time) corresponding to. In addition, the directivity control devices 3 and 3A use the information related to the designated position with respect to the image data on the tracking screen TRW of the display device 35 in the direction toward the monitoring target (for example, the person HM1) corresponding to the designated position. Follow the directivity of and switch.

Thereby, even if the monitoring target object (for example, person HM1) currently displayed on the image data on the tracking screen TRW of the display apparatus 35 moves, the directivity control devices 3 and 3A are monitored objects (for example, the person HM1). Since the directivity of the sound formed in the direction toward the position before the movement of the object is formed in the direction toward the position after the movement of the monitoring object (for example, the person HM1), the movement of the monitoring object (for example, the person HM1) Along with this, the directivity of the voice can be properly formed and the efficiency of the monitoring work of the supervisor can be suppressed.

In addition, the directivity control devices 3 and 3A can monitor the object to be monitored by a simple manual operation that designates the object to be moved (for example, the person HM1) in the image data displayed on the tracking screen TRW of the display device 35. It is possible to easily acquire accurate information regarding the position after the movement of the person (for example, the person HM1).

In addition, the directivity control device 3A simplifies the sound source of the sound emitted from the monitoring object (for example, the person HM1) and the monitoring object (for example, the person HM1) itself from the image data displayed on the tracking screen TRW of the display device 35. Therefore, information regarding the position of the sound source or information regarding the position of the monitoring target can be easily obtained as information regarding the position after the movement of the monitoring target (for example, the person HM1).

(Second Embodiment)
In the second embodiment, the directivity control device 3B is configured to exceed the imaging area of the camera device or the sound collection area of the omnidirectional microphone array device in accordance with the movement state of the monitoring target (for example, a person). Switches the camera device used to capture an image of the monitoring object to another camera device, or switches the omnidirectional microphone array device used to collect sound emitted from the monitoring object to another omnidirectional microphone array device.

In this embodiment, the camera device used for capturing an image of the monitoring target (for example, the person HM1) that is the target of the audio tracking process and the omnidirectional microphone array apparatus used for collecting the sound emitted by the person HM1 are preliminarily used. Assume that information is associated with the information and is stored in advance in the memory 33 of the directivity control device 3B.

FIG. 23 is a block diagram illustrating a system configuration example of the directivity control system 100B of the second embodiment. The directivity control system 100B shown in FIG. 23 includes one or more camera devices C1,..., Cn, one or more omnidirectional microphone array devices M1,..., Mm, a directivity control device 3B, and a recorder device 4. It is the structure containing these. In the description of each part in FIG. 23, the same reference numerals are given to the configuration and operation of each part shown in the

directivity control systems

100 and 100A shown in FIGS. Will be described.

The directivity control device 3B may be, for example, a stationary PC installed in a monitoring control room (not shown), or a data communication terminal such as a mobile phone, PDA, tablet terminal, or smartphone that can be carried by the user.

The directivity control device 3B includes a communication unit 31, an operation unit 32, a memory 33, a signal processing unit 34A, a display device 35, a speaker device 36, an image processing unit 37, and an operation switching control unit 38. It is the structure which contains at least. The signal processing unit 34A includes at least a pointing direction calculation unit 34a, an output control unit 34b, a tracking processing unit 34c, and a sound source detection unit 34d.

The operation switching control unit 38, based on various information or data regarding the movement status of the monitoring target (for example, a person) acquired by the tracking processing unit 34c, a plurality of camera devices C1 to Cn or a plurality of omnidirectional microphone array devices M1. Among Mm, various operations for switching the camera device used for capturing an image of the monitoring object of the directivity control system 100B or the omnidirectional microphone array device used for collecting the sound emitted from the monitoring object are performed.

Next, the automatic switching process of the camera device in the directivity control device 3B will be described with reference to FIG. FIG. 24 is an explanatory diagram showing an automatic switching process of a camera device used for capturing an image displayed on the display device 35. In FIG. 24, for the sake of simplicity, the camera device used for capturing an image of the person HM1 is moved from the camera device C1 to the camera by moving the person HM1 as the monitoring object from the tracking position A1 to the tracking position A2. An example of switching to the device C2 will be described.

The tracking position A1 is within the range of the imaging area C1RN of the camera device C1, and is within the range of the switching determination line JC1 of the camera device C1 determined in advance. The tracking position A2 is within the range of the imaging area C2RN of the camera device C2, and is outside the range of the switching determination line JC1 of the camera device C1. Although not shown, the tracking positions A1 and A2 are within the sound collection area of the omnidirectional microphone array apparatus M1.

When the person HM1 is about to exceed the imaging area C1RN of the camera device C1, the operation switching control unit 38 includes information indicating that the camera device used for capturing an image of the person HM1 is switched from the camera device C1 to the camera device C2. The camera device C2 is notified through the communication unit 31 and the network NW. In other words, the operation switching control unit 38 instructs the camera device C2 to prepare for capturing an image in a range within the angle of view of the camera device C2. However, at this time, the image data of the video imaged by the camera device C1 is displayed on the tracking screen TRW of the display device 35.

For example, when the person HM1 exceeds the switching determination line JC1 of the camera device C1, the operation switching control unit 38 provides information indicating that the camera device used for capturing an image of the person HM1 is switched from the camera device C1 to the camera device C2. And notifies the camera device C2 via the communication unit 31 and the network NW.

The operation switching control unit 38 uses the distance information between the camera device C1 and the person HM1 measured by the camera device C1 to determine whether the person HM1 has exceeded the switching determination line JC1. More specifically, the operation switching control unit 38 has the person HM1 within the angle of view of the camera device C1, and the distance from the camera device C1 to the person HM1 is the distance from the camera device C1 to the switching determination line JC1. When it becomes larger than (known), it is determined that the person HM1 exceeds the switching determination line JC1. It is assumed that the operation switching control unit 38 knows in advance a camera device (for example, the camera device C2) that can be switched from the camera device C1, and also knows a camera device that can be switched from other camera devices in advance.

When it is determined that the person HM1 that has exceeded the switching determination line JC1 has exceeded the imaging area C1RN of the camera device C1, the operation switching control unit 38 selects a camera device to be used for capturing an image of the person HM1 from the camera device C1 to the camera. Switch to device C2. Thereafter, on the tracking screen TRW of the display device 35, image data of a video imaged by the camera device C2 (for example, image data of the moving person HM1) is displayed.

Thereby, the operation switching control unit 38 can adaptively switch to a camera device that can accurately display an image of the moving monitoring target (for example, the person HM1), and the user's monitoring target (for example, the person) The image of HM1) can be designated easily.

Next, automatic switching processing of the omnidirectional microphone array device in the directivity control device 3B will be described with reference to FIG. FIG. 25 is an explanatory diagram showing an automatic switching process of the omnidirectional microphone array device used for collecting the sound of the monitoring target (for example, the person HM1). In FIG. 25, in order to simplify the explanation, the omnidirectional microphone array device used for collecting the sound emitted by the person HM1 by moving the person HM1 as the monitoring object from the tracking position A1 to the tracking position A2 An example of switching from the omnidirectional microphone array apparatus M1 to the omnidirectional microphone array apparatus M2 will be described.

The tracking position A1 is within the range of the sound collection area M1RN of the omnidirectional microphone array apparatus M1, and is within the range of the switching determination line JM1 of the omnidirectional microphone array apparatus M1 determined in advance. The tracking position A2 is within the range of the sound collection area M2RN of the omnidirectional microphone array apparatus M2, and is outside the range of the switching determination line JM1 of the omnidirectional microphone array apparatus M1. Although not shown, the tracking positions A1 and A2 are within the imaging area of the camera device C1.

When the person HM1 is about to exceed the sound collection area M1RN of the omnidirectional microphone array apparatus M1, the operation switching control unit 38 uses the omnidirectional microphone array apparatus used for collecting sound emitted from the person HM1. Information indicating switching from M1 to the omnidirectional microphone array apparatus M2 is notified to the omnidirectional microphone array apparatus M2 via the communication unit 31 and the network NW. In other words, the operation switching control unit 38 instructs the omnidirectional microphone array apparatus M2 to prepare to collect sound within the sound collection area of the omnidirectional microphone array apparatus M2.

For example, when the person HM1 exceeds the switching determination line JM1 of the omnidirectional microphone array apparatus M1, the operation switching control unit 38 uses the omnidirectional microphone array apparatus used for collecting the sound emitted by the person HM1. Information indicating switching from M1 to the omnidirectional microphone array apparatus M2 is notified to the omnidirectional microphone array apparatus M2 via the communication unit 31 and the network NW.

The operation switching control unit 38 uses the distance information between the omnidirectional microphone array device M1 and the person HM1 to determine whether or not the person HM1 exceeds the switching determination line JM1. More specifically, when the distance from the omnidirectional microphone array apparatus M1 to the person HM1 becomes larger than the distance (known) from the omnidirectional microphone array apparatus M1 to the switching determination line JM1, It is determined that the person HM1 has exceeded the switching determination line JM1. The operation switching control unit 38 knows in advance an omnidirectional microphone array apparatus (for example, omnidirectional microphone array apparatus M2) that can be switched from the omnidirectional microphone array apparatus M1, and can switch from other omnidirectional microphone array apparatuses. It is assumed that the omnidirectional microphone array apparatus is known in advance.

When it is determined that the person HM1 exceeding the switching determination line JM1 has exceeded the sound collection area M1RN of the omnidirectional microphone array apparatus M1, the operation switching control unit 38 uses the omnidirectional microphone for sound collection by the person HM1. The array device M is switched from the omnidirectional microphone array device M1 to the omnidirectional microphone array device M2.

Thereby, the operation switching control unit 38 can adaptively switch to the omnidirectional microphone array device capable of accurately collecting the sound emitted from the moving monitoring target (for example, the person HM1). It is possible to pick up sound produced by an object (for example, the person HM1) with high accuracy.

Next, the manual switching process of the camera device in the directivity control device 3B will be described with reference to FIG. FIG. 26 is an explanatory diagram illustrating manual switching processing of the camera device used for capturing an image displayed on the display device 35. In FIG. 26, on the display device 35, tracking of an image captured by the camera device C1 currently used for capturing an image of the person HM1 in response to an input operation with the cursor CSR by the user's mouse operation or the user's finger FG. The screen TRW is switched to a multi-camera screen including a camera screen C1W of the camera device C1 and camera screens of camera devices (for example, eight camera devices) around the camera device C1.

As in FIG. 24, switchable camera devices are determined in advance for the camera device C1 currently in use, for example, camera devices C2, C3, and C4. On the multi-camera screen shown in FIG. 26, camera screens C2W, C3W, and C4W captured by the camera devices C2, C3, and C4 are displayed (see hatching shown in FIG. 26). It is assumed that the person HM1 is moving in the movement direction MV1.

In consideration of the moving direction MV1 of the person HM1 as the monitoring target, the user uses one of the three camera screens C2W, C3W, and C4W with the finger FG on the multi-camera screen shown in FIG. Assume that a touch operation is performed on (for example, camera screen C3W).

In response to the user's finger FG touch operation, the operation switching control unit 38 changes the camera device used for capturing an image of the person HM1 from the currently used camera device C1 to the camera screen C3W that is the target of the touch operation. Switch to the corresponding camera device C3.

Thereby, the operation switching control unit 38 can adaptively switch to a camera device that can accurately display an image of the moving monitoring target (for example, the person HM1) by a simple operation of the user. The image of the monitoring object (for example, the person HM1) can be easily designated.

Next, manual switching processing of the omnidirectional microphone array device in the directivity control device 3B will be described with reference to FIG. FIG. 27 is an explanatory diagram illustrating a manual switching process of the omnidirectional microphone array device used for collecting the sound of the monitoring target (for example, the person HM1). In FIG. 27, the person HM1 as the monitoring target is displayed in the center on the tracking screen TRW. The omnidirectional microphone array devices that can be switched from the currently used omnidirectional microphone array device M1 are the three omnidirectional microphone array devices M2, M3, and M4 installed around the omnidirectional microphone array device M1. .

In FIG. 27, the omnidirectional microphone array apparatus M2, which can be switched from the omnidirectional microphone array apparatus M1 currently used on the tracking screen TRW in response to an input operation with the cursor CSR by the user's mouse operation or the user's finger FG. Markers M2R, M3R, and M4R indicating the approximate positions of M3 and M4 are displayed (see (1) shown in FIG. 27).

The user considers the moving direction MV1 from the tracking position A1 corresponding to the tracking point of the person HM1 as the monitoring object, and touches one of the three markers (for example, the user's finger FG) (for example, The marker M3R) is selected (see (2) shown in FIG. 27). The operation switching control unit 38 connects the communication unit 31 and the network NW from the omnidirectional microphone array apparatus M1 currently in use to the omnidirectional microphone array apparatus M3 corresponding to the marker M3R selected by the touch operation of the user's finger FG. To start sound collection (see (3) in FIG. 27).

Further, the output control unit 34b switches the directivity from the omnidirectional microphone array apparatus M3 corresponding to the selected marker M3R to the current tracking position of the person HM1 (see (4) shown in FIG. 27). . Thereafter, the markers M2R, M3R, and M4R indicating the approximate positions of the omnidirectional microphone array devices M2, M3, and M4 displayed on the tracking screen TRW are deleted by the output control unit 34b.

As a result, the operation switching control unit 38 accurately captures the sound generated by the moving monitoring target (for example, the person HM1) by a simple user operation on the markers M2R, M3R, and M4R displayed on the tracking screen TRW. It is possible to adaptively switch to the omnidirectional microphone array apparatus M3 capable of sounding, and it is possible to pick up the sound emitted by the person HM1 with high accuracy in accordance with the moving direction MV1 of the person HM1.

Next, the optimum omnidirectional microphone array device selection process in the directivity control device 3B will be described with reference to FIG. FIG. 28 is an explanatory diagram showing a selection process of the optimum omnidirectional microphone array device used for collecting the sound of the monitoring object. In the upper left display device 35 in FIG. 28, all the camera devices (for example, nine camera devices) managed by the directivity control system 100B according to the input operation with the cursor CSR by the user's mouse operation or the user's finger FG. ) Camera screens are listed.

Of the camera screens displayed as a list on the display device 35 on the upper left side of FIG. 28, the camera screens on which the monitoring target object (for example, the person HM1) that is the target of the audio tracking process is shown. C3W. In these camera screens C1W, C2W, and C3W, it is assumed that the camera screen C1W with the best reflection of the person HM1 is selected in accordance with the input operation with the cursor CSR by the user's mouse operation or the user's finger FG.

The operation switching control unit 38 selects and switches the camera device C1 corresponding to the camera screen C1W as a camera device used for capturing an image of the person HM1 in accordance with the user's selection of the camera screen C1W. Thereby, the output control unit 34b enlarges the image data captured by the camera device corresponding to the camera screen C1W and displays the image data on the tracking screen TRW1 of the display device 35 (see the lower left side in FIG. 28).

Further, the output control unit 34b displays markers M1R, M2R, M3R, and M4R indicating the approximate positions of all the omnidirectional microphone array devices associated with the camera device C1 selected by the operation switching control unit 38 on the tracking screen TRW1. Display in the four corners. The display positions of the markers M1R, M2R, M3R, and M4R are not limited to the four corners on the tracking screen TRW1.

Further, when the markers M1R, M2R, M3R, and M4R are sequentially designated by the input operation with the cursor CSR by the user's mouse operation or the user's finger FG, the output control unit 34b highlights the markers one by one (for example, blinking). Br), for each marker, directivity is formed in the direction from the omnidirectional microphone array device corresponding to each marker to the position of the person HM1, and the sound collected for a certain time is output.

When the marker (for example, the marker M3R) indicating the approximate position of the omnidirectional microphone array device determined by the user to be optimal among the sounds output for a certain period of time is selected, the operation switching control unit 38 selects the selected marker M3R. The corresponding omnidirectional microphone array apparatus M3 is selected and switched as the omnidirectional microphone array apparatus used for collecting the sound emitted by the person HM1.

As a result, the operation switching control unit 38 receives the collected sound having different directivities for a certain period of time in the plurality of omnidirectional microphone array devices M1, M2, M3, and M4 associated with the selected camera device C5. Since the sound can be output, it is possible to accurately collect the sound emitted from the moving monitoring target (for example, the person HM1) by performing a simple operation for selecting the sound to be collected that the user determines to be optimal. The optimum omnidirectional microphone array device M3 can be selected, and the sound emitted from the monitoring object (for example, the person HM1) can be collected with high accuracy.

Next, camera device automatic switching processing in the directivity control system 100B of the present embodiment will be described with reference to FIG. FIG. 29A is a flowchart for explaining an example of automatic switching processing of the camera device in the directivity control system 100B of the second embodiment. The automatic switching process of the camera device shown in FIG. 29A explains the details of the automatic switching process of the camera device shown in FIG. 24. For example, after step S3B-1 shown in FIG. Continued.

In FIG. 29A, the image processing unit 37 performs predetermined image processing on the image data displayed on the tracking screen TRW of the display device 35, whereby the position of the monitoring object (for example, the person HM1) is detected. (That is, a tracking point) is detected (S21). After step S21, camera switching determination processing is performed (S22). Details of the camera switching determination process will be described later with reference to FIG.

After step S22, when the camera switching mode is set to ON by the operation switching control unit 38 (S23, YES), the operation switching control unit 38 sets the camera device currently in use (for example, the camera device C1). All the switchable camera devices associated with each other are instructed to capture an image via the communication unit 31 and the network NW (S24). All the camera devices that have received an instruction to capture an image start capturing an image. The camera switching mode is a flag used for controlling the process of whether to switch the camera device when the multiple camera switching method is automatic.

The operation switching control unit 38 uses the distance information between the camera device C1 and the person HM1 measured by the camera device C1 currently in use to detect the person HM1 at the tracking position A1 in the real space detected in step S21. It is determined whether or not the imaging area C1RN of the device C1 has been exceeded (S25). When it is determined that the person HM1 has exceeded the imaging area C1RN of the camera device C1 (S25, YES), the operation switching control unit 38 switches according to the instruction in step S24 and is associated with the currently used camera device C1. Image data captured by all possible camera devices is output to the image processing unit 37. The image processing unit 37 performs predetermined image processing on all the image data output from the operation switching control unit 38, thereby determining whether or not the person HM1 as the monitoring target is detected (S26). The image processing unit 37 outputs the image processing result to the operation switching control unit 38.

The operation switching control unit 38 can detect the person HM1 as the monitoring target using the image processing result of the image processing unit 37, and is the closest to the tracking position A1 in the real space detected in step S21. One near camera device (for example, camera device C2) is selected, and the camera device used for capturing an image of the person HM1 is switched from the camera device C1 to the camera device C2 (S27). Thereby, the output control unit 34b switches the tracking screen TRW displayed on the display device 35 to the camera screen of the camera device C2 selected by the operation switching control unit 38 and displays it (S27).

On the other hand, when the camera switching mode is set to OFF by the operation switching control unit 38 (S23, NO), or when it is determined that the person HM1 does not exceed the imaging area C1RN of the camera device C1 (S25, NO), the automatic switching process of the camera device shown in FIG. 29A ends, and the process proceeds to the automatic switching process of the omnidirectional microphone array device shown in FIG.

Next, camera switching determination processing in the directivity control device 3B will be described with reference to FIG. FIG. 29B is a flowchart illustrating an example of the camera switching determination process illustrated in FIG.

In FIG. 29 (B), the operation switching control unit 38 sets the camera switching mode in the directivity control device 3B to OFF (S22-1). The operation switching control unit 38 uses the distance information between the camera device C1 and the person HM1 measured by the camera device C1 currently in use to determine the tracking position A1 in the real space corresponding to the tracking point detected in step S21. It is determined whether or not a predetermined switching determination line JC1 of the camera device C1 currently in use has been exceeded (S22-2).

When the operation switching control unit 38 determines that the tracking position A1 in the real space corresponding to the tracking point detected in step S21 exceeds the predetermined switching determination line JC1 of the camera device C1 currently in use (S22). -2, YES), the camera switching mode is set to ON (automatic) (S22-3).

After step S22-3, or when it is determined that the tracking position A1 does not exceed the predetermined switching determination line JC1 of the camera device C1 currently in use (S22-2, NO), FIG. 29 (B) The camera switching determination process shown in FIG. 9 ends, and the process proceeds to step S23 shown in FIG.

Next, automatic switching processing of the omnidirectional microphone array device in the directivity control system 100B of the present embodiment will be described with reference to FIG. FIG. 30A is a flowchart for explaining an example of automatic switching processing of the omnidirectional microphone array device in the directivity control system 100B of the second embodiment. The automatic switching process of the omnidirectional microphone array apparatus shown in FIG. 30A explains in detail the contents of the automatic switching process of the omnidirectional microphone array apparatus shown in FIG. 25, and step S27 shown in FIG. The automatic switching process of the camera device shown in FIG. 29A may be performed after the automatic switching process of the omnidirectional microphone array device shown in FIG.

In FIG. 30A, the sound source detection unit 34d calculates or calculates the position (sound source position) of the monitoring object (for example, the person HM1) in the real space by performing a predetermined sound source detection process. The coordinates indicating the position on the image data corresponding to the position of the sound source (that is, the coordinates of the tracking position A1 corresponding to the tracking point) are calculated (S31). After step S31, a microphone switching determination process is performed (S32). Details of the microphone switching determination process will be described later with reference to FIG.

After step S32, when the microphone switching mode is set to ON by the operation switching control unit 38 (S33, YES), the operation switching control unit 38 selects the omnidirectional microphone array device (for example, omnidirectional) currently in use. Instruct all the switchable omnidirectional microphone array devices associated with the microphone array device M1) to collect the sound emitted by the person HM1 via the communication unit 31 and the network NW (S34). All omnidirectional microphone array apparatuses that have received an instruction to collect sound start to collect sound. Note that the microphone switching mode is a flag used for controlling processing for determining whether to switch the omnidirectional microphone array apparatus when the multiple microphone switching method is automatic.

The operation switching control unit 38 uses the distance information between the omnidirectional microphone array apparatus M1 currently used and the person HM1 calculated by the sound source detection unit 34d, so that the person HM1 selects the sound collection area M1RN of the omnidirectional microphone array apparatus M1. It is determined whether or not it has been exceeded (S35). When it is determined that the person HM1 has exceeded the sound collection area M1RN of the omnidirectional microphone array apparatus M1 (S35, YES), the sound source detection unit 34d determines that the omnidirectional microphone array apparatus currently in use is in accordance with an instruction in step S34. Based on the strength or volume level of the sound collected by all switchable omnidirectional microphone array devices associated with M1, the position (sound source position) of the person HM1 as the monitoring object is calculated ( S36).

The operation switching control unit 38 uses the sound source detection result of the sound source detection unit 34d to monitor among all switchable omnidirectional microphone array devices associated with the currently used omnidirectional microphone array device M1. One omnidirectional microphone array apparatus (for example, omnidirectional microphone array apparatus M2) that minimizes the difference in the distance between the position of the person HM1 as an object (the position of the sound source) and the omnidirectional microphone array apparatus is selected, and the person HM1 The omnidirectional microphone array apparatus used for collecting the emitted voice is switched from the omnidirectional microphone array apparatus M1 to the omnidirectional microphone array apparatus M2 (S37). Thereby, the output control unit 34b switches the sound directivity from the omnidirectional microphone array device M2 after switching to the direction of the sound source calculated in step S36 (S37).

On the other hand, when the microphone switching mode is set to OFF by the operation switching control unit 38 (S33, NO), or when it is determined that the person HM1 does not exceed the sound collection area M1RN of the omnidirectional microphone array apparatus M1. (S35, NO), the automatic switching process of the omnidirectional microphone array apparatus shown in FIG. 30A ends, and the process proceeds to, for example, step S3B-2 shown in FIG. 10B. Note that the automatic switching process of the camera device shown in FIG. 29A may be started after the automatic switching process of the omnidirectional microphone array device shown in FIG.

Next, the microphone switching determination process in the directivity control device 3B will be described with reference to FIG. FIG. 30B is a flowchart illustrating an example of the microphone switching determination process illustrated in FIG.

In FIG. 30 (B), the operation switching control unit 38 sets the microphone switching mode to OFF (S32-1). The operation switching control unit 38 uses the distance information between the omnidirectional microphone array apparatus M1 currently in use and the person HM1, and the tracking position A1 calculated in step S31 is a predetermined value of the omnidirectional microphone array apparatus M1 currently in use. It is determined whether or not the switching determination line JM1 is exceeded (S32-2).

When it is determined that the tracking position A1 exceeds the predetermined switching determination line JM1 of the omnidirectional microphone array apparatus M1 currently in use (S32-2, YES), the operation switching control unit 38 turns on the microphone switching mode. Setting is made (S32-3).

After step S32-3, or when it is determined that the tracking position A1 does not exceed the predetermined switching determination line JM1 of the omnidirectional microphone array apparatus M1 currently in use (S32-2, NO), FIG. The microphone switching determination process shown in (B) ends, and the process proceeds to step S33 shown in FIG.

Next, camera device manual switching processing in the directivity control system 100B of the present embodiment will be described with reference to FIG. FIG. 31A is a flowchart illustrating an example of manual switching processing of the camera device in the directivity control system 100B of the second embodiment. The manual switching process of the camera device in the directivity control system 100B shown in FIG. 31A is performed following step S1 shown in FIG. 9A, FIG. 9B, or FIG.

In FIG. 31A, when an instruction for switching the camera device is input to the display device 35 in response to an input operation with the cursor CSR by the user's mouse operation or the user's finger FG (S41), output control is performed. The unit 34b displays the tracking screen TRW of the image captured by the camera device C1 currently used for capturing an image of the person HM1, the camera screen C1W of the camera device C1, and the camera devices (for example, eight cameras) around the camera device C1. Switch to a multi-camera screen including the camera screen (S42).

For the multi-camera screen displayed on the display device 35 in step S42, the user considers the moving direction MV1 of the person HM1 as the monitoring target (see FIG. 26), for example, any camera screen with the finger FG. Is selected by a touch operation (S43).

The operation switching control unit 38 selects a camera device used for capturing an image of the person HM1 in response to the touch operation of the user's finger FG from the currently used camera device C1 in step S43. Switching to the camera device C3 corresponding to the screen C3W (S44). Thereby, the manual switching process of the camera device shown in FIG. 31A is completed, and any one of steps S45, S51, S61, and S71 shown in FIG. 31B, FIG. 32A, and FIG. Proceed to

Next, manual switching processing of the omnidirectional microphone array device in the directivity control system 100B of the present embodiment will be described with reference to FIG. FIG. 31B is a flowchart for explaining an example of manual switching processing of the omnidirectional microphone array device in the directivity control system 100B of the second embodiment.

In FIG. 31B, when an instruction for switching the omnidirectional microphone array apparatus is input in response to an input operation with the cursor CSR or the user's finger FG by the user's mouse operation (S45), the output control unit 34b On the tracking screen TRW, a marker (for example, marker M2R) indicating the approximate position of the omnidirectional microphone array device (for example, omnidirectional microphone array device M2, M3, M4) that can be switched from the currently used omnidirectional microphone array device M1. , M3R, M4R) are displayed (S46).

The user selects one of the three markers (for example, the marker M3R) by touching the finger FG of the user in consideration of the moving direction MV1 from the tracking position A1 of the person HM1 as the monitoring target. (S47, see FIG. 27). The operation switching control unit 38 connects the communication unit 31 and the network NW from the omnidirectional microphone array apparatus M1 currently in use to the omnidirectional microphone array apparatus M3 corresponding to the marker M3R selected by the touch operation of the user's finger FG. The start of sound collection is instructed (S47).

The output control unit 34b switches the directivity from the omnidirectional microphone array device M3 corresponding to the marker M3R selected in Step S47 to the direction toward the tracking position of the current person HM1 (S48). Further, the output control unit 34b deletes the markers M2R, M3R, and M4R indicating the approximate positions of the omnidirectional microphone array devices M2, M3, and M4 displayed on the tracking screen TRW (S48).

After step S48, the manual switching process of the omnidirectional microphone array apparatus shown in FIG. 31 (B) ends, and the process proceeds to step S2 shown in FIG. 9 (A), FIG. 9 (B), or FIG. 10 (A). In addition, the manual switching process of the camera apparatus shown in FIG. 31A may be performed after the manual switching process of the omnidirectional microphone array apparatus shown in FIG.

Next, the optimum omnidirectional microphone array device selection process in the directivity control system 100B of this embodiment will be described with reference to FIGS. 32 (A), 32 (B), and 33. FIG. FIG. 32A is a flowchart for explaining a first example of selection processing of the optimum omnidirectional microphone array device in the directivity control system 100B of the second embodiment. FIG. 32B is a flowchart illustrating a second example of the optimum omnidirectional microphone array device selection process in the directivity control system 100B of the second embodiment. FIG. 33 is a flowchart for explaining a third example of the optimum omnidirectional microphone array apparatus selection process in the directivity control system 100B of the second embodiment.

In FIG. 32A, on the tracking screen TRW displayed on the display device 35, in the moving direction of the person HM1 as the monitoring object in accordance with the input operation with the cursor CSR by the user's mouse operation or the user's finger FG. When the position (tracking position corresponding to the tracking point) is designated (S51), information (for example, coordinates) regarding the designated position is input to the operation switching control unit 38 (S52).

The operation switching control unit 38 determines each distance from each omnidirectional microphone array device to a position in the real space corresponding to the designated position designated in step S51, that is, from each omnidirectional microphone array device as a monitoring object. Each distance to the person HM1 is calculated (S53).

The operation switching control unit 38 selects the omnidirectional microphone array device that provides the minimum distance among the distances calculated in step S53, and instructs the signal processing unit 34 to select the selected omnidirectional microphone array device. Instruct to form directivity for the voice data of the voice collected by (S54).

In response to the instruction in step S54, the output control unit 34b of the signal processing unit 34 is directed from the omnidirectional microphone array apparatus selected by the operation switching control unit 38 in step S54 to the position of the person HM1 as the monitoring target. Then, the sound directivity is formed, and the sound having the directivity is output from the speaker device 36 (S55).

As a result, the operation switching control unit 38 accurately specifies the sound emitted by the moving monitoring object (for example, the person HM1) by simply specifying the position indicating the moving direction of the monitoring object (for example, the person HM1). The optimum omnidirectional microphone array device that can pick up the sound can be selected, and the sound emitted from the monitoring object (for example, the person HM1) can be picked up with high accuracy.

After step S55, the optimum omnidirectional microphone array apparatus selection process shown in FIG. 32A is completed, and the process proceeds to step S2 shown in FIG. 9A, FIG. 9B, or FIG. move on. Note that the manual switching process of the camera apparatus shown in FIG. 31A may be performed after the selection process of the optimum omnidirectional microphone array apparatus shown in FIG.

In FIG. 32 (B), on the tracking screen TRW displayed on the display device 35, in the moving direction of the person HM1 as the monitoring object in accordance with the input operation with the cursor CSR by the user's mouse operation or the user's finger FG. When the position (tracking position corresponding to the tracking point) is designated (S61), information (for example, coordinates) regarding the designated position is input to the operation switching control unit 38.

The image processing unit 37 detects the orientation of the face of the person HM1 as the monitoring target by performing predetermined image processing on the image data captured by the currently used camera device (for example, the camera device C1). (S62). The image processing unit 37 outputs the detection result of the face direction of the person HM1 as the monitoring target to the operation switching control unit 38.

The operation switching control unit 38 detects information related to the specified position specified in step S61 (for example, coordinates indicating the position on the image data) and the detection result of the face orientation of the person HM1 obtained from the image processing unit 37 in step S62. Are used to calculate the relationship between the face orientation of the person HM1, the designated position, and each omnidirectional microphone array device (S63). For example, the operation switching control unit 38 calculates the distance between the position of the monitoring object (for example, the person HM1) corresponding to the designated position on the image data designated in step S61 and each omnidirectional microphone array device.

The operation switching control unit 38 corresponds to the designated position on the image data that is in the direction along the face of the monitoring target (eg, the person HM1) (eg, within 45 degrees in the horizontal direction) and designated in step S61. The omnidirectional microphone array apparatus that selects the minimum distance between the position of the monitoring target (for example, the person HM1) and each omnidirectional microphone array apparatus is selected (S64). Furthermore, the operation switching control unit 38 instructs the signal processing unit 34 to form directivity with respect to the audio data of the audio collected by the omnidirectional microphone array device selected in step S64 (S64). ).

In response to the instruction in step S64, the output control unit 34b of the signal processing unit 34 changes the sound directivity from the omnidirectional microphone array apparatus selected in step S64 toward the position of the person HM1 as the monitoring target. The formed voice having directivity is output from the speaker device 36 (S65).

Thereby, the operation switching control unit 38 is moving depending on the orientation of the face on the image data of the monitoring object (for example, the person HM1) and the distance between the monitoring object (for example, the person HM1) and each omnidirectional microphone array device. It is possible to select an optimal omnidirectional microphone array device that can accurately pick up the sound emitted from the monitoring object (for example, the person HM1), and the sound emitted from the monitoring object (for example, the person HM1) can be selected with high accuracy. Can be picked up.

After step S65, the optimum omnidirectional microphone array apparatus selection process shown in FIG. 32B is completed, and the process proceeds to step S2 shown in FIG. 9A, FIG. 9B, or FIG. move on. Note that the manual switching process of the camera device shown in FIG. 31A may be performed after the selection process of the optimum omnidirectional microphone array device shown in FIG.

In FIG. 33, the output control unit 34b lists the camera screens of all the camera devices managed by the directivity control system 100B in the display device 35 in response to an input operation using the cursor CSR or the user's finger FG by the user's mouse operation. Displayed (S71). Among the camera screens displayed as a list on the display device 35, the cursor CSR by the user's mouse operation or the user's mouse is displayed on the camera screen on which the monitoring target object (for example, the person HM1) to be subjected to the voice tracking process is shown. Assume that the camera screen C1W with the best reflection of the person HM1 is selected in accordance with the input operation with the finger FG (S72).

The operation switching control unit 38 selects and switches a camera device corresponding to the camera screen as a camera device used for capturing an image of the person HM1 in accordance with the user's selection of the camera screen in step S72. As a result, the output control unit 34b enlarges the image data captured by the camera device corresponding to the camera screen and displays it on the tracking screen TRW1 of the display device 35 (see S73, lower left side of FIG. 28).

The output control unit 34b is a marker (for example, markers M1R, M2R, M3R, and M4R shown in FIG. 28) that indicates the approximate positions of all the omnidirectional microphone array devices associated with the camera device selected by the operation switching control unit 38. Are displayed at the four corners of the tracking screen TRW1 (S74).

When the markers M1R, M2R, M3R, and M4R are sequentially designated by the cursor CSR by the user's mouse operation or the input operation by the user's finger FG (S75), the output control unit 34b highlights the markers one by one (for example, With the blinking Br), for each marker, directivity is formed in the direction from the omnidirectional microphone array device corresponding to each marker to the position of the person HM1, and the sound collected for a certain time is output (S76). .

When the marker (for example, the marker M3R) indicating the approximate position of the omnidirectional microphone array device determined by the user to be optimal among the sounds output for a certain period of time is selected, the operation switching control unit 38 selects the selected marker M3R. The corresponding omnidirectional microphone array apparatus M3 is selected and switched as the omnidirectional microphone array apparatus used for collecting the sound emitted by the person HM1 (S77).

In addition, after step S77, the selection process of the optimal omnidirectional microphone array apparatus shown in FIG. 33 is completed, and the process proceeds to step S2 shown in FIG. 9A, FIG. 9B, or FIG. Note that the manual switching process of the camera device shown in FIG. 31A may be performed after the selection process of the optimum omnidirectional microphone array device shown in FIG.

(Modification of the first embodiment)
In each of the above-described embodiments, when a single monitoring object (for example, the person HM1) is mainly displayed on the image data, the voice tracking according to the movement of the person HM1 as the single monitoring object. The process has been described.

In the modified example of the first embodiment (hereinafter referred to as “this modified example”), in the first embodiment or the second embodiment, a plurality of monitoring objects (for example, a plurality of persons) are displayed on the tracking screen TRW. An operation example of the directivity control system 100 when a plurality of persons are designated at the same timing or at different timings when they appear will be described. Since the system configuration example of the directivity control system according to the present modification is the same as the

directivity control system

100, 100A, 100B according to the first or second embodiment, the description of the system configuration example is simplified or omitted. The different contents will be described. Hereinafter, in order to simplify the explanation, the system configuration example of the directivity control system 100 will be described.

An example of the operation of the directivity control system 100 according to this modification will be described with reference to FIGS. FIG. 34 is a flowchart for explaining an example of the overall flow of manual tracking processing based on a plurality of simultaneous designations in the directivity control system 100 according to the modification of the first embodiment. FIG. 35 is a flowchart illustrating an example of an automatic tracking process for a plurality of monitoring objects in the directivity control system 100 according to the modification of the first embodiment. In FIG. 35, directivity control devices 3A and 3B are used.

In FIG. 34, the tracking mode determination process in step S1, the tracking assist process in step S2, the tracking connection process in step S6, and the audio output process in step S7 are performed in, for example, step S1 shown in FIG. The tracking mode determination process, the tracking assist process in step S2, the tracking connection process in step S6 shown in FIG. 9 (A), and the audio output process in step S7 shown in FIG. 9 (A) are omitted. To do.

In FIG. 34, if the tracking mode is off (S1, NO), the manual tracking process based on multiple simultaneous designations shown in FIG. 34 ends, but if the tracking mode is on (S1, YES), the display On the tracking screen TRW of the device 35, the sound currently output (reproduced) from the speaker device 36 is temporarily stopped by the click operation of the cursor CSR by the user's mouse operation or the touch operation of the user's finger FG (S81). After step S81, tracking assist processing is performed (S2).

After step S2, the tracking points corresponding to the tracking positions of the movement processes (movement paths) of a plurality of persons as monitoring objects are simultaneously determined in accordance with the input operation with the cursor CSR by the user's mouse operation or the user's finger FG. It is assumed that a plurality are designated (S82).

The tracking processing unit 34c, for each person as the monitoring target specified in step S82, distinguishes positions in the real space corresponding to a plurality of specified positions on the tracking screen TRW and specified times, and tracks tracking points respectively. The position and tracking time are associated with each other and stored in the memory 33 (S83). Further, the tracking processing unit 34c displays a point marker for each person as a monitoring target by distinguishing the tracking point on the tracking screen TRW via the output control unit 34b (S83).

The output control unit 34b selects each person corresponding to the tracking position of each person as a plurality of monitoring objects simultaneously designated in step S82 from the omnidirectional microphone array apparatus (for example, omnidirectional microphone array apparatus) M1 currently in use. The directivity of the collected sound is formed in the direction to the position in the real space (sound position, sound source position) (S84). After step S84, tracking connection processing is performed (S6).

After step S6, the output control unit 34b resumes the output (reproduction) of the sound that has been paused in step S81 from the speaker device 36 (S85). After step S85, an audio output process is performed (S7). After step S7, the operations from step S81 to step S7 (steps S81, S2, S82, S83, S84, S6, S85, and S7) are repeated until the tracking mode of the directivity control device 3B is turned off. .

In FIG. 35, after step S3, the image processing unit 37 of the directivity control devices 3A and 3B performs known image processing to detect a person as a monitoring target on the tracking screen TRW of the display device 35. If it is determined that a plurality of persons have been detected, the signal processing is performed using the determination result (including the detection position of each person (for example, known representative points) and detection time data) as an automatic designation result. To the tracking processing unit 34c of the unit 34 (S91). In addition, the sound source detection unit 34d performs a known sound source detection process to determine whether or not the position of the sound (sound source) emitted by the person as the monitoring target is detected on the tracking screen TRW of the display device 35, When it is determined that the positions of a plurality of sound sources have been detected, the determination result (including the sound source detection position and detection time data) is output to the tracking processing unit 34c as an automatic designation result (S91).

The tracking processing unit 34c calculates the movement vector of each person as a plurality of monitoring objects using the transition of one or more automatic designation results immediately before in step S91, and estimates the movement direction of each person (S91). ).

The tracking processing unit 34c associates the tracking positions corresponding to the plurality of automatically designated tracking points with the previous automatic designation results using the estimation result of the moving direction of the person as the plurality of monitoring objects in step S91. In addition, it is stored in the memory 33 as a pair of tracking positions (S92). The tracking processing unit 34c distinguishes the designated position and designated time of each person on the tracking screen TRW for each person as the monitoring target, and stores them in the memory 33 in association with the tracking position and tracking time of the tracking point. (S92). Further, the tracking processing unit 34c displays a point marker for each person as a monitoring target by distinguishing the tracking position on the tracking screen TRW via the output control unit 34b (S92).

As a result, the

directivity control devices

3, 3 </ b> A, 3 </ b> B according to the present modified example move how the plurality of monitoring objects (for example, persons) displayed on the image data on the tracking screen TRW of the display device 35 move. However, since the directivity of the sound formed in the direction toward the position before the movement of each person is formed in the direction toward the position after the movement of each person, the directivity of the sound is increased as each person moves. It can follow and form appropriately, and it can control the efficiency degradation of the supervisor's monitoring work.

Hereinafter, configurations, operations, and effects of the above-described directivity control apparatus, directivity control method, storage medium, and directivity control system according to the present invention will be described.

One embodiment of the present invention is a directivity control device that controls the directivity of sound collected by a first sound collection unit including a plurality of microphones, the first sound collection unit, the display unit A directivity forming unit that forms the directivity of the sound in a direction toward the monitoring target corresponding to the first specified position on the image of the display, and the display unit specified according to the movement of the monitoring target An information acquisition unit that acquires information about a second designated position on the image of the image, and the directivity forming unit uses the information about the second designated position acquired by the information acquisition unit, The directivity control device switches the directivity of the voice in a direction toward the monitoring target corresponding to a second designated position.

In this configuration, the directivity control device forms sound directivity in the direction from the first sound collection unit including a plurality of microphones to the monitoring target corresponding to the first designated position on the image of the display unit. In addition, information on the second designated position that designates the moving monitoring object is acquired. In addition, the directivity control device switches the directivity of the sound in the direction toward the monitoring target corresponding to the second designated position, using information regarding the second designated position on the image of the display unit.

Thereby, the directivity control device monitors the directivity of the sound formed in the direction toward the position before the movement of the monitoring object even if the monitoring object projected on the image of the display unit moves. Since the object is formed in the direction toward the position after the movement, the sound can be properly formed following the directivity of the sound as the monitoring object moves, and the efficiency of monitoring work of the supervisor can be suppressed. .

In one embodiment of the present invention, the information acquisition unit acquires information related to the second specified position in response to a specifying operation on the monitoring object that moves on the image of the display unit. It is a control device.

According to this configuration, the directivity control device can easily obtain accurate information regarding the position after the movement of the monitoring target object by a simple operation that specifies the monitoring target object moving on the image displayed on the display unit. can do.

In one embodiment of the present invention, a sound source detection unit that detects a sound source position corresponding to the monitoring object from the image of the display unit, and an image processing unit that detects the monitoring object from the image of the display unit. The information acquisition unit includes the information regarding the sound source position detected by the sound source detection unit or the information regarding the position of the monitoring target detected by the image processing unit as the second designated position. It is a directivity control device that is acquired as information on the information.

According to this configuration, the directivity control device can easily detect the sound source of the sound generated by the monitoring target and the monitoring target itself from the image displayed on the display unit. Information regarding the position of the monitoring object can be easily acquired as information regarding the position after the movement of the monitoring object.

In one embodiment of the present invention, the sound source detection unit starts detection processing of a sound source position corresponding to the monitoring object centering on an initial position designated on the image of the display unit, and the image The processing unit is a directivity control device that starts the detection process of the monitoring object around the initial position.

According to this configuration, the directivity control device is related to the position of the sound source around the initial position (for example, at the position of the monitoring target) specified on the image displayed on the display unit, for example, by the user's specifying operation. Since the detection process of the information or the information related to the position of the monitoring object is started, the detection process of the position of the sound source or the detection process of the position of the monitoring object can be performed at high speed.

Further, according to one embodiment of the present invention, the change is performed in accordance with an operation for changing the information on the sound source position detected by the sound source detection unit or the information on the position of the monitoring target detected by the image processing unit. It is a directivity control apparatus which acquires the information regarding the position on the image of the said display part designated by operation as information regarding the said 2nd designated position.

According to this configuration, even if the position of the sound source or the position of the monitoring object detected by the sound source position detection process or the monitoring object position detection process is incorrect, for example, the user Information regarding the position designated on the image by the position changing operation can be easily corrected and acquired as information regarding the position after the movement of the monitoring object.

In one embodiment of the present invention, the information acquisition unit is configured such that a distance between the sound source position detected by the sound source detection unit and the position of the monitoring target detected by the image processing unit is a predetermined value or more. The information on the position on the image of the display unit specified by the change operation in response to the change operation of the information on the sound source position or the information on the position of the monitoring target object. It is a directivity control device that is acquired as information on the information.

According to this configuration, the directivity control device is configured so that the distance between the sound source position detected by the sound source position detection process or the monitoring target object position detection process and the monitoring target object position is equal to or greater than a predetermined value. For example, information related to the position designated on the image can be easily corrected and acquired as information related to the position after the movement of the monitoring target object by, for example, a user's position changing operation. Furthermore, the directivity control device, for example, if the distance between the sound source position detected by the sound source position detection process or the monitoring object position detection process and the position of the monitoring object is not a predetermined value or more, for example, The position of the sound source or the position of the monitoring object can be easily acquired as information on the position after the movement of the monitoring object without requiring a position changing operation.

In addition, an embodiment of the present invention further includes an image storage unit that stores an image captured over a certain period, and an image playback unit that plays back the image stored in the image storage unit on the display unit. The image reproduction unit is a directivity control device that reproduces the image at a speed value smaller than the initial value of the reproduction speed by a predetermined input operation.

According to this configuration, when the directivity control device reproduces an image captured over a certain period on the display unit as a video, the initial value of the reproduction speed is determined by a user's predetermined input operation (for example, a slow reproduction instruction operation). Slow reproduction can be performed at a speed value smaller than (for example, a normal value used at the time of video reproduction).

In addition, an embodiment of the present invention further includes a display control unit that displays a captured image on the display unit, and the display control unit is configured according to designation to a designated position on the image of the display unit. A directivity control device that enlarges and displays the image on the same screen at a predetermined magnification around the designated position.

According to this configuration, the directivity control device enlarges and displays an image at a predetermined magnification within the same screen, with the designated position on the image displayed on the display unit as the center, for example, by a simple designation operation by the user. Therefore, it is possible to simplify the user's operation of specifying the monitoring target on the same screen.

In addition, an embodiment of the present invention further includes a display control unit that displays a captured image on the display unit, and the display control unit is configured according to designation to a designated position on the image of the display unit. A directivity control device that enlarges and displays the image on another screen at a predetermined magnification with the designated position as the center.

According to this configuration, the directivity control device enlarges and displays an image at a predetermined magnification in different screens with a designated position on the image displayed on the display unit as a center by, for example, a simple designation operation by the user. Therefore, the user can easily specify the monitoring target by comparing the screen that is not enlarged and the screen that is enlarged.

In addition, an embodiment of the present invention further includes a display control unit that displays a captured image on the display unit, and the display control unit uses the center of the display unit as a reference in accordance with a predetermined input operation. Is a directivity control device that enlarges and displays the image at a predetermined magnification.

According to this configuration, the directivity control device enlarges and displays the image at a predetermined magnification on the basis of the center of the display unit, for example, by a simple designation operation by the user, for example, monitoring near the center of the display unit. When the object is shown, the user can easily specify the monitoring object.

Further, in one embodiment of the present invention, the display control unit, when the designated position exceeds a predetermined scroll determination line on the screen on which the image is enlarged and displayed in accordance with the movement of the monitoring object, A directivity control apparatus that scrolls the screen by a predetermined amount in a direction beyond the scroll determination line.

According to this configuration, the directivity control device has exceeded the scroll determination line when the user-specified position exceeds the scroll determination line due to the movement of the monitoring object displayed on the enlarged display screen. Since the screen is automatically scrolled by a predetermined amount in the direction, it is possible to prevent the designated position of the monitoring target of the user from moving off the screen even when the screen is enlarged.

Further, in one embodiment of the present invention, the display control unit, when the designated position exceeds a predetermined scroll determination line on the screen on which the image is enlarged and displayed in accordance with the movement of the monitoring object, A directivity control device that scrolls the screen so that the designated position is at the center.

According to this configuration, the directivity control device allows the user-specified position to be displayed on the screen when the user-specified position exceeds the scroll determination line due to the movement of the monitoring object displayed on the enlarged display screen. Since the screen is automatically scrolled to the center of the screen, even if the screen is enlarged, it is possible to prevent the designated position of the user's monitoring target from moving off the screen and to keep moving It is possible to easily specify the monitoring object.

Moreover, one embodiment of the present invention is a directivity control device in which the display control unit scrolls the screen so that the designated position is at the center of the screen on the screen on which the image is enlarged and displayed. .

According to this configuration, the directivity control device automatically scrolls the screen so that the position designated by the user is always at the center of the screen when the monitoring target displayed on the enlarged screen is moved. Therefore, even when the screen is enlarged and displayed, it is possible to prevent the designated position of the monitoring target of the user from moving off the screen, and it is possible to easily specify the monitoring target on the screen that continues to move.

Further, an embodiment of the present invention is a directivity control device in which the image processing unit performs masking processing on a part of the monitoring object on the image of the display unit in accordance with a predetermined input operation.

According to this configuration, the directivity control device masks a part (for example, a face) of a monitoring object (for example, a person) displayed on the screen of the display unit by, for example, a simple input operation by the user. Privacy can be effectively protected by making it difficult to understand who the person of the object is.

In addition, an embodiment of the present invention further includes a sound output control unit that causes the sound output unit to output the sound collected by the first sound collecting unit, and the sound output control unit includes a predetermined input operation. Accordingly, the directivity control device is configured to perform voice change processing on the sound collected by the first sound collection unit and output the sound to the sound output unit.

According to this configuration, the directivity control device performs voice change processing on the sound collected in real time by the first sound collection unit, for example, by a simple input operation by the user, and outputs the sound. By making it difficult to understand who the voice of (for example, a person) is, it is possible to effectively protect the privacy on the voice of the person of the monitored object currently being imaged.

In one embodiment of the present invention, a sound storage unit that stores sound collected by the first sound collection unit over a certain period of time, and the sound stored in the sound storage unit is output to a sound output unit. An audio output control unit for causing the audio output control unit to perform a voice change process on the audio collected by the first sound collection unit in response to a predetermined input operation. This is a directivity control device for output.

According to this configuration, the directivity control device performs a voice change process on the sound when the sound collected by the first sound collection unit is output for a certain period of time by a simple input operation of the user, for example. Therefore, it is possible to effectively protect the privacy of the monitoring target person's voice by making it difficult to understand the voice of the monitoring target object (for example, a person).

Moreover, one embodiment of the present invention further includes a display control unit that displays a predetermined marker at a specified position on the image of one or more of the display units that is specified according to the movement of the monitoring object. The directivity control device.

According to this configuration, for example, when the user performs a designation operation for designating the monitoring target displayed on the display unit, the directivity control device is set to a predetermined position designated on the screen of the display unit. Since the marker is displayed, the position through which the moving monitoring object passes can be explicitly shown as a trajectory.

In one embodiment of the present invention, at least the current designated position and the immediately preceding designated position among two or more designated positions on the image of the display unit designated according to the movement of the monitoring object. It is a directivity control apparatus further provided with the display control part which connects and displays.

According to this configuration, the directivity control device includes at least the current designated position and the immediately preceding position among the plurality of designated positions designated by the user's designated operation when the monitoring object projected on the screen of the display unit moves. Since the designated position is connected and displayed, a partial trajectory of the movement of the monitored object can be explicitly shown.

In addition, according to an embodiment of the present invention, one or two specified positions adjacent to each specified position with respect to all the specified positions on the image of the display unit that are specified according to the movement of the monitoring object. It is a directivity control apparatus further provided with the display control part which displays the flow line which connected.

According to this configuration, the directivity control device is adjacent to each designated position with respect to all of the plurality of designated positions designated by the user's designated operation when the monitoring object projected on the screen of the display unit moves. Since one or two designated positions to be displayed are connected and displayed, the entire trajectory of the movement of the monitoring object can be explicitly shown.

In one embodiment of the present invention, a specified list storage unit that stores data of all specified positions and specified times on the image of the display unit, and all of the display units displayed by the display control unit. Reproduction that calculates the reproduction start time of the sound at the designated position on the flow line using the designation list stored in the designation list storage unit in accordance with designation of an arbitrary position on the flow line connecting the designated position A time calculation unit, wherein the directivity forming unit uses the data at the specified position corresponding to the specified time closest to the reproduction start time of the sound calculated by the reproduction time calculation unit, It is a directivity control device that forms the directivity of speech.

In this configuration, the directivity control device is configured such that when all the designated positions designated by the user are displayed while moving the monitored object, the designated position on the flow line is designated according to any user designation. The reproduction start time of the collected voice at is calculated, and the directivity of the voice is formed corresponding to any designated time designated during movement of the monitoring object closest to the reproduction time.

As a result, the directivity control device designates the designation specified next to the arbitrarily designated position in accordance with the position (arbitrarily designated position) arbitrarily designated by the user on the flow line indicating the trajectory of the movement of the monitored object. Voice directivity can be formed in advance in the direction toward the position (tracking position).

In one embodiment of the present invention, a sound storage unit that stores sound collected by the first sound collection unit over a certain period of time, and the sound stored in the sound storage unit is output to a sound output unit. An audio output control unit that causes the audio output unit to output the audio to the audio output unit at the reproduction start time of the audio calculated by the reproduction time calculation unit. A unit configured to form directivity of the voice by using data of the designated position corresponding to the next designated time when there is a next designated time within a predetermined time from the reproduction start time of the voice; It is a control device.

In this configuration, the directivity control device reproduces the sound at the sound reproduction start time at the position designated according to any user designation on the flow line, and within a predetermined time from the sound reproduction time, the monitoring target object When there is a next designated time designated by the user during the movement, the directivity of the voice is formed using data at a designated position corresponding to the next designated time.

Thereby, the directivity control device can clearly output the collected sound emitted by the monitoring target at the reproduction start time calculated according to the user's arbitrary designated position, and within a predetermined time from the reproduction start time. When there is a next designated position, the directivity of the voice at the next designated position can be formed in advance.

In addition, according to an embodiment of the present invention, when the monitoring object exceeds a predetermined switching range corresponding to the first imaging unit used for displaying the image on the display unit, the image on the display unit is displayed. The directivity control device further includes an operation switching control unit that switches an imaging unit used for display from the first imaging unit to the second imaging unit.

In this configuration, the directivity control device, when the moving monitoring object exceeds a predetermined switching range corresponding to the first imaging unit used for displaying the image on the display unit, The imaging unit used for image display is switched from the first imaging unit to the second imaging unit.

As a result, the directivity control device can adaptively switch to an imaging unit capable of accurately displaying an image of the moving monitoring object, and can easily specify the image of the user's monitoring object. it can.

Further, according to an embodiment of the present invention, when the monitoring object exceeds a predetermined switching range corresponding to the first sound collection unit, a sound collection unit used for collecting sound of the monitoring object is provided. The directivity control device further includes an operation switching control unit that switches from the first sound collecting unit to the second sound collecting unit.

In this configuration, the directivity control device, when the moving monitoring object exceeds the predetermined switching range corresponding to the first sound collection unit used for collecting the sound of the monitoring object, The sound collection unit used for collecting the sound of the object is switched from the first sound collection unit to the second sound collection unit.

As a result, the directivity control device can adaptively switch to a sound collection unit capable of accurately collecting the sound emitted by the moving monitoring object, and the sound emitted by the monitoring object can be accurately obtained. Sound can be collected.

In addition, according to one embodiment of the present invention, a display control unit that displays a list of images captured by a plurality of imaging units on different screens according to a predetermined input operation, and the display control unit Select an imaging unit to be used for displaying the image of the monitoring object on the display unit in response to a selection operation on one of the predetermined selectable screens among the screens displayed in a list on the display unit A directivity control device further comprising an operation switching control unit.

In this configuration, the directivity control device changes the imaging unit used for displaying the image on the display unit from a plurality of different screens displayed in a list on the display unit to a screen specified by the user according to the moving direction of the monitoring target. Switch to the corresponding imaging unit.

Thus, the directivity control device can adaptively switch to an imaging unit capable of accurately displaying an image of the moving monitoring target object by a simple operation of the user, and the user's monitoring target image Can be specified easily.

Further, according to one embodiment of the present invention, a display that displays a marker indicating the approximate positions of a plurality of surrounding sound collection units that can be switched from the first sound collection unit in accordance with a predetermined input operation is displayed on the display unit. In accordance with a selection operation of any one of the plurality of markers displayed on the display unit by the control unit and the display control unit, a sound collection unit used for collecting sound of the monitoring target object, The directivity control device further includes an operation switching control unit that switches from the first sound collection unit to another sound collection unit corresponding to the selected marker.

In this configuration, the directivity control device causes the display unit to display markers indicating the approximate positions of a plurality of surrounding sound collection units that can be switched from the first sound collection unit, for example, by a user input operation, and is selected by the user In accordance with one of the markers, the sound collection unit used to collect the sound of the monitoring target is switched from the first sound collection unit to another sound collection unit corresponding to the selected marker.

As a result, the directivity control device can adaptively switch to the sound collection unit capable of accurately collecting the sound emitted from the moving monitoring object by a simple operation of the user. Can be collected with high accuracy.

Further, in one embodiment of the present invention, the operation switching control unit is configured to perform the operation according to designation of a position on the image of the monitoring object captured by the imaging unit selected by the operation switching control unit. A directivity control device that selects a sound collecting unit having the shortest distance from a plurality of sound collecting units including one sound collecting unit to the monitoring target as a sound collecting unit used for collecting sound of the monitoring target. It is.

In this configuration, the directivity control device includes a plurality of sound collection units including the first sound collection unit to the monitoring target according to the position designation on the image of the monitoring target captured by the selected imaging unit. Is selected as the sound collecting unit used for collecting the sound of the object to be monitored.

As a result, the directivity control device allows the user to easily specify the position indicating the moving direction of the monitored object, so that the optimum sound collecting sound that can be accurately picked up by the moving monitored object can be obtained. The sound part can be selected, and the sound emitted from the monitoring object can be collected with high accuracy.

In addition, an embodiment of the present invention further includes an image processing unit that detects a face direction of the monitoring target object from the image of the display unit, and the operation switching control unit is selected by the operation switching control unit. In response to designation of a position on the image of the monitoring object imaged by the imaging unit, the first convergence is performed in a direction corresponding to the face direction of the monitoring object detected by the image processing unit. A directivity control apparatus that selects a sound collecting unit having a shortest distance from a plurality of sound collecting units including a sound unit to the monitoring target as a sound collecting unit used for collecting sound of the monitoring target.

In this configuration, the directivity control device exists in the direction indicated by the orientation of the face of the monitoring object on the image according to the position designation on the image of the monitoring object imaged by the selected imaging unit, And the sound collection part with the shortest distance from the some sound collection part containing a 1st sound collection part to the monitoring target object is selected as a sound collection part used for the sound collection of the sound of the monitoring target object.

As a result, the directivity control device can accurately collect the sound emitted by the moving monitoring object according to the orientation of the face on the image of the monitoring object and the distance between the monitoring object and the sound collection unit. It is possible to select an optimal sound pickup unit that is possible, and it is possible to pick up sound generated by the monitoring object with high accuracy.

Moreover, one embodiment of the present invention further includes an audio output control unit that causes the audio output unit to output audio collected by the first sound collection unit, and the display control unit includes the operation switching control unit A marker indicating the approximate position of a plurality of sound collection units including the first sound collection unit associated with the imaging unit selected by the display unit is displayed on the display unit, and the audio output control unit is configured to switch the operation From the sound collection unit corresponding to each marker displayed on the display unit to the monitoring target according to the designation of the position on the image of the monitoring target captured by the imaging unit selected by the control unit The sound with directivity formed in the direction of is sequentially output for a predetermined time, and the operation switching control unit selects one of the markers based on the sound output by the sound output control unit. The sound collection part corresponding to the marked marker, Selecting a sound pickup unit used for sound collection sound serial monitored object, a directivity control apparatus.

In this configuration, the directivity control device causes the display unit to display a marker indicating the approximate position of the plurality of sound collection units including the first sound collection unit associated with the selected imaging unit, and to monitor the movement In accordance with the position designation on the image of the target object, sound in which directivity is formed in the direction from the sound collection unit corresponding to each marker to the monitoring target is sequentially output for a predetermined time, and further selected. The sound collection unit corresponding to any one of the markers is selected as the sound collection unit used for collecting the sound of the monitoring target.

As a result, the directivity control device can output sound collection sounds in which different directivities are formed in a plurality of sound collection units associated with the selected imaging unit over a certain period of time, so that the user is optimal. By performing a simple operation to select the collected sound to be judged, it is possible to select the optimum sound collection part that can accurately pick up the sound emitted by the moving monitoring object. Can be collected with high accuracy.

In addition, an embodiment of the present invention is a directivity control method in a directivity control apparatus that controls directivity of sound collected by a first sound collection unit including a plurality of microphones. The step of forming the directivity of the sound in the direction from the sound collection unit to the monitoring target corresponding to the first specified position on the image of the display unit, and specified according to the movement of the monitoring target Using the information on the second designated position on the image of the display unit and the obtained information on the second designated position to the monitoring object corresponding to the second designated position. And switching the directivity of the voice in the direction.

In this method, the directivity control device forms sound directivity in the direction from the first sound collection unit including a plurality of microphones to the monitoring target corresponding to the first designated position on the image of the display unit. In addition, information on the second designated position that designates the moving monitoring object is acquired. In addition, the directivity control device switches the directivity of the sound in the direction toward the monitoring target corresponding to the second designated position, using information regarding the second designated position on the image of the display unit.

One embodiment of the present invention is a storage medium storing a program for executing processing in a directivity control device that controls directivity of sound collected by a first sound collection unit including a plurality of microphones. And forming the directivity of the sound in the direction from the first sound collection unit to the monitoring target corresponding to the first designated position on the image of the display unit; and Using the step of acquiring information related to the second specified position on the image of the display unit specified in accordance with movement and the acquired information related to the second specified position, the second specified position And a step of switching the directivity of the voice in a direction toward the corresponding monitoring object.

A directivity control device capable of executing a program stored in the storage medium is provided from a first sound collection unit including a plurality of microphones to a monitoring object corresponding to a first designated position on an image on a display unit. Voice directivity is formed in the direction, and information on the second designated position that designates the moving monitoring object is acquired. In addition, the directivity control device switches the directivity of the sound in the direction toward the monitoring target corresponding to the second designated position, using information regarding the second designated position on the image of the display unit.

In addition, an embodiment of the present invention includes an imaging unit that images a sound collection region, a first sound collection unit that includes a plurality of microphones and collects sound in the sound collection region, and the first sound collection unit. A directivity control device that controls the directivity of the sound collected in step 1, wherein the directivity control device corresponds to a first designated position on the image of the display unit from the first sound collection unit. Information on a second designated position on the image of the display unit, which is designated according to the movement of the monitoring target, and a directivity forming unit that forms the directivity of the voice in a direction toward the monitoring target The directivity forming unit uses the information related to the second designated position obtained by the information obtaining unit, and the monitoring target corresponding to the second designated position A directivity control system that switches the directivity of the voice in a direction toward an object.

In this system, the directivity control device forms sound directivity in the direction from the first sound collection unit including a plurality of microphones to the monitoring target corresponding to the first designated position on the image of the display unit. In addition, information on the second designated position that designates the moving monitoring object is acquired. In addition, the directivity control device switches the directivity of the sound in the direction toward the monitoring target corresponding to the second designated position, using information regarding the second designated position on the image of the display unit.

As a result, in the directivity control system, the directivity control device allows the sound formed in the direction toward the position before the movement of the monitoring object even if the monitoring object displayed on the image of the display unit moves. Because the directivity of the object is formed in the direction toward the position after the movement of the monitoring object, it can be formed appropriately following the directivity of the voice as the monitoring object moves. Efficiency degradation can be suppressed.

Although various embodiments have been described above with reference to the drawings, it goes without saying that the present invention is not limited to such examples. It will be apparent to those skilled in the art that various changes and modifications can be made within the scope of the claims, and these are naturally within the technical scope of the present invention. Understood.

The present invention is directed to a directivity control device that properly forms the directivity of sound with respect to a monitoring object even when the monitoring object on the image moves, and suppresses deterioration in efficiency of the monitoring work of the observer. This is useful as a sex control method, a storage medium, and a directivity control system.

3, 3A, 3B Directivity control device 4 Recorder device 31 Communication unit 32 Operation unit 33

Memory

34, 34A Signal processing unit 34a Directional direction calculation unit 34b Output control unit 34c Tracking processing unit 34d Sound source detection unit 35 Display device 36 Speaker device 37 Image processing unit 38 Operation switching

control unit

100, 100A, 100B Directivity control system C1, Cn Camera device C1RN, C2RN Imaging area JC1, JM1 Switching determination line JDL Scroll determination line LN1, LN2, LNR, LNW Tracking line LST Tracking list NW Network M1, Mm Omnidirectional microphone array device MR1, MR2, MR2W, MR2R, MR3 Point marker TP1, TP2 Tracking point TRW Tracking screen

Claims

A directivity control device for controlling the directivity of sound collected by a first sound collection unit including a plurality of microphones,
A directivity forming unit that forms directivity of the sound in a direction from the first sound collecting unit to a monitoring target corresponding to the first designated position on the image of the display unit;
An information acquisition unit that acquires information related to a second specified position on the image of the display unit that is specified according to the movement of the monitoring object;
The directivity forming part is
Using the information on the second designated position acquired by the information acquisition unit, the directivity of the voice is switched in the direction toward the monitoring object corresponding to the second designated position.
Directivity control device.
The directivity control device according to claim 1,
The information acquisition unit
In response to a designation operation for the monitoring object that moves on the image of the display unit, information on the second designated position is acquired.
Directivity control device.
The directivity control device according to claim 1,
A sound source detection unit for detecting a sound source position corresponding to the monitoring object from the image of the display unit;
An image processing unit for detecting the monitoring object from the image of the display unit,
The information acquisition unit
Obtaining information on the position of the sound source detected by the sound source detection unit or information on the position of the monitoring target detected by the image processing unit as information on the second designated position;
Directivity control device.
The directivity control device according to claim 3,
The sound source detector
Centering on the initial position designated on the image of the display unit, the detection processing of the sound source position corresponding to the monitoring object is started,
The image processing unit
Centering on the initial position, the monitoring object detection process is started.
Directivity control device.
The directivity control device according to claim 3,
The information acquisition unit
The image of the display unit designated by the change operation in response to a change operation of the information on the sound source position detected by the sound source detection unit or the information on the position of the monitoring target detected by the image processing unit. Obtaining information relating to the upper position as information relating to the second designated position;
Directivity control device.
The directivity control device according to claim 3,
The information acquisition unit
When the distance between the sound source position detected by the sound source detection unit and the position of the monitoring target detected by the image processing unit is a predetermined value or more, information on the sound source position or the position of the monitoring target In response to a change operation of information regarding, information related to the position on the image of the display unit specified by the change operation is acquired as information related to the second specified position.
Directivity control device.
The directivity control device according to claim 1,
An image storage unit for storing images taken over a certain period of time;
An image reproduction unit for reproducing the image stored in the image storage unit on the display unit;
Further comprising
The image reproduction unit
Playing the image at a speed value smaller than the initial value of the playback speed by a predetermined input operation,
Directivity control device.
The directivity control device according to claim 1,
A display control unit for displaying the captured image on the display unit;
The display control unit
In response to designation to a designated position on the image of the display unit, the image is enlarged and displayed on the same screen at a predetermined magnification around the designated position.
Directivity control device.
The directivity control device according to claim 1,
A display control unit for displaying the captured image on the display unit;
The display control unit
In response to designation to a designated position on the image of the display unit, the image is enlarged and displayed on another screen at a predetermined magnification around the designated position.
Directivity control device.
The directivity control device according to claim 1,
A display control unit for displaying the captured image on the display unit;
The display control unit
In response to a predetermined input operation, the image is enlarged and displayed at a predetermined magnification with reference to the center of the display unit.
Directivity control device.
The directivity control device according to claim 8,
The display control unit
In response to the movement of the monitoring object, when the specified position exceeds a predetermined scroll determination line on the screen on which the image is enlarged and displayed, the screen is scrolled by a predetermined amount in the direction beyond the scroll determination line. ,
Directivity control device.
The directivity control device according to claim 8,
The display control unit
In response to the movement of the monitoring object, when the designated position exceeds a predetermined scroll determination line on the screen on which the image is enlarged, the screen is scrolled so that the designated position is the center.
Directivity control device.
The directivity control device according to claim 8,
The display control unit
In the screen on which the image is enlarged, the screen is scrolled so that the designated position is the center of the screen.
Directivity control device.
The directivity control device according to claim 3,
The image processing unit
In accordance with a predetermined input operation, a part of the monitoring object on the image of the display unit is masked.
Directivity control device.
The directivity control device according to claim 1,
An audio output control unit that causes the audio output unit to output the audio collected by the first sound collection unit;
The audio output control unit
In response to a predetermined input operation, the voice collected by the first sound collection unit is subjected to voice change processing and output to the voice output unit.
Directivity control device.
The directivity control device according to claim 1,
An audio storage unit for storing audio collected by the first sound collection unit over a certain period;
An audio output control unit that causes the audio output unit to output the audio stored in the audio storage unit;
The audio output control unit
In response to a predetermined input operation, the voice collected by the first sound collection unit is subjected to voice change processing and output to the voice output unit.
Directivity control device.
The directivity control device according to claim 1,
A display control unit configured to display a predetermined marker at a specified position on the image of one or more of the display units specified in accordance with the movement of the monitoring object;
Directivity control device.
The directivity control device according to claim 1,
A display control unit configured to connect and display at least the current specified position and the immediately preceding specified position among two or more specified positions on the image of the display unit, which are specified according to the movement of the monitoring object; Further comprising
Directivity control device.
The directivity control device according to claim 1,
Display control for displaying a flow line connecting one or two specified positions adjacent to each specified position for all specified positions on the image of the display unit specified in accordance with the movement of the monitoring object Further comprising
Directivity control device.
The directivity control device according to claim 19,
A designated list storage unit for storing a designated list including data of all designated positions and designated times on the image of the display unit;
In accordance with designation of an arbitrary position on the flow line connecting all the designated positions displayed by the display control unit, the designated position stored on the designation list storage unit is used to designate the designated position on the flow line. A reproduction time calculation unit for calculating a reproduction start time of the sound in
The directivity forming part is
Using the data of the designated position corresponding to the designated time closest to the reproduction start time of the voice calculated by the reproduction time calculating unit, to form the directivity of the voice;
Directivity control device.
The directivity control device according to claim 20,
An audio storage unit for storing audio collected by the first sound collection unit over a certain period;
An audio output control unit that causes the audio output unit to output the audio stored in the audio storage unit;
The audio output control unit
At the reproduction start time of the sound calculated by the reproduction time calculation unit, the sound is output to the sound output unit,
The directivity forming part is
When there is a next designated time within a predetermined time from the reproduction start time of the voice, the directivity of the voice is formed using data of the designated position corresponding to the next designated time.
Directivity control device.
The directivity control device according to claim 1,
When the monitoring object exceeds a predetermined switching range corresponding to the first imaging unit used for displaying the image on the display unit, the imaging unit used for displaying the image on the display unit is the first imaging unit. An operation switching control unit that switches from the imaging unit to the second imaging unit,
Directivity control device.
The directivity control device according to claim 1,
When the monitored object exceeds a predetermined switching range corresponding to the first sound collecting unit, the sound collecting unit used for collecting the sound of the monitored object is changed from the first sound collecting unit to the first sound collecting unit. An operation switching control unit that switches to two sound collection units,
Directivity control device.
The directivity control device according to claim 1,
A display control unit that displays a list of each image captured by a plurality of imaging units on a different screen according to a predetermined input operation;
Displaying the image of the monitoring object on the display unit in response to a selection operation on any one of the screens that can be selected from among the screens displayed as a list on the display unit by the display control unit An operation switching control unit that selects an imaging unit to be used for
Directivity control device.
The directivity control device according to claim 1,
A display control unit that displays on the display unit markers indicating the approximate positions of a plurality of surrounding sound collecting units that can be switched from the first sound collecting unit in accordance with a predetermined input operation;
In response to a selection operation of any one of the plurality of markers displayed on the display unit by the display control unit, a sound collection unit used for collecting sound of the monitoring target is the first collection unit. An operation switching control unit for switching from the sound collection unit to another sound collection unit corresponding to the selected marker;
Directivity control device.
A directivity control device according to claim 24,
The operation switching control unit
In accordance with designation of a position on the image of the monitoring object imaged by the imaging unit selected by the operation switching control unit, the monitoring object is selected from a plurality of sound collection units including the first sound collection unit. Select the sound collecting unit with the shortest distance to the sound collecting unit used for collecting the sound of the monitoring object,
Directivity control device.
A directivity control device according to claim 24,
An image processing unit that detects the orientation of the face of the monitoring object from the image of the display unit;
The operation switching control unit
Corresponding to the face orientation of the monitoring object detected by the image processing unit according to the designation of the position on the image of the monitoring object imaged by the imaging unit selected by the operation switching control unit The sound collection unit having the closest distance from the plurality of sound collection units including the first sound collection unit to the monitoring target in the direction is selected as the sound collection unit used for collecting the sound of the monitoring target. ,
Directivity control device.
A directivity control device according to claim 24,
An audio output control unit that causes the audio output unit to output the audio collected by the first sound collection unit;
The display control unit
Displaying a marker indicating the approximate position of a plurality of sound collection units including the first sound collection unit associated with the imaging unit selected by the operation switching control unit on the display unit;
The audio output control unit
The monitoring from the sound collection unit corresponding to each marker displayed on the display unit according to the designation of the position on the image of the monitoring target image captured by the imaging unit selected by the operation switching control unit Sequentially output sound with directivity in the direction to the object for a predetermined time,
The operation switching control unit
A sound collection unit that uses the sound collection unit corresponding to the selected marker in response to the selection operation of any of the markers based on the sound output by the sound output control unit, Select as the
Directivity control device.
A directivity control method in a directivity control device for controlling the directivity of sound collected by a first sound collection unit including a plurality of microphones,
Forming the directivity of the voice in a direction from the first sound collection unit to the monitoring target corresponding to the first designated position on the image of the display unit;
Obtaining information related to a second designated position on the image of the display unit designated in accordance with the movement of the monitoring object;
Switching the directivity of the voice in a direction toward the monitoring object corresponding to the second designated position, using the acquired information on the second designated position.
Directivity control method.
A storage medium storing a program for executing processing in a directivity control device that controls directivity of sound collected by a first sound collection unit including a plurality of microphones,
Forming the directivity of the voice in a direction from the first sound collection unit to the monitoring target corresponding to the first designated position on the image of the display unit;
Obtaining information related to a second designated position on the image of the display unit designated in accordance with the movement of the monitoring object;
A program for executing the step of switching the directivity of the voice in a direction toward the monitoring object corresponding to the second designated position using the acquired information on the second designated position is stored. The
Storage medium.
An imaging unit for imaging the sound collection area;
A first sound collection unit that includes a plurality of microphones and collects sound in the sound collection region;
A directivity control device that controls the directivity of the sound collected by the first sound collection unit,
The directivity control device includes:
A directivity forming unit that forms directivity of the sound in a direction from the first sound collecting unit to a monitoring target corresponding to the first designated position on the image of the display unit;
An information acquisition unit that acquires information related to a second specified position on the image of the display unit that is specified according to the movement of the monitoring object;
The directivity forming part is
Using the information on the second designated position acquired by the information acquisition unit, the directivity of the voice is switched in the direction toward the monitoring object corresponding to the second designated position.
Directional control system.