WO2022153359A1

WO2022153359A1 - Microphone position presentation method, device therefor, and program

Info

Publication number: WO2022153359A1
Application number: PCT/JP2021/000667
Authority: WO
Inventors: 達也加古; 賢一野口
Original assignee: 日本電信電話株式会社
Priority date: 2021-01-12
Filing date: 2021-01-12
Publication date: 2022-07-21

Abstract

The present invention provides a means for determining the installation position of a microphone without a skilled engineer going to an actual place. This microphone position presentation method comprises: an acquisition step for acquiring point group data of a space where sound pickup is performed; a microphone position calculation step for calculating a microphone installation position within a space shape estimated from the point group data so as to satisfy conditions related to a desired sound; and a presentation step for presenting the calculated microphone installation position.

Description

Microphone location presentation method, its device, and program

The present invention relates to a technique for presenting an appropriate microphone installation position in a sound collecting space.

When recording in an indoor space, the appropriate microphone installation position will change depending on the purpose. For example, Non-Patent Document 1 discloses an example of the relationship between the installation position of a microphone and a sound source when recording a musical instrument and voice. As one of the methods for determining the position where the microphone is installed, there is a method in which an impulse response at an arbitrary position in an indoor space is acquired and the installation position is determined based on an index value according to a purpose such as an SN ratio.

However, in the conventional technique, it is necessary for a skilled engineer to take time to determine the installation position. In addition, since the acoustic characteristics are different for each indoor space, it is necessary to determine the installation position for each indoor space, and it is necessary for the engineer to go to the site.

An object of the present invention is to provide a means for a skilled engineer to determine the installation position of a microphone without visiting the site. The present invention can be applied indoors or outdoors as long as it is a space having a structure that reflects sound. As an outdoor space having a structure that reflects sound, for example, a stadium or an outdoor live venue is assumed.

In order to solve the above problems, according to one aspect of the present invention, the microphone position presentation method satisfies the acquisition step of acquiring the point cloud data of the space for collecting sound and the condition relating to the desired sound. , The microphone position calculation step of calculating where to install the microphone in the shape of the space estimated from the point cloud data, and the presentation step of presenting the calculated microphone installation position are included.

In order to solve the above problems, according to another aspect of the present invention, the microphone position presentation method relates to an input step for accepting input of information based on point group data of a space for collecting sound, and a desired sound. A microphone position calculation step that calculates where to install the microphone in the shape of the space estimated from the information based on the point group data so as to satisfy the condition, and an output step that outputs the calculated microphone installation position to the presentation means. including.

In order to solve the above problems, according to another aspect of the present invention, the microphone position presentation method includes a space shape estimation step of estimating the shape of the space from the point group data of the space for collecting sound, and the estimated space. The wall surface extraction step that extracts the wall surface that forms the space using the shape of A defect completion step that estimates the shape of a part of the wall surface and complements the shape of the space using the estimated part of the wall surface, and a space transfer function calculation that calculates the space transfer function in the space using the complemented space shape. Using the steps and the calculated spatial transmission function, the sound pressure probability distribution for the virtual sound source position is estimated, and the estimated sound pressure probability distribution is used to obtain the microphone position that satisfies the desired acoustic condition. Includes a distribution estimation step.

According to the present invention, there is an effect that a skilled engineer can determine the installation position of the microphone without going to the site.

The functional block diagram of the microphone position presentation system which concerns on 1st Embodiment. The figure which shows the example of the processing flow of the microphone position presentation system which concerns on 1st Embodiment. The figure which shows the example of the point cloud data after noise reduction. The figure which shows the example of the point cloud data after noise reduction. Functional block diagram of the microphone position presentation device. The figure which shows the example of the processing flow of the microphone position presenting apparatus. The figure for demonstrating the processing of a defect completion part. The figure for demonstrating the processing of a defect completion part. 9A is a diagram showing an example of the sound pressure power distribution of the sound source S ₁ , and FIG. 9B is a diagram showing an example of the sound pressure power distribution of the sound source S ₂ . FIG. 10A is a diagram showing the simulated SNR distribution, and FIG. 10B is a diagram showing the sum of the logs of the probability distribution. The functional block diagram of the acoustic system which concerns on 2nd Embodiment. The figure which shows the example of the processing flow of the acoustic system which concerns on 2nd Embodiment. Functional block diagram of the filter design device. The figure which shows the example of the processing flow of the filter design apparatus. The figure which shows the configuration example of the computer to which this method is applied.

Hereinafter, embodiments of the present invention will be described. In the drawings used in the following description, the same reference numerals are given to the components having the same function and the steps for performing the same processing, and duplicate description is omitted. In the following description, the processing performed for each element of a vector or matrix shall be applied to all the elements of the vector or matrix unless otherwise specified.
The points of the present invention will be described before the description of specific embodiments. As mentioned above, since the acoustic characteristics differ from space to space, engineers have traditionally visited the site to decide where to install the microphone. Therefore, consider deciding the position to install the microphone in the virtual space where the acoustic characteristics can be simulated. The acoustic characteristics are influenced by the positional relationship between an object that reflects or absorbs sound, a sound source, and a microphone existing in the space. Point cloud data is used to construct an object that reflects or absorbs sound in virtual space. In other words, a virtual space may be constructed using point cloud data, acoustic characteristics may be simulated in the virtual space, and the microphone installation position may be determined. When assuming indoors, since it is assumed that the edge of the space is composed of a ceiling, a wall, or the like, the space may be constructed by interpolating points to the coordinates where the ceiling, the wall, or the like is assumed to exist. The installation positions of a plurality of microphones may be determined in order to operate as a microphone array instead of a single microphone.

<Microphone position presentation system according to the first embodiment>
FIG. 1 shows a functional block diagram of the microphone position presentation system according to the first embodiment, and FIG. 2 shows a processing flow thereof.

The microphone position presentation system includes an acquisition unit 110, a microphone position presentation device 120, and a presentation unit 150.

The microphone position presentation system acquires point cloud data in a space for sound collection via the acquisition unit 110, and calculates an appropriate microphone installation position for acquiring a desired sound collection signal using the point cloud data. , Present to the user via the presentation unit 150.

The microphone position presenting device 120 is configured by loading a special program into a known or dedicated computer having, for example, a central processing unit (CPU: Central Processing Unit), a main storage device (RAM: Random Access Memory), or the like. It is a special device. The microphone position presenting device 120 executes each process under the control of the central processing unit, for example. The data input to the microphone position presenting device 120 and the data obtained by each process are stored in, for example, the main storage device, and the data stored in the main storage device is read out to the central processing unit as needed. It is used for other processing. At least a part of each processing unit of the microphone position presenting device 120 may be configured by hardware such as an integrated circuit. Each storage unit included in the microphone position presenting device 120 can be configured by, for example, a main storage device such as RAM (Random Access Memory) or middleware such as a relational database or a key-value store. However, each storage unit does not necessarily have to be provided inside the microphone position presenting device 120, and is composed of an auxiliary storage device composed of a hard disk, an optical disk, or a semiconductor memory element such as a flash memory, and is composed of a microphone. It may be configured to be provided outside the position presenting device 120.

Each part will be explained below.

<Acquisition unit 110>
The acquisition unit 110 acquires the point cloud data of the space for collecting sound (S110) and outputs it. For example, the acquisition unit 110 includes a spatial sensing unit 111, a noise removing unit 113, and a spatial model coupling unit 115.

<Spatial sensing unit 111>
The space sensing unit 111 acquires and outputs point cloud data of the space for collecting sound (S111). For example, the space sensing unit 111 is composed of LiDAR (Light Detection and Ranging or Laser Imaging Detection and Ranging) installed in a space that collects sound, emits light toward an object, and receives reflected light after light emission. The distance to the object is calculated from the difference in time until the laser is fired, and the direction of the object is calculated from the launch direction. This distance and direction are represented by point cloud data. As the spatial sensing technique, various conventional techniques can be used. For example, Reference 1 is known as an existing spatial sensing technique.
(Reference 1) Toshio Ito, "Principles and Utilization of LiDAR Technology for Automatic Driving", Scientific Information Publishing Co., Ltd., 2020

<Noise reduction unit 113>
The noise removing unit 113 receives the point cloud data, removes the noise included in the point cloud data (S113), and outputs the removed point cloud data. As the noise reduction technique, various conventional techniques can be used. For example, Reference 2 is known as an existing noise reduction technique.
(Reference 2) Rusu, R. B., Z. C. Marton, N. Blodow, M. Dolha, and M. Beetz, "Towards 3D Point Cloud Based Object Maps for Household Environments", Robotics and Autonomous Systems Journal, 2008.

Note that FIGS. 3 and 4 show examples of point cloud data after noise reduction. FIG. 3 shows point cloud data acquired with the spatial sensing unit 111 (for example, LiDAR) as an elevation angle of 0 degrees, and FIG. 4 shows an elevation angle of −90 degrees.

<Spatial model joint 115>
The space model coupling unit 115 receives a plurality of point cloud data, combines the plurality of point cloud data, restores the shape of the space (S115), and outputs the point cloud data R indicating the shape of the space. For example, while rotating LiDAR, multiple point cloud data with different elevation angles are obtained, and a three-dimensional scene is reconstructed by combining multiple point cloud data using an iterative closest point (ICP) algorithm. As the spatial model coupling technique, various conventional techniques can be used. For example, Reference 3 is known as an existing spatial model coupling technique.
(Reference 3) Szymon.R and Marc.L, "Efficient Variants of the ICP Algorithm", Proceedings Third International Conference on 3-D Digital Imaging and Modeling, 2001, pp. 145-152.

<Microphone position presentation device 120>
FIG. 5 is a functional block diagram of the microphone position presenting device 120, and FIG. 6 is a diagram showing an example of its processing flow.

The microphone position presenting device 120 receives the virtual sound source position S and the point cloud data R indicating the shape of the space, and places the microphone in the space shape estimated from the point cloud data R so as to satisfy the desired acoustic conditions. Calculate whether to install (S120), and output the calculated installation position N of the microphone.

For example, the microphone position presenting device 120 includes an input unit 121, a microphone position calculation unit 123, and an output unit 125.

<Input unit 121>
The input unit 121 receives the point cloud data R indicating the shape of the space (S121) and outputs it to the microphone position calculation unit 123. For example, the input unit 121 is an input interface for inputting point cloud data indicating the shape of the space to the computer constituting the microphone position presenting device 120. When the acquisition unit 110 is included in the computer constituting the microphone position presentation device 120, the input unit 121 is included in the microphone position calculation unit 123. Further, the processing S113 and the processing S115, or the processing S115 may be performed in the computer constituting the microphone position presenting device 120. In this case, the input unit 121 is provided in front of the portion corresponding to the processing performed in the computer, and the input unit 121 is the information based on the point group data of the space for collecting sound (the point group data itself output by the spatial sensing unit 111). , Or, the point group data from which the noise output by the noise removing unit 113 is removed) is received.

<Microphone position calculation unit 123>
The microphone position calculation unit 123 receives the point cloud data R indicating the shape of the space, calculates where in the shape of the space the microphone is to be installed so as to satisfy the desired acoustic conditions (S123), and calculates the microphone. Outputs the installation position N of.

For example, the microphone position calculation unit 123 includes a wall surface extraction unit 123A, a defect complementation unit 123B, a sound source information input unit 123C, a spatial transfer function calculation unit 123D, and a sound pressure probability distribution estimation unit 123E.

<Wall extraction unit 123A>
The wall surface extraction unit 123A receives the point cloud data R indicating the shape of the space, extracts the wall surface forming the space using the shape of the space (S123A), and outputs the data.

For example, the wall surface extraction unit 123A extracts the wall surface by acquiring the parameters of the wall surface forming the space by using an algorithm improved from RANSAC (Random Sample Consensus) for wall surface extraction. An example is shown below.

(1) The point cloud data R indicating the shape of the space is divided based on the observation angle. For example, in the horizontal plane, it is divided into M directions.

(2) Specify the m-th direction (ROI (region of interest)) from the M-divided directions. However, m = 1,2, ..., M.

(3) Randomly acquire 3 points a, b, and c from the point cloud belonging to the ROI.

(4) Calculate the coefficient of the equation of a plane ax + by + cz + d = 0 including the three points a, b, c from the three points a, b, c. For example, the coefficient is calculated from the outer product of the vectors ab and ac.

(5) Calculate the distance between each point in the point cloud belonging to ROI and the plane, and find the number n _q of points where the distance is less than or equal to a predetermined threshold. However, let q = 1,2, ..., Q. For example, the distance d is calculated by d = n (pa). n is the normal vector of the plane containing the three points a, b, and c, and p is each point in the point cloud belonging to ROI.

(6) Repeat steps (3) to (5) above Q times, and set the number n _m of the maximum points from the number n _q of Q and the coefficient corresponding to that number as the parameter of the direction m.

(7) Repeat steps (2) to (6) above M times, and select the direction of the maximum number of points from the _M number nm.

(8) Collect points whose distance to the plane indicated by the coefficient corresponding to the direction selected in (7) above is less than the threshold value, and obtain the normal vector from the collected point group. For example, the normal vector can be obtained by SVD (Singular Value Decomposition) or PCA (Principal Component Analysis) of the covariance matrix.

(9) Record the normal vector as the largest wall surface in the point cloud data.

(10) Remove the point cloud data belonging to the largest wall surface from the data and return to (1). The point cloud data belonging to the largest wall surface represents the extracted wall surface.

If the point cloud data belonging to the largest wall surface occupies a certain value or less in the total point cloud data R, it is assumed that there is no wall surface there. That is, in a certain direction, when the ratio of the number n _q of the points that are equal to or less than the threshold value of (5) above is equal to or less than a predetermined value, it is assumed that there is no wall surface there. Furthermore, since the point cloud data belonging to the largest wall surface in (10) above is excluded from the data, the wall surface that can be extracted gradually decreases and automatically converges. With such a configuration, not only the wall and ceiling but also the planar shape of a certain point cloud that is likely to be reflected is extracted, and as a result, it is simulated as a space with furniture and equipment.

<Defective complement 123B>
The defect complementing portion 123B receives the wall surface forming the space, and when an object other than the wall surface exists in the space and the shape of a part of the wall surface cannot be estimated, the shape of a part of the wall surface can be estimated from the shape of the other part of the wall surface. The shape is estimated, the shape of the space is complemented using a part of the estimated wall surface (S123B), and the shape of the complemented space is output.

As mentioned above, since LiDAR measures the distance using the reflection of the laser, data loss occurs in the area where there is a shield and the laser cannot reach. Therefore, the missing data is complemented from the information on the wall surface from which the estimation can be performed, and the vertices (corners of the room) are obtained.

In the example of FIG. 7, the display D serves as a shield, and the wall surface existing in the portion surrounded by the broken line in the figure cannot be observed. Note that FIG. 7 is point cloud data of a horizontal plane with an elevation angle of 0 degrees, and point O is the installation position of LiDAR.

For example, the parameters of the wall surface (a, b, c, d) are extracted by plane approximation, and the coordinates of the apex of the wall surface are obtained from the straight line. That is, as shown in FIG. 8, the intersection P of the straight lines is obtained as the coordinates of the apex of the wall, and it is treated as if the wall surface is in the broken line portion of FIG.

<Sound source information input unit 123C>
The sound source information input unit 123C accepts one or more virtual sound source positions S = (S ₁ , S ₂ , ...) As inputs (S123C) and outputs them to the spatial transfer function calculation unit 123D. The sound source information input unit 123C is, for example, an input device such as a keyboard or a mouse, and the virtual sound source position is input by the user via the sound source information input unit 123C. For example, the sound source information input unit 123C receives the shape of the complemented space, which is the output of the defect complementing unit 123B (indicated by a broken line in FIG. 5), and displays the shape of the complemented space via a display device such as a display. It may be presented to the user, and the user may specify where to place the sound source in the complemented space by using a mouse or the like. Alternatively, the point cloud data R acquired from LiDAR may be automatically input by using the object recognition as in Reference 4.
(Reference 4) D. Maturana and S. Scherer, "VoxNet: A 3D Convolutional Neural Network for real-time object recognition", 2015 IEEE / RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, 2015, pp. 922-928, doi: 10.1109 / IROS.2015.7353481.

Further, the sound source information input unit 123C is a storage medium reading device, and the microphone position presentation system accepts the virtual sound source position as an input by reading the storage medium that stores the virtual sound source position via the sound source information input unit 123C. You may.

<Spatial transfer function calculation unit 123D>
The spatial transmission function calculation unit 123D calculates the spatial transmission function of each position (assumed listening position) in the space at the virtual sound source position S by using the shape of the space complemented with the virtual sound source position S (S123D). ),Output. As the spatial transfer function calculation technique, various conventional techniques can be used. For example, sound wave propagation is simulated from the shape of space (room model), and the FDTD method (finite-difference time-domain method) is used to predict the incoming sound at a virtual listening position, and the spatial transmission function is calculated. do. In addition, so far, we have explained to obtain acoustic characteristics by using the space and the shape of the object existing in the space in the simulated space, but further, the reflection considering the object constituting the space and the material of the object. Factors may be considered. As the reflection coefficient, for example, the object may be estimated from a camera image or the like, and the reflection coefficient corresponding to the estimated object may be used, or the reflection coefficient may be directly given from the outside. The reflectance coefficient may be obtained by other methods.

<Sound pressure probability distribution estimation unit 123E>
The sound pressure probability distribution estimation unit 123E uses the virtual sound source position S and the calculated spatial transmission function as inputs, and uses the spatial transmission function to distribute the sound pressure probability distribution at each position (assumed listening position) with respect to the virtual sound source position. (S123E), and using the sound pressure probability distribution, the position of the microphone satisfying the desired sound condition is obtained and output. As the sound pressure probability distribution estimation technique, various conventional techniques can be used. For example, the SNR from each sound source is obtained from the sound pressure map obtained by the FDTD method. Next, the space is divided by a grid and the power from each sound source at each point is calculated. Normalize the calculated power so that the sum of the powers in the entire space is 1. Treat the normalized power as a probability distribution and calculate the SNR corresponding to the ratio of the desired source to other noise sources. FIG. 9A shows an example of the sound pressure power distribution of the sound source S ₁ , and FIG. 9B shows an example of the sound pressure power distribution of the sound source S _2. The parts surrounded by the white broken lines in the figure are the sound source S _{1 and the} sound source S, respectively. At position ₂ , the sound pressure power is high. For example, when "high SNR with respect to sound source S ₁ " is set as the condition related to the desired sound, FIG. 10A shows the simulated SNR distribution, and FIG. 10B shows the sum of the log of the probability distribution, and the power distribution is shown. It is a visualization of the joint distribution as a probability distribution. It can be seen that the values of the parts surrounded by the broken lines are large. When the SNR of the sound source S ₁ is increased, the sound source S ₂ becomes noise. Therefore, the sum of the logs of the probability distribution is calculated as follows.

log p (x | SNR ₁ ) = log p (x | S ₁ ) --log p (x | S ₂ )
As the conditions relating to the desired sound, various conditions other than the above-mentioned conditions can be considered. For example, removal of unnecessary sounds, emphasis of necessary sounds, and the like can be considered. N sound sources S ₁ , S ₂ ,…, S _N out of L sound sources S ₁ , S ₂ ,…, S _L are emphasized as target sounds, and N L sound sources S _{L + 1} , S _{L +} When ₂ ,…, S _N is noise, the sum of logs of the probability distribution is calculated as follows.

log p (x | SNR) = log p (x | S ₁ ) +… + log p (x | S _L ) --log p (x | S _{L + 1} ) --… --log p (x | S _N )
The sound pressure probability distribution estimation unit 123E may obtain a predetermined number of microphone installation positions, or may receive the number of microphones to be installed as an input and obtain the microphone installation positions according to the number of microphones. good. The sound pressure probability distribution estimation unit 123E obtains the installation position of the microphone so as to satisfy the conditions relating to the desired sound. For example, the position where the sum of the logs of the probability distribution is large is set as the microphone installation position. When a plurality of microphones are installed, a plurality of positions where the sum of the logs of the probability distribution is maximized may be set as the installation positions of the plurality of microphones, or the installation positions may be obtained by other methods.

<Output unit 125>
The output unit 125 receives the shape of the space after complementation and the microphone position N, generates information indicating the microphone position N in the shape of the space after complementation, for example, image data, and outputs the information to the presentation unit 150 (S125). .. For example, the output unit 125 includes an output interface for outputting the microphone position from the computer constituting the microphone position presenting device 120. When the computer constituting the microphone position presenting device 120 includes the presenting unit 150 described later, the output unit 125 does not have to include the output interface.

<Presentation unit 150>
The presentation unit 150 receives information indicating the microphone position N in the shape of the space after complementation and presents it to the user (S150). The presentation unit 150 comprises display means such as a display.

<Effect>
With the above configuration, a skilled engineer can determine the installation position of the microphone without going to the site.

<Second embodiment>
The part different from the first embodiment will be mainly described.

In this embodiment, the estimated spatial transfer function is used to design a filter that forms a desired beamformer to improve processing performance.

In the conventional technique, the acoustic characteristics are different for each indoor space, so it is necessary to design a filter for each indoor space, and an engineer needs to go to the site.

An object of the present invention is to provide a means for a skilled engineer to design a desired filter without visiting the site.

<Sound system according to the second embodiment>
FIG. 11 shows a functional block diagram of the acoustic system according to the first embodiment, and FIG. 12 shows a processing flow thereof.

The sound system includes an acquisition unit 110, a filter design device 220, and a filtering unit 250.

The sound system acquires point group data of the space for sound collection via the acquisition unit 110, and uses the point group data to generate a filter for different sound collection characteristics of the acoustic signal emitted from the sound source, and collects the sound. The sounded acoustic signal is filtered by the generated filter, and the filtered acoustic signal is acquired. The filter may have different sound collecting characteristics of the acoustic signal emitted from the sound source position in space. "Different sound collection characteristics" means, for example, that the acoustic signal emitted at a specific position is locally picked up so that the acoustic signal emitted at another position is not picked up as much as possible, or conversely, a specific sound is picked up. It means that the acoustic signal emitted at a position is suppressed (silenced) and only the acoustic signal emitted at another position is picked up.

The filter design device 220 is configured by loading a special program into a known or dedicated computer having, for example, a central processing unit (CPU: Central Processing Unit), a main storage device (RAM: Random Access Memory), or the like. Device. The filter design device 220 executes each process under the control of the central processing unit, for example. The data input to the filter design device 220 and the data obtained in each process are stored in, for example, the main storage device, and the data stored in the main storage device is read out to the central processing unit as needed. Used for other processing. At least a part of each processing unit of the filter design device 220 may be configured by hardware such as an integrated circuit. Each storage unit included in the filter design device 220 can be configured by, for example, a main storage device such as RAM (Random Access Memory) or middleware such as a relational database or a key-value store. However, each storage unit does not necessarily have to be provided inside the filter design device 220, and is configured by an auxiliary storage device composed of a hard disk, an optical disk, or a semiconductor memory element such as a flash memory to design a filter. It may be configured to be provided outside the device 220.

Each part will be explained below.

Since the acquisition unit 110 is as described in the first embodiment, the description thereof will be omitted.

<Filter design device 220>
FIG. 13 is a functional block diagram of the filter design device 220, and FIG. 14 is a diagram showing an example of the processing flow thereof.

The filter design device 220 receives point cloud data indicating the shape of space, a virtual sound source position S, and a virtual microphone position M, designs a filter having different sound collection characteristics (S220), and outputs the designed filter.

For example, the filter design device 220 includes an input unit 121, a filter calculation unit 223, and an output unit 225.

Since the input unit 121 is as described in the first embodiment, the description thereof will be omitted.

<Filter calculation unit 223>
The filter calculation unit 223 receives the point cloud data R indicating the shape of the space, the virtual sound source position S, and the virtual microphone position M, calculates the filter W having different sound collection characteristics (S223), and outputs the filter W.

For example, the filter calculation unit 223 includes a wall surface extraction unit 123A, a defect complementation unit 123B, a sound source, a microphone information input unit 223C, a spatial transfer function calculation unit 223D, and a filter calculation unit 223E.

Since the wall surface extraction unit 123A and the defect complementing unit 123B are as described in the first embodiment, the description thereof will be omitted.

<Sound source and microphone information input unit 223C>
The sound source and microphone information input unit 223C has one or more virtual sound source positions S = (S ₁ , S ₂ , ...) And one or more virtual microphone positions M = (M ₁ , M ₂ , ...). It is accepted as an input (S223C) and output to the spatial transfer function calculation unit 123D. Like the sound source information input unit 123C, the sound source and microphone information input unit 223C is an input device such as a keyboard or a mouse, or a storage medium reading device. For example, the sound source and the microphone information input unit 223C receives the shape of the complemented space, which is the output of the defect complementing unit 123B (indicated by a broken line in FIG. 13), and displays the shape of the complemented space on a display device such as a display. It may be presented to the user via the configuration, and the user may specify where to place the sound source and the microphone in the complemented space by using a mouse or the like.

<Spatial transfer function calculation unit 223D>
The spatial transmission function calculation unit 123D calculates the spatial transmission function of each microphone position in the space at the virtual sound source position by using the shape of the space complemented by the virtual sound source position S and the virtual microphone position M (S223D). ,Output. In the first embodiment, the spatial transfer function between the virtual sound source position and all the assumed positions in the space is calculated in order to determine the position of the microphone. However, in the present embodiment, the microphone position is determined, so that the virtual sound source position is virtual. The spatial transfer function between the sound source position and the virtual microphone position may be calculated. The spatial transfer function calculation technique is the same as that of the first embodiment.

<Filter calculation unit 223E>
The filter calculation unit 223E inputs the calculated spatial transmission function as the virtual sound source position S and the virtual microphone position M, and collects the acoustic signal emitted from the virtual sound source position at the virtual microphone position. The filter W that makes the sound characteristics different is calculated (S223E) and output. As the filter calculation technique, various conventional techniques can be used. For example, when it is desired to emphasize the target sound by making the sound collection characteristics different, the filter can be designed based on the minimum dispersion method. Let ω be the frequency, let τ be the frame number, let the transmission coefficient between the kth sound source and the mth microphone be _{Ak, m} (ω), and let the acoustic signal emitted by the _kth sound source be Sk (ω, ω,). If τ), the observation signals at the k-th sound source and the m-th microphone are
X _m (ω, τ) = A _{k, m} (ω) S _k (ω, τ)
It is expressed as. The output signal obtained by multiplying the filter coefficient W is
Y (ω, τ) = W ^H (ω) X (ω, τ)
It is expressed as. However, X (ω, τ) = [X ₁ (ω, τ), X ₂ (ω, τ),…, X _M (ω, τ)], W (ω) = [W ₁ (ω), W ₂ (ω),…, W _M (ω)]. M represents the total number of microphones.

The transfer coefficient between the K sound source and the m-th microphone is a _m (ω) = [A _{1, m} (ω), A _{2, m} (ω),…, A _{K, m} (ω)] Then, design a filter W (ω) that minimizes P under the following constraint conditions.

However, it is assumed that the original signal is uncorrelated.

Echo cancellation, noise cancellation, reverberation removal, speech enhancement, and the like can be considered as examples of different sound collection characteristics, and a filter suitable for each process can be set by using the prior art.

<Output unit 225>
The output unit 225 receives the filter W and outputs it to the filtering unit 250 (S225). For example, the output unit 225 is an output interface for outputting the filter W from the computer constituting the filter design device 220. When the computer constituting the filter design device 220 includes the filtering unit 250 described later, the output unit 225 is included in the filter calculation unit 223.

<Filtering unit 250>
The filtering unit 250 receives the filter W prior to the conversion process. The filtering unit 250 receives the acoustic signal X picked up by the microphone actually arranged at the virtual microphone position described above, and multiplies the acoustic signal X by the filter W to obtain the output signal Y. The output signal Y may be stored in a storage unit (not shown), reproduced from a speaker (not shown) or the like, or transmitted to a device installed in another space. With such a configuration, it is possible to obtain an output signal Y having different sound collecting characteristics of the acoustic signal X.

<Effect>
With such a configuration, a skilled engineer can design a desired filter without going to the site.

<Other variants>
The present invention is not limited to the above embodiments and modifications. For example, the various processes described above may not only be executed in chronological order according to the description, but may also be executed in parallel or individually as required by the processing capacity of the device that executes the processes. In addition, changes can be made as appropriate without departing from the spirit of the present invention.

<Programs and recording media>
The various processes described above can be performed by causing the storage unit 2020 of the computer shown in FIG. 15 to read a program for executing each step of the above method and operating the control unit 2010, the input unit 2030, the output unit 2040, and the like. ..

The program that describes this processing content can be recorded on a computer-readable recording medium. The computer-readable recording medium may be, for example, a magnetic recording device, an optical disk, a photomagnetic recording medium, a semiconductor memory, or the like.

The distribution of this program is carried out, for example, by selling, transferring, or renting a portable recording medium such as a DVD or CD-ROM on which the program is recorded. Further, the program may be stored in the storage device of the server computer, and the program may be distributed by transferring the program from the server computer to another computer via a network.

A computer that executes such a program first stores, for example, a program recorded on a portable recording medium or a program transferred from a server computer in its own storage device. Then, when the process is executed, the computer reads the program stored in its own recording medium and executes the process according to the read program. Further, as another execution form of this program, a computer may read the program directly from a portable recording medium and execute processing according to the program, and further, the program is transferred from the server computer to this computer. It is also possible to execute the process according to the received program one by one each time. In addition, the above processing is executed by a so-called ASP (Application Service Provider) type service that realizes the processing function only by the execution instruction and result acquisition without transferring the program from the server computer to this computer. May be. The program in this embodiment includes information to be used for processing by a computer and equivalent to the program (data that is not a direct command to the computer but has a property of defining the processing of the computer, etc.).

Further, in this form, the present device is configured by executing a predetermined program on the computer, but at least a part of these processing contents may be realized by hardware.

Claims

The acquisition step to acquire the point cloud data of the space where the sound is collected, and
A microphone position calculation step for calculating where to install the microphone in the shape of the space estimated from the point cloud data so as to satisfy the desired acoustic conditions.
Including a presentation step that presents the calculated microphone installation position,
Microphone position presentation method.
The method for presenting the microphone position according to claim 1.
The microphone position calculation step is
The space transfer function in the space is calculated using the shape of the space, and the installation position of the microphone is calculated using the calculated space transfer function.
Microphone position presentation method.
The method for presenting the microphone position according to claim 1.
The microphone position calculation step is
A wall surface extraction step for extracting the wall surface forming the space using the shape of the space,
When an object other than the wall surface exists in the space and the shape of a part of the wall surface cannot be estimated, the shape of a part of the wall surface is estimated from the estimated shape of the other part of the wall surface, and the estimated shape of the wall surface is reached. Includes a defect complement step that complements the shape of the space with a portion.
Microphone position presentation method.
An input step that accepts input of information based on point cloud data in the space where sound is collected, and
A microphone position calculation step for calculating where to install the microphone in the shape of the space estimated from the information based on the point cloud data so as to satisfy the desired acoustic conditions.
Including an output step that outputs the calculated microphone installation position to the presentation means,
Microphone position presentation method.
A space shape estimation step that estimates the shape of the space from the point cloud data of the space that collects sound, and
A wall extraction step for extracting the wall surface forming the space using the estimated shape of the space, and
When an object other than the wall surface exists in the space and the shape of a part of the wall surface cannot be estimated, the shape of a part of the wall surface is estimated from the estimated shape of the other part of the wall surface, and the estimated shape of the wall surface is reached. A defect completion step that complements the shape of the space using a part,
A space transfer function calculation step for calculating a space transfer function in the space using the complemented shape of the space, and a space transfer function calculation step.
The calculated spatial transmission function is used to estimate the sound pressure probability distribution with respect to the virtual sound source position, and the estimated sound pressure probability distribution is used to estimate the sound pressure probability distribution to obtain the microphone position satisfying the desired acoustic conditions. Steps and, including,
Microphone position presentation method.
An acquisition unit that acquires point cloud data in the space where sound is collected,
A microphone position calculation unit that calculates where to install the microphone in the shape of the space estimated from the point cloud data so as to satisfy the desired acoustic conditions.
Including a presentation unit that presents the calculated microphone installation position,
Microphone position presentation device.
An input unit that accepts input of information based on point cloud data in the space that collects sound,
A microphone position calculation unit that calculates where to install the microphone in the shape of the space estimated from the information based on the point cloud data so as to satisfy the desired acoustic conditions.
Including an output unit that outputs the calculated microphone installation position to the presentation means,
Microphone position presentation device.
A program for causing a computer to execute the microphone position presentation method of claim 4.