WO2022153359A1 - Microphone position presentation method, device therefor, and program - Google Patents

Microphone position presentation method, device therefor, and program Download PDF

Info

Publication number
WO2022153359A1
WO2022153359A1 PCT/JP2021/000667 JP2021000667W WO2022153359A1 WO 2022153359 A1 WO2022153359 A1 WO 2022153359A1 JP 2021000667 W JP2021000667 W JP 2021000667W WO 2022153359 A1 WO2022153359 A1 WO 2022153359A1
Authority
WO
WIPO (PCT)
Prior art keywords
space
microphone
shape
wall surface
estimated
Prior art date
Application number
PCT/JP2021/000667
Other languages
French (fr)
Japanese (ja)
Inventor
達也 加古
賢一 野口
Original Assignee
日本電信電話株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電信電話株式会社 filed Critical 日本電信電話株式会社
Priority to PCT/JP2021/000667 priority Critical patent/WO2022153359A1/en
Publication of WO2022153359A1 publication Critical patent/WO2022153359A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K15/00Acoustics not otherwise provided for

Definitions

  • the present invention relates to a technique for presenting an appropriate microphone installation position in a sound collecting space.
  • Non-Patent Document 1 discloses an example of the relationship between the installation position of a microphone and a sound source when recording a musical instrument and voice.
  • the methods for determining the position where the microphone is installed there is a method in which an impulse response at an arbitrary position in an indoor space is acquired and the installation position is determined based on an index value according to a purpose such as an SN ratio.
  • An object of the present invention is to provide a means for a skilled engineer to determine the installation position of a microphone without visiting the site.
  • the present invention can be applied indoors or outdoors as long as it is a space having a structure that reflects sound.
  • As an outdoor space having a structure that reflects sound for example, a stadium or an outdoor live venue is assumed.
  • the microphone position presentation method satisfies the acquisition step of acquiring the point cloud data of the space for collecting sound and the condition relating to the desired sound. , The microphone position calculation step of calculating where to install the microphone in the shape of the space estimated from the point cloud data, and the presentation step of presenting the calculated microphone installation position are included.
  • the microphone position presentation method relates to an input step for accepting input of information based on point group data of a space for collecting sound, and a desired sound.
  • a microphone position calculation step that calculates where to install the microphone in the shape of the space estimated from the information based on the point group data so as to satisfy the condition, and an output step that outputs the calculated microphone installation position to the presentation means.
  • the microphone position presentation method includes a space shape estimation step of estimating the shape of the space from the point group data of the space for collecting sound, and the estimated space.
  • the wall surface extraction step that extracts the wall surface that forms the space using the shape of
  • a defect completion step that estimates the shape of a part of the wall surface and complements the shape of the space using the estimated part of the wall surface, and a space transfer function calculation that calculates the space transfer function in the space using the complemented space shape.
  • the sound pressure probability distribution for the virtual sound source position is estimated, and the estimated sound pressure probability distribution is used to obtain the microphone position that satisfies the desired acoustic condition. Includes a distribution estimation step.
  • the functional block diagram of the microphone position presentation system which concerns on 1st Embodiment The figure which shows the example of the processing flow of the microphone position presentation system which concerns on 1st Embodiment.
  • Functional block diagram of the microphone position presentation device The figure which shows the example of the processing flow of the microphone position presenting apparatus.
  • the figure for demonstrating the processing of a defect completion part The figure for demonstrating the processing of a defect completion part.
  • 9A is a diagram showing an example of the sound pressure power distribution of the sound source S 1
  • FIG. 9B is a diagram showing an example of the sound pressure power distribution of the sound source S 2 .
  • FIG. 10A is a diagram showing the simulated SNR distribution
  • FIG. 10B is a diagram showing the sum of the logs of the probability distribution.
  • the acoustic characteristics are influenced by the positional relationship between an object that reflects or absorbs sound, a sound source, and a microphone existing in the space.
  • Point cloud data is used to construct an object that reflects or absorbs sound in virtual space.
  • a virtual space may be constructed using point cloud data
  • acoustic characteristics may be simulated in the virtual space
  • the microphone installation position may be determined.
  • the space may be constructed by interpolating points to the coordinates where the ceiling, the wall, or the like is assumed to exist.
  • the installation positions of a plurality of microphones may be determined in order to operate as a microphone array instead of a single microphone.
  • FIG. 1 shows a functional block diagram of the microphone position presentation system according to the first embodiment
  • FIG. 2 shows a processing flow thereof.
  • the microphone position presentation system includes an acquisition unit 110, a microphone position presentation device 120, and a presentation unit 150.
  • the microphone position presentation system acquires point cloud data in a space for sound collection via the acquisition unit 110, and calculates an appropriate microphone installation position for acquiring a desired sound collection signal using the point cloud data. , Present to the user via the presentation unit 150.
  • the microphone position presenting device 120 is configured by loading a special program into a known or dedicated computer having, for example, a central processing unit (CPU: Central Processing Unit), a main storage device (RAM: Random Access Memory), or the like. It is a special device.
  • the microphone position presenting device 120 executes each process under the control of the central processing unit, for example.
  • the data input to the microphone position presenting device 120 and the data obtained by each process are stored in, for example, the main storage device, and the data stored in the main storage device is read out to the central processing unit as needed. It is used for other processing.
  • At least a part of each processing unit of the microphone position presenting device 120 may be configured by hardware such as an integrated circuit.
  • Each storage unit included in the microphone position presenting device 120 can be configured by, for example, a main storage device such as RAM (Random Access Memory) or middleware such as a relational database or a key-value store.
  • a main storage device such as RAM (Random Access Memory) or middleware such as a relational database or a key-value store.
  • each storage unit does not necessarily have to be provided inside the microphone position presenting device 120, and is composed of an auxiliary storage device composed of a hard disk, an optical disk, or a semiconductor memory element such as a flash memory, and is composed of a microphone. It may be configured to be provided outside the position presenting device 120.
  • the acquisition unit 110 acquires the point cloud data of the space for collecting sound (S110) and outputs it.
  • the acquisition unit 110 includes a spatial sensing unit 111, a noise removing unit 113, and a spatial model coupling unit 115.
  • the space sensing unit 111 acquires and outputs point cloud data of the space for collecting sound (S111).
  • the space sensing unit 111 is composed of LiDAR (Light Detection and Ranging or Laser Imaging Detection and Ranging) installed in a space that collects sound, emits light toward an object, and receives reflected light after light emission.
  • the distance to the object is calculated from the difference in time until the laser is fired, and the direction of the object is calculated from the launch direction. This distance and direction are represented by point cloud data.
  • the spatial sensing technique various conventional techniques can be used.
  • Reference 1 is known as an existing spatial sensing technique. (Reference 1) Toshio Ito, "Principles and Utilization of LiDAR Technology for Automatic Driving", Scientific Information Publishing Co., Ltd., 2020
  • the noise removing unit 113 receives the point cloud data, removes the noise included in the point cloud data (S113), and outputs the removed point cloud data.
  • the noise reduction technique various conventional techniques can be used.
  • Reference 2 is known as an existing noise reduction technique. (Reference 2) Rusu, R. B., Z. C. Marton, N. Blodow, M. Dolha, and M. Beetz, "Towards 3D Point Cloud Based Object Maps for Household Environments", Robotics and Autonomous Systems Journal, 2008.
  • FIGS. 3 and 4 show examples of point cloud data after noise reduction.
  • FIG. 3 shows point cloud data acquired with the spatial sensing unit 111 (for example, LiDAR) as an elevation angle of 0 degrees
  • FIG. 4 shows an elevation angle of ⁇ 90 degrees.
  • the space model coupling unit 115 receives a plurality of point cloud data, combines the plurality of point cloud data, restores the shape of the space (S115), and outputs the point cloud data R indicating the shape of the space.
  • the spatial model coupling technique various conventional techniques can be used.
  • Reference 3 is known as an existing spatial model coupling technique. (Reference 3) Szymon.R and Marc.L, "Efficient Variants of the ICP Algorithm", Proceedings Third International Conference on 3-D Digital Imaging and Modeling, 2001, pp. 145-152.
  • FIG. 5 is a functional block diagram of the microphone position presenting device 120
  • FIG. 6 is a diagram showing an example of its processing flow.
  • the microphone position presenting device 120 receives the virtual sound source position S and the point cloud data R indicating the shape of the space, and places the microphone in the space shape estimated from the point cloud data R so as to satisfy the desired acoustic conditions. Calculate whether to install (S120), and output the calculated installation position N of the microphone.
  • the microphone position presenting device 120 includes an input unit 121, a microphone position calculation unit 123, and an output unit 125.
  • the input unit 121 receives the point cloud data R indicating the shape of the space (S121) and outputs it to the microphone position calculation unit 123.
  • the input unit 121 is an input interface for inputting point cloud data indicating the shape of the space to the computer constituting the microphone position presenting device 120.
  • the acquisition unit 110 is included in the computer constituting the microphone position presentation device 120
  • the input unit 121 is included in the microphone position calculation unit 123.
  • the processing S113 and the processing S115, or the processing S115 may be performed in the computer constituting the microphone position presenting device 120.
  • the input unit 121 is provided in front of the portion corresponding to the processing performed in the computer, and the input unit 121 is the information based on the point group data of the space for collecting sound (the point group data itself output by the spatial sensing unit 111). , Or, the point group data from which the noise output by the noise removing unit 113 is removed) is received.
  • the microphone position calculation unit 123 receives the point cloud data R indicating the shape of the space, calculates where in the shape of the space the microphone is to be installed so as to satisfy the desired acoustic conditions (S123), and calculates the microphone. Outputs the installation position N of.
  • the microphone position calculation unit 123 includes a wall surface extraction unit 123A, a defect complementation unit 123B, a sound source information input unit 123C, a spatial transfer function calculation unit 123D, and a sound pressure probability distribution estimation unit 123E.
  • the wall surface extraction unit 123A receives the point cloud data R indicating the shape of the space, extracts the wall surface forming the space using the shape of the space (S123A), and outputs the data.
  • the wall surface extraction unit 123A extracts the wall surface by acquiring the parameters of the wall surface forming the space by using an algorithm improved from RANSAC (Random Sample Consensus) for wall surface extraction.
  • RANSAC Random Sample Consensus
  • the point cloud data R indicating the shape of the space is divided based on the observation angle. For example, in the horizontal plane, it is divided into M directions.
  • n n (pa).
  • n is the normal vector of the plane containing the three points a, b, and c, and p is each point in the point cloud belonging to ROI.
  • the normal vector can be obtained by SVD (Singular Value Decomposition) or PCA (Principal Component Analysis) of the covariance matrix.
  • the point cloud data belonging to the largest wall surface occupies a certain value or less in the total point cloud data R, it is assumed that there is no wall surface there. That is, in a certain direction, when the ratio of the number n q of the points that are equal to or less than the threshold value of (5) above is equal to or less than a predetermined value, it is assumed that there is no wall surface there. Furthermore, since the point cloud data belonging to the largest wall surface in (10) above is excluded from the data, the wall surface that can be extracted gradually decreases and automatically converges. With such a configuration, not only the wall and ceiling but also the planar shape of a certain point cloud that is likely to be reflected is extracted, and as a result, it is simulated as a space with furniture and equipment.
  • the defect complementing portion 123B receives the wall surface forming the space, and when an object other than the wall surface exists in the space and the shape of a part of the wall surface cannot be estimated, the shape of a part of the wall surface can be estimated from the shape of the other part of the wall surface.
  • the shape is estimated, the shape of the space is complemented using a part of the estimated wall surface (S123B), and the shape of the complemented space is output.
  • LiDAR measures the distance using the reflection of the laser
  • data loss occurs in the area where there is a shield and the laser cannot reach. Therefore, the missing data is complemented from the information on the wall surface from which the estimation can be performed, and the vertices (corners of the room) are obtained.
  • FIG. 7 is point cloud data of a horizontal plane with an elevation angle of 0 degrees, and point O is the installation position of LiDAR.
  • the parameters of the wall surface (a, b, c, d) are extracted by plane approximation, and the coordinates of the apex of the wall surface are obtained from the straight line. That is, as shown in FIG. 8, the intersection P of the straight lines is obtained as the coordinates of the apex of the wall, and it is treated as if the wall surface is in the broken line portion of FIG.
  • the sound source information input unit 123C is, for example, an input device such as a keyboard or a mouse, and the virtual sound source position is input by the user via the sound source information input unit 123C.
  • the sound source information input unit 123C receives the shape of the complemented space, which is the output of the defect complementing unit 123B (indicated by a broken line in FIG. 5), and displays the shape of the complemented space via a display device such as a display.
  • the point cloud data R acquired from LiDAR may be automatically input by using the object recognition as in Reference 4.
  • Reference 4 D. Maturana and S. Scherer, "VoxNet: A 3D Convolutional Neural Network for real-time object recognition", 2015 IEEE / RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, 2015, pp. 922-928, doi: 10.1109 / IROS.2015.7353481.
  • the sound source information input unit 123C is a storage medium reading device, and the microphone position presentation system accepts the virtual sound source position as an input by reading the storage medium that stores the virtual sound source position via the sound source information input unit 123C. You may.
  • the spatial transmission function calculation unit 123D calculates the spatial transmission function of each position (assumed listening position) in the space at the virtual sound source position S by using the shape of the space complemented with the virtual sound source position S (S123D). ),Output.
  • the spatial transfer function calculation technique various conventional techniques can be used. For example, sound wave propagation is simulated from the shape of space (room model), and the FDTD method (finite-difference time-domain method) is used to predict the incoming sound at a virtual listening position, and the spatial transmission function is calculated. do.
  • the reflection coefficient for example, the object may be estimated from a camera image or the like, and the reflection coefficient corresponding to the estimated object may be used, or the reflection coefficient may be directly given from the outside.
  • the reflectance coefficient may be obtained by other methods.
  • the sound pressure probability distribution estimation unit 123E uses the virtual sound source position S and the calculated spatial transmission function as inputs, and uses the spatial transmission function to distribute the sound pressure probability distribution at each position (assumed listening position) with respect to the virtual sound source position. (S123E), and using the sound pressure probability distribution, the position of the microphone satisfying the desired sound condition is obtained and output.
  • the sound pressure probability distribution estimation technique various conventional techniques can be used. For example, the SNR from each sound source is obtained from the sound pressure map obtained by the FDTD method. Next, the space is divided by a grid and the power from each sound source at each point is calculated. Normalize the calculated power so that the sum of the powers in the entire space is 1.
  • FIG. 9A shows an example of the sound pressure power distribution of the sound source S 1
  • FIG. 9B shows an example of the sound pressure power distribution of the sound source S 2.
  • the parts surrounded by the white broken lines in the figure are the sound source S 1 and the sound source S, respectively.
  • the sound pressure power is high.
  • FIG. 10A shows the simulated SNR distribution
  • FIG. 10B shows the sum of the log of the probability distribution, and the power distribution is shown. It is a visualization of the joint distribution as a probability distribution. It can be seen that the values of the parts surrounded by the broken lines are large.
  • the SNR of the sound source S 1 is increased, the sound source S 2 becomes noise. Therefore, the sum of the logs of the probability distribution is calculated as follows.
  • the sound pressure probability distribution estimation unit 123E may obtain a predetermined number of microphone installation positions, or may receive the number of microphones to be installed as an input and obtain the microphone installation positions according to the number of microphones. good.
  • the sound pressure probability distribution estimation unit 123E obtains the installation position of the microphone so as to satisfy the conditions relating to the desired sound. For example, the position where the sum of the logs of the probability distribution is large is set as the microphone installation position. When a plurality of microphones are installed, a plurality of positions where the sum of the logs of the probability distribution is maximized may be set as the installation positions of the plurality of microphones, or the installation positions may be obtained by other methods.
  • the output unit 125 receives the shape of the space after complementation and the microphone position N, generates information indicating the microphone position N in the shape of the space after complementation, for example, image data, and outputs the information to the presentation unit 150 (S125). ..
  • the output unit 125 includes an output interface for outputting the microphone position from the computer constituting the microphone position presenting device 120.
  • the output unit 125 does not have to include the output interface.
  • the presentation unit 150 receives information indicating the microphone position N in the shape of the space after complementation and presents it to the user (S150).
  • the presentation unit 150 comprises display means such as a display.
  • the estimated spatial transfer function is used to design a filter that forms a desired beamformer to improve processing performance.
  • the acoustic characteristics are different for each indoor space, so it is necessary to design a filter for each indoor space, and an engineer needs to go to the site.
  • An object of the present invention is to provide a means for a skilled engineer to design a desired filter without visiting the site.
  • FIG. 11 shows a functional block diagram of the acoustic system according to the first embodiment
  • FIG. 12 shows a processing flow thereof.
  • the sound system includes an acquisition unit 110, a filter design device 220, and a filtering unit 250.
  • the sound system acquires point group data of the space for sound collection via the acquisition unit 110, and uses the point group data to generate a filter for different sound collection characteristics of the acoustic signal emitted from the sound source, and collects the sound.
  • the sounded acoustic signal is filtered by the generated filter, and the filtered acoustic signal is acquired.
  • the filter may have different sound collecting characteristics of the acoustic signal emitted from the sound source position in space.
  • "Different sound collection characteristics" means, for example, that the acoustic signal emitted at a specific position is locally picked up so that the acoustic signal emitted at another position is not picked up as much as possible, or conversely, a specific sound is picked up. It means that the acoustic signal emitted at a position is suppressed (silenced) and only the acoustic signal emitted at another position is picked up.
  • the filter design device 220 is configured by loading a special program into a known or dedicated computer having, for example, a central processing unit (CPU: Central Processing Unit), a main storage device (RAM: Random Access Memory), or the like. Device.
  • the filter design device 220 executes each process under the control of the central processing unit, for example.
  • the data input to the filter design device 220 and the data obtained in each process are stored in, for example, the main storage device, and the data stored in the main storage device is read out to the central processing unit as needed. Used for other processing.
  • At least a part of each processing unit of the filter design device 220 may be configured by hardware such as an integrated circuit.
  • Each storage unit included in the filter design device 220 can be configured by, for example, a main storage device such as RAM (Random Access Memory) or middleware such as a relational database or a key-value store.
  • a main storage device such as RAM (Random Access Memory) or middleware such as a relational database or a key-value store.
  • middleware such as a relational database or a key-value store.
  • each storage unit does not necessarily have to be provided inside the filter design device 220, and is configured by an auxiliary storage device composed of a hard disk, an optical disk, or a semiconductor memory element such as a flash memory to design a filter. It may be configured to be provided outside the device 220.
  • FIG. 13 is a functional block diagram of the filter design device 220
  • FIG. 14 is a diagram showing an example of the processing flow thereof.
  • the filter design device 220 receives point cloud data indicating the shape of space, a virtual sound source position S, and a virtual microphone position M, designs a filter having different sound collection characteristics (S220), and outputs the designed filter.
  • the filter design device 220 includes an input unit 121, a filter calculation unit 223, and an output unit 225.
  • the filter calculation unit 223 receives the point cloud data R indicating the shape of the space, the virtual sound source position S, and the virtual microphone position M, calculates the filter W having different sound collection characteristics (S223), and outputs the filter W.
  • the filter calculation unit 223 includes a wall surface extraction unit 123A, a defect complementation unit 123B, a sound source, a microphone information input unit 223C, a spatial transfer function calculation unit 223D, and a filter calculation unit 223E.
  • the sound source and microphone information input unit 223C is an input device such as a keyboard or a mouse, or a storage medium reading device.
  • the sound source and the microphone information input unit 223C receives the shape of the complemented space, which is the output of the defect complementing unit 123B (indicated by a broken line in FIG. 13), and displays the shape of the complemented space on a display device such as a display. It may be presented to the user via the configuration, and the user may specify where to place the sound source and the microphone in the complemented space by using a mouse or the like.
  • the spatial transmission function calculation unit 123D calculates the spatial transmission function of each microphone position in the space at the virtual sound source position by using the shape of the space complemented by the virtual sound source position S and the virtual microphone position M (S223D). ,Output.
  • the spatial transfer function between the virtual sound source position and all the assumed positions in the space is calculated in order to determine the position of the microphone.
  • the microphone position is determined, so that the virtual sound source position is virtual.
  • the spatial transfer function between the sound source position and the virtual microphone position may be calculated.
  • the spatial transfer function calculation technique is the same as that of the first embodiment.
  • the filter calculation unit 223E inputs the calculated spatial transmission function as the virtual sound source position S and the virtual microphone position M, and collects the acoustic signal emitted from the virtual sound source position at the virtual microphone position.
  • the filter W that makes the sound characteristics different is calculated (S223E) and output.
  • the filter calculation technique various conventional techniques can be used. For example, when it is desired to emphasize the target sound by making the sound collection characteristics different, the filter can be designed based on the minimum dispersion method.
  • be the frequency
  • be the frame number
  • the transmission coefficient between the kth sound source and the mth microphone be Ak, m ( ⁇ )
  • the acoustic signal emitted by the kth sound source be Sk ( ⁇ , ⁇ ,).
  • X ( ⁇ , ⁇ ) [X 1 ( ⁇ , ⁇ ), X 2 ( ⁇ , ⁇ ),..., X M ( ⁇ , ⁇ )]
  • W ( ⁇ ) [W 1 ( ⁇ ), W 2 ( ⁇ ),..., W M ( ⁇ )].
  • M represents the total number of microphones.
  • a filter W ( ⁇ ) that minimizes P under the following constraint conditions.
  • the original signal is uncorrelated. Echo cancellation, noise cancellation, reverberation removal, speech enhancement, and the like can be considered as examples of different sound collection characteristics, and a filter suitable for each process can be set by using the prior art.
  • the output unit 225 receives the filter W and outputs it to the filtering unit 250 (S225).
  • the output unit 225 is an output interface for outputting the filter W from the computer constituting the filter design device 220.
  • the output unit 225 is included in the filter calculation unit 223.
  • the filtering unit 250 receives the filter W prior to the conversion process.
  • the filtering unit 250 receives the acoustic signal X picked up by the microphone actually arranged at the virtual microphone position described above, and multiplies the acoustic signal X by the filter W to obtain the output signal Y.
  • the output signal Y may be stored in a storage unit (not shown), reproduced from a speaker (not shown) or the like, or transmitted to a device installed in another space. With such a configuration, it is possible to obtain an output signal Y having different sound collecting characteristics of the acoustic signal X.
  • the program that describes this processing content can be recorded on a computer-readable recording medium.
  • the computer-readable recording medium may be, for example, a magnetic recording device, an optical disk, a photomagnetic recording medium, a semiconductor memory, or the like.
  • the distribution of this program is carried out, for example, by selling, transferring, or renting a portable recording medium such as a DVD or CD-ROM on which the program is recorded.
  • the program may be stored in the storage device of the server computer, and the program may be distributed by transferring the program from the server computer to another computer via a network.
  • a computer that executes such a program first stores, for example, a program recorded on a portable recording medium or a program transferred from a server computer in its own storage device. Then, when the process is executed, the computer reads the program stored in its own recording medium and executes the process according to the read program. Further, as another execution form of this program, a computer may read the program directly from a portable recording medium and execute processing according to the program, and further, the program is transferred from the server computer to this computer. It is also possible to execute the process according to the received program one by one each time. In addition, the above processing is executed by a so-called ASP (Application Service Provider) type service that realizes the processing function only by the execution instruction and result acquisition without transferring the program from the server computer to this computer. May be.
  • the program in this embodiment includes information to be used for processing by a computer and equivalent to the program (data that is not a direct command to the computer but has a property of defining the processing of the computer, etc.).
  • the present device is configured by executing a predetermined program on the computer, but at least a part of these processing contents may be realized by hardware.

Abstract

The present invention provides a means for determining the installation position of a microphone without a skilled engineer going to an actual place. This microphone position presentation method comprises: an acquisition step for acquiring point group data of a space where sound pickup is performed; a microphone position calculation step for calculating a microphone installation position within a space shape estimated from the point group data so as to satisfy conditions related to a desired sound; and a presentation step for presenting the calculated microphone installation position.

Description

マイクロホン位置提示方法、その装置、およびプログラムMicrophone location presentation method, its device, and program
 本発明は、収音を行う空間において、適切なマイクロホンの設置位置を提示する技術に関する。 The present invention relates to a technique for presenting an appropriate microphone installation position in a sound collecting space.
 屋内空間において録音を行う場合、目的によって適切なマイクロホンの設置位置は変化する。例えば、非特許文献1において、楽器と音声をレコーディングする際におけるマイクロホンの設置位置と音源との関係の一例が開示されている。マイクロホンを設置する位置を決定する手法の1つとして、屋内空間における任意の位置におけるインパルス応答を取得し、SN比等の目的に応じた指標値に基づき設置位置を決定するものがある。 When recording in an indoor space, the appropriate microphone installation position will change depending on the purpose. For example, Non-Patent Document 1 discloses an example of the relationship between the installation position of a microphone and a sound source when recording a musical instrument and voice. As one of the methods for determining the position where the microphone is installed, there is a method in which an impulse response at an arbitrary position in an indoor space is acquired and the installation position is determined based on an index value according to a purpose such as an SN ratio.
 しかしながら、従来技術では、熟練したエンジニアが時間をかけて設置位置を決定する必要がある。また、屋内空間ごとに音響特性が異なるため、屋内空間ごとに設置位置を決定する必要があり、エンジニアが現地に出向く必要がある。 However, in the conventional technique, it is necessary for a skilled engineer to take time to determine the installation position. In addition, since the acoustic characteristics are different for each indoor space, it is necessary to determine the installation position for each indoor space, and it is necessary for the engineer to go to the site.
 本発明は、熟練したエンジニアが現地に出向かずにマイクロホンの設置位置を決定する手段を提供することを目的とする。なお、本発明は、音を反射する構造体を有する空間であれば、屋内でも屋外でも適用可能である。音を反射する構造体を有する屋外空間としては、例えば競技場や屋外のライブ会場などが想定される。 An object of the present invention is to provide a means for a skilled engineer to determine the installation position of a microphone without visiting the site. The present invention can be applied indoors or outdoors as long as it is a space having a structure that reflects sound. As an outdoor space having a structure that reflects sound, for example, a stadium or an outdoor live venue is assumed.
 上記の課題を解決するために、本発明の一態様によれば、マイクロホン位置提示方法は、収音を行う空間の点群データを取得する取得ステップと、所望の音響に係る条件を満たすように、点群データから推定した空間の形状のどこにマイクロホンを設置するかを算出するマイクロホン位置算出ステップと、算出したマイクロホンの設置位置を提示する提示ステップと、を含む。 In order to solve the above problems, according to one aspect of the present invention, the microphone position presentation method satisfies the acquisition step of acquiring the point cloud data of the space for collecting sound and the condition relating to the desired sound. , The microphone position calculation step of calculating where to install the microphone in the shape of the space estimated from the point cloud data, and the presentation step of presenting the calculated microphone installation position are included.
 上記の課題を解決するために、本発明の他の態様によれば、マイクロホン位置提示方法は、収音を行う空間の点群データに基づく情報の入力を受け付ける入力ステップと、所望の音響に係る条件を満たすように、点群データに基づく情報から推定した空間の形状のどこにマイクロホンを設置するかを算出するマイクロホン位置算出ステップと、算出したマイクロホンの設置位置を提示手段に出力する出力ステップと、を含む。 In order to solve the above problems, according to another aspect of the present invention, the microphone position presentation method relates to an input step for accepting input of information based on point group data of a space for collecting sound, and a desired sound. A microphone position calculation step that calculates where to install the microphone in the shape of the space estimated from the information based on the point group data so as to satisfy the condition, and an output step that outputs the calculated microphone installation position to the presentation means. including.
 上記の課題を解決するために、本発明の他の態様によれば、マイクロホン位置提示方法は、収音を行う空間の点群データから空間の形状を推定する空間形状推定ステップと、推定した空間の形状を用いて、空間を形成する壁面を抽出する壁面抽出ステップと、空間に壁面以外の物体が存在し、壁面の一部の形状を推定できない場合、推定できた壁面の他部の形状から壁面の一部の形状を推定し、推定した壁面の一部を用いて空間の形状を補完する欠損補完ステップと、補完した空間の形状を用いて空間における空間伝達関数を算出する空間伝達関数算出ステップと、算出した空間伝達関数を用いて、仮想の音源位置に対する音圧確率分布を推定し、推定した音圧確率分布を用いて、所望の音響に係る条件を満たすマイクロホン位置を求める音圧確率分布推定ステップと、を含む。 In order to solve the above problems, according to another aspect of the present invention, the microphone position presentation method includes a space shape estimation step of estimating the shape of the space from the point group data of the space for collecting sound, and the estimated space. The wall surface extraction step that extracts the wall surface that forms the space using the shape of A defect completion step that estimates the shape of a part of the wall surface and complements the shape of the space using the estimated part of the wall surface, and a space transfer function calculation that calculates the space transfer function in the space using the complemented space shape. Using the steps and the calculated spatial transmission function, the sound pressure probability distribution for the virtual sound source position is estimated, and the estimated sound pressure probability distribution is used to obtain the microphone position that satisfies the desired acoustic condition. Includes a distribution estimation step.
 本発明によれば、熟練したエンジニアが現地に出向かずにマイクロホンの設置位置を決定できるという効果を奏する。 According to the present invention, there is an effect that a skilled engineer can determine the installation position of the microphone without going to the site.
第一実施形態に係るマイクロホン位置提示システムの機能ブロック図。The functional block diagram of the microphone position presentation system which concerns on 1st Embodiment. 第一実施形態に係るマイクロホン位置提示システムの処理フローの例を示す図。The figure which shows the example of the processing flow of the microphone position presentation system which concerns on 1st Embodiment. ノイズ除去後の点群データの例を表す図。The figure which shows the example of the point cloud data after noise reduction. ノイズ除去後の点群データの例を表す図。The figure which shows the example of the point cloud data after noise reduction. マイクロホン位置提示装置の機能ブロック図。Functional block diagram of the microphone position presentation device. マイクロホン位置提示装置の処理フローの例を示す図。The figure which shows the example of the processing flow of the microphone position presenting apparatus. 欠損補完部の処理を説明するための図。The figure for demonstrating the processing of a defect completion part. 欠損補完部の処理を説明するための図。The figure for demonstrating the processing of a defect completion part. 図9A音源S1の音圧のパワー分布の例を示す図、図9Bは音源S2の音圧のパワー分布の例を示す図。9A is a diagram showing an example of the sound pressure power distribution of the sound source S 1 , and FIG. 9B is a diagram showing an example of the sound pressure power distribution of the sound source S 2 . 図10AはシミュレーションしたSNRの分布を表す図、図10Bは確率分布のlogの和を表す図。FIG. 10A is a diagram showing the simulated SNR distribution, and FIG. 10B is a diagram showing the sum of the logs of the probability distribution. 第二実施形態に係る音響システムの機能ブロック図。The functional block diagram of the acoustic system which concerns on 2nd Embodiment. 第二実施形態に係る音響システムの処理フローの例を示す図。The figure which shows the example of the processing flow of the acoustic system which concerns on 2nd Embodiment. フィルタ設計装置の機能ブロック図。Functional block diagram of the filter design device. フィルタ設計装置の処理フローの例を示す図。The figure which shows the example of the processing flow of the filter design apparatus. 本手法を適用するコンピュータの構成例を示す図。The figure which shows the configuration example of the computer to which this method is applied.
 以下、本発明の実施形態について、説明する。なお、以下の説明に用いる図面では、同じ機能を持つ構成部や同じ処理を行うステップには同一の符号を記し、重複説明を省略する。以下の説明において、ベクトルや行列の各要素単位で行われる処理は、特に断りが無い限り、そのベクトルやその行列の全ての要素に対して適用されるものとする。
 具体的な実施形態の説明の前に本発明のポイントについて記載する。前述したように、空間ごとに音響特性が異なる事から従来はエンジニアが現地に赴きマイクロホンを設置する位置を決めてきた。そこで、音響特性も模擬することができる仮想空間上でマイクロホンを設置する位置を決めることを考える。音響特性は、当該空間に存在する、音を反射ないし吸収する物体と、音源と、マイクロホンとの位置関係に影響される。音を反射ないし吸収する物体を仮想空間上で構成するために点群データを利用する。いいかえると、点群データを用いて仮想空間を構築し、仮想空間上で音響特性を模擬し、マイクロホンの設置位置を決めればよい。屋内を想定する場合、空間の端は天井や壁などで構成されると想定されることから、天井や壁等が存在すると想定される座標に点を補間して空間を構成してもよい。マイクロホン単体ではなくマイクロホンアレイとして動作させるために複数のマイクロホンの設置位置を決めてもよい。
Hereinafter, embodiments of the present invention will be described. In the drawings used in the following description, the same reference numerals are given to the components having the same function and the steps for performing the same processing, and duplicate description is omitted. In the following description, the processing performed for each element of a vector or matrix shall be applied to all the elements of the vector or matrix unless otherwise specified.
The points of the present invention will be described before the description of specific embodiments. As mentioned above, since the acoustic characteristics differ from space to space, engineers have traditionally visited the site to decide where to install the microphone. Therefore, consider deciding the position to install the microphone in the virtual space where the acoustic characteristics can be simulated. The acoustic characteristics are influenced by the positional relationship between an object that reflects or absorbs sound, a sound source, and a microphone existing in the space. Point cloud data is used to construct an object that reflects or absorbs sound in virtual space. In other words, a virtual space may be constructed using point cloud data, acoustic characteristics may be simulated in the virtual space, and the microphone installation position may be determined. When assuming indoors, since it is assumed that the edge of the space is composed of a ceiling, a wall, or the like, the space may be constructed by interpolating points to the coordinates where the ceiling, the wall, or the like is assumed to exist. The installation positions of a plurality of microphones may be determined in order to operate as a microphone array instead of a single microphone.
<第一実施形態に係るマイクロホン位置提示システム>
 図1は第一実施形態に係るマイクロホン位置提示システムの機能ブロック図を、図2はその処理フローを示す。
<Microphone position presentation system according to the first embodiment>
FIG. 1 shows a functional block diagram of the microphone position presentation system according to the first embodiment, and FIG. 2 shows a processing flow thereof.
 マイクロホン位置提示システムは、取得部110と、マイクロホン位置提示装置120と、提示部150とを含む。 The microphone position presentation system includes an acquisition unit 110, a microphone position presentation device 120, and a presentation unit 150.
 マイクロホン位置提示システムは、取得部110を介して収音を行う空間の点群データを取得し、点群データを用いて所望の収音信号を取得するための適切なマイクロホンの設置位置を算出し、提示部150を介して利用者に提示する。 The microphone position presentation system acquires point cloud data in a space for sound collection via the acquisition unit 110, and calculates an appropriate microphone installation position for acquiring a desired sound collection signal using the point cloud data. , Present to the user via the presentation unit 150.
 マイクロホン位置提示装置120は、例えば、中央演算処理装置(CPU: Central Processing Unit)、主記憶装置(RAM: Random Access Memory)などを有する公知又は専用のコンピュータに特別なプログラムが読み込まれて構成された特別な装置である。マイクロホン位置提示装置120は、例えば、中央演算処理装置の制御のもとで各処理を実行する。マイクロホン位置提示装置120に入力されたデータや各処理で得られたデータは、例えば、主記憶装置に格納され、主記憶装置に格納されたデータは必要に応じて中央演算処理装置へ読み出されて他の処理に利用される。マイクロホン位置提示装置120の各処理部は、少なくとも一部が集積回路等のハードウェアによって構成されていてもよい。マイクロホン位置提示装置120が備える各記憶部は、例えば、RAM(Random Access Memory)などの主記憶装置、またはリレーショナルデータベースやキーバリューストアなどのミドルウェアにより構成することができる。ただし、各記憶部は、必ずしもマイクロホン位置提示装置120がその内部に備える必要はなく、ハードディスクや光ディスクもしくはフラッシュメモリ(Flash Memory)のような半導体メモリ素子により構成される補助記憶装置により構成し、マイクロホン位置提示装置120の外部に備える構成としてもよい。 The microphone position presenting device 120 is configured by loading a special program into a known or dedicated computer having, for example, a central processing unit (CPU: Central Processing Unit), a main storage device (RAM: Random Access Memory), or the like. It is a special device. The microphone position presenting device 120 executes each process under the control of the central processing unit, for example. The data input to the microphone position presenting device 120 and the data obtained by each process are stored in, for example, the main storage device, and the data stored in the main storage device is read out to the central processing unit as needed. It is used for other processing. At least a part of each processing unit of the microphone position presenting device 120 may be configured by hardware such as an integrated circuit. Each storage unit included in the microphone position presenting device 120 can be configured by, for example, a main storage device such as RAM (Random Access Memory) or middleware such as a relational database or a key-value store. However, each storage unit does not necessarily have to be provided inside the microphone position presenting device 120, and is composed of an auxiliary storage device composed of a hard disk, an optical disk, or a semiconductor memory element such as a flash memory, and is composed of a microphone. It may be configured to be provided outside the position presenting device 120.
 以下、各部について説明する。 Each part will be explained below.
<取得部110>
 取得部110は、収音を行う空間の点群データを取得し(S110)、出力する。例えば、取得部110は、空間センシング部111とノイズ除去部113と空間モデル結合部115とを含む。
<Acquisition unit 110>
The acquisition unit 110 acquires the point cloud data of the space for collecting sound (S110) and outputs it. For example, the acquisition unit 110 includes a spatial sensing unit 111, a noise removing unit 113, and a spatial model coupling unit 115.
<空間センシング部111>
 空間センシング部111は、収音を行う空間の点群データを取得し(S111)、出力する。例えば、空間センシング部111は、収音を行う空間内に設置されたLiDAR(Light Detection and RangingまたはLaser Imaging Detection and Ranging)からなり、光を対象物に向けて発射し、発光後反射光を受光するまでの時間の差により対象物までの距離を求め、発射方向から対象物の方向を求める。この距離や方向が点群データで表される。空間センシング技術としては、様々な従来技術を用いることができる。例えば、既存の空間センシング技術として参考文献1が知られている。
(参考文献1) 伊東 敏夫、「自動運転のためのLiDAR技術の原理と活用法」、科学情報出版株式会社、2020年
<Spatial sensing unit 111>
The space sensing unit 111 acquires and outputs point cloud data of the space for collecting sound (S111). For example, the space sensing unit 111 is composed of LiDAR (Light Detection and Ranging or Laser Imaging Detection and Ranging) installed in a space that collects sound, emits light toward an object, and receives reflected light after light emission. The distance to the object is calculated from the difference in time until the laser is fired, and the direction of the object is calculated from the launch direction. This distance and direction are represented by point cloud data. As the spatial sensing technique, various conventional techniques can be used. For example, Reference 1 is known as an existing spatial sensing technique.
(Reference 1) Toshio Ito, "Principles and Utilization of LiDAR Technology for Automatic Driving", Scientific Information Publishing Co., Ltd., 2020
<ノイズ除去部113>
 ノイズ除去部113は、点群データを受け取り、点群データに含まれるノイズを除去し(S113)、除去後の点群データを出力する。ノイズ除去技術としては、様々な従来技術を用いることができる。例えば、既存のノイズ除去技術として参考文献2が知られている。
(参考文献2)Rusu, R. B., Z. C. Marton, N. Blodow, M. Dolha, and M. Beetz, "Towards 3D Point Cloud Based Object Maps for Household Environments", Robotics and Autonomous Systems Journal, 2008.
<Noise reduction unit 113>
The noise removing unit 113 receives the point cloud data, removes the noise included in the point cloud data (S113), and outputs the removed point cloud data. As the noise reduction technique, various conventional techniques can be used. For example, Reference 2 is known as an existing noise reduction technique.
(Reference 2) Rusu, R. B., Z. C. Marton, N. Blodow, M. Dolha, and M. Beetz, "Towards 3D Point Cloud Based Object Maps for Household Environments", Robotics and Autonomous Systems Journal, 2008.
 なお、図3および図4は、ノイズ除去後の点群データの例を表す。図3は空間センシング部111(例えばLiDAR)を仰角0度、図4は仰角-90度として取得した点群データである。 Note that FIGS. 3 and 4 show examples of point cloud data after noise reduction. FIG. 3 shows point cloud data acquired with the spatial sensing unit 111 (for example, LiDAR) as an elevation angle of 0 degrees, and FIG. 4 shows an elevation angle of −90 degrees.
<空間モデル結合部115>
 空間モデル結合部115は、複数の点群データを受け取り、複数の点群データを結合して、空間の形状を復元し(S115)、空間の形状を示す点群データRを出力する。例えば、LiDARを回転させながら、仰角の異なる複数の点群データを得、反復最近接点(ICP: Iterative Closest Point)アルゴリズムにより複数の点群データを組み合わせて3次元シーンを再構成する。空間モデル結合技術としては、様々な従来技術を用いることができる。例えば、既存の空間モデル結合技術として参考文献3が知られている。
(参考文献3) Szymon.R and Marc.L, "Efficient Variants of the ICP Algorithm", Proceedings Third International Conference on 3-D Digital Imaging and Modeling, 2001, pp. 145-152.
<Spatial model joint 115>
The space model coupling unit 115 receives a plurality of point cloud data, combines the plurality of point cloud data, restores the shape of the space (S115), and outputs the point cloud data R indicating the shape of the space. For example, while rotating LiDAR, multiple point cloud data with different elevation angles are obtained, and a three-dimensional scene is reconstructed by combining multiple point cloud data using an iterative closest point (ICP) algorithm. As the spatial model coupling technique, various conventional techniques can be used. For example, Reference 3 is known as an existing spatial model coupling technique.
(Reference 3) Szymon.R and Marc.L, "Efficient Variants of the ICP Algorithm", Proceedings Third International Conference on 3-D Digital Imaging and Modeling, 2001, pp. 145-152.
<マイクロホン位置提示装置120>
 図5はマイクロホン位置提示装置120の機能ブロック図を、図6はその処理フローの例を示す図である。
<Microphone position presentation device 120>
FIG. 5 is a functional block diagram of the microphone position presenting device 120, and FIG. 6 is a diagram showing an example of its processing flow.
 マイクロホン位置提示装置120は、仮想の音源位置Sおよび空間の形状を示す点群データRを受け取り、所望の音響に係る条件を満たすように、点群データRから推定した空間の形状のどこにマイクロホンを設置するかを算出し(S120)、算出したマイクロホンの設置位置Nを出力する。 The microphone position presenting device 120 receives the virtual sound source position S and the point cloud data R indicating the shape of the space, and places the microphone in the space shape estimated from the point cloud data R so as to satisfy the desired acoustic conditions. Calculate whether to install (S120), and output the calculated installation position N of the microphone.
 例えば、マイクロホン位置提示装置120は、入力部121とマイクロホン位置算出部123と出力部125とを含む。 For example, the microphone position presenting device 120 includes an input unit 121, a microphone position calculation unit 123, and an output unit 125.
<入力部121>
 入力部121は、空間の形状を示す点群データRを受け取り(S121)、マイクロホン位置算出部123に出力する。例えば、入力部121は、マイクロホン位置提示装置120を構成するコンピュータに空間の形状を示す点群データを入力するための入力インターフェースである。なお、マイクロホン位置提示装置120を構成するコンピュータ内に取得部110を含む場合には、入力部121はマイクロホン位置算出部123に含まれる。また、マイクロホン位置提示装置120を構成するコンピュータ内で、処理S113および処理S115、または、処理S115を行ってもよい。この場合、コンピュータ内で行う処理に対応する部分の前段に入力部121が設けられ、入力部121は収音を行う空間の点群データに基づく情報(空間センシング部111の出力する点群データそのもの、または、ノイズ除去部113の出力するノイズを除去した点群データ)を受け取る。
<Input unit 121>
The input unit 121 receives the point cloud data R indicating the shape of the space (S121) and outputs it to the microphone position calculation unit 123. For example, the input unit 121 is an input interface for inputting point cloud data indicating the shape of the space to the computer constituting the microphone position presenting device 120. When the acquisition unit 110 is included in the computer constituting the microphone position presentation device 120, the input unit 121 is included in the microphone position calculation unit 123. Further, the processing S113 and the processing S115, or the processing S115 may be performed in the computer constituting the microphone position presenting device 120. In this case, the input unit 121 is provided in front of the portion corresponding to the processing performed in the computer, and the input unit 121 is the information based on the point group data of the space for collecting sound (the point group data itself output by the spatial sensing unit 111). , Or, the point group data from which the noise output by the noise removing unit 113 is removed) is received.
<マイクロホン位置算出部123>
 マイクロホン位置算出部123は、空間の形状を示す点群データRを受け取り、所望の音響に係る条件を満たすように、空間の形状のどこにマイクロホンを設置するかを算出し(S123)、算出したマイクロホンの設置位置Nを出力する。
<Microphone position calculation unit 123>
The microphone position calculation unit 123 receives the point cloud data R indicating the shape of the space, calculates where in the shape of the space the microphone is to be installed so as to satisfy the desired acoustic conditions (S123), and calculates the microphone. Outputs the installation position N of.
 例えば、マイクロホン位置算出部123は、壁面抽出部123Aと欠損補完部123Bと音源情報入力部123Cと空間伝達関数算出部123Dと音圧確率分布推定部123Eとを含む。 For example, the microphone position calculation unit 123 includes a wall surface extraction unit 123A, a defect complementation unit 123B, a sound source information input unit 123C, a spatial transfer function calculation unit 123D, and a sound pressure probability distribution estimation unit 123E.
<壁面抽出部123A>
 壁面抽出部123Aは、空間の形状を示す点群データRを受け取り、空間の形状を用いて、空間を形成する壁面を抽出し(S123A)、出力する。
<Wall extraction unit 123A>
The wall surface extraction unit 123A receives the point cloud data R indicating the shape of the space, extracts the wall surface forming the space using the shape of the space (S123A), and outputs the data.
 例えば、壁面抽出部123Aは、RANSAC(Random Sample Consensus)を壁面抽出用に改善したアルゴリズムを利用して、空間を形成する壁面のパラメータを取得することで、壁面を抽出する。以下にその例を示す。 For example, the wall surface extraction unit 123A extracts the wall surface by acquiring the parameters of the wall surface forming the space by using an algorithm improved from RANSAC (Random Sample Consensus) for wall surface extraction. An example is shown below.
 (1)空間の形状を示す点群データRを、観測角度に基づき、分割する。例えば、水平面において、M個の方向に分割する。 (1) The point cloud data R indicating the shape of the space is divided based on the observation angle. For example, in the horizontal plane, it is divided into M directions.
 (2)M分割した方向の中からm番目の方向(ROI(region of interest))を指定する。ただし、m=1,2,…,Mとする。 (2) Specify the m-th direction (ROI (region of interest)) from the M-divided directions. However, m = 1,2, ..., M.
 (3)ROIに属する点群から3点a,b,cをランダムに取得する。 (3) Randomly acquire 3 points a, b, and c from the point cloud belonging to the ROI.
 (4)3点a,b,cから3点a,b,cを含む平面の方程式ax+by+cz+d=0の係数を計算する。例えば、ベクトルab,acの外積から係数を計算する。 (4) Calculate the coefficient of the equation of a plane ax + by + cz + d = 0 including the three points a, b, c from the three points a, b, c. For example, the coefficient is calculated from the outer product of the vectors ab and ac.
 (5)ROIに属する点群の各点と平面との距離を計算し、距離が所定の閾値以下となる点の数nqを求める。ただし、q=1,2,…,Qとする。例えば、d=n(p-a)により距離dを計算する。nは3点a,b,cを含む平面の法線ベクトルであり、pはROIに属する点群の各点である。 (5) Calculate the distance between each point in the point cloud belonging to ROI and the plane, and find the number n q of points where the distance is less than or equal to a predetermined threshold. However, let q = 1,2, ..., Q. For example, the distance d is calculated by d = n (pa). n is the normal vector of the plane containing the three points a, b, and c, and p is each point in the point cloud belonging to ROI.
 (6)上述の(3)~(5)をQ回繰り返し、Q個の数nqの中から最大の点の数nmとその数に対応する係数をその方向mのパラメータとする。 (6) Repeat steps (3) to (5) above Q times, and set the number n m of the maximum points from the number n q of Q and the coefficient corresponding to that number as the parameter of the direction m.
 (7)上述の(2)~(6)をM回繰り返し、M個の数nmの中から最大の点の数の方向を選択する。 (7) Repeat steps (2) to (6) above M times, and select the direction of the maximum number of points from the M number nm.
 (8)上述の(7)で選択した方向に対応する係数で示される平面との距離が閾値以下の点を集め、集めた点群から法線ベクトルを求める。例えば、SVD(Singular Value Decomposition)や共分散行列のPCA(Principal Component Analysis)で法線ベクトルを求めることができる。 (8) Collect points whose distance to the plane indicated by the coefficient corresponding to the direction selected in (7) above is less than the threshold value, and obtain the normal vector from the collected point group. For example, the normal vector can be obtained by SVD (Singular Value Decomposition) or PCA (Principal Component Analysis) of the covariance matrix.
 (9)法線ベクトルを点群データの中で最大の壁面として記録する。 (9) Record the normal vector as the largest wall surface in the point cloud data.
 (10)最大の壁面に所属する点群データをデータから除き、(1)へ戻る。なお、最大の壁面に所属する点群データが、抽出された壁面を表す。 (10) Remove the point cloud data belonging to the largest wall surface from the data and return to (1). The point cloud data belonging to the largest wall surface represents the extracted wall surface.
 なお、最大の壁面に所属する点群データが全体の点群データRに占める割合がある所定の値以下になる場合、そこには壁面がないものとする。つまり、ある方向において、上述の(5)の閾値以下となる点の数nqの割合が所定の値以下になる場合には、そこに壁面がないものとする。さらに、上述の(10)で最大の壁面に所属する点群データをデータから除くため、抽出できる壁面が徐々に減っていき、自動的に収束する。なお、このような構成により、壁や天井だけでなく、反射がありそうなある程度の点群の平面形状を抽出し、結果として家具や機材もある空間として模擬する。 If the point cloud data belonging to the largest wall surface occupies a certain value or less in the total point cloud data R, it is assumed that there is no wall surface there. That is, in a certain direction, when the ratio of the number n q of the points that are equal to or less than the threshold value of (5) above is equal to or less than a predetermined value, it is assumed that there is no wall surface there. Furthermore, since the point cloud data belonging to the largest wall surface in (10) above is excluded from the data, the wall surface that can be extracted gradually decreases and automatically converges. With such a configuration, not only the wall and ceiling but also the planar shape of a certain point cloud that is likely to be reflected is extracted, and as a result, it is simulated as a space with furniture and equipment.
<欠損補完部123B>
 欠損補完部123Bは、空間を形成する壁面を受け取り、空間に壁面以外の物体が存在し、壁面の一部の形状を推定できない場合、推定できた壁面の他部の形状から壁面の一部の形状を推定し、推定した壁面の一部を用いて空間の形状を補完し(S123B)、補完後の空間の形状を出力する。
<Defective complement 123B>
The defect complementing portion 123B receives the wall surface forming the space, and when an object other than the wall surface exists in the space and the shape of a part of the wall surface cannot be estimated, the shape of a part of the wall surface can be estimated from the shape of the other part of the wall surface. The shape is estimated, the shape of the space is complemented using a part of the estimated wall surface (S123B), and the shape of the complemented space is output.
 上述の通り、LiDARはレーザの反射を利用して距離を計測しているため、遮蔽物がありレーザが届かない領域ではデータの欠損が生じる。そこで、欠損データを推定できた壁面の情報から補完し、頂点(部屋の角)を求める。 As mentioned above, since LiDAR measures the distance using the reflection of the laser, data loss occurs in the area where there is a shield and the laser cannot reach. Therefore, the missing data is complemented from the information on the wall surface from which the estimation can be performed, and the vertices (corners of the room) are obtained.
 図7の例では、ディスプレイDが遮蔽物となり、図中、破線で囲まれている部分に存在している壁面が観測できない。なお、図7は、仰角0度の水平面の点群データであり、点OはLiDARの設置位置である。 In the example of FIG. 7, the display D serves as a shield, and the wall surface existing in the portion surrounded by the broken line in the figure cannot be observed. Note that FIG. 7 is point cloud data of a horizontal plane with an elevation angle of 0 degrees, and point O is the installation position of LiDAR.
 例えば、平面近似によって壁面(a,b,c,d)のパラメータを抽出し、直線から壁の頂点座標を求める。つまり、図8のように、直線の交点Pを壁の頂点座標として求め、図8の破線部分に壁面があるものとして扱う。 For example, the parameters of the wall surface (a, b, c, d) are extracted by plane approximation, and the coordinates of the apex of the wall surface are obtained from the straight line. That is, as shown in FIG. 8, the intersection P of the straight lines is obtained as the coordinates of the apex of the wall, and it is treated as if the wall surface is in the broken line portion of FIG.
<音源情報入力部123C>
 音源情報入力部123Cは、1つ以上の仮想の音源位置S=(S1,S2,…)を入力として受け付け(S123C)、空間伝達関数算出部123Dに出力する。音源情報入力部123Cは例えばキーボードやマウス等の入力装置であり、仮想の音源位置は音源情報入力部123Cを介して利用者によって入力される。例えば、音源情報入力部123Cは、欠損補完部123Bの出力である補完後の空間の形状を受け取り(図5中、破線で示す)、補完後の空間の形状をディスプレイ等の表示装置を介して利用者に提示し、利用者が、マウス等を利用して補完後の空間のどこに音源を配置するかを指定する構成としてもよい。または、LiDARから取得した点群データRから参考文献4のような物体認識を利用し、自動で入力する構成としてもよい。
(参考文献4) D. Maturana and S. Scherer, "VoxNet: A 3D Convolutional Neural Network for real-time object recognition", 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, 2015, pp. 922-928, doi: 10.1109/IROS.2015.7353481.
<Sound source information input unit 123C>
The sound source information input unit 123C accepts one or more virtual sound source positions S = (S 1 , S 2 , ...) As inputs (S123C) and outputs them to the spatial transfer function calculation unit 123D. The sound source information input unit 123C is, for example, an input device such as a keyboard or a mouse, and the virtual sound source position is input by the user via the sound source information input unit 123C. For example, the sound source information input unit 123C receives the shape of the complemented space, which is the output of the defect complementing unit 123B (indicated by a broken line in FIG. 5), and displays the shape of the complemented space via a display device such as a display. It may be presented to the user, and the user may specify where to place the sound source in the complemented space by using a mouse or the like. Alternatively, the point cloud data R acquired from LiDAR may be automatically input by using the object recognition as in Reference 4.
(Reference 4) D. Maturana and S. Scherer, "VoxNet: A 3D Convolutional Neural Network for real-time object recognition", 2015 IEEE / RSJ International Conference on Intelligent Robots and Systems (IROS), Hamburg, 2015, pp. 922-928, doi: 10.1109 / IROS.2015.7353481.
 また、音源情報入力部123Cは記憶媒体の読み取り装置であり、マイクロホン位置提示システムは音源情報入力部123Cを介して仮想の音源位置を記憶した記憶媒体を読み取ることで仮想の音源位置を入力として受け付けてもよい。 Further, the sound source information input unit 123C is a storage medium reading device, and the microphone position presentation system accepts the virtual sound source position as an input by reading the storage medium that stores the virtual sound source position via the sound source information input unit 123C. You may.
<空間伝達関数算出部123D>
 空間伝達関数算出部123Dは、仮想の音源位置Sと補完した空間の形状を用いて、仮想の音源位置Sでの空間における各位置(想定される受聴位置)の空間伝達関数を算出し(S123D)、出力する。空間伝達関数算出技術としては、様々な従来技術を用いることができる。例えば、空間の形状(部屋のモデル)から音波伝搬をシミュレーションし、仮想の受聴位置での到来音を予測するためFDTD法(finite-difference time-domain method)によるシミュレーションを行い、空間伝達関数を算出する。また、これまでは模擬された空間内において、空間及び空間内に存在するオブジェクトの形状を用いて音響特性を得る説明を行ってきたが、さらに空間を構成する物体やオブジェクトの材質を考慮した反射係数を考慮してもよい。反射係数は、例えばカメラの映像などから当該物体を推定し、推定された物体に応じた反射係数を用いてもよいし、外部から直接反射係数を与えるなどしてもよい。他の方法で反射係数を取得してもよい。
<Spatial transfer function calculation unit 123D>
The spatial transmission function calculation unit 123D calculates the spatial transmission function of each position (assumed listening position) in the space at the virtual sound source position S by using the shape of the space complemented with the virtual sound source position S (S123D). ),Output. As the spatial transfer function calculation technique, various conventional techniques can be used. For example, sound wave propagation is simulated from the shape of space (room model), and the FDTD method (finite-difference time-domain method) is used to predict the incoming sound at a virtual listening position, and the spatial transmission function is calculated. do. In addition, so far, we have explained to obtain acoustic characteristics by using the space and the shape of the object existing in the space in the simulated space, but further, the reflection considering the object constituting the space and the material of the object. Factors may be considered. As the reflection coefficient, for example, the object may be estimated from a camera image or the like, and the reflection coefficient corresponding to the estimated object may be used, or the reflection coefficient may be directly given from the outside. The reflectance coefficient may be obtained by other methods.
<音圧確率分布推定部123E>
 音圧確率分布推定部123Eは、仮想の音源位置Sと算出した空間伝達関数を入力とし、空間伝達関数を用いて、仮想の音源位置に対する各位置(想定される受聴位置)の音圧確率分布を推定し(S123E)、音圧確率分布を用いて、所望の音響に係る条件を満たすマイクロホン位置を求め、出力する。音圧確率分布推定技術としては、様々な従来技術を用いることができる。例えば、FDTD法から得られた音圧のマップから各音源からのSNRを取得する。次に、空間をグリッドで区切りそれぞれのポイントでの各音源からのパワーを計算する。計算したパワーを空間内全体のパワーの総和が1となるように正規化する。正規化したパワーを確率分布として扱い、所望の音源とその他の雑音源との比に相当するSNRを計算する。図9Aは音源S1の音圧のパワー分布の例を、図9Bは音源S2の音圧のパワー分布の例を示し、図中の白い破線で囲んだ部分がそれぞれ音源S1、音源S2の位置であり、音圧のパワーが高くなっている。例えば、「音源S1、に対するSNRの高いこと」を所望の音響に係る条件とした場合、図10AはシミュレーションしたSNRの分布を表し、図10Bは確率分布のlogの和を表し、パワー分布を確率分布として同時分布を可視化したものである。それぞれ破線で囲んだ部分の値が大きいことが分かる。なお、音源S1、に対するSNRの高くする場合、音源S2は雑音となるそのため、確率分布のlogの和は以下のように求める。
<Sound pressure probability distribution estimation unit 123E>
The sound pressure probability distribution estimation unit 123E uses the virtual sound source position S and the calculated spatial transmission function as inputs, and uses the spatial transmission function to distribute the sound pressure probability distribution at each position (assumed listening position) with respect to the virtual sound source position. (S123E), and using the sound pressure probability distribution, the position of the microphone satisfying the desired sound condition is obtained and output. As the sound pressure probability distribution estimation technique, various conventional techniques can be used. For example, the SNR from each sound source is obtained from the sound pressure map obtained by the FDTD method. Next, the space is divided by a grid and the power from each sound source at each point is calculated. Normalize the calculated power so that the sum of the powers in the entire space is 1. Treat the normalized power as a probability distribution and calculate the SNR corresponding to the ratio of the desired source to other noise sources. FIG. 9A shows an example of the sound pressure power distribution of the sound source S 1 , and FIG. 9B shows an example of the sound pressure power distribution of the sound source S 2. The parts surrounded by the white broken lines in the figure are the sound source S 1 and the sound source S, respectively. At position 2 , the sound pressure power is high. For example, when "high SNR with respect to sound source S 1 " is set as the condition related to the desired sound, FIG. 10A shows the simulated SNR distribution, and FIG. 10B shows the sum of the log of the probability distribution, and the power distribution is shown. It is a visualization of the joint distribution as a probability distribution. It can be seen that the values of the parts surrounded by the broken lines are large. When the SNR of the sound source S 1 is increased, the sound source S 2 becomes noise. Therefore, the sum of the logs of the probability distribution is calculated as follows.
log p(x|SNR1) = log p(x|S1) - log p(x|S2)
所望の音響に係る条件としては、上述の条件以外にも様々な条件が考えられる。例えば、不要な音の除去や必要な音の強調等が考えられる。N個の音源S1,S2,…,SNのうちのL個の音源S1,S2,…,SLを目的音として強調し、N-L個の音源SL+1,SL+2,…,SNをノイズとする場合には、確率分布のlogの和は以下のように求める。
log p (x | SNR 1 ) = log p (x | S 1 ) --log p (x | S 2 )
As the conditions relating to the desired sound, various conditions other than the above-mentioned conditions can be considered. For example, removal of unnecessary sounds, emphasis of necessary sounds, and the like can be considered. N sound sources S 1 , S 2 ,…, S N out of L sound sources S 1 , S 2 ,…, S L are emphasized as target sounds, and N L sound sources S L + 1 , S L + When 2 ,…, S N is noise, the sum of logs of the probability distribution is calculated as follows.
log p(x|SNR) = log p(x|S1) + … + log p(x|SL) - log p(x|SL+1) - … - log p(x|SN)
 なお、音圧確率分布推定部123Eは、予め定めた個数のマイクロホンの設置位置を求めてもよいし、設置するマイクロホンの個数を入力として受け取り、その個数に応じたマイクロホンの設置位置を求めてもよい。音圧確率分布推定部123Eは、所望の音響に係る条件を満たすようにマイクロホンの設置位置を求める。例えば、確率分布のlogの和の大きい位置をマイクロホンの設置位置とする。複数のマイクロホンの設置する場合には、確率分布のlogの和が極大となる複数の位置を複数のマイクロホンの設置位置としてもよいし、他の方法で設置位置を求めてもよい。
log p (x | SNR) = log p (x | S 1 ) +… + log p (x | S L ) --log p (x | S L + 1 ) --… --log p (x | S N )
The sound pressure probability distribution estimation unit 123E may obtain a predetermined number of microphone installation positions, or may receive the number of microphones to be installed as an input and obtain the microphone installation positions according to the number of microphones. good. The sound pressure probability distribution estimation unit 123E obtains the installation position of the microphone so as to satisfy the conditions relating to the desired sound. For example, the position where the sum of the logs of the probability distribution is large is set as the microphone installation position. When a plurality of microphones are installed, a plurality of positions where the sum of the logs of the probability distribution is maximized may be set as the installation positions of the plurality of microphones, or the installation positions may be obtained by other methods.
<出力部125>
 出力部125は、補完後の空間の形状とマイクロホン位置Nとを受け取り、補完後の空間の形状におけるマイクロホン位置Nを示す情報、例えば、画像データを生成し、提示部150に出力する(S125)。例えば、出力部125は、マイクロホン位置提示装置120を構成するコンピュータからマイクロホン位置を出力するための出力インターフェースを含む。なお、マイクロホン位置提示装置120を構成するコンピュータ内に、後述する提示部150を含む場合には、出力部125は出力インターフェースを含まなくともよい。
<Output unit 125>
The output unit 125 receives the shape of the space after complementation and the microphone position N, generates information indicating the microphone position N in the shape of the space after complementation, for example, image data, and outputs the information to the presentation unit 150 (S125). .. For example, the output unit 125 includes an output interface for outputting the microphone position from the computer constituting the microphone position presenting device 120. When the computer constituting the microphone position presenting device 120 includes the presenting unit 150 described later, the output unit 125 does not have to include the output interface.
<提示部150>
 提示部150は、補完後の空間の形状におけるマイクロホン位置Nを示す情報を受け取り、利用者に提示する(S150)。提示部150は例えばディスプレイ等の表示手段からなる。
<Presentation unit 150>
The presentation unit 150 receives information indicating the microphone position N in the shape of the space after complementation and presents it to the user (S150). The presentation unit 150 comprises display means such as a display.
<効果>
 以上の構成により、熟練したエンジニアが現地に出向かずにマイクロホンの設置位置を決定することができる。
<Effect>
With the above configuration, a skilled engineer can determine the installation position of the microphone without going to the site.
<第二実施形態>
 第一実施形態と異なる部分を中心に説明する。
<Second embodiment>
The part different from the first embodiment will be mainly described.
 本実施形態では、推定した空間伝達関数を用いて、所望のビームフォーマを形成するフィルタを設計し、処理性能を向上させる。 In this embodiment, the estimated spatial transfer function is used to design a filter that forms a desired beamformer to improve processing performance.
 従来技術では、屋内空間ごとに音響特性が異なるため、屋内空間ごとにフィルタを設計する必要があり、エンジニアが現地に出向く必要がある。 In the conventional technique, the acoustic characteristics are different for each indoor space, so it is necessary to design a filter for each indoor space, and an engineer needs to go to the site.
 本実施形態に係る発明は、熟練したエンジニアが現地に出向かずに所望のフィルタの設計を可能とする手段を提供することを目的とする。 An object of the present invention is to provide a means for a skilled engineer to design a desired filter without visiting the site.
<第二実施形態に係る音響システム>
 図11は第一実施形態に係る音響システムの機能ブロック図を、図12はその処理フローを示す。
<Sound system according to the second embodiment>
FIG. 11 shows a functional block diagram of the acoustic system according to the first embodiment, and FIG. 12 shows a processing flow thereof.
 音響システムは、取得部110と、フィルタ設計装置220と、フィルタリング部250とを含む。 The sound system includes an acquisition unit 110, a filter design device 220, and a filtering unit 250.
 音響システムは、取得部110を介して収音を行う空間の点群データを取得し、点群データを用いて、音源から発せられた音響信号の収音特性を異ならせるフィルタを生成し、収音した音響信号を生成したフィルタでフィルタリングし、フィルタリング後の音響信号を取得する。フィルタは、空間上の音源位置から発せられた音響信号の収音特性を異ならせるものであればよい。「収音特性を異ならせる」とは、例えば、特定の位置で発せられた音響信号を局所収音して他の位置で発せられた音響信号を極力収音しないようにしたり、逆に特定の位置で発せられた音響信号を抑圧(消音)して他の位置で発せられた音響信号のみを収音したりすることを意味する。 The sound system acquires point group data of the space for sound collection via the acquisition unit 110, and uses the point group data to generate a filter for different sound collection characteristics of the acoustic signal emitted from the sound source, and collects the sound. The sounded acoustic signal is filtered by the generated filter, and the filtered acoustic signal is acquired. The filter may have different sound collecting characteristics of the acoustic signal emitted from the sound source position in space. "Different sound collection characteristics" means, for example, that the acoustic signal emitted at a specific position is locally picked up so that the acoustic signal emitted at another position is not picked up as much as possible, or conversely, a specific sound is picked up. It means that the acoustic signal emitted at a position is suppressed (silenced) and only the acoustic signal emitted at another position is picked up.
 フィルタ設計装置220は、例えば、中央演算処理装置(CPU: Central Processing Unit)、主記憶装置(RAM: Random Access Memory)などを有する公知又は専用のコンピュータに特別なプログラムが読み込まれて構成された特別な装置である。フィルタ設計装置220は、例えば、中央演算処理装置の制御のもとで各処理を実行する。フィルタ設計装置220に入力されたデータや各処理で得られたデータは、例えば、主記憶装置に格納され、主記憶装置に格納されたデータは必要に応じて中央演算処理装置へ読み出されて他の処理に利用される。フィルタ設計装置220の各処理部は、少なくとも一部が集積回路等のハードウェアによって構成されていてもよい。フィルタ設計装置220が備える各記憶部は、例えば、RAM(Random Access Memory)などの主記憶装置、またはリレーショナルデータベースやキーバリューストアなどのミドルウェアにより構成することができる。ただし、各記憶部は、必ずしもフィルタ設計装置220がその内部に備える必要はなく、ハードディスクや光ディスクもしくはフラッシュメモリ(Flash Memory)のような半導体メモリ素子により構成される補助記憶装置により構成し、フィルタ設計装置220の外部に備える構成としてもよい。 The filter design device 220 is configured by loading a special program into a known or dedicated computer having, for example, a central processing unit (CPU: Central Processing Unit), a main storage device (RAM: Random Access Memory), or the like. Device. The filter design device 220 executes each process under the control of the central processing unit, for example. The data input to the filter design device 220 and the data obtained in each process are stored in, for example, the main storage device, and the data stored in the main storage device is read out to the central processing unit as needed. Used for other processing. At least a part of each processing unit of the filter design device 220 may be configured by hardware such as an integrated circuit. Each storage unit included in the filter design device 220 can be configured by, for example, a main storage device such as RAM (Random Access Memory) or middleware such as a relational database or a key-value store. However, each storage unit does not necessarily have to be provided inside the filter design device 220, and is configured by an auxiliary storage device composed of a hard disk, an optical disk, or a semiconductor memory element such as a flash memory to design a filter. It may be configured to be provided outside the device 220.
 以下、各部について説明する。 Each part will be explained below.
 取得部110は、第一実施形態で説明した通りなので、説明を省略する。 Since the acquisition unit 110 is as described in the first embodiment, the description thereof will be omitted.
<フィルタ設計装置220>
 図13はフィルタ設計装置220の機能ブロック図を、図14はその処理フローの例を示す図である。
<Filter design device 220>
FIG. 13 is a functional block diagram of the filter design device 220, and FIG. 14 is a diagram showing an example of the processing flow thereof.
 フィルタ設計装置220は、空間の形状を示す点群データ、仮想の音源位置S、仮想のマイクロホン位置Mを受け取り、収音特性を異ならせるフィルタを設計し(S220)、設計したフィルタを出力する。 The filter design device 220 receives point cloud data indicating the shape of space, a virtual sound source position S, and a virtual microphone position M, designs a filter having different sound collection characteristics (S220), and outputs the designed filter.
 例えば、フィルタ設計装置220は、入力部121とフィルタ算出部223と出力部225とを含む。 For example, the filter design device 220 includes an input unit 121, a filter calculation unit 223, and an output unit 225.
 入力部121は、第一実施形態で説明した通りなので、説明を省略する。 Since the input unit 121 is as described in the first embodiment, the description thereof will be omitted.
<フィルタ算出部223>
 フィルタ算出部223は、空間の形状を示す点群データR、仮想の音源位置S、仮想のマイクロホン位置Mを受け取り、収音特性を異ならせるフィルタWを算出し(S223)、出力する。
<Filter calculation unit 223>
The filter calculation unit 223 receives the point cloud data R indicating the shape of the space, the virtual sound source position S, and the virtual microphone position M, calculates the filter W having different sound collection characteristics (S223), and outputs the filter W.
 例えば、フィルタ算出部223は、壁面抽出部123Aと欠損補完部123Bと音源およびマイクロホン情報入力部223Cと空間伝達関数算出部223Dとフィルタ計算部223Eとを含む。 For example, the filter calculation unit 223 includes a wall surface extraction unit 123A, a defect complementation unit 123B, a sound source, a microphone information input unit 223C, a spatial transfer function calculation unit 223D, and a filter calculation unit 223E.
 壁面抽出部123Aと欠損補完部123Bとは、第一実施形態で説明した通りなので、説明を省略する。 Since the wall surface extraction unit 123A and the defect complementing unit 123B are as described in the first embodiment, the description thereof will be omitted.
<音源およびマイクロホン情報入力部223C>
 音源およびマイクロホン情報入力部223Cは、1つ以上の仮想の音源位置S=(S1,S2,…)と1つ以上の仮想のマイクロホン位置M=(M1,M2,…)とを入力として受け付け(S223C)、空間伝達関数算出部123Dに出力する。音源およびマイクロホン情報入力部223Cは音源情報入力部123Cと同様に例えばキーボードやマウス等の入力装置や、記憶媒体の読み取り装置である。例えば、音源およびマイクロホン情報入力部223Cは、欠損補完部123Bの出力である補完後の空間の形状を受け取り(図13中、破線で示す)、補完後の空間の形状をディスプレイ等の表示装置を介して利用者に提示し、利用者が、マウス等を利用して補完後の空間のどこに音源およびマイクロホンを配置するかを指定する構成としてもよい。
<Sound source and microphone information input unit 223C>
The sound source and microphone information input unit 223C has one or more virtual sound source positions S = (S 1 , S 2 , ...) And one or more virtual microphone positions M = (M 1 , M 2 , ...). It is accepted as an input (S223C) and output to the spatial transfer function calculation unit 123D. Like the sound source information input unit 123C, the sound source and microphone information input unit 223C is an input device such as a keyboard or a mouse, or a storage medium reading device. For example, the sound source and the microphone information input unit 223C receives the shape of the complemented space, which is the output of the defect complementing unit 123B (indicated by a broken line in FIG. 13), and displays the shape of the complemented space on a display device such as a display. It may be presented to the user via the configuration, and the user may specify where to place the sound source and the microphone in the complemented space by using a mouse or the like.
<空間伝達関数算出部223D>
 空間伝達関数算出部123Dは、仮想の音源位置Sと仮想のマイクロホン位置Mと補完した空間の形状を用いて、仮想の音源位置での空間における各マイクロホン位置の空間伝達関数を算出し(S223D)、出力する。第一実施形態ではマイクロホンの位置が決めるために仮想の音源位置と空間における想定される全ての各位置との空間伝達関数を算出したが、本実施形態ではマイクロホン位置は決まっているため、仮想の音源位置と仮想のマイクロホン位置との空間伝達関数を算出すればよい。なお、空間伝達関数算出技術としては、第一実施形態と同様である。
<Spatial transfer function calculation unit 223D>
The spatial transmission function calculation unit 123D calculates the spatial transmission function of each microphone position in the space at the virtual sound source position by using the shape of the space complemented by the virtual sound source position S and the virtual microphone position M (S223D). ,Output. In the first embodiment, the spatial transfer function between the virtual sound source position and all the assumed positions in the space is calculated in order to determine the position of the microphone. However, in the present embodiment, the microphone position is determined, so that the virtual sound source position is virtual. The spatial transfer function between the sound source position and the virtual microphone position may be calculated. The spatial transfer function calculation technique is the same as that of the first embodiment.
<フィルタ計算部223E>
 フィルタ計算部223Eは、仮想の音源位置Sと仮想のマイクロホン位置Mと算出した空間伝達関数を入力とし、仮想の音源位置から発された音響信号を仮想のマイクロホン位置で収音した音響信号の収音特性を異ならせるフィルタWを計算し(S223E)、出力する。フィルタ計算技術としては、様々な従来技術を用いることができる。例えば、収音特性を異ならせ、目的音を強調したい場合、最小分散法に基づきフィルタを設計することができる。ωを周波数とし、τをフレーム番号とし、k番目の音源とm番目のマイクロホンとの間の伝達係数をAk,m(ω)とし、k番目の音源が発する音響信号をSk(ω,τ)とすると、k番目の音源とm番目のマイクロホンにおける観測信号は
Xm(ω,τ)=Ak,m(ω)Sk(ω,τ)
と表される。フィルタ係数Wを掛け合わせて得られる出力信号は
Y(ω,τ)=WH(ω)X(ω,τ)
と表される。ただし、X(ω,τ)=[X1(ω,τ),X2(ω,τ),…,XM(ω,τ)]、W(ω)=[W1(ω),W2(ω),…,WM(ω)]とする。Mはマイクロホンの総数を表す。
<Filter calculation unit 223E>
The filter calculation unit 223E inputs the calculated spatial transmission function as the virtual sound source position S and the virtual microphone position M, and collects the acoustic signal emitted from the virtual sound source position at the virtual microphone position. The filter W that makes the sound characteristics different is calculated (S223E) and output. As the filter calculation technique, various conventional techniques can be used. For example, when it is desired to emphasize the target sound by making the sound collection characteristics different, the filter can be designed based on the minimum dispersion method. Let ω be the frequency, let τ be the frame number, let the transmission coefficient between the kth sound source and the mth microphone be Ak, m (ω), and let the acoustic signal emitted by the kth sound source be Sk (ω, ω,). If τ), the observation signals at the k-th sound source and the m-th microphone are
X m (ω, τ) = A k, m (ω) S k (ω, τ)
It is expressed as. The output signal obtained by multiplying the filter coefficient W is
Y (ω, τ) = W H (ω) X (ω, τ)
It is expressed as. However, X (ω, τ) = [X 1 (ω, τ), X 2 (ω, τ),…, X M (ω, τ)], W (ω) = [W 1 (ω), W 2 (ω),…, W M (ω)]. M represents the total number of microphones.
 K個の音源とあるm番目のマイクロホンとの間の伝達係数をam(ω)=[A1,m(ω),A2,m(ω),…,AK,m(ω)]とし、以下の拘束条件のもとPを最小化するフィルタW(ω)を設計する。
Figure JPOXMLDOC01-appb-M000001

Figure JPOXMLDOC01-appb-M000002

Figure JPOXMLDOC01-appb-M000003

Figure JPOXMLDOC01-appb-M000004

ただし、原信号が無相関と仮定する。
Figure JPOXMLDOC01-appb-M000005

収音特性を異ならせる例としては、エコーキャンセル、ノイズキャンセル、残響除去、音声強調等が考えられ、従来技術を用いて、それぞれの処理に適したフィルタを設定することができる。
The transfer coefficient between the K sound source and the m-th microphone is a m (ω) = [A 1, m (ω), A 2, m (ω),…, A K, m (ω)] Then, design a filter W (ω) that minimizes P under the following constraint conditions.
Figure JPOXMLDOC01-appb-M000001

Figure JPOXMLDOC01-appb-M000002

Figure JPOXMLDOC01-appb-M000003

Figure JPOXMLDOC01-appb-M000004

However, it is assumed that the original signal is uncorrelated.
Figure JPOXMLDOC01-appb-M000005

Echo cancellation, noise cancellation, reverberation removal, speech enhancement, and the like can be considered as examples of different sound collection characteristics, and a filter suitable for each process can be set by using the prior art.
<出力部225>
 出力部225は、フィルタWを受け取り、フィルタリング部250に出力する(S225)。例えば、出力部225は、フィルタ設計装置220を構成するコンピュータからフィルタWを出力するための出力インターフェースである。なお、フィルタ設計装置220を構成するコンピュータ内に、後述するフィルタリング部250を含む場合には、出力部225は、フィルタ算出部223に含まれる。
<Output unit 225>
The output unit 225 receives the filter W and outputs it to the filtering unit 250 (S225). For example, the output unit 225 is an output interface for outputting the filter W from the computer constituting the filter design device 220. When the computer constituting the filter design device 220 includes the filtering unit 250 described later, the output unit 225 is included in the filter calculation unit 223.
<フィルタリング部250>
 フィルタリング部250は、変換処理に先立ちフィルタWを受け取る。フィルタリング部250は、前述の仮想のマイクロホン位置に実際に配置されたマイクロホンで収音した音響信号Xを受け取り、音響信号XにフィルタWを乗じて出力信号Yを得る。出力信号Yは、図示しない記憶部に記憶してもよいし、図示しないスピーカ等から再生してもよいし、他の空間に設置された装置に伝送してもよい。このような構成により、音響信号Xの収音特性を異ならせた出力信号Yを得ることができる。
<Filtering unit 250>
The filtering unit 250 receives the filter W prior to the conversion process. The filtering unit 250 receives the acoustic signal X picked up by the microphone actually arranged at the virtual microphone position described above, and multiplies the acoustic signal X by the filter W to obtain the output signal Y. The output signal Y may be stored in a storage unit (not shown), reproduced from a speaker (not shown) or the like, or transmitted to a device installed in another space. With such a configuration, it is possible to obtain an output signal Y having different sound collecting characteristics of the acoustic signal X.
<効果>
 このような構成とすることで、熟練したエンジニアが現地に出向かずに所望のフィルタを設計することができる。
<Effect>
With such a configuration, a skilled engineer can design a desired filter without going to the site.
<その他の変形例>
 本発明は上記の実施形態及び変形例に限定されるものではない。例えば、上述の各種の処理は、記載に従って時系列に実行されるのみならず、処理を実行する装置の処理能力あるいは必要に応じて並列的にあるいは個別に実行されてもよい。その他、本発明の趣旨を逸脱しない範囲で適宜変更が可能である。
<Other variants>
The present invention is not limited to the above embodiments and modifications. For example, the various processes described above may not only be executed in chronological order according to the description, but may also be executed in parallel or individually as required by the processing capacity of the device that executes the processes. In addition, changes can be made as appropriate without departing from the spirit of the present invention.
<プログラム及び記録媒体>
 上述の各種の処理は、図15に示すコンピュータの記憶部2020に、上記方法の各ステップを実行させるプログラムを読み込ませ、制御部2010、入力部2030、出力部2040などに動作させることで実施できる。
<Programs and recording media>
The various processes described above can be performed by causing the storage unit 2020 of the computer shown in FIG. 15 to read a program for executing each step of the above method and operating the control unit 2010, the input unit 2030, the output unit 2040, and the like. ..
 この処理内容を記述したプログラムは、コンピュータで読み取り可能な記録媒体に記録しておくことができる。コンピュータで読み取り可能な記録媒体としては、例えば、磁気記録装置、光ディスク、光磁気記録媒体、半導体メモリ等どのようなものでもよい。 The program that describes this processing content can be recorded on a computer-readable recording medium. The computer-readable recording medium may be, for example, a magnetic recording device, an optical disk, a photomagnetic recording medium, a semiconductor memory, or the like.
 また、このプログラムの流通は、例えば、そのプログラムを記録したDVD、CD-ROM等の可搬型記録媒体を販売、譲渡、貸与等することによって行う。さらに、このプログラムをサーバコンピュータの記憶装置に格納しておき、ネットワークを介して、サーバコンピュータから他のコンピュータにそのプログラムを転送することにより、このプログラムを流通させる構成としてもよい。 The distribution of this program is carried out, for example, by selling, transferring, or renting a portable recording medium such as a DVD or CD-ROM on which the program is recorded. Further, the program may be stored in the storage device of the server computer, and the program may be distributed by transferring the program from the server computer to another computer via a network.
 このようなプログラムを実行するコンピュータは、例えば、まず、可搬型記録媒体に記録されたプログラムもしくはサーバコンピュータから転送されたプログラムを、一旦、自己の記憶装置に格納する。そして、処理の実行時、このコンピュータは、自己の記録媒体に格納されたプログラムを読み取り、読み取ったプログラムに従った処理を実行する。また、このプログラムの別の実行形態として、コンピュータが可搬型記録媒体から直接プログラムを読み取り、そのプログラムに従った処理を実行することとしてもよく、さらに、このコンピュータにサーバコンピュータからプログラムが転送されるたびに、逐次、受け取ったプログラムに従った処理を実行することとしてもよい。また、サーバコンピュータから、このコンピュータへのプログラムの転送は行わず、その実行指示と結果取得のみによって処理機能を実現する、いわゆるASP(Application Service Provider)型のサービスによって、上述の処理を実行する構成としてもよい。なお、本形態におけるプログラムには、電子計算機による処理の用に供する情報であってプログラムに準ずるもの(コンピュータに対する直接の指令ではないがコンピュータの処理を規定する性質を有するデータ等)を含むものとする。 A computer that executes such a program first stores, for example, a program recorded on a portable recording medium or a program transferred from a server computer in its own storage device. Then, when the process is executed, the computer reads the program stored in its own recording medium and executes the process according to the read program. Further, as another execution form of this program, a computer may read the program directly from a portable recording medium and execute processing according to the program, and further, the program is transferred from the server computer to this computer. It is also possible to execute the process according to the received program one by one each time. In addition, the above processing is executed by a so-called ASP (Application Service Provider) type service that realizes the processing function only by the execution instruction and result acquisition without transferring the program from the server computer to this computer. May be. The program in this embodiment includes information to be used for processing by a computer and equivalent to the program (data that is not a direct command to the computer but has a property of defining the processing of the computer, etc.).
 また、この形態では、コンピュータ上で所定のプログラムを実行させることにより、本装置を構成することとしたが、これらの処理内容の少なくとも一部をハードウェア的に実現することとしてもよい。 Further, in this form, the present device is configured by executing a predetermined program on the computer, but at least a part of these processing contents may be realized by hardware.

Claims (8)

  1.  収音を行う空間の点群データを取得する取得ステップと、
     所望の音響に係る条件を満たすように、前記点群データから推定した前記空間の形状のどこにマイクロホンを設置するかを算出するマイクロホン位置算出ステップと、
     算出したマイクロホンの設置位置を提示する提示ステップと、を含む、
     マイクロホン位置提示方法。
    The acquisition step to acquire the point cloud data of the space where the sound is collected, and
    A microphone position calculation step for calculating where to install the microphone in the shape of the space estimated from the point cloud data so as to satisfy the desired acoustic conditions.
    Including a presentation step that presents the calculated microphone installation position,
    Microphone position presentation method.
  2.  請求項1のマイクロホン位置提示方法であって、
     前記マイクロホン位置算出ステップは、
     前記空間の形状を用いて前記空間における空間伝達関数を算出し、算出した前記空間伝達関数を用いて前記マイクロホンの設置位置を算出する、
     マイクロホン位置提示方法。
    The method for presenting the microphone position according to claim 1.
    The microphone position calculation step is
    The space transfer function in the space is calculated using the shape of the space, and the installation position of the microphone is calculated using the calculated space transfer function.
    Microphone position presentation method.
  3.  請求項1のマイクロホン位置提示方法であって、
     マイクロホン位置算出ステップは、
     前記空間の形状を用いて、前記空間を形成する壁面を抽出する壁面抽出ステップと、
     前記空間に壁面以外の物体が存在し、前記壁面の一部の形状を推定できない場合、推定できた前記壁面の他部の形状から前記壁面の一部の形状を推定し、推定した前記壁面の一部を用いて前記空間の形状を補完する欠損補完ステップと、を含む、
     マイクロホン位置提示方法。
    The method for presenting the microphone position according to claim 1.
    The microphone position calculation step is
    A wall surface extraction step for extracting the wall surface forming the space using the shape of the space,
    When an object other than the wall surface exists in the space and the shape of a part of the wall surface cannot be estimated, the shape of a part of the wall surface is estimated from the estimated shape of the other part of the wall surface, and the estimated shape of the wall surface is reached. Includes a defect complement step that complements the shape of the space with a portion.
    Microphone position presentation method.
  4.  収音を行う空間の点群データに基づく情報の入力を受け付ける入力ステップと、
     所望の音響に係る条件を満たすように、前記点群データに基づく情報から推定した前記空間の形状のどこにマイクロホンを設置するかを算出するマイクロホン位置算出ステップと、
     算出したマイクロホンの設置位置を提示手段に出力する出力ステップと、を含む、
     マイクロホン位置提示方法。
    An input step that accepts input of information based on point cloud data in the space where sound is collected, and
    A microphone position calculation step for calculating where to install the microphone in the shape of the space estimated from the information based on the point cloud data so as to satisfy the desired acoustic conditions.
    Including an output step that outputs the calculated microphone installation position to the presentation means,
    Microphone position presentation method.
  5.  収音を行う空間の点群データから前記空間の形状を推定する空間形状推定ステップと、
     推定した前記空間の形状を用いて、前記空間を形成する壁面を抽出する壁面抽出ステップと、
     前記空間に壁面以外の物体が存在し、前記壁面の一部の形状を推定できない場合、推定できた前記壁面の他部の形状から前記壁面の一部の形状を推定し、推定した前記壁面の一部を用いて前記空間の形状を補完する欠損補完ステップと、
     補完した前記空間の形状を用いて前記空間における空間伝達関数を算出する空間伝達関数算出ステップと、
     算出した空間伝達関数を用いて、仮想の音源位置に対する音圧確率分布を推定し、推定した前記音圧確率分布を用いて、所望の音響に係る条件を満たすマイクロホン位置を求める音圧確率分布推定ステップと、を含む、
     マイクロホン位置提示方法。
    A space shape estimation step that estimates the shape of the space from the point cloud data of the space that collects sound, and
    A wall extraction step for extracting the wall surface forming the space using the estimated shape of the space, and
    When an object other than the wall surface exists in the space and the shape of a part of the wall surface cannot be estimated, the shape of a part of the wall surface is estimated from the estimated shape of the other part of the wall surface, and the estimated shape of the wall surface is reached. A defect completion step that complements the shape of the space using a part,
    A space transfer function calculation step for calculating a space transfer function in the space using the complemented shape of the space, and a space transfer function calculation step.
    The calculated spatial transmission function is used to estimate the sound pressure probability distribution with respect to the virtual sound source position, and the estimated sound pressure probability distribution is used to estimate the sound pressure probability distribution to obtain the microphone position satisfying the desired acoustic conditions. Steps and, including,
    Microphone position presentation method.
  6.  収音を行う空間の点群データを取得する取得部と、
     所望の音響に係る条件を満たすように、前記点群データから推定した前記空間の形状のどこにマイクロホンを設置するかを算出するマイクロホン位置算出部と、
     算出したマイクロホンの設置位置を提示する提示部と、を含む、
     マイクロホン位置提示装置。
    An acquisition unit that acquires point cloud data in the space where sound is collected,
    A microphone position calculation unit that calculates where to install the microphone in the shape of the space estimated from the point cloud data so as to satisfy the desired acoustic conditions.
    Including a presentation unit that presents the calculated microphone installation position,
    Microphone position presentation device.
  7.  収音を行う空間の点群データに基づく情報の入力を受け付ける入力部と、
     所望の音響に係る条件を満たすように、前記点群データに基づく情報から推定した前記空間の形状のどこにマイクロホンを設置するかを算出するマイクロホン位置算出部と、
     算出したマイクロホンの設置位置を提示手段に出力する出力部と、を含む、
     マイクロホン位置提示装置。
    An input unit that accepts input of information based on point cloud data in the space that collects sound,
    A microphone position calculation unit that calculates where to install the microphone in the shape of the space estimated from the information based on the point cloud data so as to satisfy the desired acoustic conditions.
    Including an output unit that outputs the calculated microphone installation position to the presentation means,
    Microphone position presentation device.
  8.  請求項4のマイクロホン位置提示方法をコンピュータに実行させるためのプログラム。 A program for causing a computer to execute the microphone position presentation method of claim 4.
PCT/JP2021/000667 2021-01-12 2021-01-12 Microphone position presentation method, device therefor, and program WO2022153359A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/JP2021/000667 WO2022153359A1 (en) 2021-01-12 2021-01-12 Microphone position presentation method, device therefor, and program

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2021/000667 WO2022153359A1 (en) 2021-01-12 2021-01-12 Microphone position presentation method, device therefor, and program

Publications (1)

Publication Number Publication Date
WO2022153359A1 true WO2022153359A1 (en) 2022-07-21

Family

ID=82447015

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2021/000667 WO2022153359A1 (en) 2021-01-12 2021-01-12 Microphone position presentation method, device therefor, and program

Country Status (1)

Country Link
WO (1) WO2022153359A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005115291A (en) * 2003-10-10 2005-04-28 Yamaha Corp Audio equipment layout support apparatus, program, and acoustic system
JP2010505153A (en) * 2005-10-07 2010-02-18 アノクシス・アーゲー Method for monitoring space and apparatus for performing the method
CN106255031A (en) * 2016-07-26 2016-12-21 北京地平线信息技术有限公司 Virtual sound field generator and virtual sound field production method
US20190176982A1 (en) * 2017-12-07 2019-06-13 Harman International Industries, Incorporated Drone deployed speaker system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2005115291A (en) * 2003-10-10 2005-04-28 Yamaha Corp Audio equipment layout support apparatus, program, and acoustic system
JP2010505153A (en) * 2005-10-07 2010-02-18 アノクシス・アーゲー Method for monitoring space and apparatus for performing the method
CN106255031A (en) * 2016-07-26 2016-12-21 北京地平线信息技术有限公司 Virtual sound field generator and virtual sound field production method
US20190176982A1 (en) * 2017-12-07 2019-06-13 Harman International Industries, Incorporated Drone deployed speaker system

Similar Documents

Publication Publication Date Title
KR102174598B1 (en) System and method for localization for non-line of sight sound source using diffraction aware
JP6687032B2 (en) Ear shape analysis method, head-related transfer function generation method, ear shape analysis device, and head-related transfer function generation device
US7697700B2 (en) Noise removal for electronic device with far field microphone on console
JP5701164B2 (en) Position detection apparatus and position detection method
An et al. Reflection-aware sound source localization
CN109754821B (en) Information processing method and system, computer system and computer readable medium
JP7447796B2 (en) Audio signal processing device, noise suppression method
US10911885B1 (en) Augmented reality virtual audio source enhancement
Tang et al. Learning acoustic scattering fields for dynamic interactive sound propagation
EP2012725A2 (en) Narrow band noise reduction for speech enhancement
JP2008164481A (en) Altitude model creation apparatus, altitude model creation method and altitude model creation program
KR20190029173A (en) Method and device for classifying medical ultrasound image based on deep learning using smart device
WO2022153359A1 (en) Microphone position presentation method, device therefor, and program
WO2022153360A1 (en) Filter design method, device therefor, and program
WO2023246327A1 (en) Audio signal processing method and apparatus, and computer device
JP6517124B2 (en) Noise suppression device, noise suppression method, and program
JP2018077139A (en) Sound field estimation device, sound field estimation method and program
CN116309921A (en) Delay summation acoustic imaging parallel acceleration method based on CUDA technology
WO2023012920A1 (en) Estimation device, estimation method, and program
JP7222277B2 (en) NOISE SUPPRESSION APPARATUS, METHOD AND PROGRAM THEREOF
Tsingos et al. Extending geometrical acoustics to highly detailed architectural environments
KR102105752B1 (en) System and method for sound source localization using reflection aware
Ratnarajah et al. Listen2Scene: Interactive material-aware binaural sound propagation for reconstructed 3D scenes
US20160094925A1 (en) Systems and methods for determining metric for sound system evaluation
JP5172909B2 (en) Reflected sound information estimation apparatus, reflected sound information estimation method, program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21919256

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21919256

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: JP