WO2023217079A1 - 基于传声器阵列的声源识别方法、装置及电子设备 - Google Patents

基于传声器阵列的声源识别方法、装置及电子设备 Download PDF

Info

Publication number
WO2023217079A1
WO2023217079A1 PCT/CN2023/092735 CN2023092735W WO2023217079A1 WO 2023217079 A1 WO2023217079 A1 WO 2023217079A1 CN 2023092735 W CN2023092735 W CN 2023092735W WO 2023217079 A1 WO2023217079 A1 WO 2023217079A1
Authority
WO
WIPO (PCT)
Prior art keywords
sound source
identified
index set
microphone array
matrix
Prior art date
Application number
PCT/CN2023/092735
Other languages
English (en)
French (fr)
Inventor
匡正
毛峻伟
丁林宁
Original Assignee
苏州清听声学科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 苏州清听声学科技有限公司 filed Critical 苏州清听声学科技有限公司
Publication of WO2023217079A1 publication Critical patent/WO2023217079A1/zh

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S5/00Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
    • G01S5/18Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using ultrasonic, sonic, or infrasonic waves

Definitions

  • the present invention relates to the technical field of sound source identification, and in particular to sound source identification methods, devices and electronic equipment based on microphone arrays.
  • Microphone arrays are commonly used for sound source identification in fields such as aeroacoustic measurement and traffic noise control.
  • Small-aperture microphone arrays are used on a large scale in practical application scenarios because of their small size and portability.
  • Traditional sound source identification methods such as the delay and sum (DAS) beamforming method, produce a wide output main lobe under a small aperture array, causing interference between multiple sound sources, seriously reducing the sound source identification performance. Therefore, it is necessary to design a sound source identification method that can be used in small aperture arrays.
  • DAS delay and sum
  • the purpose of the present invention is to provide a sound source identification method, device and electronic equipment based on a microphone array, which can achieve high sound source identification performance based on any arrangement of the microphone array.
  • a sound source identification method based on a microphone array includes:
  • the scanning matrix of the microphone array is determined based on the microphone array surface where the microphone array is arbitrarily arranged and the grid scanning surface to be identified; the grid scanning surface to be identified includes at least one target sound source to be identified;
  • the iteration is terminated when the preset iteration termination conditions are met to obtain the target scan matrix corresponding to the third atomic index set, and the identified grid scan to be identified is obtained according to the sample covariance matrix and the target scan matrix. Sound source information of the target sound source included in the surface.
  • determining the scanning matrix of the microphone array based on the microphone array surface of the arbitrarily arranged microphone array and the grid scanning surface to be identified includes:
  • the scanning matrix of the microphone array is determined based on the microphone array surface and the grid scanning surface to be identified.
  • the microphone array includes M array elements, and the scan data is time domain data; the scan data is obtained based on the scan data of the grid scan surface to be identified within a preset time period.
  • the corresponding sample covariance matrix includes:
  • the scanning data of the microphone array on the grid scanning surface to be identified within a preset time period will be acquired. Framing;
  • a sample covariance matrix within a preset time period is obtained according to the signal data.
  • the iterative search for the target index position in the scan matrix that has the largest orthogonal projection to the sample covariance matrix to update the first atomic index set to obtain the second atomic index set includes: :
  • the first target index position is added to the first atomic index set to obtain a second atomic index set.
  • re-identifying any sound source in the second atomic index set after the current round of iteration to update the second atomic index set to obtain a third atomic index set includes:
  • the updated index position is added to the second atomic index set to obtain a third atomic index set.
  • the method further includes:
  • the corresponding second residual is calculated after the current round of iteration is completed and the index positions of all currently identified sound sources are updated;
  • the update termination condition is: the difference between the first residual and the second residual does not The preset threshold is exceeded, or the preset number of cycles is reached.
  • the identified said scan matrix is obtained according to the sample covariance matrix and the target scan matrix.
  • the sound source information of the target sound source included in the grid scanning plane to be identified is shown in the following formula (1):
  • a sound source identification device based on a microphone array includes:
  • the first processing module is used to determine the scanning matrix of the microphone array based on the microphone array surface where the microphone array is arbitrarily arranged and the grid scanning surface to be identified; the grid scanning surface to be identified includes at least one target sound source to be identified;
  • the second processing module is used to obtain the corresponding sample covariance matrix based on the scanning data of the grid scanning surface to be identified within a preset time period;
  • the third processing module is used to iteratively search the target index position in the scan matrix that has the largest orthogonal projection to the sample covariance matrix to update the first atomic index set to obtain the second atomic index set, the first atomic index Any index position included in the set or the second atomic index set respectively corresponds to the corresponding identified sound source;
  • a fourth processing module configured to re-identify any sound source in the second atomic index set after the current round of iteration to update the second atomic index set to obtain a third atomic index set;
  • the fifth processing module is configured to terminate the iteration when the preset iteration termination condition is met to obtain the target scan matrix corresponding to the third atomic index set, and obtain the identified scan matrix according to the sample covariance matrix and the target scan matrix.
  • the sound source information of the target sound source included in the grid scanning plane to be identified.
  • an electronic device including:
  • a memory associated with the one or more processors is used to store program instructions.
  • the program instructions execute any one of the first aspects. method described.
  • a fourth aspect provides a computer-readable storage medium on which a computer program is stored, characterized in that when the computer program is executed by one or more processors, the method according to any one of the first aspects is implemented. step.
  • the present invention has the following beneficial effects:
  • the present invention provides a sound source identification method, device and electronic equipment based on a microphone array.
  • the method includes: determining the scanning matrix of the microphone array based on the microphone array surface where the microphone array is arbitrarily arranged and the grid scanning surface to be identified; The corresponding sample covariance matrix is obtained from the scan data of the grid scan surface to be identified within the time period; iteratively searches for the target index position in the scan matrix that has the largest orthogonal projection to the sample covariance matrix to update the first atomic index set to obtain the second atomic index.
  • the sound source information of the identified target sound source included in the grid scanning surface to be identified is obtained according to the sample covariance matrix and the target scanning matrix; the sound source identification method uses the covariance of orthogonal least squares On the basis of fitting sound source recognition, combined with the idea of global backtracking, global backtracking can re-examine and correct each added sound source.
  • This method can not only use the block sparsity of sparse coherent sound sources to identify the current source and its relationship with the previous source at one time.
  • the covariance of a source makes the covariance matrix estimation of coherent sources practical and feasible. It is no longer limited to a specific array element arrangement. It can also be used when the sound source frequency is too low, the sound source spacing is too close, and the measurement distance is too far.
  • This method can simultaneously reduce the impact of the array's scanning matrix correlation on the sound source recognition results, so as to reduce the mutual interference of multiple sound sources at low frequencies and effectively improve the recognition performance and recognition accuracy.
  • Figure 1 is a flow chart of the sound source identification method based on the microphone array in this embodiment
  • Figure 2 is a simulation diagram of the three-dimensional coordinate system of the microphone array, the microphone array, and the grid scanning surface to be identified established in this embodiment;
  • Figure 3 is a comparison diagram of the sound source identification results obtained by the simulation experiment between the sound source identification method based on the microphone array and the DAS beamforming method in this embodiment;
  • Figure 4 is the source positioning root mean square error result in the frequency dimension obtained by the simulation experiment in this embodiment. fruit chart;
  • Figure 5 is a diagram of the root mean square error result of the source intensity in the frequency dimension obtained by the simulation experiment in this embodiment
  • Figure 6 is a diagram of the root mean square error result of source positioning in the sound spacing dimension obtained from the simulation experiment in this embodiment
  • Figure 7 is a diagram of the root mean square error result of the source intensity in the sound spacing dimension obtained by the simulation experiment in this embodiment
  • Figure 8 is a diagram of the root mean square error result of source positioning in the measurement spacing dimension obtained by the simulation experiment in this embodiment
  • Figure 9 is a diagram of the root mean square error result of the source intensity in the measurement spacing dimension obtained by the simulation experiment in this embodiment.
  • Figure 10 is a schematic structural diagram of a computer-readable storage medium in this embodiment.
  • this embodiment provides a sound source identification method, device and electronic equipment based on a microphone array, which can effectively solve the above problems. Further detailed description will be given below with reference to specific embodiments.
  • this embodiment provides a sound source identification method based on a microphone array.
  • the method includes the following steps:
  • the grid scanning surface to be identified includes at least one target sound source to be identified.
  • step S1 includes:
  • the grid scanning plane to be identified includes two sound sources to be identified: source 1 and source 2. We need to identify the positions, power and source covariance of source 1 and source 2 respectively.
  • the position of each array element in the microphone array is determined.
  • the scan data is time domain data.
  • step S2 includes:
  • p ⁇ C M ⁇ 1 represents the data signal received by the M array elements.
  • the signal data includes sound source parameters such as sound pressure and sound intensity, but does not Not limited to this.
  • f is the specified sound source detection frequency
  • s ⁇ C N ⁇ 1 is the intensity of the sound source signal at the grid point
  • n ⁇ C M ⁇ 1 indicates environmental noise.
  • A [a 1 ,a 2 ,...,a N ] ⁇ C M ⁇ N is the scanning matrix of the microphone array
  • a n ⁇ C M ⁇ 1 represents the steering vector of the nth grid point
  • a n is calculated as follows (3) shown:
  • r n the distance from the n-th grid point to the coordinate origin
  • i the imaginary unit
  • the angular velocity
  • c the sound speed
  • T the transpose of the matrix
  • sample covariance matrix G is calculated as follows:
  • step S3 Iteratively search for the target index position in the scan matrix that has the largest orthogonal projection to the sample covariance matrix to update the first atomic index set to obtain the second atomic index set.
  • step S3 in each round of iterative search, at least one new sound source to be identified may be discovered, so step S3 is used to discover and add at least one new sound source to be identified to the atomic index set.
  • step S3 there is mutual interference between multiple sound sources, so after step S3, the sound sources in the atomic index set need to be re-examined after each iteration, as described in step S4 below.
  • initialization residual R 0 G
  • Step S3 includes:
  • step S4 Re-identify any sound source in the second atomic index set ⁇ l after the current round of iteration (l) to update the second atomic index set ⁇ l to obtain the third atomic index set.
  • step S4 includes:
  • the index position is re-searched to obtain the updated index position n ⁇ so that the orthogonal projection on the space spanned by the identified atoms and the selected index atoms is the largest.
  • the updated index position n ⁇ is calculated as follows: 10) shown:
  • step S4 also includes:
  • the update termination condition is: the difference between the first residual and the second residual does not exceed a preset threshold, or reaches a preset number of cycles.
  • S47 and S48 can be executed selectively.
  • the preset iteration termination condition can be that the residual value after iteration is less than the preset empirical value, or on the premise that the number of sound sources is clear, the number of iterations is not less than the number of sound sources.
  • step S5 the sound source information of the identified target sound source included in the grid scanning plane to be identified is obtained according to the sample covariance matrix and the target scanning matrix, as shown in the following formula (1):
  • the simulation experiment verification method is as follows:
  • FIG. 2 is a comparison diagram of the sound source identification results obtained by the simulation experiment between the sound source identification method based on the microphone array and the DAS beamforming method in this embodiment.
  • the asterisk represents the actual position of the sound source
  • the sound source output result of the DAS beamforming method is the peak of the cloud diagram
  • the circle represents the output result of the sound source identification in this embodiment, and their size is proportional to the sound source intensity. It can be seen that the main lobe of the DAS beamforming output result of the strong source (source 2) is too wide, interfering with source 1, causing the identification position of source 1 to be seriously shifted, and the identification accuracy is poor.
  • the sound source identification method based on the microphone array is The recognition results are not affected by the above interference, and the recognition results are more accurate.
  • the recognition error of the sound source identification method in this embodiment is calculated in the dimensions of frequency, sound source spacing, and measurement spacing.
  • the results are shown in Figures 4 to 9.
  • the root mean square error (RMS) performance parameter is introduced, defined as the following formula (13) (taking source 1 as an example).
  • the Monte Carlo number T of simulation is 200 times.
  • X 1,t represents the source position identified in the tth simulation of source 1 when describing the root mean square error of source positioning
  • X 1 represents the true position of source 1.
  • the root mean square error (m 2 ) of source positioning in different dimensions is basically less than 10 -3 , and even less than 10 -4 under some variables.
  • the root mean square error (dB 2 ) of the source intensity in different dimensions is basically less than 10 0 , and even less than 10 -1 and 10 -2 under some variables. Therefore, the sound source identification method based on the microphone array has smaller identification error and higher accuracy.
  • the sound source identification method based on the microphone array provided in this embodiment has smaller source positioning root mean square error and source intensity root mean square error under different frequencies, sound source spacing and measurement spacing. It can be seen that Based on the covariance fitting of orthogonal least squares for sound source identification, combined with the idea of global backtracking It is thought that not only can the block sparsity of sparse coherent sound sources be used to identify the current source and its covariance with the previous source at one time, but also make the covariance matrix estimation of coherent sources practical and feasible, and no longer limited to specific array elements.
  • the arrangement can also reduce the impact of the array's scanning matrix correlation on the sound source identification results when the sound source frequency is too low, the sound source spacing is too close, or the measurement distance is too far, so as to reduce the mutual interference of multiple sound sources at low frequencies. situation, effectively improving recognition performance and recognition accuracy.
  • this embodiment further provides a sound source identification device based on a microphone array, which device includes:
  • the first processing module is used to determine the scanning matrix of the microphone array based on the microphone array surface where the microphone array is arbitrarily arranged and the grid scanning surface to be identified; the grid scanning surface to be identified includes at least one target sound source to be identified;
  • the second processing module is used to obtain the corresponding sample covariance matrix based on the scanning data of the grid scanning surface to be identified within a preset time period;
  • the third processing module is used to iteratively search the target index position in the scan matrix that has the largest orthogonal projection to the sample covariance matrix to update the first atomic index set to obtain the second atomic index set, the first atomic index Any index position included in the set or the second atomic index set respectively corresponds to the corresponding identified sound source;
  • a fourth processing module configured to re-identify any sound source in the second atomic index set after the current round of iteration to update the second atomic index set to obtain a third atomic index set;
  • the fifth processing module is configured to terminate the iteration when the preset iteration termination condition is met to obtain the target scan matrix corresponding to the third atomic index set, and obtain the identified scan matrix according to the sample covariance matrix and the target scan matrix.
  • the sound source information of the target sound source included in the grid scanning plane to be identified.
  • the first processing module includes:
  • a construction unit for establishing a three-dimensional coordinate system of the microphone array determining the microphone array surface where the microphone array is arbitrarily arranged and the grid scanning surface to be identified in the three-dimensional coordinate system of the microphone array;
  • a first processing unit configured to determine the scanning matrix of the microphone array based on the microphone array surface and the grid scanning surface to be identified.
  • the second processing module includes:
  • the second processing unit is used to compare the obtained microphone array within the preset time period to the network to be identified.
  • the scanning data of the grid scanning surface is divided into frames;
  • a conversion unit configured to convert the framed scanning data into frequency domain data through fast Fourier transform
  • An acquisition unit configured to acquire signal data of M array elements on the microphone based on the frequency domain data
  • the third processing unit obtains a sample covariance matrix within a preset time period based on the signal data.
  • the third processing module includes:
  • a search unit configured to search the first target index position in the scan matrix that has the largest orthogonal projection to the sample covariance matrix and calculate the corresponding first residual when performing the current round of iteration;
  • a first adding unit configured to add the first target index position to the first atomic index set to obtain the second atomic index set.
  • the fourth processing module includes:
  • the determination unit is used to determine all currently recognized sound sources after the current round of iteration is completed
  • the deletion unit is used to delete the target index position corresponding to any first sound source among all currently recognized sound sources;
  • a first calculation unit configured to calculate and obtain the temporary residual corresponding to the first sound source based on the currently identified remaining sound sources except the first sound source and the sample covariance matrix
  • An identification unit configured to re-identify the updated index position of the first sound source based on the temporary residual
  • a second adding unit is configured to add the updated index position to the second atomic index set to obtain a third atomic index set.
  • the second calculation unit is used to calculate the corresponding second residual after the current round of iteration is completed and the index positions of all currently identified sound sources are updated;
  • the judgment unit is used to judge whether the preset update termination condition is met, and if so, start the next round of iteration; otherwise, perform loop identification again on any sound source in the third atomic index set to update the third atomic index set to obtain the third iteration.
  • the update termination condition is: the difference between the first residual and the second residual does not exceed a preset threshold, or reaches a preset number of cycles.
  • the fifth processing module is specifically used to obtain the target scan matrix corresponding to the third atomic index set after the termination of the iteration, according to the sample covariance matrix, the target scan matrix Obtain the identified sound source information of the target sound source included in the grid scan plane to be identified, as shown in the following formula (1):
  • the sound source identification device based on the microphone array provided in the above embodiment triggers the sound source identification service based on the microphone array
  • only the division of the above functional modules is used as an example. In practical applications, it can be used as needed.
  • the above function allocation is completed by different functional modules, that is, the internal structure of the system is divided into different functional modules to complete all or part of the functions described above.
  • the sound source identification device based on the microphone array provided by the above embodiments and the sound source identification method based on the microphone array belong to the same concept, that is, the system is based on this method.
  • the method embodiments please refer to the method embodiments. I won’t go into details here.
  • this embodiment also provides an electronic device, including:
  • a memory associated with the one or more processors is used to store program instructions.
  • the program instructions When the program instructions are read and executed by the one or more processors, the aforementioned sound source based on the microphone array is executed. recognition methods.
  • this embodiment also provides a computer-readable storage medium 31 on which a computer program 310 is stored.
  • the computer program is executed by one or more processors 32, the aforementioned microphone array-based method is implemented. sound source identification method.
  • the computer-readable storage medium may be a computer-readable signal medium or a computer-readable storage medium or any combination of the above two.
  • the computer-readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, device or device, or any combination thereof. More specific examples (non-exhaustive list) of computer readable storage media include: electrical connections having one or more conductors, portable computer disks, hard drives, random access memory (RAM), read only memory (ROM), Erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable Portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program for use by or in connection with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave carrying computer-readable program code therein. Such propagated data signals may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the above.
  • a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium that can send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device .
  • Program code embodied on a computer-readable storage medium may be transmitted using any suitable medium, including but not limited to wireless, wire, optical cable, RF, etc., or any suitable combination of the foregoing.
  • the client and server can communicate using any currently known or future developed network protocol such as HTTP (Hyper Text Transfer Protocol), and can communicate with digital data in any form or medium.
  • Data communications e.g., communications network
  • communications networks include local area networks (“LAN”), wide area networks (“WAN”), the Internet (e.g., the Internet), and end-to-end networks (e.g., ad hoc end-to-end networks), as well as any currently known or developed in the future network of.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device; it may also exist independently without being assembled into the electronic device.
  • Computer program code for performing the operations of the present invention may be written in one or more programming languages, including object-oriented programming languages such as Java, Smalltalk, C++, and conventional procedures, or a combination thereof.
  • programming language - such as "C” or a similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (such as through the Internet using an Internet service provider) .
  • LAN local area network
  • WAN wide area network
  • Internet service provider such as AT&T, MCI, Sprint, EarthLink, MSN, GTE, etc.
  • each block in the flowchart or block diagram may represent a module, segment, or portion of code that contains one or more logic functions that implement the specified executable instructions.
  • the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown one after another may actually execute substantially in parallel, or they may sometimes execute in the reverse order, depending on the functionality involved.
  • each block of the block diagram and/or flowchart illustration, and combinations of blocks in the block diagram and/or flowchart illustration can be implemented by special purpose hardware-based systems that perform the specified functions or operations. , or can be implemented using a combination of specialized hardware and computer instructions.
  • the units involved in the embodiments of the present disclosure can be implemented in software or hardware. Among them, the name of a unit does not constitute a limitation on the unit itself under certain circumstances.
  • FPGAs Field Programmable Gate Arrays
  • ASICs Application Specific Integrated Circuits
  • ASSPs Application Specific Standard Products
  • SOCs Systems on Chips
  • CPLD Complex Programmable Logical device
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in connection with an instruction execution system, apparatus, or device.
  • the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • Machine-readable media may include, but are not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, devices or devices, or any suitable combination of the foregoing.
  • machine-readable storage media would include one or more wire-based electrical connections, laptop disks, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • RAM random access memory
  • ROM read only memory
  • EPROM or flash memory erasable programmable read only memory
  • CD-ROM portable compact disk read-only memory
  • magnetic storage device or any suitable combination of the above.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

一种基于传声器阵列的声源识别方法、装置及电子设备,该声源识别方法在正交最小二乘的协方差拟合声源识别的基础上结合全局回溯思想,全局回溯可以对每次添加的声源进行重新审查修正,该方法不仅可以利用稀疏相干声源的块稀疏性一次性识别当前源及其与上一个源的协方差,使相干源的协方差矩阵估计变得实际可行,不再受限于特定的阵元排布,还能在声源频率过低、声源间距过近、测量距离过远时减小阵列的扫描矩阵相关性对声源识别结果带来的影响,以减轻低频下多声源相互干扰的情况,有效提高识别性能及识别精度。

Description

基于传声器阵列的声源识别方法、装置及电子设备 技术领域
本发明涉及声源识别技术领域,尤其涉及基于传声器阵列的声源识别方法、装置及电子设备。
背景技术
现代社会的发展过程中,人们对于听觉环境的标准也越来越高,因此在不同生活场景中,维持听觉环境的舒适显得十分必要,即降噪或去除异常的声音。在这一过程中的一个基本问题是识别不同来源的声源。随着越来越严格的声学质量标准的发展,特别是在交通运输行业,对声源的定位、量化和排序的专门技术的需求已经成为至关重要的。
传声器阵列通常用于气动声学测量、交通噪声控制等领域的声源识别。小孔径传声器阵列因为体积小而具有便携性的特点,在实际应用场景中被大规模应用。传统声源识别方法,如延迟和求和(delay and sum,简称DAS)波束形成方法在小孔径阵列下产生的输出主瓣较宽,多声源间产生干扰,严重降低声源识别性能。因此,需要设计一种能够用于小孔径阵列的声源识别方法。
在实际声源识别过程中,由于传播环境的复杂性或者存在分布辐射信号的扩展源时,都会产生相干源信号。这将导致信号的协方差矩阵产生秩缺损,使传统的声源识别方法产生错误的结果。特定排布方式的阵列(平移不变或对称)可以利用前后向平滑技术来解决此问题,但会带来阵列孔径的减小和成本的提高。
因此,需要寻找一种针对任意布设传声器阵列且能有效提高识别性能的声源识别方法。
发明内容
本发明的目的在于提供基于传声器阵列的声源识别方法、装置及电子设备,其基于任意布设传声器阵列均能实现较高的声源识别性能。
为实现上述发明目的,本发明提出了如下技术方案:
一方面,提供基于传声器阵列的声源识别方法,所述方法包括:
基于任意布设传声器阵列的传声器阵列面及待识别网格扫描面确定所述传声器阵列的扫描矩阵;所述待识别网格扫描面包括至少一个待识别的目标声源;
根据在预设时长内对所述待识别网格扫描面的扫描数据获得相应的样本协方差矩阵;
迭代搜索所述扫描矩阵中与所述样本协方差矩阵的正交投影最大的目标索引位置以更新第一原子索引集获得第二原子索引集,所述第一原子索引集或所述第二原子索引集分别包括的任一索引位置对应于相应识别到的声源;
对当前轮迭代后所述第二原子索引集中任一声源进行再次识别以更新所述第二原子索引集获得第三原子索引集;
至满足预设迭代终止条件时终止迭代以获得与所述第三原子索引集对应的目标扫描矩阵,根据所述样本协方差矩阵、所述目标扫描矩阵获得识别到的所述待识别网格扫描面中包括的目标声源的声源信息。
在一种较佳的实施方式中,所述基于任意布设传声器阵列的传声器阵列面及待识别网格扫描面确定所述传声器阵列的扫描矩阵,包括:
建立传声器阵列三维坐标系;
在所述传声器阵列三维坐标系中确定任意布设传声器阵列的传声器阵列面及待识别网格扫描面;
基于所述传声器阵列面及待识别网格扫描面确定所述传声器阵列的扫描矩阵。
在一种较佳的实施方式中,所述传声器阵列包括M个阵元,所述扫描数据为时域数据;所述根据在预设时长内对所述待识别网格扫描面的扫描数据获得相应的样本协方差矩阵,包括:
将获取的预设时长内传声器阵列对所述待识别网格扫描面的扫描数据 分帧;
将分帧后的所述扫描数据经快速傅里叶变换转换为频域数据;
基于所述频域数据获得传声器上M个阵元的信号数据;
根据所述信号数据获得预设时长内的样本协方差矩阵。
在一种较佳的实施方式中,所述迭代搜索所述扫描矩阵中与所述样本协方差矩阵的正交投影最大的目标索引位置以更新第一原子索引集获得第二原子索引集,包括:
在进行当前轮迭代时,搜索所述扫描矩阵中与所述样本协方差矩阵的正交投影最大的第一目标索引位置并计算相应的第一残差;
将所述第一目标索引位置添加至第一原子索引集中以获得第二原子索引集。
在一种较佳的实施方式中,所述对当前轮迭代后所述第二原子索引集中任一声源进行再次识别以更新所述第二原子索引集获得第三原子索引集,包括:
在当前轮迭代完成后,确定当前所识别出的所有声源;
将当前所识别出的所有声源中的任一第一声源对应的目标索引位置删除;
基于当前所识别出的除所述第一声源外的其余声源及样本协方差矩阵计算获得与所述第一声源对应的临时残差;
基于所述临时残差重新识别所述第一声源的更新索引位置;
将所述更新索引位置添加至所述第二原子索引集获得第三原子索引集。
在一种较佳的实施方式中,在获得第三原子索引集之后,所述方法还包括:
在当前轮迭代完成后并完成当前所识别出的所有声源的索引位置更新后计算相应的第二残差;
当满足预设更新终止条件,则开始下一轮迭代;
否则,对所述第三原子索引集中任一声源进行再次循环识别以更新所述第三原子索引集获得第四原子索引集;
其中,所述更新终止条件为:所述第一残差与所述第二残差的差值不 超过预设阈值,或,达到预设循环次数。
在一种较佳的实施方式中,所述终止迭代以获得与所述第三原子索引集对应的目标扫描矩阵后,根据所述样本协方差矩阵、所述目标扫描矩阵获得识别到的所述待识别网格扫描面中包括的目标声源的声源信息如下公式(1)所示:
其中,为源协方差矩阵,为所述第三原子索引集对应的目标扫描矩阵的Moore-Penrose逆,G为样本协方差矩阵。
第二方面,提供基于传声器阵列的声源识别装置,所述装置包括:
第一处理模块,用于基于任意布设传声器阵列的传声器阵列面及待识别网格扫描面确定所述传声器阵列的扫描矩阵;所述待识别网格扫描面包括至少一个待识别的目标声源;
第二处理模块,用于根据在预设时长内对所述待识别网格扫描面的扫描数据获得相应的样本协方差矩阵;
第三处理模块,用于迭代搜索所述扫描矩阵中与所述样本协方差矩阵的正交投影最大的目标索引位置以更新第一原子索引集获得第二原子索引集,所述第一原子索引集或所述第二原子索引集分别包括的任一索引位置对应于相应识别到的声源;
第四处理模块,用于对当前轮迭代后所述第二原子索引集中任一声源进行再次识别以更新所述第二原子索引集获得第三原子索引集;
第五处理模块,用于至满足预设迭代终止条件时终止迭代以获得与所述第三原子索引集对应的目标扫描矩阵,根据所述样本协方差矩阵、所述目标扫描矩阵获得识别到的所述待识别网格扫描面中包括的目标声源的声源信息。
第三方面,提供一种电子设备,包括:
一个或多个处理器;以及
与所述一个或多个处理器关联的存储器,所述存储器用于存储程序指令,所述程序指令在被所述一个或多个处理器读取执行时,执行如第一方面任意一项所述的方法。
第四方面,提供一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被一个或多个处理器执行时实现如第一方面任一项所述的方法的步骤。
与现有技术相比,本发明具有如下有益效果:
本发明提供一种基于传声器阵列的声源识别方法、装置及电子设备,其中方法包括:基于任意布设传声器阵列的传声器阵列面及待识别网格扫描面确定传声器阵列的扫描矩阵;根据在预设时长内对待识别网格扫描面的扫描数据获得相应的样本协方差矩阵;迭代搜索扫描矩阵中与样本协方差矩阵的正交投影最大的目标索引位置以更新第一原子索引集获得第二原子索引集;对当前轮迭代后第二原子索引集中任一声源进行再次识别以更新第二原子索引集获得第三原子索引集;至满足预设迭代终止条件时终止迭代以获得与第三原子索引集对应的目标扫描矩阵,根据样本协方差矩阵、目标扫描矩阵获得识别到的待识别网格扫描面中包括的目标声源的声源信息;该声源识别方法在正交最小二乘的协方差拟合声源识别的基础上结合全局回溯思想,全局回溯可以对每次添加的声源进行重新审查修正,该方法不仅可以利用稀疏相干声源的块稀疏性一次性识别当前源及其与上一个源的协方差,使相干源的协方差矩阵估计变得实际可行,不再受限于特定的阵元排布,还能在声源频率过低、声源间距过近、测量距离过远时减小阵列的扫描矩阵相关性对声源识别结果带来的影响,以减轻低频下多声源相互干扰的情况,有效提高识别性能及识别精度。
附图说明
图1是本实施例中基于传声器阵列的声源识别方法的流程图;
图2是本实施例中建立的传声器阵列三维坐标系、传声器阵列、待识别网格扫描面的仿真图;
图3为仿真实验获得的本实施例中基于传声器阵列的声源识别方法与DAS波束形成法的声源识别结果比较图;
图4是本实施例中仿真实验获得的频率维度下的源定位均方根误差结 果图;
图5是本实施例中仿真实验获得的频率维度下的源强度均方根误差结果图;
图6是本实施例中仿真实验获得的声间距维度下的源定位均方根误差结果图;
图7是本实施例中仿真实验获得的声间距维度下的源强度均方根误差结果图;
图8是本实施例中仿真实验获得的测量间距维度下的源定位均方根误差结果图;
图9是本实施例中仿真实验获得的测量间距维度下的源强度均方根误差结果图;
图10为本实施例中计算机可读存储介质的结构示意图。
具体实施方式
为使本发明的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
在本发明的描述中,需要理解的是,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。在本发明的描述中,除非另有说明,“多个”的含义是两个或两个以上。
随着小孔径传声器阵列广泛应用于声源识别技术的现状,为适应不同阵列排布、不同声源频率、不同测量距离等使用场景,需要开发一种通用性较佳的声源识别方法。为此,本实施例提供一种基于传声器阵列的声源识别方法、装置及电子设备,能有效解决上述问题。以下将结合具体的实施例作进一步的详细描述。
实施例
如图1所示,本实施例提供一种基于传声器阵列的声源识别方法,该方法包括如下步骤:
S1、基于任意布设传声器阵列的传声器阵列面及待识别网格扫描面确定传声器阵列的扫描矩阵。其中,待识别网格扫描面包括至少一个待识别的目标声源。
具体地,步骤S1包括:
S11、建立如图2所示的传声器阵列三维坐标系;
S12、在传声器阵列三维坐标系中确定任意布设传声器阵列的传声器阵列面及待识别网格扫描面;其中,传声器阵列包括M个阵元。
S13、基于传声器阵列面及待识别网格扫描面确定传声器阵列的扫描矩阵。
示例性的,如图2所示,待识别网格扫描面中包括两个待识别声源:源1、源2,我们需要分别识别源1、源2的位置、功率及源协方差。在传声器阵列三维坐标系中,传声器阵列中的每一个阵元的位置是确定的。
S2、根据在预设时长内对待识别网格扫描面的扫描数据获得相应的样本协方差矩阵。其中,扫描数据为时域数据。
具体地,步骤S2包括:
S21、将获取的预设时长内传声器阵列对待识别网格扫描面的扫描数据分帧。具体分为第1帧、第2帧…第k帧。
S22、将分帧后的扫描数据经快速傅里叶变换(FFT)转换为频域数据。
S23、基于频域数据获得传声器上M个阵元的信号数据p,p∈CM×1表示M个阵元接收到的数据信号,信号数据包括声压、声强等声源参数,但并不以此为限。具体的,信号数据p的计算方法如下公式(2)所示:
p(k,f)=As(k,f)+n(f,f)   (2)
其中,k=1,2,…,k,表示第k帧信号数据,f为指定的声源探测频率,s∈CN×1为声源信号在网格点上的强度,n∈CM×1表示环境噪声。A=[a1,a2,…,aN]∈CM×N是传声器阵列的扫描矩阵,an∈CM×1表示第n个网格点的导向矢量,an的计算如下式(3)所示:
其中表示第n个网格点到第m个传声器的距离,rn表示第n个网格点到坐标原点的距离,i表示虚数单位,ω表示角速度,c表示声速,T表示矩阵的转置。
S24、根据信号数据获得预设时长内的样本协方差矩阵G。具体的,样本协方差矩阵G的计算如下式(4)所示:
S3、迭代搜索扫描矩阵中与样本协方差矩阵的正交投影最大的目标索引位置以更新第一原子索引集获得第二原子索引集,第一原子索引集或第二原子索引集分别包括的任一索引位置对应于相应识别到的声源。本实施例中,在每一轮迭代搜索中,都可能发现至少一个新的待识别声源,故步骤S3用于发现并将至少一个新的待识别声源加入原子索引集中。然而如前所述,多声源间存在相互干扰的情况,故在步骤S3之后需要在每次迭代后原子索引集中的声源进行重新审核,如下述步骤S4所述。
具体地,在进行首次迭代搜索之前,进行初始化:初始化残差R0=G,原子索引集迭代次数l=1。
步骤S3包括:
S31、在进行当前轮迭代(l)时,搜索扫描矩阵A中与样本协方差矩阵G的正交投影最大的第一目标索引位置n*并计算相应的第一残差Rl。其中,n*与Rl的计算分别如下式(5)(6)所示:

其中表示样本协方差矩阵G在由n,m,∈Λl张成的空间Fl上的正交投影,通过如下式(7)获得:
S32、将第一目标索引位置添加至第一原子索引集中以获得第二原子索 引集Λl,Λl如下式(8)所示:
Λl=Λl-1∪{n}  (8)
S4、对当前轮迭代(l)后第二原子索引集Λl中任一声源进行再次识别以更新第二原子索引集Λl获得第三原子索引集。具体的,步骤S4包括:
S41、在当前轮迭代完成后,确定当前所识别出的所有声源。在完成步骤S41后,进行初始化:初始化循环次数i=1;初始化选择的原子次序j=1,j≤l。
S42、将当前所识别出的所有声源中的任一第一声源(原子次序j)对应的目标索引位置删除;即保持其余声源不变,对第一声源进行重新识别。
S43、基于当前所识别出的除第一声源外的其余声源及样本协方差矩阵计算获得与第一声源对应的临时残差Rl′Rl′的计算如下式(9)所示:
Rl′=G-ΠFl′(G)  (9)
S44、基于临时残差重新识别第一声源的更新索引位置。
具体的,重新寻找索引位置以获得更新索引位置n★★,使其在已识别原子和选择的索引原子所张成空间上的正交投影最大,该更新索引位置n★★的计算如下式(10)所示:
S45、将更新索引位置n★★添加至第二原子索引集Λl获得第三原子索引集Λl′,Λl′=Λl∪{n★★}。
当然,在完成原子j的重新识别后,继续对索引集中的原子j+1进行重新识别,至完成所有原子的重新识别。
为进一步提高声源识别精度,在步骤S45之后,步骤S4还包括:
S46、在当前轮迭代完成后并完成当前所识别出的所有声源的索引位置更新后计算相应的第二残差Rl′
S47、当满足预设更新终止条件,则开始下一轮迭代。
其中,更新终止条件为:第一残差与第二残差的差值不超过预设阈值,或,达到预设循环次数。
S48、否则,对第三原子索引集中任一声源进行再次循环识别以更新第三原子索引集获得第四原子索引集;
上述,S47与S48择一执行。
因此,在每一次迭代完成后,对于当前索引集中的每一个原子,考虑到多声源之间的干扰对于识别精度的影响而进行重新识别,且以全部原子均进行重新识别为一个循环,循环次数为一次、两次甚至更多,以实现当每一次迭代后识别到新的声源后进行的全局回溯,从而有效避免在声源频率过低、声源间距过近或测量距离过远时阵列的扫描矩阵相关性增大,识别性能变差的问题。
S5、至满足预设迭代终止条件时终止迭代以获得与第三原子索引集对应的目标扫描矩阵,根据样本协方差矩阵、目标扫描矩阵获得识别到的待识别网格扫描面中包括的目标声源的声源信息。
其中,预设迭代终止条件可以为迭代后的残差值小于预设经验值,或在明确声源数量的前提下,迭代次数不小于声源数量。
进一步的,步骤S5中根据样本协方差矩阵、目标扫描矩阵获得识别到的待识别网格扫描面中包括的目标声源的声源信息如下公式(1)所示:
其中,为源协方差矩阵,为第三原子索引集对应的目标扫描矩阵的Moore-Penrose逆,G为样本协方差矩阵。
需要说明的是,协方差矩阵与源协方差矩阵关系如下式(11)所示:
Γ=ACAH2I   (11)
而在实际应用中,G与Γ满足下式(12):
故我们可采用样本协方差矩阵G对源协方差矩阵C进行估算,从而获得如式(1)所述关系式,从而估算声源信息,实现声源识别。
当然,在步骤S1、S2后执行S5,同样能获得相应的目标声源的声源信息,但是相较于本实施例,缺少全局回溯,识别精度欠佳。
下面将针对本实施例中基于传声器阵列的声源识别方法进行仿真实验, 并验证其识别精度。
仿真实验验证方法如下:
如图2所示,在距离传声器阵列面1m处,建立1m×1m的待识别网格扫描平面,用0.02m的步长对扫描平面进行离散,将整个平面划分为51×51个网格点。考虑两个相干源,声源间距为0.4m,声源强度分别为32dB和40dB,声源频率设置为3kHz,信噪比设置为0dB。仿真结果如图3所示,图3为仿真实验获得的本实施例中基于传声器阵列的声源识别方法与DAS波束形成法的声源识别结果比较图。其中,星号代表声源的实际位置,DAS波束形成方法的声源输出结果为云图的峰值处,圆圈代表本实施例中声源识别的输出结果,它们的大小与声源强度成正比。可见,强源(源2)的DAS波束形成输出结果主瓣过宽,干扰到源1,使源1识别位置严重偏移,识别精度差,而本实施例中基于传声器阵列的声源识别方法的识别结果则并不受上述干扰影响,识别结果较为准确。
同样的仿真实验条件下,对本实施例的声源识别方法在频率、声源间距、测量间距维度进行识别误差计算,结果如图4~图9所示。说明:为了方便定量描述声源识别性能,引入均方根误差(RMS)性能参数,定义如下式(13)(以源1为例)。仿真的蒙特卡洛次数T为200次。
其中X1,t在描述源定位均方根误差时代表源1第t次仿真时识别出的源位置,X1代表源1真实位置。描述源强度均方根误差时上同。当然,源2的计算与源1相类似。
可见,不同维度的源定位均方根误差(m2)基本小于10-3,甚至部分变量下小于10-4。在不同维度的源强度均方根误差(dB2)基本小于100,甚至部分变量下小于10-1、10-2。故该基于传声器阵列的声源识别方法的识别误差较小,精度较高。
综上,本实施例所提供的基于传声器阵列的声源识别方法,在不同的频率、声源间距及测量间距下,源定位均方根误差、源强度均方根误差均较小,可见在正交最小二乘的协方差拟合声源识别的基础上结合全局回溯思 想,不仅可以利用稀疏相干声源的块稀疏性一次性识别当前源及其与上一个源的协方差,使相干源的协方差矩阵估计变得实际可行,不再受限于特定的阵元排布,还能在声源频率过低、声源间距过近、测量距离过远时减小阵列的扫描矩阵相关性对声源识别结果带来的影响,以减轻低频下多声源相互干扰的情况,有效提高识别性能及识别精度。
对应于上述基于传声器阵列的声源识别方法,本实施例进一步提供一种基于传声器阵列的声源识别装置,该装置包括:
第一处理模块,用于基于任意布设传声器阵列的传声器阵列面及待识别网格扫描面确定所述传声器阵列的扫描矩阵;所述待识别网格扫描面包括至少一个待识别的目标声源;
第二处理模块,用于根据在预设时长内对所述待识别网格扫描面的扫描数据获得相应的样本协方差矩阵;
第三处理模块,用于迭代搜索所述扫描矩阵中与所述样本协方差矩阵的正交投影最大的目标索引位置以更新第一原子索引集获得第二原子索引集,所述第一原子索引集或所述第二原子索引集分别包括的任一索引位置对应于相应识别到的声源;
第四处理模块,用于对当前轮迭代后所述第二原子索引集中任一声源进行再次识别以更新所述第二原子索引集获得第三原子索引集;
第五处理模块,用于至满足预设迭代终止条件时终止迭代以获得与所述第三原子索引集对应的目标扫描矩阵,根据所述样本协方差矩阵、所述目标扫描矩阵获得识别到的所述待识别网格扫描面中包括的目标声源的声源信息。
进一步的,第一处理模块包括:
构建单元,用于建立传声器阵列三维坐标系;在所述传声器阵列三维坐标系中确定任意布设传声器阵列的传声器阵列面及待识别网格扫描面;
第一处理单元,用于基于所述传声器阵列面及待识别网格扫描面确定所述传声器阵列的扫描矩阵。
第二处理模块包括:
第二处理单元,用于将获取的预设时长内传声器阵列对所述待识别网 格扫描面的扫描数据分帧;
转换单元,用于将分帧后的所述扫描数据经快速傅里叶变换转换为频域数据;
获取单元,用于基于所述频域数据获得传声器上M个阵元的信号数据;
第三处理单元,根据所述信号数据获得预设时长内的样本协方差矩阵。
第三处理模块包括:
搜索单元,用于在进行当前轮迭代时,搜索所述扫描矩阵中与所述样本协方差矩阵的正交投影最大的第一目标索引位置并计算相应的第一残差;
第一添加单元,用于将所述第一目标索引位置添加至第一原子索引集中以获得第二原子索引集。
第四处理模块包括:
确定单元,用于在当前轮迭代完成后,确定当前所识别出的所有声源;
将删除单元,用于当前所识别出的所有声源中的任一第一声源对应的目标索引位置删除;
第一计算单元,用于基于当前所识别出的除所述第一声源外的其余声源及样本协方差矩阵计算获得与所述第一声源对应的临时残差;
识别单元,用于基于所述临时残差重新识别所述第一声源的更新索引位置;
第二添加单元,用于将所述更新索引位置添加至所述第二原子索引集获得第三原子索引集。
第二计算单元,用于在当前轮迭代完成后并完成当前所识别出的所有声源的索引位置更新后计算相应的第二残差;
判断单元,用于判断是否满足预设更新终止条件,若是则开始下一轮迭代;否则,对所述第三原子索引集中任一声源进行再次循环识别以更新所述第三原子索引集获得第四原子索引集;其中,所述更新终止条件为:所述第一残差与所述第二残差的差值不超过预设阈值,或,达到预设循环次数。
所述第五处理模块具体用于所述终止迭代以获得与所述第三原子索引集对应的目标扫描矩阵后,根据所述样本协方差矩阵、所述目标扫描矩阵 获得识别到的所述待识别网格扫描面中包括的目标声源的声源信息如下公式(1)所示:
其中,为源协方差矩阵,为所述第三原子索引集对应的目标扫描矩阵的Moore-Penrose逆,G为样本协方差矩阵。
需要说明的是:上述实施例提供的基于传声器阵列的声源识别装置在触发基于传声器阵列的声源识别业务时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将系统的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的基于传声器阵列的声源识别装置与基于传声器阵列的声源识别方法的实施例属于同一构思,即该系统是基于该方法的,其具体实现过程详见方法实施例,这里不再赘述。
另外,本实施例还提供一种电子设备,包括:
一个或多个处理器;以及
与所述一个或多个处理器关联的存储器,所述存储器用于存储程序指令,所述程序指令在被所述一个或多个处理器读取执行时,执行前述的基于传声器阵列的声源识别方法。
关于执行程序指令所执行的数据处理方法,具体执行细节及相应的有益效果与前述方法中的描述内容是一致的,此处将不再赘述。
以及,如图10所示,本实施例还提供一种计算机可读存储介质31,其上存储有计算机程序310,所述计算机程序被一个或多个处理器32执行时实现前述的基于传声器阵列的声源识别方法。
具体地,可以采用一个或多个计算机可读介质的任意组合。计算机可读存储介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是但不限于电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(EPROM或闪存)、光纤、便 携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本文件中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。
计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。
计算机可读存储介质上包含的程序代码可以用任何适当的介质传输,包括但不限于无线、电线、光缆、RF等等,或者上述的任意合适的组合。
在一些实施方式中,客户端、服务器可以利用诸如HTTP(Hyper Text Transfer Protocol,超文本传输协议)之类的任何当前已知或未来研发的网络协议进行通信,并且可以与任意形式或介质的数字数据通信(例如,通信网络)互连。通信网络的示例包括局域网(“LAN”),广域网(“WAN”),网际网(例如,互联网)以及端对端网络(例如,ad hoc端对端网络),以及任何当前已知或未来研发的网络。
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。
可以以一种或多种程序设计语言或其组合来编写用于执行本发明操作的计算机程序代码,所述程序设计语言包括面向对象的程序设计语言诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络包括局域网(LAN)或广域网(WAN)连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,单元的名称在某种情况下并不构成对该单元本身的限定。
本文中以上描述的功能可以至少部分地由一个或多个硬件逻辑部件来执行。例如,非限制性地,可以使用的示范类型的硬件逻辑部件包括:现场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、片上系统(SOC)、复杂可编程逻辑设备(CPLD)等等。
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储存设备、磁储存设备、或上述内容的任何合适组合。
上述所有可选技术方案,可以采用任意结合形成本发明的可选实施例, 即可将任意多个实施例进行组合,从而获得应对不同应用场景的需求,均在本申请的保护范围内,在此不再一一赘述。
需要说明的是,以上所述仅为本发明的较佳实施例,并不用以限制本发明,凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。

Claims (10)

  1. 基于传声器阵列的声源识别方法,其特征在于,所述方法包括:
    基于任意布设传声器阵列的传声器阵列面及待识别网格扫描面确定所述传声器阵列的扫描矩阵;所述待识别网格扫描面包括至少一个待识别的目标声源;
    根据在预设时长内对所述待识别网格扫描面的扫描数据获得相应的样本协方差矩阵;
    迭代搜索所述扫描矩阵中与所述样本协方差矩阵的正交投影最大的目标索引位置以更新第一原子索引集获得第二原子索引集,所述第一原子索引集或所述第二原子索引集分别包括的任一索引位置对应于相应识别到的声源;
    对当前轮迭代后所述第二原子索引集中任一声源进行再次识别以更新所述第二原子索引集获得第三原子索引集;
    至满足预设迭代终止条件时终止迭代以获得与所述第三原子索引集对应的目标扫描矩阵,根据所述样本协方差矩阵、所述目标扫描矩阵获得识别到的所述待识别网格扫描面中包括的目标声源的声源信息。
  2. 如权利要求1所述的方法,其特征在于,所述基于任意布设传声器阵列的传声器阵列面及待识别网格扫描面确定所述传声器阵列的扫描矩阵,包括:
    建立传声器阵列三维坐标系;
    在所述传声器阵列三维坐标系中确定任意布设传声器阵列的传声器阵列面及待识别网格扫描面;
    基于所述传声器阵列面及待识别网格扫描面确定所述传声器阵列的扫描矩阵。
  3. 如权利要求1所述的方法,其特征在于,所述传声器阵列包括M个阵元,所述扫描数据为时域数据;所述根据在预设时长内对所述待识别网格扫描面的扫描数据获得相应的样本协方差矩阵,包括:
    将获取的预设时长内传声器阵列对所述待识别网格扫描面的扫描数据分帧;
    将分帧后的所述扫描数据经快速傅里叶变换转换为频域数据;
    基于所述频域数据获得传声器上M个阵元的信号数据;
    根据所述信号数据获得预设时长内的样本协方差矩阵。
  4. 如权利要求1所述的方法,其特征在于,所述迭代搜索所述扫描矩阵中与所述样本协方差矩阵的正交投影最大的目标索引位置以更新第一原子索引集获得第二原子索引集,包括:
    在进行当前轮迭代时,搜索所述扫描矩阵中与所述样本协方差矩阵的正交投影最大的第一目标索引位置并计算相应的第一残差;
    将所述第一目标索引位置添加至第一原子索引集中以获得第二原子索引集。
  5. 如权利要求4所述的方法,其特征在于,所述对当前轮迭代后所述第二原子索引集中任一声源进行再次识别以更新所述第二原子索引集获得第三原子索引集,包括:
    在当前轮迭代完成后,确定当前所识别出的所有声源;
    将当前所识别出的所有声源中的任一第一声源对应的目标索引位置删除;
    基于当前所识别出的除所述第一声源外的其余声源及样本协方差矩阵计算获得与所述第一声源对应的临时残差;
    基于所述临时残差重新识别所述第一声源的更新索引位置;
    将所述更新索引位置添加至所述第二原子索引集获得第三原子索引集。
  6. 如权利要求5所述的方法,其特征在于,在获得第三原子索引集之后,所述方法还包括:
    在当前轮迭代完成后并完成当前所识别出的所有声源的索引位置更新后计算相应的第二残差;
    当满足预设更新终止条件,则开始下一轮迭代;
    否则,对所述第三原子索引集中任一声源进行再次循环识别以更新所述第三原子索引集获得第四原子索引集;
    其中,所述更新终止条件为:所述第一残差与所述第二残差的差值不超过预设阈值,或,达到预设循环次数。
  7. 如权利要求1所述的方法,其特征在于,所述终止迭代以获得与所述第三原子索引集对应的目标扫描矩阵后,根据所述样本协方差矩阵、所述目标扫描矩阵获得识别到的所述待识别网格扫描面中包括的目标声源的声源信息如下公式(1)所示:
    其中,为源协方差矩阵,为所述第三原子索引集对应的目标扫描矩阵的Moore-Penrose逆,G为样本协方差矩阵。
  8. 基于传声器阵列的声源识别装置,其特征在于,所述装置包括:
    第一处理模块,用于基于任意布设传声器阵列的传声器阵列面及待识别网格扫描面确定所述传声器阵列的扫描矩阵;所述待识别网格扫描面包括至少一个待识别的目标声源;
    第二处理模块,用于根据在预设时长内对所述待识别网格扫描面的扫描数据获得相应的样本协方差矩阵;
    第三处理模块,用于迭代搜索所述扫描矩阵中与所述样本协方差矩阵的正交投影最大的目标索引位置以更新第一原子索引集获得第二原子索引集,所述第一原子索引集或所述第二原子索引集分别包括的任一索引位置对应于相应识别到的声源;
    第四处理模块,用于对当前轮迭代后所述第二原子索引集中任一声源进行再次识别以更新所述第二原子索引集获得第三原子索引集;
    第五处理模块,用于至满足预设迭代终止条件时终止迭代以获得与所述第三原子索引集对应的目标扫描矩阵,根据所述样本协方差矩阵、所述目标扫描矩阵获得识别到的所述待识别网格扫描面中包括的目标声源的声源信息。
  9. 一种电子设备,其特征在于,包括:
    一个或多个处理器;以及
    与所述一个或多个处理器关联的存储器,所述存储器用于存储程序指令,所述程序指令在被所述一个或多个处理器读取执行时,执行如权利要求1至7中任意一项所述的方法。
  10. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于, 所述计算机程序被一个或多个处理器执行时实现权利要求1至7中任一项所述的方法的步骤。
PCT/CN2023/092735 2022-05-12 2023-05-08 基于传声器阵列的声源识别方法、装置及电子设备 WO2023217079A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210520321.9 2022-05-12
CN202210520321.9A CN115113139B (zh) 2022-05-12 2022-05-12 基于传声器阵列的声源识别方法、装置及电子设备

Publications (1)

Publication Number Publication Date
WO2023217079A1 true WO2023217079A1 (zh) 2023-11-16

Family

ID=83327275

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/092735 WO2023217079A1 (zh) 2022-05-12 2023-05-08 基于传声器阵列的声源识别方法、装置及电子设备

Country Status (2)

Country Link
CN (1) CN115113139B (zh)
WO (1) WO2023217079A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115113139B (zh) * 2022-05-12 2024-02-02 苏州清听声学科技有限公司 基于传声器阵列的声源识别方法、装置及电子设备

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040252845A1 (en) * 2003-06-16 2004-12-16 Ivan Tashev System and process for sound source localization using microphone array beamsteering
US20150341723A1 (en) * 2014-05-22 2015-11-26 The United States Of America As Represented By The Secretary Of The Navy Multitask learning method for broadband source-location mapping of acoustic sources
CN106646376A (zh) * 2016-12-05 2017-05-10 哈尔滨理工大学 基于加权修正参数的p范数噪声源定位识别方法
CN107247251A (zh) * 2017-06-20 2017-10-13 西北工业大学 基于压缩感知的三维声源定位方法
CN109375171A (zh) * 2018-11-21 2019-02-22 合肥工业大学 一种基于新型正交匹配追踪算法的声源定位方法
CN110109058A (zh) * 2019-05-05 2019-08-09 中国航发湖南动力机械研究所 一种平面阵列反卷积声源识别方法
CN115113139A (zh) * 2022-05-12 2022-09-27 苏州清听声学科技有限公司 基于传声器阵列的声源识别方法、装置及电子设备

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106093921B (zh) * 2016-07-25 2019-04-26 中国电子科技集团公司第五十四研究所 基于稀疏分解理论的声矢量阵宽带测向方法
CN106443587B (zh) * 2016-11-18 2019-04-05 合肥工业大学 一种高分辨率的快速反卷积声源成像算法
US11064294B1 (en) * 2020-01-10 2021-07-13 Synaptics Incorporated Multiple-source tracking and voice activity detections for planar microphone arrays
CN111693942A (zh) * 2020-07-08 2020-09-22 湖北省电力装备有限公司 一种基于麦克风阵列的声源定位方法
CN114089276A (zh) * 2021-11-23 2022-02-25 青岛海洋科学与技术国家实验室发展中心 一种自适应水下声源被动定位方法及系统

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040252845A1 (en) * 2003-06-16 2004-12-16 Ivan Tashev System and process for sound source localization using microphone array beamsteering
US20150341723A1 (en) * 2014-05-22 2015-11-26 The United States Of America As Represented By The Secretary Of The Navy Multitask learning method for broadband source-location mapping of acoustic sources
CN106646376A (zh) * 2016-12-05 2017-05-10 哈尔滨理工大学 基于加权修正参数的p范数噪声源定位识别方法
CN107247251A (zh) * 2017-06-20 2017-10-13 西北工业大学 基于压缩感知的三维声源定位方法
CN109375171A (zh) * 2018-11-21 2019-02-22 合肥工业大学 一种基于新型正交匹配追踪算法的声源定位方法
CN110109058A (zh) * 2019-05-05 2019-08-09 中国航发湖南动力机械研究所 一种平面阵列反卷积声源识别方法
CN115113139A (zh) * 2022-05-12 2022-09-27 苏州清听声学科技有限公司 基于传声器阵列的声源识别方法、装置及电子设备

Also Published As

Publication number Publication date
CN115113139B (zh) 2024-02-02
CN115113139A (zh) 2022-09-27

Similar Documents

Publication Publication Date Title
JP4937622B2 (ja) 位置標定モデルを構築するコンピュータ実施方法
WO2023217079A1 (zh) 基于传声器阵列的声源识别方法、装置及电子设备
WO2021139435A1 (zh) 一种室内定位方法、装置及电子设备
JP2016161570A (ja) デバイスのロケーションを求める方法およびデバイス
WO2021164282A1 (zh) 基于低秩矩阵重建的水声宽频散射源的定位方法
CN112949840A (zh) 通道注意力引导的卷积神经网络动态通道剪枝方法和装置
US20210233518A1 (en) Method and apparatus for recognizing voice
WO2013043664A1 (en) Hybrid positioning system based on time difference of arrival (tdoa) and time of arrival (toa)
EP3695403B1 (en) Joint wideband source localization and acquisition based on a grid-shift approach
JP7406521B2 (ja) 音声検出方法、音声検出装置、電子デバイス、コンピュータ可読記憶媒体、及び、コンピュータプログラム
WO2022226856A1 (zh) 浅海多层海底地声参数反演方法、装置、计算机设备及存储介质
JP6106571B2 (ja) 音源位置推定装置、方法及びプログラム
JP2017150903A (ja) 音源定位装置、方法、及びプログラム
CN106569180B (zh) 一种基于Prony方法的方位估计算法
WO2022135131A1 (zh) 声源定位方法、装置和电子设备
CN103837858A (zh) 一种用于平面阵列的远场波达角估计方法及系统
WO2017049914A1 (zh) 一种终端定位方法、装置及系统
Hao et al. Joint source localisation and sensor refinement using time differences of arrival and frequency differences of arrival
Belloch et al. Real-time sound source localization on an embedded GPU using a spherical microphone array
CN113176606B (zh) 微震震源定位方法、系统、设备及存储介质
Guo et al. An adaptive beamforming algorithm for sound source localisation via hybrid compressive sensing reconstruction
US9396740B1 (en) Systems and methods for estimating pitch in audio signals based on symmetry characteristics independent of harmonic amplitudes
CN110824484B (zh) 一种基于恒模算法的阵元位置估计方法
WO2020140733A1 (zh) 用于评估设备环境噪声的方法、装置、介质及电子设备
CN110207699B (zh) 一种定位方法和装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23802849

Country of ref document: EP

Kind code of ref document: A1