WO2023088156A1 - Procédé et appareil de correction de la vitesse du son - Google Patents

Procédé et appareil de correction de la vitesse du son Download PDF

Info

Publication number
WO2023088156A1
WO2023088156A1 PCT/CN2022/131002 CN2022131002W WO2023088156A1 WO 2023088156 A1 WO2023088156 A1 WO 2023088156A1 CN 2022131002 W CN2022131002 W CN 2022131002W WO 2023088156 A1 WO2023088156 A1 WO 2023088156A1
Authority
WO
WIPO (PCT)
Prior art keywords
sound
sound source
microphone array
sound velocity
correction signal
Prior art date
Application number
PCT/CN2022/131002
Other languages
English (en)
Chinese (zh)
Inventor
张磊
简旻捷
郭臻鸿
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023088156A1 publication Critical patent/WO2023088156A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01HMEASUREMENT OF MECHANICAL VIBRATIONS OR ULTRASONIC, SONIC OR INFRASONIC WAVES
    • G01H5/00Measuring propagation velocity of ultrasonic, sonic or infrasonic waves, e.g. of pressure waves
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S5/00Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
    • G01S5/18Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using ultrasonic, sonic, or infrasonic waves
    • G01S5/20Position of source determined by a plurality of spaced direction-finders
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/21Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R29/00Monitoring arrangements; Testing arrangements

Definitions

  • the embodiments of the present application relate to the field of audio processing, and in particular, to a sound velocity correction method and device.
  • a microphone array is often used in the conference terminal to pick up the sound of the participants and locate the sound source, so as to locate and track the position of the participants, and realize functions such as sound screen and broadcasting.
  • the sound velocity is one of the important parameters in the microphone array sound source localization algorithm.
  • the sound velocity depends on the ambient temperature. In the actual conference scene, due to the temperature change, the sound velocity can vary from 337 to 350m/s. Affects the accurate positioning and tracking of participants.
  • face recognition is used to correct the speed of sound.
  • the angle of the speaker is determined through face recognition, referred to as the face angle.
  • the microphone array performs sound source localization based on the preset sound speed to measure the speaker angle, referred to as the sound angle.
  • the current actual sound speed can be determined by continuously fine-tuning the sound speed so that the difference between the sound angle and the face angle is within the preset range.
  • the real-time performance of sound source localization is poor.
  • the present application provides a sound velocity correction method and device, which are used to reduce the calculation amount of sound velocity correction and improve the real-time performance of sound velocity correction.
  • the first aspect of the present application provides a sound velocity correction method, the method comprising: receiving a sound velocity correction signal from a correction sound source through a microphone array; determining the target sound velocity, the spatial relative distance between the target sound velocity and the microphone array and the correction sound source, and the sound velocity Correction signal correlation.
  • the executive subject of the embodiment of the present application may be a sound source localization device, and a corrected sound source may be provided in the conference venue, and the corrected sound source may output a sound velocity correction signal, and the microphone array included in the sound source localization device may collect the sound source. Sound velocity correction signal.
  • the sound velocity correction signal can be used to calculate the time delay between the microphones in the microphone array, and the sound source localization device can obtain the position information of the corrected sound source and the microphone array, and determine the space of the corrected sound source and the microphone array based on the position information relative distance, and then obtain the target speed of sound based on the relationship between time and distance.
  • the sound source localization device does not need face recognition, and can determine the target sound velocity at one time, reducing the calculation amount of sound velocity correction and improving the real-time performance of sound velocity correction.
  • the sound velocity correction signal is an ultrasonic wave or the first sound, and the frequency of the first sound is outside the preset frequency range.
  • the above step of receiving the sound velocity correction signal from the correction sound source through the microphone array includes: through the microphone array Receive sound velocity correction signals from correction sound sources in real time.
  • the corrected sound source is an ultrasonic transmitter. Since the human body cannot hear ultrasonic waves, the ultrasonic transmitter can output the sound velocity correction signal in real time without affecting the voice signals of the participants at the meeting site. ;
  • the sound velocity correction signal is the first sound
  • the corrected sound source can be a loudspeaker at this time, and the preset frequency range is the human voice range, and the sound source localization device only locates the sound in the human voice frequency range when performing sound source localization , so the first sound will not affect the sound source localization.
  • the sound source localization device can also receive the sound speed correction signal in real time to update the target sound speed and improve the accuracy of the target sound speed.
  • the sound velocity correction signal is the second sound
  • the frequency of the second sound is within the preset frequency range
  • the above steps of receiving the sound velocity correction signal from the correction sound source through the microphone array include: periodically A sound velocity correction signal is received from a correction sound source.
  • the corrected sound source can be a loudspeaker at this time, and the preset frequency range is the frequency range of human vocalization. Because the frequency band of the second sound overlaps with the frequency range of human vocalization, the sound The source location device collects the second sound and the human voice at the same time, which may make the sound source location inaccurate. If someone is speaking at the conference site, the speaker can only periodically send the second sound to the microphone array, and the sound source location device periodically corrects the target sound velocity. , diverge the meeting time, and improve the accuracy of sound source positioning.
  • the sound from the speaker during the remote single-talk can also be used as the second sound.
  • the sound source location The device does not need to provide the second sound itself.
  • the sound velocity correction signal is used to determine the time delay between microphones in the microphone array, the target sound speed is proportional to the relative space distance, and the target sound speed is inversely proportional to the time delay.
  • the sound source localization device after the sound source localization device receives the sound velocity correction signal through the microphone array, it can count the time delays of receiving the sound velocity correction signal between different microphones in the microphone array, and then based on the corrected sound source and different microphones.
  • the difference in the relative spatial distance determines the target sound velocity. Since the difference in the relative spatial distance between the rectified sound source and different microphones is fixed, the lower the delay, the higher the target sound velocity. This provides a way to determine the target sound velocity, directly Target sound velocity, improve real-time performance.
  • the method further includes: acquiring the position of the sound source to be corrected by a camera; and determining a spatial relative distance according to the position of the sound source to be corrected and the position of the microphone array.
  • the step of determining the spatial relative distance according to the position of the corrected sound source and the position of the microphone array includes: determining the first coordinate of the corrected sound source in the three-dimensional space coordinate system according to the position of the corrected sound source; The position of the array determines the second coordinates of the microphone array in the three-dimensional space coordinate system; the spatial relative distance is determined according to the geometric relationship between the first coordinates and the second coordinates.
  • the sound source localization device determines the spatial relative distance according to the position of the corrected sound source and the position of the microphone array by establishing a three-dimensional space coordinate system, so that the corrected sound source and the microphone array are in the three-dimensional space coordinates There are coordinates in the system, and then based on the geometric operation relationship between the coordinates, the relative distance of the space can be obtained to improve the accuracy of distance calculation.
  • the method further includes: performing sound source localization according to a target sound velocity.
  • the sound source localization device determines the target sound velocity, it can use the target sound speed to perform sound source localization on the positions of the participants at the meeting site, which reduces the influence of temperature on sound source localization and improves sound source localization. the accuracy.
  • the second aspect of the present application provides a sound velocity correction device, which can implement the method in the first aspect or any possible implementation manner of the first aspect.
  • the apparatus includes corresponding units or modules for performing the above method.
  • the units or modules included in the device can be realized by means of software and/or hardware.
  • the device can be, for example, a network device, or a chip, a chip system, or a processor that supports the network device to implement the above method, or a logic module or software that can realize all or part of the functions of the network device.
  • the third aspect of the present application provides a computer device, including: a processor, the processor is coupled with a memory, and the memory is used to store instructions.
  • the device implements the first aspect or the first A method in any possible implementation of an aspect.
  • the apparatus may be, for example, a network device, or may be a chip or a chip system that supports the network device to implement the foregoing method.
  • the fourth aspect of the present application provides a computer-readable storage medium, in which instructions are stored in the computer-readable storage medium, and when the instructions are executed, the computer executes the first aspect or any possible implementation of the first aspect method provided.
  • the fifth aspect of the present application provides a computer program product.
  • the computer program product includes computer program code.
  • the computer program code When executed, the computer executes the aforementioned first aspect or any possible implementation manner of the first aspect. Methods.
  • FIG. 1 is a schematic structural diagram of a sound source localization device provided by an embodiment of the present application
  • FIG. 2 is a schematic structural diagram of a uniform linear microphone array provided in an embodiment of the present application.
  • FIG. 3 is a schematic structural diagram of a uniform circular microphone array provided by an embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of a uniform spherical microphone array provided in an embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a three-dimensional uniform linear microphone array provided by an embodiment of the present application.
  • FIG. 6 is a schematic flowchart of a sound velocity correction method provided in an embodiment of the present application.
  • FIG. 7 is a schematic diagram of a delay estimation process provided by an embodiment of the present application.
  • Fig. 8 is a schematic diagram of sound source localization provided by the embodiment of the present application.
  • FIG. 9 is a schematic diagram of another sound source localization provided in the embodiment of the present application.
  • FIG. 10 is a schematic diagram of a sound velocity correction process provided by the embodiment of the present application.
  • Fig. 11 is a schematic diagram of the position of a rectified sound source provided by the embodiment of the present application.
  • FIG. 12 is a schematic diagram of correcting the spatial relative distance between the sound source and the microphone array provided by the embodiment of the present application.
  • Fig. 13 is a schematic diagram of the position of another rectified sound source provided by the embodiment of the present application.
  • Fig. 14 is another schematic diagram of correcting the spatial relative distance between the sound source and the microphone array provided by the embodiment of the present application.
  • Fig. 15 is a schematic diagram of the position of another rectified sound source provided by the embodiment of the present application.
  • FIG. 16 is another schematic diagram of correcting the spatial relative distance between the sound source and the microphone array provided by the embodiment of the present application.
  • Fig. 17 is a schematic structural diagram of a sound velocity correction device provided in an embodiment of the present application.
  • FIG. 18 is a schematic structural diagram of a computer device provided by an embodiment of the present application.
  • Embodiments of the present application provide a sound velocity correction method and device, which are used to reduce the calculation amount of sound velocity correction and improve the real-time performance of sound velocity correction.
  • Microphone array A system composed of a certain number of acoustic sensors (usually microphones) used to sample and process the spatial characteristics of the sound field.
  • mean( ⁇ ) means average, and argmax(f(s)) means the value of s when f(s) is the largest.
  • the sound source correction method provided in this application can be executed by a sound source localization device, and is applied to various scenarios that require sound pickup, for example, video calls, voice calls, multi-person conferences, recording or video recording, and other scenarios.
  • the sound source localization device may include a variety of terminals capable of picking up sound, such as a large-screen conference terminal, a TV, a tablet computer, a head mount display, HMD), augmented reality (augmented reality, AR) equipment, mixed reality (mixed reality, MR) equipment, personal digital assistant (personal digital assistant, PDA), tablet computer, vehicle electronic equipment, laptop computer (laptop computer) , personal computer (personal computer, PC), monitoring equipment, robot, vehicle terminal, wearable device or self-driving vehicle, etc.
  • the terminal takes a large-screen conference terminal as an example.
  • the structure of a sound source localization device may be shown in FIG. 1 , and the sound source localization device 1 may include a microphone array 11 and a processor 12 .
  • the microphone array 11 may include an array composed of multiple microphones for collecting voice signals.
  • the structure formed by the plurality of microphones may include a centralized array structure or a distributed array structure. For example, when the sound pressure of the user's voice exceeds the sound source detection threshold, the voice signal is collected through the microphone array.
  • Each microphone can form one voice signal, and the multiple voice signals are fused to form the data collected in the current environment.
  • the microphone in this application can be an ordinary omnidirectional microphone, and the microphone array formed by multiple microphones according to a certain topology can be in any array form, such as a uniform straight line as shown in Figure 2 formed by 8 ordinary omnidirectional microphones Microphone array, the distance between adjacent microphones is d; as the uniform circular microphone array shown in Figure 3 formed by 8 common omnidirectional microphones, the angle between adjacent microphones and the connection line of the center of circle is o; as A uniform spherical microphone array as shown in Figure 4 composed of 18 common omnidirectional microphones, and a three-dimensional uniform linear microphone array as shown in Figure 5 formed by 10 common omnidirectional microphones, each dimension is adjacent The distance between the microphones is d.
  • a uniform linear microphone array composed of a plurality of ordinary omnidirectional microphones is taken as an example.
  • the processor 12 can be used to process the data collected by the microphone array, so as to extract the voice data corresponding to the sound source. It can be understood that the steps of the sound source correction method provided in this application can be executed by the processor 12 .
  • the sound source localization device may include devices such as Octopus conferencing devices, Internet of Things (Internet of Things, IoT), smart speakers, or smart robots.
  • devices such as Octopus conferencing devices, Internet of Things (Internet of Things, IoT), smart speakers, or smart robots.
  • FIG. 6 Please refer to FIG. 6 .
  • a sound velocity correction method provided by the embodiment of the present application is shown, and the flow of the method is specifically described as follows.
  • Step 601. The corrected sound source sends a sound velocity correction signal to the microphone array, and accordingly, the sound source localization device receives the sound velocity correction signal from the corrected sound source through the microphone array.
  • a corrected sound source may be set in the meeting place, and the corrected sound source may output a sound velocity correction signal, and the microphone array included in the sound source localization device may collect the sound velocity correction signal.
  • the sound velocity correction signal may be any sound wave signal, that is, any sound wave signal may be selected to participate in the sound velocity correction process.
  • the microphone array picks up all the sound wave signals within the range of the microphone.
  • the pickup distance of the microphone can be determined according to the specific application environment. For example, if the room size is 5 meters long, 10 meters wide, and 4 meters high, the microphone array can be required to process all the sounds in the room. Should be at least 10 meters.
  • the microphone array can usually collect sound wave signals whose sound pressure exceeds a certain threshold.
  • the sound pressure of speech exceeds a threshold.
  • the threshold is collectively referred to as the sound source detection threshold, and sound wave signals that do not exceed the threshold are usually discarded.
  • Step 602. The sound source localization device determines the target sound velocity, and the target sound velocity is related to the spatial relative distance between the microphone array and the corrected sound source, and the sound velocity correction signal.
  • the sound source localization device after the sound source localization device obtains the sound velocity correction signal through the microphone array, it can first obtain the position information of the corrected sound source and the microphone array, and determine the spatial relative distance between the corrected sound source and the microphone array based on the position information, and The sound velocity correction signal can be used to calculate the time delay between the microphones in the microphone array, so that the target sound velocity can be obtained based on the relationship between time and distance.
  • the position of the corrected sound source may be predetermined, that is, the user inputs the position information of the corrected sound source into the corrected sound source after setting the corrected sound source.
  • the position of the corrected sound source can also be obtained by the sound source localization device through the camera to collect images and perform image recognition. The distance is not limited here.
  • the way for the sound source localization device to determine the spatial relative distance may be to set a three-dimensional space coordinate system in the three-dimensional space within the sound pickup range of the microphone array, and the origin of the three-dimensional space coordinate system may be at any position within the sound pickup range, Exemplarily, in this embodiment, the origin of the three-dimensional space coordinates may be the center position of the microphone array, or the position of any microphone in the microphone array, or other positions.
  • the second coordinates of each microphone and the first coordinates of the corrected sound source can be determined according to the position of the microphone array and the corrected sound source in the three-dimensional space coordinate system.
  • the geometric relationship between the two coordinates and the first coordinate of the rectified sound source obtains the spatial relative distance between the rectified sound source and the microphone array.
  • the sound source localization device After the sound source localization device receives the sound velocity correction signal through the microphone array, it can count the time delay of receiving the sound velocity correction signal between different microphones in the microphone array, and then determine based on the difference in the spatial relative distance between the corrected sound source and different microphones The target sound speed, wherein, since the difference of the spatial relative distance between the rectified sound source and different microphones is fixed, the lower the delay, the higher the target sound speed.
  • the target sound velocity estimation can be obtained through the following equations:
  • t ij represents the time delay between the i-th microphone and the j-th microphone
  • FIG. 7 is a delay estimation flow chart provided by the embodiment of the present application.
  • the phases of the received sound velocity correction signals are different;
  • Step 702 performing fast Fourier transform (fast fourier transform, FFT) on the sound velocity correction signal of the i-th microphone to obtain signal 1, performing FFT on the sound velocity correction signal of the j-th microphone and Perform conjugation operation to obtain signal 2;
  • Step 703 Convolve signal 1 and signal 2 to obtain signal 3;
  • Step 704 Perform power spectrum weighting on signal 3 to obtain signal 4;
  • Step 705 Perform fast Fourier inverse on signal 4 Transform (inverse fast fourier transform, IFFT) to obtain signal 5;
  • step 706 use the time delay corresponding to the peak value of signal 5 as the time delay between the i-th microphone and the j-th microphone.
  • FFT fast Fourier transform
  • ⁇ ij ( ⁇ ) is a weighting function, and the commonly used value is is the cross-spectrum (signal 3), ⁇ is the delay difference parameter, ⁇ is the angular velocity, is calculated as:
  • X i ( ⁇ ) is the time spectrum of the signal received by the i-th microphone (signal 1)
  • jth microphone is the time-spectrum conjugate of the signal received by the jth microphone (signal 2).
  • the correction sound source may be a speaker or an ultrasonic transmitter, which is not limited here.
  • the sound velocity correction signal is ultrasonic. Since the human body cannot hear ultrasonic waves, the ultrasonic transmitter can output the sound velocity correction signal in real time.
  • the sound source location device can receive the sound velocity correction signal in real time. Update the target sound velocity, further reduce the influence of temperature on sound source localization, and improve the real-time performance of sound velocity correction.
  • step 801 microphone array pickup and sampling
  • step 802 adopt high-pass filter to extract sound velocity correction signal, and adopt low-pass filter to extract human voice signal
  • step 803 adjust sound velocity Correct the estimated time delay of the signal, and determine the target sound velocity
  • step 804 perform sound source localization on the human voice signal based on the target sound velocity.
  • the loudspeaker may be controlled by a sound source localization device, or may be controlled by other devices, which is not limited here.
  • the sound output by the loudspeaker may be the first sound whose frequency is outside the preset frequency band range, wherein the human voice range is between 100Hz (bass) to 10kHz (soprano), then the preset frequency range may be Set it to 100Hz to 10kHz. If the frequency of the first sound is 18kHz at this time, the frequency band of the first sound and the human voice will not overlap, which will not affect the collection of human voice by the sound source localization equipment. At this time, the sound source localization equipment can also Receive the sound velocity correction signal in real time to update the target sound velocity.
  • step 901 microphone array picks up sound and samples;
  • step 902 adopts a bandpass filter to extract the sound velocity correction signal (exemplarily, the frequency band collected by the bandpass filter can be 10kHz-20kHz), using a low-pass filter to extract the human voice signal;
  • Step 903 Estimate the time delay of the sound velocity correction signal, and determine the target sound velocity;
  • Step 904 Perform sound source localization on the human voice signal based on the target sound velocity.
  • the sound source localization device can only periodically update the target sound velocity, and compare the time when the speaker outputs the sound speed correction signal with the time of the sound source output signal of the sound source localization at the meeting site Stagger to avoid the impact of sound source localization.
  • the speaker of the sound source localization device outputs the voice signal of the far-end single-talk, since the position of the speaker has been confirmed, it can directly communicate with the voice signal of the far-end single-talk and the speaker.
  • the target sound velocity is determined at the location of the sound source, which is input as a parameter for subsequent sound source localization, and there is no need to provide a sound velocity correction signal locally at this time.
  • the process of correcting the sound velocity may refer to the schematic flow chart of the sound velocity correction shown in FIG. 10 .
  • Step 1001 determine whether it is a remote single-speaking at present, if so, execute step 1002, otherwise execute step 1003;
  • Step 1002 perform the operation of determining the target sound velocity based on the voice signal output by the current loudspeaker;
  • Step 1003 perform sound based on the current target sound velocity source location.
  • the location of the corrected sound source there is no limitation on the location of the corrected sound source.
  • the corrected sound source when the corrected sound source is controlled by the sound source localization device, if the microphone array is built into the sound source localization device, the corrective sound source may be built into the sound source localization device, please refer to Figure 11, as shown in Figure 11
  • Figure 11 A schematic diagram of the location of the rectified sound source provided in the embodiment of the present application.
  • the rectified sound source can be at any position in the sound source localization device, such as any position of the dotted circle in Figure 11. Take one of the positions as an example. At this time The spatial relative distance between the corrected sound source and the microphone array is shown in Fig. 12 .
  • the rectified sound source can be built into the microphone array, please refer to Figure 13, as shown in Figure 13 is a schematic diagram of the position of another rectified sound source provided by the embodiment of the present application.
  • the rectified sound source can be at any position in the microphone array, such as any position of the dotted circle in FIG. 13 .
  • the rectified sound source can also be placed outside the microphone array, and the rectified sound source may or may not be connected to the sound source localization equipment, which is not limited here.
  • the time-corrected spatial relative distance between the sound source and the microphone array is shown in FIG. 14 . Please refer to Fig. 15. As shown in Fig.
  • FIG. 15 it is a schematic diagram of the location of another rectified sound source provided by the embodiment of the present application.
  • the rectified sound source can be set at any position outside the sound source localization equipment.
  • Built-in sound source localization equipment or external sound source localization equipment, taking one of the positions as an example, the spatial relative distance between the corrected sound source and the microphone array is shown in Figure 16.
  • the sound source localization equipment determines the target sound velocity, it can use the target sound speed to locate the position of the participants at the conference site, which reduces the influence of temperature on the sound source localization and improves the accuracy of the sound source localization.
  • the corrected sound source sends a sound velocity correction signal to the microphone array
  • the sound source localization device receives the sound velocity correction signal through the microphone array, and determines the target sound velocity according to the spatial relative position of the corrected sound source and the microphone array, as well as the sound velocity correction signal, and the sound source localization
  • the device does not need face recognition, and can determine the target sound velocity at one time, reducing the calculation amount of sound velocity correction and improving the real-time performance of sound velocity correction.
  • the device 170 includes:
  • a receiving unit 1701 configured to receive a sound velocity correction signal from a correction sound source through a microphone array
  • the determining unit 1702 is configured to determine a target sound velocity, the target sound velocity is related to the spatial relative distance between the microphone array and the corrected sound source, and the sound velocity correction signal.
  • the sound velocity correction signal is an ultrasonic wave or the first sound
  • the frequency of the first sound is outside the preset frequency range
  • the receiving unit 1701 is specifically configured to: receive the sound velocity correction signal from the correction sound source in real time through the microphone array.
  • the sound velocity correction signal is the second sound
  • the frequency of the second sound is within the preset frequency range
  • the communication receiving unit 1701 is specifically configured to periodically receive the sound velocity correction signal from the correction sound source through the microphone array.
  • the sound velocity correction signal is used to determine the time delay between the microphones in the microphone array, the target sound speed is proportional to the relative space distance, and the target sound speed is inversely proportional to the time delay.
  • the device 170 further includes: an acquisition unit 1703, which is specifically configured to: obtain the position of the rectified sound source through a camera; the determination unit 1702 is also configured to: determine the spatial relative distance.
  • an acquisition unit 1703 which is specifically configured to: obtain the position of the rectified sound source through a camera
  • the determination unit 1702 is also configured to: determine the spatial relative distance.
  • the determining unit 1702 is specifically configured to: determine the first coordinate of the corrected sound source in the three-dimensional space coordinate system according to the position of the corrected sound source; determine the second coordinate of the microphone array in the three-dimensional space coordinate system according to the position of the microphone array ; Determine the spatial relative distance according to the geometric relationship between the first coordinate and the second coordinate.
  • the device 170 further includes a positioning unit 1704, and the positioning unit 1704 is specifically configured to: perform sound source localization according to a target sound velocity.
  • the receiving unit 1701 of the apparatus 170 is configured to execute step 601 in FIG. 6
  • the determining unit 1702 of the apparatus 170 is configured to execute step 602 in FIG. 6 , which will not be repeated here.
  • FIG. 18 is a schematic diagram of a possible logical structure of a computer device 180 provided by an embodiment of the present application.
  • the computer device 180 includes: a processor 1801 , a communication interface 1802 , a storage system 1803 and a bus 1804 .
  • the processor 1801 , the communication interface 1802 and the storage system 1803 are connected to each other through a bus 1804 .
  • the processor 1801 is used to control and manage the actions of the computer device 180, for example, the processor 1801 is used to execute the steps performed by the sound source localization device in the method embodiment in FIG. 6 .
  • the communication interface 1802 is used to support the computer device 180 in communicating.
  • the storage system 1803 is used for storing program codes and data of the computer device 180 .
  • the processor 1801 may be a central processing unit, a general processor, a digital signal processor, an application specific integrated circuit, a field programmable gate array or other programmable logic devices, transistor logic devices, hardware components or any combination thereof. It can implement or execute the various illustrative logical blocks, modules and circuits described in connection with the present disclosure.
  • the processor 1801 may also be a combination that implements computing functions, for example, a combination of one or more microprocessors, a combination of a digital signal processor and a microprocessor, and the like.
  • the bus 1804 can be a PCI bus or an EISA bus, etc.
  • the bus can be divided into address bus, data bus, control bus and so on. For ease of representation, only one thick line is used in FIG. 18 , but it does not mean that there is only one bus or one type of bus.
  • the receiving unit 1701 in the device 170 is equivalent to the communication interface 1802 in the computer device 180
  • the determining unit 1702 , obtaining unit 1703 and positioning unit 1704 in the device 170 are equivalent to the processor 1801 in the computer device 180 .
  • the computer device 180 in this embodiment may correspond to the sound source localization device in the above-mentioned method embodiment in FIG. And/or various steps implemented, for the sake of brevity, details are not repeated here.
  • each unit in the device can be implemented in the form of software called by the processing element; they can also be implemented in the form of hardware; some units can also be implemented in the form of software called by the processing element, and some units can be implemented in the form of hardware.
  • each unit can be a separate processing element, or it can be integrated in a certain chip of the device.
  • it can also be stored in the memory in the form of a program, which is called and executed by a certain processing element of the device. Function.
  • all or part of these units can be integrated together, or implemented independently.
  • the processing element mentioned here may also be a processor, which may be an integrated circuit with signal processing capabilities.
  • each step of the above method or each unit above may be implemented by an integrated logic circuit of hardware in the processor element or implemented in the form of software called by the processing element.
  • the units in any of the above devices may be one or more integrated circuits configured to implement the above method, for example: one or more specific integrated circuits (application specific integrated circuit, ASIC), or, one or Multiple microprocessors (digital signal processor, DSP), or, one or more field programmable gate arrays (field programmable gate array, FPGA), or a combination of at least two of these integrated circuit forms.
  • ASIC application specific integrated circuit
  • DSP digital signal processor
  • FPGA field programmable gate array
  • the units in the device can be implemented in the form of a processing element scheduler
  • the processing element can be a general-purpose processor, such as a central processing unit (central processing unit, CPU) or other processors that can call programs.
  • CPU central processing unit
  • these units can be integrated together and implemented in the form of a system-on-a-chip (SOC).
  • SOC system-on-a-chip
  • a computer-readable storage medium is also provided.
  • Computer-executable instructions are stored in the computer-readable storage medium.
  • the processor of the device executes the computer-executable instructions
  • the device executes the above-mentioned method embodiment.
  • a computer program product in another embodiment of the present application, includes computer-executable instructions, and the computer-executable instructions are stored in a computer-readable storage medium.
  • the processor of the device executes the computer-executed instructions, the device executes the method performed by the sound source localization device in the foregoing method embodiments.
  • the disclosed system, device and method can be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components can be combined or May be integrated into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or may be distributed to multiple network units. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
  • the integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or part of the contribution to the prior art or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute all or part of the steps of the methods described in the various embodiments of the present application.
  • the aforementioned storage medium includes: U disk, mobile hard disk, read-only memory (read-only memory, ROM), random access memory (random access memory, RAM), magnetic disk or optical disc and other media that can store program codes. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • General Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • General Health & Medical Sciences (AREA)
  • Otolaryngology (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Measurement Of Velocity Or Position Using Acoustic Or Ultrasonic Waves (AREA)

Abstract

L'invention concerne un procédé et un appareil de correction de la vitesse du son. Le procédé comprend les étapes suivantes : l'envoi, par une source sonore de correction, d'un signal de correction de vitesse du son à un réseau de microphones (11) (601) ; la réception, par un dispositif de positionnement de source sonore (1), du signal de correction de vitesse du son au moyen du réseau de microphones (11), et la détermination, par le dispositif de positionnement de source sonore (1), d'une vitesse du son cible en fonction de la position spatiale relative de la source sonore de correction et du réseau de microphones (11) et du signal de correction de vitesse du son (602). Le dispositif de positionnement de source sonore (1) peut déterminer la vitesse du son cible à un instant sans avoir besoin d'effectuer une reconnaissance faciale, ce qui permet de réduire la quantité de calcul pour la correction de la vitesse du son et d'améliorer les performances en temps réel de la correction de la vitesse du son.
PCT/CN2022/131002 2021-11-22 2022-11-10 Procédé et appareil de correction de la vitesse du son WO2023088156A1 (fr)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202111382634.4 2021-11-22
CN202111382634 2021-11-22
CN202111672798.0 2021-12-31
CN202111672798.0A CN116148769A (zh) 2021-11-22 2021-12-31 一种声速矫正方法以及装置

Publications (1)

Publication Number Publication Date
WO2023088156A1 true WO2023088156A1 (fr) 2023-05-25

Family

ID=86357001

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/131002 WO2023088156A1 (fr) 2021-11-22 2022-11-10 Procédé et appareil de correction de la vitesse du son

Country Status (2)

Country Link
CN (1) CN116148769A (fr)
WO (1) WO2023088156A1 (fr)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001141578A (ja) * 1999-11-10 2001-05-25 Ishikawajima Harima Heavy Ind Co Ltd 温度検出方法及び温度検出装置
JP2004309265A (ja) * 2003-04-04 2004-11-04 Nec Corp マルチスタティック水中音速計測方法および方式
CN201247251Y (zh) * 2008-08-21 2009-05-27 中国船舶重工集团公司第七一一研究所 管道气体流速和声速测量计
CN105307063A (zh) * 2014-07-15 2016-02-03 松下知识产权经营株式会社 声速校正装置
CN109164414A (zh) * 2018-09-07 2019-01-08 深圳市天博智科技有限公司 基于麦克风阵列的定位方法、装置和存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001141578A (ja) * 1999-11-10 2001-05-25 Ishikawajima Harima Heavy Ind Co Ltd 温度検出方法及び温度検出装置
JP2004309265A (ja) * 2003-04-04 2004-11-04 Nec Corp マルチスタティック水中音速計測方法および方式
CN201247251Y (zh) * 2008-08-21 2009-05-27 中国船舶重工集团公司第七一一研究所 管道气体流速和声速测量计
CN105307063A (zh) * 2014-07-15 2016-02-03 松下知识产权经营株式会社 声速校正装置
CN109164414A (zh) * 2018-09-07 2019-01-08 深圳市天博智科技有限公司 基于麦克风阵列的定位方法、装置和存储介质

Also Published As

Publication number Publication date
CN116148769A (zh) 2023-05-23

Similar Documents

Publication Publication Date Title
US10397722B2 (en) Distributed audio capture and mixing
CN107534725B (zh) 一种语音信号处理方法及装置
JP6246792B2 (ja) ユーザのグループのうちのアクティブに話しているユーザを識別するための装置及び方法
KR101659712B1 (ko) 입자 필터링을 이용한 음원 위치를 추정
WO2014161309A1 (fr) Procédé et appareil pour qu'un terminal mobile mette en œuvre un suivi de source vocale
WO2021037129A1 (fr) Procédé et appareil de collecte de son
JP2017022718A (ja) サラウンド音場の生成
CN109804559A (zh) 空间音频系统中的增益控制
WO2016014254A1 (fr) Système et procédé pour déterminer un contexte audio dans des applications de réalité augmentée
WO2015184893A1 (fr) Procédé et dispositif de réduction de bruit d'appel vocal pour terminal mobile
WO2014101429A1 (fr) Procédé et dispositif de réduction de bruit pour bi-microphone d'un terminal
Farmani et al. Informed sound source localization using relative transfer functions for hearing aid applications
WO2015106401A1 (fr) Procédé et appareil de traitement de la parole
WO2019061678A1 (fr) Procédé et appareil de détection de mouvement et dispositif de surveillance
WO2019200722A1 (fr) Procédé et appareil d'estimation de direction de source sonore
WO2022007030A1 (fr) Procédé et appareil de traitement de signal audio, dispositif et support lisible
CN111551921A (zh) 一种声像联动的声源定向系统及方法
WO2022062531A1 (fr) Procédé et appareil d'acquisition de signal audio multicanal, et système
US11068233B2 (en) Selecting a microphone based on estimated proximity to sound source
WO2023088156A1 (fr) Procédé et appareil de correction de la vitesse du son
WO2023056905A1 (fr) Procédé et appareil de localisation de source sonore et dispositif
US11902754B2 (en) Audio processing method, apparatus, electronic device and storage medium
JP2001313992A (ja) 収音装置および収音方法
CN112466325B (zh) 声源定位方法和装置,及计算机存储介质
US12003948B1 (en) Multi-device localization

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22894693

Country of ref document: EP

Kind code of ref document: A1