CN111323751B - Sound source positioning method, device and storage medium - Google Patents

Sound source positioning method, device and storage medium Download PDF

Info

Publication number
CN111323751B
CN111323751B CN202010217565.0A CN202010217565A CN111323751B CN 111323751 B CN111323751 B CN 111323751B CN 202010217565 A CN202010217565 A CN 202010217565A CN 111323751 B CN111323751 B CN 111323751B
Authority
CN
China
Prior art keywords
coordinate
calibration
coordinate system
sound source
correction matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010217565.0A
Other languages
Chinese (zh)
Other versions
CN111323751A (en
Inventor
赵玉垒
浦宏杰
修平平
朱赛男
鄢仁祥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Keda Technology Co Ltd
Original Assignee
Suzhou Keda Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Keda Technology Co Ltd filed Critical Suzhou Keda Technology Co Ltd
Priority to CN202010217565.0A priority Critical patent/CN111323751B/en
Publication of CN111323751A publication Critical patent/CN111323751A/en
Application granted granted Critical
Publication of CN111323751B publication Critical patent/CN111323751B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S5/00Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
    • G01S5/18Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using ultrasonic, sonic, or infrasonic waves

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Stereophonic System (AREA)
  • Measurement Of Velocity Or Position Using Acoustic Or Ultrasonic Waves (AREA)

Abstract

The application relates to a sound source positioning method, a sound source positioning device and a storage medium, which belong to the technical field of computers, and the method comprises the following steps: determining a first coordinate value of the position of a sound source in a first coordinate system when a target sound signal emitted by the sound source is acquired; acquiring a coordinate conversion relation and a correction matrix; converting the first coordinate value to a second coordinate system by using the coordinate conversion relation and the correction matrix to obtain a second coordinate value so as to trigger the image acquisition assembly to acquire the image of the sound source corresponding to the second coordinate value; the problem that when the whistle coordinate position is converted through the coordinate conversion matrix, the converted whistle position monitoring result is not accurate enough due to the inaccuracy of the coordinate conversion matrix can be solved; the correction matrix can correct errors in coordinate conversion of the coordinate conversion relationship, so that the accuracy of the determined position of the sound source can be improved.

Description

Sound source positioning method, device and storage medium
Technical Field
The application relates to a sound source positioning method, a sound source positioning device and a storage medium, and belongs to the technical field of computers.
Background
The vehicles in urban areas whistle can seriously disturb residents, so most areas stipulate that the vehicles in urban areas are forbidden to whistle. In order to ensure the monitoring of the illegal whistle, the whistle position needs to be positioned, the whistle position is transmitted to the license plate snapshot device, and the license plate snapshot device snapshots and stores the license plate number of the illegal whistle vehicle.
In a typical sound source positioning method, when a whistling monitoring device and a license plate snapshot device are installed, the distance and the deflection angle of the whistling monitoring device relative to the license plate snapshot device are obtained; constructing a coordinate transformation matrix by using the distance and the deflection angle; when the whistle monitoring device collects the whistle signal, the coordinate conversion matrix is used for converting the position of the whistle signal in the own coordinate system of the whistle monitoring device into a common coordinate system shared by the whistle monitoring device and the license plate snapshot device, so that a whistle position monitoring result is obtained.
However, the accuracy of the coordinate transformation matrix affects the transformation result, so that an error exists between the result after the coordinate system transformation and the actual whistle position.
Disclosure of Invention
The application provides a sound source positioning method, a sound source positioning device and a storage medium, which can solve the problem that when the whistle coordinate position is converted through a coordinate conversion matrix, the converted whistle position monitoring result is not accurate enough due to the inaccuracy of the coordinate conversion matrix. The application provides the following technical scheme:
in a first aspect, a sound source localization method is provided, the method including:
when a target sound signal emitted by the sound source is collected, determining a first coordinate value of the position of the sound source in a first coordinate system, wherein the first coordinate system is established based on the position of an audio collection assembly, and the audio collection assembly is used for collecting the target sound signal;
acquiring a coordinate conversion relation, wherein the coordinate conversion relation is used for converting the position of the sound source from the first coordinate system to a second coordinate system, the second coordinate system is established based on the position of the audio acquisition assembly and the position of an image acquisition assembly, and the image acquisition assembly is used for acquiring the image of the sound source;
acquiring a correction matrix, wherein the correction matrix is used for correcting errors in coordinate conversion of the coordinate conversion relation;
and converting the first coordinate value to the second coordinate system by using the coordinate conversion relation and the correction matrix to obtain a second coordinate value so as to trigger an image acquisition assembly to acquire an image of the sound source corresponding to the second coordinate value.
Optionally, the obtaining a correction matrix includes:
when acquiring n times of calibration voice signals emitted from each of m calibration points, determining a first calibration coordinate value of the calibration voice signal on each calibration point in a first coordinate system; both m and n are positive integers;
converting each first calibration coordinate value to the second coordinate system by using the coordinate conversion relation to obtain a second calibration coordinate value;
acquiring real coordinate values of the m calibration points in the second coordinate system;
determining the correction matrix based on the second calibration coordinate values and the real coordinate values.
Optionally, said determining said correction matrix based on said second calibration coordinate values and said real coordinate values comprises:
acquiring a matrix model of the correction matrix;
and multiplying the estimated position matrix formed by each second calibration coordinate value by the matrix model, taking the actual position matrix formed by each corresponding real coordinate value as a multiplication result, and determining the solution of the matrix model based on a least square method to obtain the correction matrix.
Optionally, when acquiring the n times of calibration sound signals emitted from each of the m calibration points, determining a first calibration coordinate value of the calibration sound signal at each calibration point in the first coordinate system includes:
and for each calibration point, calculating the average value of the first calibration coordinate values of the calibration sound signals of n times sent out from the calibration point to obtain the first calibration coordinate values of the calibration sound signals on the calibration point in a first coordinate system, wherein n is an integer greater than 1.
Optionally, the obtaining a correction matrix includes:
reading the stored correction matrix; after the audio acquisition assembly is installed, the calibration matrix emits calibration sound signals at m calibration points; is determined based on a difference between real coordinate values and second calibration coordinate values of the calibration sound signal in the second coordinate system.
Optionally, the obtaining the coordinate transformation relationship includes:
determining a deflection angle of the first coordinate system relative to the second coordinate system;
acquiring the origin position of the coordinate origin of the first coordinate system in the second coordinate system;
and taking the origin position and the deflection angle as deflection parameters of a preset coordinate conversion formula to obtain the coordinate conversion relation.
Optionally, the coordinate conversion formula includes:
Figure BDA0002424899310000031
Figure BDA0002424899310000032
Figure BDA0002424899310000033
Figure BDA0002424899310000034
Figure BDA0002424899310000035
wherein the content of the first and second substances,
Figure BDA0002424899310000036
a matrix formed by the position of the origin of coordinates of the first coordinate system in the second coordinate system;
Figure BDA0002424899310000037
for representing the first coordinate value, wherein r is a length of a line connecting a position of the sound source and a coordinate origin of the first coordinate system,
Figure BDA0002424899310000038
an included angle between the projection of the connecting line on the x ' o ' y ' plane of the first coordinate system and the positive direction of the x ' axis is shown, and theta is an included angle between the connecting line and the z ' axis of the first coordinate system; α, β, γ are the deflection angles.
In a second aspect, there is provided a sound source localization apparatus, the apparatus comprising:
the first coordinate determination module is used for determining a first coordinate value of the position of the sound source in a first coordinate system when a target sound signal emitted by the sound source is acquired, wherein the first coordinate system is established based on the position of an audio acquisition assembly, and the audio acquisition assembly is used for acquiring the target sound signal;
a conversion relation obtaining module, configured to obtain a coordinate conversion relation, where the coordinate conversion relation is used to convert the position of the sound source from the first coordinate system to a second coordinate system, the second coordinate system is established based on the position of the audio acquisition component and the position of an image acquisition component, and the image acquisition component is used to acquire an image of the sound source;
a correction matrix obtaining module, configured to obtain a correction matrix, where the correction matrix is used to correct an error in coordinate conversion performed by the coordinate conversion relationship;
and the second coordinate determination module is used for converting the first coordinate value to the second coordinate system by using the coordinate conversion relation and the correction matrix to obtain a second coordinate value so as to trigger an image acquisition assembly to acquire an image of the sound source corresponding to the second coordinate value.
In a third aspect, there is provided a sound source localization apparatus, the apparatus comprising a processor and a memory; the memory stores therein a program that is loaded and executed by the processor to implement the sound source localization method according to the first aspect.
In a fourth aspect, there is provided a computer-readable storage medium having a program stored therein, the program being loaded and executed by the processor to implement the sound source localization method of the first aspect.
The beneficial effect of this application lies in: determining a first coordinate value of the position of a sound source in a first coordinate system when a target sound signal emitted by the sound source is collected; acquiring a coordinate conversion relation, wherein the coordinate conversion relation is used for converting the position of a sound source from a first coordinate system to a second coordinate system; acquiring a correction matrix; converting the first coordinate value to a second coordinate system by using the coordinate conversion relation and the correction matrix to obtain a second coordinate value so as to trigger the image acquisition assembly to acquire the image of the sound source corresponding to the second coordinate value; the problem that when the whistle coordinate position is converted through the coordinate conversion matrix, the converted whistle position monitoring result is not accurate enough due to the inaccuracy of the coordinate conversion matrix can be solved; the correction matrix can correct errors in coordinate conversion of the coordinate conversion relation, so that the accuracy of the determined position of the sound source can be improved;
in addition, the coordinate conversion relation can be obtained through the deflection condition of the audio acquisition assembly relative to the image acquisition assembly, so that the influence of the installation offset of the audio acquisition assembly on the conversion of the positioning result can be reduced by correcting the coordinate conversion relation;
in addition, the correction matrix is determined and stored after the audio acquisition assembly is installed, the correction matrix can be repeatedly used in the subsequent sound source positioning process, the output positioning result of the sound source positioning can be rapidly completed, and the sound source positioning efficiency is improved.
The foregoing description is only an overview of the technical solutions of the present application, and in order to make the technical solutions of the present application more clear and clear, and to implement the technical solutions according to the content of the description, the following detailed description is made with reference to the preferred embodiments of the present application and the accompanying drawings.
Drawings
FIG. 1 is a schematic diagram of a sound source localization system according to an embodiment of the present application;
FIG. 2 is a flow chart of a sound source localization method provided by an embodiment of the present application;
FIG. 3 is a schematic diagram of a first coordinate system provided by one embodiment of the present application;
FIG. 4 is a schematic illustration of a first coordinate system and a second coordinate system provided by an embodiment of the present application;
FIG. 5 is a schematic diagram of calibration points provided by one embodiment of the present application;
FIG. 6 is a block diagram of a sound source localization apparatus provided in one embodiment of the present application;
fig. 7 is a block diagram of a sound source localization apparatus according to an embodiment of the present application.
Detailed Description
The following detailed description of embodiments of the present application will be described in conjunction with the accompanying drawings and examples. The following examples are intended to illustrate the present application but are not intended to limit the scope of the present application.
Fig. 1 is a schematic structural diagram of a sound source localization system according to an embodiment of the present application, as shown in fig. 1, the system at least includes: an audio capture component 110 and an image capture component 120.
The audio capturing component 110 may be a microphone or a microphone array or other devices having a function of capturing sound signals, and the embodiment does not limit the type of the audio capturing component 110. The structure of the microphone array may be circular, rectangular, multi-arm spiral, spherical, etc., and the structure of the microphone array is not limited in this embodiment.
Optionally, the audio capture component 110 is configured to: when the target sound signal is collected, determining a first coordinate value of the position of a sound source of the target sound signal in a first coordinate system; acquiring a coordinate conversion relation; acquiring a correction matrix; and converting the first coordinate value to a second coordinate system by using the coordinate conversion relation and the correction matrix to obtain a second coordinate value so as to trigger the image acquisition assembly to acquire the image of the sound source corresponding to the second coordinate value.
Wherein the first coordinate system is established based on the position of the audio acquisition component. The first coordinate value may be a coordinate value of a spherical coordinate system; of course, the coordinate values may be in a cartesian coordinate system, and the expression form of the first coordinate value is not limited in this embodiment. Optionally, the origin of coordinates of the first coordinate system is a position where the audio capturing component is located.
The second coordinate system is established based on the locations of the audio capture component 110 and the image capture component 120. In other words, the second coordinate system is a coordinate system common to the audio capture component 110 and the image capture component 120. Optionally, one coordinate plane of the second coordinate system is parallel to the ground. The second coordinate value may be a coordinate value of a spherical coordinate system; of course, the coordinate values may be in a cartesian coordinate system, and the expression form of the second coordinate values is not limited in this embodiment.
Wherein the coordinate transformation relationship is used to transform the position of the sound source from a first coordinate system to a second coordinate system. The correction matrix is used for correcting errors in coordinate conversion in the coordinate conversion relation.
The audio capture component 110 is communicatively coupled to the image capture component 120 via a wired or wireless connection.
The image capturing assembly 120 may be a video camera, a still camera, or other devices with image capturing function, and the present embodiment does not limit the type of the image capturing assembly 120.
The image collecting component 120 is configured to collect an image of the sound source corresponding to the second coordinate value.
It should be added that after the audio acquisition component 110 acquires the first coordinate value, the first coordinate value may also be sent to other devices to trigger the other devices to execute acquiring the coordinate transformation relationship and the correction matrix; and converting the first coordinate value into a second coordinate system by using the coordinate conversion relation and the correction matrix to obtain a second coordinate value. The other devices may be a computer, a mobile phone, an image capturing component 120, a tablet computer, a server, etc., and the present embodiment does not limit the types of the other devices.
In addition, the application scenarios of the sound source localization system and the corresponding sound source localization method include, but are not limited to, at least one of the following:
1. and (5) a whistling monitoring scene. That is, the audio acquisition component 110 locates the blast position and triggers the image acquisition component 120 to snap an image of the vehicle at the blast position.
2. And monitoring scenes in a classroom. That is, the audio acquisition component 110 locates the student speaking location and triggers the image acquisition component 120 to snap an image of the student at the speaking location.
3. A video conference scene. That is, the speaking location at which the audio capture component 110 is currently speaking to the participant triggers the image capture component 120 of the participant at that speaking location.
Of course, the sound source localization system and the sound source localization method can also be applied to other similar scenes, and the application is not listed here.
Fig. 2 is a flowchart of a sound source localization method according to an embodiment of the present application, where the method is applied to the sound source localization system shown in fig. 1, and a main execution subject of each step is an audio acquisition component 110 in the system. The method at least comprises the following steps:
step 201, when a target sound signal emitted by a sound source is collected, determining a first coordinate value of the position of the sound source in a first coordinate system.
The first coordinate system is established based on a position of an audio acquisition component used to acquire a target sound signal.
Optionally, the first coordinate system is a three-dimensional coordinate system, and the origin of coordinates is a position of the audio capturing component.
Illustratively, the audio acquisition component may determine a first coordinate value of a position of a sound source of the target sound signal in a first coordinate system using a sound source localization method of a microphone array, the sound source localization method including a beam former based method, a high-resolution spectrum estimation based method, a delay inequality based method, and the like, and the embodiment does not limit the type of the sound source localization method.
Referring to a schematic diagram between a first coordinate system (including a coordinate origin o ', an x' axis, a y 'axis, and a z' axis) and a position of a sound source shown in fig. 3, a first coordinate value of the sound source s in the first coordinate system is
Figure BDA0002424899310000071
The first coordinate value is represented by a coordinate value of a spherical coordinate system. In practical implementation, the first coordinate value may also be represented by a coordinate value of a cartesian coordinate system, that is:
(
Figure BDA0002424899310000081
r × cos cos cos θ) the present application does not limit the manner in which the first coordinate value is expressed. Wherein the content of the first and second substances,
Figure BDA0002424899310000082
the included angle (also called horizontal angle) between the projection of a connecting line between the position of the sound source and the coordinate origin of the first coordinate system on the x 'o' y 'plane and the positive direction of the x' axis; theta is an included angle (also called a pitch angle) between a connecting line between the position of the sound source and the origin of coordinates of the first coordinate system and the z' -axis; r is the length of the connecting line between the position of the sound source and the origin of coordinates of the first coordinate system.
Step 202, obtaining a coordinate transformation relation.
The coordinate transformation relationship is used to transform the position of the sound source from a first coordinate system to a second coordinate system.
The second coordinate system is established based on the position of the audio capture component and the position of the image capture component, i.e., the second coordinate system is a coordinate system common to both the audio capture component and the image capture component. At this time, the coordinate values determined by the audio acquisition component are also applicable to the image acquisition component. The image acquisition assembly is used for acquiring images of the sound source.
The first coordinate system has included angles (alpha ', beta ', gamma ') of three directions relative to the second coordinate system, wherein alpha ' is an included angle of an x ' axis of the first coordinate system relative to an x axis of the second coordinate system; beta 'is the included angle of the y' axis of the first coordinate system relative to the y axis of the second coordinate system; γ 'is the angle of the z' axis of the first coordinate system relative to the z axis of the second coordinate system. Refer to fig. 4, which shows a relative positional relationship diagram between a first coordinate system and a second coordinate system (including a coordinate origin o, an x-axis, a y-axis, and a z-axis).
In one example, the coordinate transformation relationship is stored in a storage medium, such as: the coordinate transformation relation is pre-written in a Read-Only Memory (ROM) through RS232/485 or Ethernet, and the audio acquisition component 110 reads the coordinate transformation relation from the ROM.
In another example, the coordinate transformation relationship is obtained through a plurality of calibration processes. At this time, a coordinate conversion relationship is acquired including: determining a deflection angle of the first coordinate system relative to the second coordinate system; acquiring the origin position of the coordinate origin of the first coordinate system in the second coordinate system; and taking the original point position and the deflection angle as deflection parameters of a preset coordinate conversion formula to obtain a coordinate conversion relation.
Optionally, the coordinate conversion formula comprises:
Figure BDA0002424899310000083
Figure BDA0002424899310000084
Figure BDA0002424899310000091
Figure BDA0002424899310000092
Figure BDA0002424899310000093
wherein the content of the first and second substances,
Figure BDA0002424899310000094
a matrix formed by the position of the origin of coordinates of the first coordinate system in the second coordinate system;
Figure BDA0002424899310000095
for representing a first coordinate value, where r is the length of the line between the position of the sound source and the origin of coordinates of the first coordinate system,
Figure BDA0002424899310000096
is the included angle between the projection of the connecting line on the x ' o ' y ' plane of the first coordinate system and the positive direction of the x ' axis, and theta is the included angle between the connecting line and the z ' axis of the first coordinate system; alpha, beta and gamma are deflection angles.
Step 203, acquiring a correction matrix.
The correction matrix is used for correcting errors in coordinate conversion in the coordinate conversion relation.
In one example, a correction matrix is obtained, comprising: when acquiring n times of calibration voice signals emitted from each of m calibration points, determining a first calibration coordinate value of the calibration voice signal on each calibration point in a first coordinate system; converting each first calibration coordinate value to a second coordinate system by using a coordinate conversion relation to obtain a second calibration coordinate value; acquiring real coordinate values of the m calibration points in a second coordinate system; and determining a correction matrix based on the second calibration coordinate value and the real coordinate value. m and n are both positive integers.
Wherein determining a correction matrix based on the second calibration coordinate value and the real coordinate value comprises: acquiring a matrix model of a correction matrix; and multiplying the estimated position matrix formed by each second calibration coordinate value by the matrix model, taking the actual position matrix formed by each corresponding real coordinate value as a product result, and determining the solution of the matrix model based on a least square method to obtain a correction matrix.
When acquiring n times of calibration sound signals emitted from each of m calibration points, determining a first calibration coordinate value of the calibration sound signal at each calibration point in a first coordinate system, including:
and for each calibration point, calculating the average value of the first calibration coordinate values of the calibration sound signals emitted from the calibration point for n times to obtain the first calibration coordinate values of the calibration sound signals on the calibration point in the first coordinate system, wherein n is an integer greater than 1.
Optionally, m and n are both integers greater than 1. Referring to the calibration point location diagram of fig. 5, fig. 5 includes 18 calibration points, and the 18 calibration points are within the effective range for the audio capture assembly to capture audio and the image capture assembly to capture images. The sound source is controlled to emit n times (e.g., 10 times) calibration sound signals at each calibration point, respectively. At this time, the audio acquisition assembly acquires the calibration sound signal emitted from each calibration point. It is assumed that the first calibration coordinate values of the n calibration sound signals obtained at a certain calibration point are represented by the following formula:
Figure BDA0002424899310000101
averaging the n first calibration coordinate values of each of the m calibration points to obtain the following formula:
Figure BDA0002424899310000102
the true coordinate values of the respective calibration points are represented by the following formula:
Figure BDA0002424899310000103
assume that the matrix model of the correction matrix is represented by:
Figure BDA0002424899310000104
will be provided with
Figure BDA0002424899310000105
And calculating the solution of the formula based on a least square method to obtain a correction matrix.
In this embodiment, a matrix with a correction matrix of 3 × 3 is taken as an example for explanation, and in actual implementation, the correction matrix may be a matrix with other dimensions, and the dimension of the correction matrix is not limited in this embodiment.
In another example, a correction matrix is obtained, comprising: the stored correction matrix is read. The correction matrix is used for emitting calibration sound signals at m calibration points after the audio acquisition assembly is installed; is determined based on a difference between the real coordinate values and the second calibration coordinate values of the calibration sound signal in the second coordinate system. That is, after the audio capture component is installed, the correction matrix is calculated by the first example and then stored in the storage medium. Such as: stored in the ROM.
And 204, converting the first coordinate value to a second coordinate system by using the coordinate conversion relation and the correction matrix to obtain a second coordinate value so as to trigger the image acquisition assembly to acquire the image of the sound source corresponding to the second coordinate value.
Since the coordinate conversion relationship takes the first coordinate value as a variable, the second coordinate value can be obtained by inputting the current first coordinate value to the coordinate conversion relationship and multiplying the obtained value by the correction matrix.
Illustratively, α, β, γ in the above coordinate conversion formula are determined by inputting the first coordinate value as
Figure BDA0002424899310000111
The second coordinate value can be obtained.
In summary, in the sound source positioning method provided in this embodiment, when a target sound signal emitted by a sound source is collected, a first coordinate value of a position of the sound source in a first coordinate system is determined; acquiring a coordinate conversion relation, wherein the coordinate conversion relation is used for converting the position of a sound source from a first coordinate system to a second coordinate system; acquiring a correction matrix; converting the first coordinate value to a second coordinate system by using the coordinate conversion relation and the correction matrix to obtain a second coordinate value so as to trigger the image acquisition assembly to acquire the image of the sound source corresponding to the second coordinate value; the problem that when the whistle coordinate position is converted through the coordinate conversion matrix, the converted whistle position monitoring result is not accurate enough due to the inaccuracy of the coordinate conversion matrix can be solved; the correction matrix can correct errors in coordinate conversion of the coordinate conversion relationship, so that the accuracy of the determined position of the sound source can be improved.
In addition, the coordinate conversion relation can be obtained through the deflection condition of the audio acquisition assembly relative to the image acquisition assembly, so that the influence of the installation offset of the audio acquisition assembly on the conversion of the positioning result can be reduced by correcting the coordinate conversion relation.
In addition, the correction matrix is determined and stored after the audio acquisition assembly is installed, the correction matrix can be repeatedly used in the subsequent sound source positioning process, the output positioning result of the sound source positioning can be rapidly completed, and the sound source positioning efficiency is improved.
Fig. 6 is a block diagram of a sound source localization apparatus according to an embodiment of the present application, and this embodiment is described by taking an example of the application of the apparatus to the audio acquisition component 110 in the sound source localization system shown in fig. 1. The device at least comprises the following modules: a first coordinate determination module 610, a transformation relation acquisition module 620, a correction matrix acquisition module 630, and a second coordinate determination module 640.
A first coordinate determination module 610, configured to determine, when a target sound signal emitted by the sound source is acquired, a first coordinate value of a position of the sound source in a first coordinate system, where the first coordinate system is established based on a position of an audio acquisition component, and the audio acquisition component is configured to acquire the target sound signal;
a transformation relation obtaining module 620, configured to obtain a coordinate transformation relation, where the coordinate transformation relation is used to transform the position of the sound source from the first coordinate system to a second coordinate system, where the second coordinate system is established based on the position of the audio acquisition component and the position of an image acquisition component, and the image acquisition component is used to acquire an image of the sound source;
a correction matrix obtaining module 630, configured to obtain a correction matrix, where the correction matrix is used to correct an error when the coordinate conversion is performed on the coordinate conversion relationship;
and a second coordinate determining module 640, configured to convert the first coordinate value to the second coordinate system by using the coordinate conversion relationship and the correction matrix to obtain a second coordinate value, so as to trigger an image acquisition component to acquire an image of a sound source corresponding to the second coordinate value.
For relevant details reference is made to the above-described method embodiments.
It should be noted that: in the sound source positioning device provided in the above embodiment, when performing sound source positioning, only the division of the above functional modules is taken as an example, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the sound source positioning device is divided into different functional modules, so as to complete all or part of the above described functions. In addition, the sound source positioning device and the sound source positioning method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are detailed in the method embodiments and are not described herein again.
Fig. 7 is a block diagram of a sound source localization apparatus provided in an embodiment of the present application, which may be an apparatus including the audio acquisition component 110 in the sound source localization system shown in fig. 1, such as: a whistling monitoring device, a smartphone, a tablet computer, a laptop computer, a desktop computer, or a server. The sound source localization apparatus may also be referred to as a user equipment, a portable terminal, a laptop terminal, a desktop terminal, a control terminal, etc., which is not limited in this embodiment. The apparatus includes at least a processor 701 and a memory 702.
Processor 701 may include one or more processing cores, such as: 4 core processors, 8 core processors, etc. The processor 701 may be implemented in at least one hardware form of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). The processor 701 may also include a main processor and a coprocessor, where the main processor is a processor for Processing data in an awake state, and is also called a Central Processing Unit (CPU); a coprocessor is a low power processor for processing data in a standby state. In some embodiments, the processor 701 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and drawing the content required to be displayed on the display screen. In some embodiments, the processor 701 may further include an AI (Artificial Intelligence) processor for processing computing operations related to machine learning.
Memory 702 may include one or more computer-readable storage media, which may be non-transitory. Memory 702 may also include high-speed random access memory, as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 702 is used to store at least one instruction for execution by processor 701 to implement a sound source localization method as provided by method embodiments herein.
In some embodiments, the sound source positioning device may further include: a peripheral interface and at least one peripheral. The processor 701, memory 702, and peripheral interface may be connected by bus or signal lines. Each peripheral may be connected to the peripheral interface via a bus, signal line, or circuit board. Illustratively, peripheral devices include, but are not limited to: radio frequency circuit, touch display screen, audio circuit, power supply, etc.
Of course, the sound source positioning device may also include fewer or more components, and the embodiment is not limited thereto.
Optionally, the present application further provides a computer-readable storage medium, in which a program is stored, and the program is loaded and executed by a processor to implement the sound source localization method of the above-mentioned method embodiment.
Optionally, the present application further provides a computer product, which includes a computer-readable storage medium, in which a program is stored, and the program is loaded and executed by a processor to implement the sound source localization method of the above-mentioned method embodiment.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (8)

1. A sound source localization method, characterized in that the method comprises:
when a target sound signal emitted by the sound source is collected, determining a first coordinate value of the position of the sound source in a first coordinate system, wherein the first coordinate system is established based on the position of an audio collection assembly, and the audio collection assembly is used for collecting the target sound signal;
acquiring a coordinate conversion relation, wherein the coordinate conversion relation is used for converting the position of the sound source from the first coordinate system to a second coordinate system, the second coordinate system is established based on the position of the audio acquisition assembly and the position of an image acquisition assembly, and the image acquisition assembly is used for acquiring the image of the sound source;
acquiring a correction matrix, wherein the correction matrix is used for correcting errors in coordinate conversion of the coordinate conversion relation;
converting the first coordinate value to the second coordinate system by using the coordinate conversion relation and the correction matrix to obtain a second coordinate value so as to trigger an image acquisition assembly to acquire an image of a sound source corresponding to the second coordinate value;
the acquiring a correction matrix includes:
when acquiring n times of calibration voice signals emitted from each of m calibration points, determining a first calibration coordinate value of the calibration voice signal on each calibration point in a first coordinate system; both m and n are positive integers;
converting each first calibration coordinate value to the second coordinate system by using the coordinate conversion relation to obtain a second calibration coordinate value;
acquiring real coordinate values of the m calibration points in the second coordinate system;
determining the correction matrix based on the second calibration coordinate values and the real coordinate values;
the step of determining the correction matrix based on the second calibration coordinate values and the real coordinate values includes:
acquiring a matrix model of the correction matrix;
and multiplying the estimated position matrix formed by each second calibration coordinate value by the matrix model, taking the actual position matrix formed by each corresponding real coordinate value as a multiplication result, and determining the solution of the matrix model based on a least square method to obtain the correction matrix.
2. The method of claim 1, wherein determining a first calibration coordinate value of the calibration sound signal at each of the m calibration points in the first coordinate system when acquiring the n calibration sound signals emitted at each of the m calibration points comprises:
and for each calibration point, calculating the average value of the first calibration coordinate values of the calibration sound signals of n times sent out from the calibration point to obtain the first calibration coordinate values of the calibration sound signals on the calibration point in a first coordinate system, wherein n is an integer greater than 1.
3. The method of claim 1, wherein obtaining the correction matrix comprises:
reading the stored correction matrix; after the audio acquisition assembly is installed, the calibration matrix emits calibration sound signals at m calibration points; is determined based on a difference between real coordinate values and second calibration coordinate values of the calibration sound signal in the second coordinate system.
4. The method according to any one of claims 1 to 3, wherein the obtaining the coordinate transformation relationship comprises:
determining a deflection angle of the first coordinate system relative to the second coordinate system;
acquiring the origin position of the coordinate origin of the first coordinate system in the second coordinate system;
and taking the origin position and the deflection angle as deflection parameters of a preset coordinate conversion formula to obtain the coordinate conversion relation.
5. The method of claim 4, wherein the coordinate conversion formula comprises:
Figure FDA0003659746400000021
Figure FDA0003659746400000022
Figure FDA0003659746400000023
Figure FDA0003659746400000024
Figure FDA0003659746400000025
wherein the content of the first and second substances,
Figure FDA0003659746400000031
a matrix formed by the position of the origin of coordinates of the first coordinate system in the second coordinate system;
Figure FDA0003659746400000032
for representing the first coordinate value, wherein r is a length of a line connecting a position of the sound source and a coordinate origin of the first coordinate system,
Figure FDA0003659746400000033
an included angle between the projection of the connecting line on the x ' o ' y ' plane of the first coordinate system and the positive direction of the x ' axis is shown, and theta is an included angle between the connecting line and the z ' axis of the first coordinate system; α, β, γ being said deflectionAnd (4) an angle.
6. A sound source localization apparatus, comprising:
the first coordinate determination module is used for determining a first coordinate value of the position of the sound source in a first coordinate system when a target sound signal emitted by the sound source is acquired, wherein the first coordinate system is established based on the position of an audio acquisition assembly, and the audio acquisition assembly is used for acquiring the target sound signal;
a conversion relation obtaining module, configured to obtain a coordinate conversion relation, where the coordinate conversion relation is used to convert the position of the sound source from the first coordinate system to a second coordinate system, the second coordinate system is established based on the position of the audio acquisition component and the position of an image acquisition component, and the image acquisition component is used to acquire an image of the sound source;
a correction matrix obtaining module, configured to obtain a correction matrix, where the correction matrix is used to correct an error in coordinate conversion performed by the coordinate conversion relationship; the acquiring a correction matrix includes: when acquiring n times of calibration voice signals emitted from each of m calibration points, determining a first calibration coordinate value of the calibration voice signal on each calibration point in a first coordinate system; both m and n are positive integers; converting each first calibration coordinate value to the second coordinate system by using the coordinate conversion relation to obtain a second calibration coordinate value; acquiring real coordinate values of the m calibration points in the second coordinate system; determining the correction matrix based on the second calibration coordinate values and the real coordinate values; the step of determining the correction matrix based on the second calibration coordinate values and the real coordinate values includes: acquiring a matrix model of the correction matrix; multiplying the estimated position matrix formed by each second calibration coordinate value by the matrix model, taking the actual position matrix formed by each corresponding real coordinate value as a multiplication result, and determining the solution of the matrix model based on a least square method to obtain the correction matrix;
and the second coordinate determination module is used for converting the first coordinate value to the second coordinate system by using the coordinate conversion relation and the correction matrix to obtain a second coordinate value so as to trigger an image acquisition assembly to acquire the image of the sound source corresponding to the second coordinate value.
7. A sound source localization arrangement, the arrangement comprising a processor and a memory; the memory stores therein a program that is loaded and executed by the processor to implement the sound source localization method according to any one of claims 1 to 5.
8. A computer-readable storage medium, characterized in that the storage medium has stored therein a program which, when being executed by a processor, is adapted to implement the sound source localization method according to any one of claims 1 to 5.
CN202010217565.0A 2020-03-25 2020-03-25 Sound source positioning method, device and storage medium Active CN111323751B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010217565.0A CN111323751B (en) 2020-03-25 2020-03-25 Sound source positioning method, device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010217565.0A CN111323751B (en) 2020-03-25 2020-03-25 Sound source positioning method, device and storage medium

Publications (2)

Publication Number Publication Date
CN111323751A CN111323751A (en) 2020-06-23
CN111323751B true CN111323751B (en) 2022-08-02

Family

ID=71169464

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010217565.0A Active CN111323751B (en) 2020-03-25 2020-03-25 Sound source positioning method, device and storage medium

Country Status (1)

Country Link
CN (1) CN111323751B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111998733B (en) * 2020-08-12 2023-03-31 军鹏特种装备股份公司 Automatic calibration method for shock wave target
CN113176538A (en) * 2021-04-16 2021-07-27 杭州爱华仪器有限公司 Sound source imaging method based on microphone array
CN113747349A (en) * 2021-08-12 2021-12-03 广东博智林机器人有限公司 Positioning method, positioning device, electronic equipment and storage medium
CN114510679B (en) * 2021-12-15 2024-04-12 成都飞机工业(集团)有限责任公司 Device position information obtaining method and device, terminal device and storage medium
CN114966547B (en) * 2022-05-18 2023-05-12 珠海视熙科技有限公司 Compensation method, system and device for improving sound source positioning accuracy

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106653041A (en) * 2017-01-17 2017-05-10 北京地平线信息技术有限公司 Audio signal processing equipment and method as well as electronic equipment
CN106875678A (en) * 2017-01-23 2017-06-20 上海良相智能化工程有限公司 A kind of vehicle whistle law enforcement evidence-obtaining system
WO2017211408A1 (en) * 2016-06-08 2017-12-14 Telefonaktiebolaget Lm Ericsson (Publ) Method for calibrating an antenna system, control device, computer program and computer program products
CN110146869A (en) * 2019-05-21 2019-08-20 北京百度网讯科技有限公司 Determine method, apparatus, electronic equipment and the storage medium of coordinate system conversion parameter
CN110632582A (en) * 2019-09-25 2019-12-31 苏州科达科技股份有限公司 Sound source positioning method, device and storage medium
CN110873863A (en) * 2018-08-29 2020-03-10 杭州海康威视数字技术股份有限公司 Target display method, radar system and electronic equipment

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2001343448A (en) * 2000-05-31 2001-12-14 Oki Electric Ind Co Ltd System for measuring location of generation of impulse sound
US9423250B1 (en) * 2009-12-17 2016-08-23 The Boeing Company Position measurement correction using loop-closure and movement data
CN106737683A (en) * 2017-01-11 2017-05-31 吉林省凯迪科技有限公司 The method of correction industrial robot off-line programing error in the field
CN107421476A (en) * 2017-05-11 2017-12-01 成都飞机工业(集团)有限责任公司 A kind of spatial hole position Measuring datum error compensation method
CN109916351B (en) * 2017-12-13 2020-09-08 北京柏惠维康科技有限公司 Method and device for acquiring TCP (Transmission control protocol) coordinates of robot
EP3557523B1 (en) * 2018-04-18 2021-07-28 B&R Industrial Automation GmbH Method for generating a correcting model of a camera for correcting an imaging error
CN109254266A (en) * 2018-11-07 2019-01-22 苏州科达科技股份有限公司 Sound localization method, device and storage medium based on microphone array

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017211408A1 (en) * 2016-06-08 2017-12-14 Telefonaktiebolaget Lm Ericsson (Publ) Method for calibrating an antenna system, control device, computer program and computer program products
CN106653041A (en) * 2017-01-17 2017-05-10 北京地平线信息技术有限公司 Audio signal processing equipment and method as well as electronic equipment
CN106875678A (en) * 2017-01-23 2017-06-20 上海良相智能化工程有限公司 A kind of vehicle whistle law enforcement evidence-obtaining system
CN110873863A (en) * 2018-08-29 2020-03-10 杭州海康威视数字技术股份有限公司 Target display method, radar system and electronic equipment
CN110146869A (en) * 2019-05-21 2019-08-20 北京百度网讯科技有限公司 Determine method, apparatus, electronic equipment and the storage medium of coordinate system conversion parameter
CN110632582A (en) * 2019-09-25 2019-12-31 苏州科达科技股份有限公司 Sound source positioning method, device and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
"适用于SFM点云的为标定摄像机注册方法";景彦哲 等;《信号处理》;20131231;第274-278页 *

Also Published As

Publication number Publication date
CN111323751A (en) 2020-06-23

Similar Documents

Publication Publication Date Title
CN111323751B (en) Sound source positioning method, device and storage medium
CN110632582B (en) Sound source positioning method, device and storage medium
US20220319050A1 (en) Calibration method and apparatus, processor, electronic device, and storage medium
US20200209365A1 (en) Laser data calibration method and robot using the same
CN106570907B (en) Camera calibration method and device
CN108924544A (en) Camera distortion measurement method and test device
CN111340801A (en) Livestock checking method, device, equipment and storage medium
CN114494388A (en) Three-dimensional image reconstruction method, device, equipment and medium in large-view-field environment
CN116342609B (en) Real-time detection method, system and storage medium based on cutting device
CN113034582A (en) Pose optimization device and method, electronic device and computer readable storage medium
CN111445513A (en) Plant canopy volume obtaining method and device based on depth image, computer equipment and storage medium
CN111429529A (en) Calibration method for coordinate transformation, electronic equipment and computer storage medium
CN115661493A (en) Object pose determination method and device, equipment and storage medium
US20220044438A1 (en) Object detection model generation method and electronic device and computer readable storage medium using the same
CN113205591B (en) Method and device for acquiring three-dimensional reconstruction training data and electronic equipment
CN112489111B (en) Camera external parameter calibration method and device and camera external parameter calibration system
CN114782611A (en) Image processing method, image processing device, storage medium and electronic equipment
CN115457202A (en) Method and device for updating three-dimensional model and storage medium
CN113639639A (en) Data processing method and device for position data and storage medium
CN110866956A (en) Robot calibration method and terminal
CN114170091A (en) Image scaling method and device, electronic equipment and storage medium
CN113446940A (en) Point cloud scanning method, device and equipment
CN112652056A (en) 3D information display method and device
CN113188569A (en) Vehicle and laser radar coordinate system calibration method, device and storage medium
US11202000B2 (en) Learning apparatus, image generation apparatus, learning method, image generation method, and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant