WO2021168620A1 - 声源跟踪控制方法和控制装置、声源跟踪系统 - Google Patents
声源跟踪控制方法和控制装置、声源跟踪系统 Download PDFInfo
- Publication number
- WO2021168620A1 WO2021168620A1 PCT/CN2020/076462 CN2020076462W WO2021168620A1 WO 2021168620 A1 WO2021168620 A1 WO 2021168620A1 CN 2020076462 W CN2020076462 W CN 2020076462W WO 2021168620 A1 WO2021168620 A1 WO 2021168620A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- audio
- audio segment
- sound source
- segment
- collection circuit
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 48
- 239000000284 extract Substances 0.000 claims abstract description 16
- 238000005070 sampling Methods 0.000 claims description 30
- 238000006243 chemical reaction Methods 0.000 claims description 25
- 230000005236 sound signal Effects 0.000 claims description 15
- 230000004044 response Effects 0.000 claims description 6
- 238000000605 extraction Methods 0.000 claims description 5
- 238000010586 diagram Methods 0.000 description 12
- 238000004891 communication Methods 0.000 description 8
- 238000004364 calculation method Methods 0.000 description 7
- 238000005516 engineering process Methods 0.000 description 6
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 230000010365 information processing Effects 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S5/00—Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations
- G01S5/18—Position-fixing by co-ordinating two or more direction or position line determinations; Position-fixing by co-ordinating two or more distance determinations using ultrasonic, sonic, or infrasonic waves
- G01S5/22—Position of source determined by co-ordinating a plurality of position lines defined by path-difference measurements
Definitions
- the present disclosure relates to the field of information processing, and in particular to a sound source tracking control method and control device, and a sound source tracking system.
- the first solution is to track the sound source at a fixed position. Personnel turn on the microphone when speaking, and turn off the microphone when not speaking. By monitoring the on-off state of the microphone and controlling the camera to aim at the speaker, the sound source tracking is achieved.
- the second solution is to combine voice recognition and face recognition. Identify the audio features by detecting the voice, query the face image information of the speaker from the database based on the audio features, and then use the queried face image information to identify the speaker in the current scene, and control the camera to aim Speakers, so as to achieve sound source tracking.
- a sound source tracking control method including: extracting a first audio segment from first audio information collected by a first audio collecting circuit, and collecting synchronously from a second audio collecting circuit Extract a second audio segment from the second audio information of the According to the first time offset, determine the first distance between the sound source and the first audio collection circuit and the second distance between the sound source and the second audio collection circuit According to the first distance difference, determine the first offset angle of the sound source; adjust the video capture direction of the video capture circuit according to the first offset angle, so that the video capture circuit Quasi the sound source.
- the determining the first offset angle of the sound source according to the first distance difference includes: using the first distance difference, and the first audio collection circuit and the second The distance between the audio collection circuits determines a first distance parameter; the first offset angle of the sound source is determined according to the ratio of the first distance parameter and the first distance difference.
- the first time of the first audio segment and the second audio segment is determined according to the deviation between the preset peak value in the first audio segment and the second audio segment
- the offset includes: according to the first difference between the maximum positive peak sample number in the first audio segment and the maximum positive peak sample number in the second audio segment, between the first audio segment and the first audio segment
- the corresponding effective positive peak value is selected from the second audio segment, wherein the first audio segment and the second audio segment respectively include multiple sample values; according to the minimum negative peak sample number in the first audio segment and the According to the second difference of the minimum negative peak sample number in the second audio segment, the corresponding effective negative peak is selected in the first audio segment and the second audio segment; according to the first audio segment and the second audio segment
- the sample sequence number deviation of the corresponding effective positive peak value in the second audio segment, and the sample sequence number deviation of the corresponding effective negative peak value in the first audio segment and the second audio segment determine the first audio segment and the The first sampling clock deviation of the second audio segment; the first time offset is determined according to the first sampling clock deviation
- the difference between the effective positive peak sample number in the first audio segment and the corresponding effective positive peak sample number in the second audio segment and the difference between the first difference is set in the first preset Within the range; the difference between the effective negative peak sample sequence number in the first audio segment and the corresponding effective negative peak sample sequence number in the second audio segment and the difference between the second difference value are within a second preset range.
- the above method further includes: determining whether the first sum of the effective positive peak value and the effective negative peak value in the first audio segment or the second audio segment is less than a first preset threshold; If the first sum value is less than the first preset threshold, the video acquisition circuit is controlled to perform panoramic shooting.
- the above method further includes: if the first sum value is not less than a first preset threshold, determining the number of the effective positive peaks in the first audio segment or the second audio segment Is the same as the number of effective negative peaks; if the number of effective positive peaks in the first audio segment or the second audio segment is the same as the number of effective negative peaks, the second audio segment is further calculated The second sum of the total number of positive peaks and the total number of negative peaks in an audio segment or a second audio segment; in response to the ratio of the first sum to the second sum being greater than a second preset threshold, controlling the Video capture circuit for panoramic shooting.
- the above method further includes: calculating a third difference between the maximum positive peak sample sequence number in the first audio segment and the minimum negative positive peak sample sequence number; calculating the second audio segment The fourth difference between the largest positive peak sample sequence number and the smallest negative positive peak sample sequence number; in response to the third difference and the fourth difference being the same in sign, and the third difference If the difference between the value and the fourth difference value is within a third preset range, a corresponding effective positive peak value is selected in the first audio segment and the second audio segment.
- the above method further includes: calculating a fifth difference between the total number of positive peaks in the first audio segment and the total number of positive peaks in the second audio segment, and The third sum of the total number of positive peaks and the total number of positive peaks in the second audio segment; calculating the sixth difference between the total number of negative peaks in the first audio segment and the total number of negative peaks in the second audio segment , And the fourth sum of the total number of negative peaks in the first audio segment and the total number of negative peaks in the second audio segment; in response to the ratio of the fifth difference to the third sum at the first If the ratio of the sixth difference value to the fourth sum value is within the fifth predetermined range, the corresponding one is selected from the first audio segment and the second audio segment Effective positive peak value.
- the above method further includes: synchronously extracting a third audio segment from the third audio information collected by the third audio collecting circuit, and extracting a fourth audio segment from the fourth audio information collected by the fourth audio collecting circuit.
- determine the second time offset of the third audio segment and the fourth audio segment determines the second distance difference between the third distance between the sound source and the third audio collection circuit and the fourth distance between the sound source and the fourth audio collection circuit; according to the The second distance difference determines the second offset angle of the sound source; adjusts the video capture direction of the video capture circuit according to the first offset angle and the second offset angle, so that the video capture circuit is aligned The sound source.
- a sound source tracking control device including: an extraction module configured to extract a first audio segment from first audio information collected by a first audio collection circuit, and synchronously extract a first audio segment from The second audio segment is extracted from the second audio information collected by the second audio collection circuit; the time offset determination module is configured to determine the difference between the preset peak values in the first audio segment and the second audio segment Deviation, determining the first time offset of the first audio segment and the second audio segment; the distance difference determining module is configured to determine that the sound source is away from the first time offset according to the first time offset The first distance difference between the first distance of the audio collection circuit and the second distance between the sound source and the second audio collection circuit; the offset angle determination module is configured to determine the The first offset angle of the sound source; the direction adjustment module is configured to adjust the video capture direction of the video capture circuit according to the first offset angle, so that the video capture circuit is aligned with the sound source.
- a sound source tracking control device including: a memory configured to store instructions; a processor coupled to the memory, and the processor is configured to execute the implementation based on instructions stored in the memory as described above The method described in any embodiment.
- a sound source tracking system including the sound source tracking control device as described in any of the above embodiments, and: a video capture circuit configured to follow the sound source tracking control device The control adjusts the video collection direction; a first audio collection circuit and a second audio collection circuit, wherein the first audio collection circuit and the second audio collection circuit are symmetrically arranged on both sides of the video collection circuit.
- the ratio of the distance from the sound source to the video collection circuit to the distance from the first audio collection circuit to the second audio collection circuit is greater than a preset distance threshold.
- the tracking system further includes: an analog-to-digital converter for performing analog-to-digital conversion on the audio signal collected by the first audio collecting circuit to generate first audio information, and performing analog-to-digital conversion on the audio signal collected by the second audio collecting circuit Analog-to-digital conversion to generate second audio information;
- the video capture circuit includes: a direction control platform and a camera arranged on the direction control platform, the direction control platform is configured to follow the control of the sound source tracking control device Adjust the direction.
- a computer-readable storage medium wherein the computer-readable storage medium stores computer instructions, and when the instructions are executed by a processor, a method related to any of the above-mentioned embodiments is implemented.
- Fig. 1 is a schematic flowchart of a sound source tracking control method according to an embodiment of the present disclosure
- Fig. 2 is a schematic flowchart of a method for calculating a time offset according to an embodiment of the present disclosure
- Fig. 3 is a schematic diagram of a hyperbolic model according to an embodiment of the present disclosure.
- FIG. 4 is a schematic flowchart of a sound source tracking control method according to another embodiment of the present disclosure.
- FIG. 5 is a schematic flowchart of a method for calculating a time offset according to another embodiment of the present disclosure
- Fig. 6 is a schematic structural diagram of a sound source tracking control device according to an embodiment of the present disclosure
- Fig. 7 is a schematic structural diagram of a sound source tracking control device according to an embodiment of the present disclosure.
- Fig. 8 is a schematic structural diagram of a sound source tracking system according to an embodiment of the present disclosure.
- Fig. 9 is a schematic structural diagram of a sound source tracking system according to another embodiment of the present disclosure.
- Fig. 10 is a schematic structural diagram of a sound source tracking system according to another embodiment of the present disclosure.
- the second related technology due to the need for voice recognition and face recognition, the calculation cost is high.
- the recognition rate of voice recognition and face recognition also affects the accuracy of sound source tracking.
- the present disclosure proposes a solution that can easily and quickly implement sound source tracking.
- Fig. 1 is a schematic flowchart of a sound source tracking control method according to an embodiment of the present disclosure. In some embodiments, the following steps of the sound source tracking control method are executed by the sound source tracking control device.
- step 101 the first audio segment is extracted from the first audio information collected by the first audio collection circuit, and the second audio segment is synchronously extracted from the second audio information collected by the second audio collection circuit.
- the first audio collection circuit and the second audio collection circuit are pickups.
- the duration of the first audio segment and the second audio segment is 50-100ms.
- the first audio collection circuit and the second audio collection circuit are symmetrically arranged on both sides of the video collection circuit.
- the distance from the video acquisition circuit to the first audio acquisition circuit is the same as the distance from the video acquisition circuit to the second audio acquisition circuit.
- the first audio collection circuit, the second audio collection circuit, and the video collection circuit are located on the first straight line.
- the video capture circuit includes a direction control platform and a camera arranged on the direction control platform.
- the direction control platform is PTZ.
- the control parameters are sent to the direction control platform to adjust the direction of the direction control platform, thereby adjusting the video capture direction of the camera.
- the communication protocol used is UART (Universal Asynchronous Receiver/Transmitter, Universal Asynchronous Receiver/Transmitter) protocol.
- the first straight line is a horizontal direction.
- the first audio collecting circuit and the second audio collecting circuit are respectively arranged on the left and right sides of the video collecting circuit. Analog-to-digital conversion is performed on the audio signal collected by the first audio collection circuit to generate first audio information, and the analog-to-digital conversion is performed on the audio signal collected by the second audio collection circuit to generate second audio information.
- step 102 the first time offset between the first audio segment and the second audio segment is determined according to the deviation between the preset peak values in the first audio segment and the second audio segment.
- Fig. 2 is a schematic flowchart of a method for calculating a time offset according to an embodiment of the present disclosure. In some embodiments, the following steps of the time offset calculation method are executed by the sound source tracking control device.
- step 201 the largest positive peak sample sequence number and the smallest negative peak sample sequence number in the first audio segment, and the largest positive peak sample sequence number and the smallest negative peak sample sequence number in the second audio segment are identified.
- first audio segment and the second audio segment respectively include multiple sample values.
- the first audio segment and the second audio segment are identified, it can also be detected whether the first audio segment and the second audio segment correspond.
- the largest positive peak sample number is L max
- the smallest negative positive peak sample number is L min
- the largest positive peak sample number is R max
- the smallest negative positive peak sample number is R min .
- ⁇ 1 is the preset threshold.
- the total number of positive peaks in the first audio segment is L Ptotal
- the total number of negative peaks in the first audio segment is L ntotal
- the total number of positive peaks in the second audio segment is R Ptotal
- the negative peaks in the second audio segment The total is R ntotal .
- ⁇ 1 and ⁇ 2 are preset thresholds. ⁇ 1 and ⁇ 2 may be the same or different.
- the positions of the largest positive peak and the smallest negative positive peak in the first audio segment and the second audio segment correspond, and the total number of positive peaks and the total number of negative peaks in the first audio segment and the second audio segment are within a reasonable range, This can ensure the calculation accuracy of the time offset. If the positions of the largest positive peak and the smallest negative positive peak in the first audio segment and the second audio segment do not correspond, or the total number of positive peaks and the total number of negative peaks in the first audio segment and the second audio segment are not within a reasonable range, It indicates that the first audio segment and the second audio segment are interfered by the outside world. In this case, it is necessary to re-extract the first audio segment from the first audio information collected by the first audio collection circuit, and simultaneously re-extract the second audio segment from the second audio information collected by the second audio collection circuit.
- step 202 the effective positive peak value and the effective negative peak value in the first audio segment and the second audio segment are obtained.
- the corresponding valid one is selected from the first audio segment and the second audio segment.
- Positive peak According to the difference between the smallest negative peak sample number in the first audio segment and the smallest negative peak sample number in the second audio segment, the corresponding effective negative peak is selected in the first audio segment and the second audio segment.
- Li is the sample number of the i-th effective positive peak DL i in the first audio segment
- R j is the sample number of the j-th effective positive peak DR j in the second audio segment.
- the maximum positive peak sample number in the first audio segment is L max
- the maximum positive peak sample number in the second audio segment is R max .
- Li is the sampling sequence of the i-th effective negative peak DL i in the first audio segment
- R j is the sampling sequence of the j-th effective negative peak DR j in the second audio segment. If the minimum negative peak sample number in the first audio segment is L min , and the minimum negative peak sample number in the second audio segment is R min , the following formula (6) holds when DL i corresponds to DR j.
- ⁇ 1 and ⁇ 2 are preset thresholds. ⁇ 1 and ⁇ 2 may be the same or different.
- the first audio segment is determined according to the sample sequence number deviation of the corresponding effective positive peak value in the first audio segment and the second audio segment, and the sample sequence number deviation of the corresponding effective negative peak value in the first audio segment and the second audio segment.
- sampling sequence deviation represents the number of sampling clocks between corresponding positive peaks or corresponding negative peaks. Therefore, the deviation of the first sampling clock between the first audio segment and the second audio segment can be determined by using the deviation of the sampling sequence number.
- the sample sequence number deviations of the corresponding effective positive peaks in the first audio segment and the second audio segment and the sample sequence number deviations of the corresponding effective negative peaks in the first audio segment and the second audio segment, you can pass Calculate the arithmetic mean, geometric mean, or standard deviation value to determine the first sampling clock deviation between the first audio segment and the second audio segment.
- the sampling sequence number deviation of the i-th effective peak value in the first audio segment and the corresponding j-th effective peak value in the second audio segment is ⁇ i.
- the standard deviation M1 of the deviation of the sampling sequence number is calculated by using the following formula (7) as the deviation of the first sampling clock between the first audio segment and the second audio segment.
- the first time offset is determined according to the first sampling clock deviation and the sampling conversion frequency.
- the effective positive peak value M Vaiid and the effective positive peak value M Vaiid in the first audio segment or the second audio segment can be further determined. Does the negative peak value N Vaiid satisfy the following formula (9)?
- D1 is the preset threshold. If the above formula (9) is established, it indicates that there are too few effective peaks in the first audio segment and the second audio segment. This is usually caused by the silence of the current scene. In this case, control the video capture circuit to perform panoramic shooting.
- the video capture direction of the video capture circuit is perpendicular to the plane where the first audio capture circuit and the second audio capture circuit are located. Therefore, the video capture circuit can fully cover the current scene.
- L Ptotal is the total number of positive peaks in the first audio segment
- L ntotal is the total number of negative peaks in the first audio segment
- R Ptotal is the total number of positive peaks in the second audio segment
- R ntotal is the total number of positive peaks in the second audio segment.
- step 103 according to the first time offset, the first distance difference between the first distance between the sound source and the first audio collection circuit and the second distance between the sound source and the second audio collection circuit is determined.
- the first distance difference a1 is calculated using formula (11).
- step 104 the first offset angle of the sound source is determined according to the first distance difference.
- Fig. 3 is a schematic diagram of a hyperbolic model according to an embodiment of the present disclosure.
- F1 is the first audio collection circuit
- F2 is the second audio collection circuit
- P is the speaker
- O is the video collection circuit.
- the distance between F1 and F2 (for example, 10-30 cm) is less than the distance between the video capture circuit and the speaker (for example, 2-5 meters), so the hyperbolic asymptotic equation can be adopted to solve the problem.
- the ratio of the distance D from the sound source to the video collection circuit to the distance d from the first audio collection circuit to the second audio collection circuit is greater than a preset distance threshold. If the value of D/d is greater than the preset distance threshold, it indicates that the distance between the video capture circuit and the speaker is sufficiently large relative to the distance between F1 and F2. In this case, it is suitable for the hyperbolic model.
- the preset distance threshold is 5.
- the first distance difference a1
- the corresponding hyperbolic equation is shown in the following formula (12).
- c is the distance between F1 and F2, and the distance parameter b satisfies the following formula (13).
- the first deviation angle of the sound source is obtained according to the slope of the asymptote.
- the first offset angle ⁇ 1 is calculated using the following formula (15).
- step 105 the video capture direction of the video capture circuit is adjusted according to the first offset angle, so that the video capture circuit is aligned with the sound source.
- the first audio collection circuit, the second audio collection circuit, and the video collection circuit are located on a first straight line, and the first straight line is a horizontal direction.
- the sound source tracking control device uses the first offset angle to control the deflection angle of the video capture circuit in the left and right directions. Therefore, the sound source tracking can be realized on the horizontal plane.
- the offset angle of the sound source is determined by using the distance difference between the sound source to reach the first audio collecting circuit and the second audio collecting circuit.
- the direction of the video acquisition circuit is adjusted according to the determined offset angle, so as to be able to aim at the sound source for shooting, so as to easily and quickly realize the sound source tracking.
- Fig. 4 is a schematic flowchart of a sound source tracking control method according to another embodiment of the present disclosure. In some embodiments, the following steps of the sound source tracking control method are executed by the sound source tracking control device.
- step 401 extract the first audio segment from the first audio information collected by the first audio collection circuit, and synchronously extract the second audio segment from the second audio information collected by the second audio collection circuit, and synchronously extract the second audio segment from the third audio information collected by the second audio collection circuit.
- the third audio segment is extracted from the third audio information collected by the audio collection circuit
- the fourth audio segment is synchronously extracted from the fourth audio information collected by the fourth audio collection circuit.
- the first audio collection circuit to the fourth audio collection circuit are pickups.
- the duration from the first audio segment to the fourth audio segment is 50-100ms.
- the first audio collection circuit and the second audio collection circuit are symmetrically arranged on both sides of the video collection circuit.
- the distance from the video acquisition circuit to the first audio acquisition circuit is the same as the distance from the video acquisition circuit to the second audio acquisition circuit.
- the third audio collecting circuit and the fourth audio collecting circuit are symmetrically arranged on the other two sides of the video collecting circuit.
- the distance from the video acquisition circuit to the third audio acquisition circuit is the same as the distance from the video acquisition circuit to the fourth audio acquisition circuit.
- the first audio collection circuit, the second audio collection circuit, and the video collection circuit are located on the first straight line.
- the third audio collection circuit, the fourth audio collection circuit and the video collection circuit are located on the second straight line.
- the first straight line is perpendicular to the second straight line.
- the video capture circuit includes a direction control platform and a camera arranged on the direction control platform.
- the direction control platform is PTZ.
- the control parameters are sent to the direction control platform to adjust the direction of the direction control platform, thereby adjusting the video capture direction of the camera.
- the communication protocol used is the UART protocol.
- the first straight line is a horizontal direction.
- the first audio collecting circuit and the second audio collecting circuit are respectively arranged on the left and right sides of the video collecting circuit.
- the second straight line is the vertical direction.
- the third audio collecting circuit and the fourth audio collecting circuit are respectively arranged on the upper and lower sides of the video collecting circuit. Perform analog-to-digital conversion on the audio signal collected by the first audio collection circuit to generate first audio information, perform analog-to-digital conversion on the audio signal collected by the second audio collection circuit to generate second audio information, and perform analog-to-digital conversion on the audio signal collected by the third audio collection circuit.
- the audio signal is subjected to analog-to-digital conversion to generate third audio information
- the audio signal collected by the fourth audio collecting circuit is subjected to analog-to-digital conversion to generate fourth audio information.
- step 402 the first time offset between the first audio segment and the second audio segment is determined according to the deviation between the preset peak values in the first audio segment and the second audio segment, and the first time offset is determined according to the third audio segment and the fourth audio segment.
- the deviation between the preset peak values in the audio segment determines the second time offset between the third audio segment and the fourth audio segment.
- the time offset calculation method described in any of the embodiments in FIG. 2 is used to calculate the first time offset
- the time offset calculation described in any of the embodiments in FIG. 5 below is used to calculate the first time offset. The method calculates the second time offset.
- Fig. 5 is a schematic flowchart of a method for calculating a time offset according to another embodiment of the present disclosure. In some embodiments, the following steps of the time offset calculation method are executed by the sound source tracking control device.
- step 501 the largest positive peak sample sequence number and the smallest negative peak sample sequence number in the third audio segment, and the largest positive peak sample sequence number and the smallest negative peak sample sequence number in the fourth audio segment are identified.
- the third audio segment and the fourth audio segment respectively include multiple sample values.
- the third audio segment and the fourth audio segment after identifying the third audio segment and the fourth audio segment, it can also be detected whether the third audio segment and the fourth audio segment correspond.
- the maximum positive peak sample number is U max
- the minimum negative positive peak sample number is U min
- the largest positive peak sample number is D max
- the smallest negative positive peak sample number is D min .
- ⁇ 2 is the preset threshold.
- the total number of positive peaks in the third audio segment is U Ptotal
- the total number of negative peaks in the third audio segment is Untotal
- the total number of positive peaks in the fourth audio segment is D Ptotal
- the negative peaks in the fourth audio segment The total is D ntotal .
- ⁇ 3 and ⁇ 4 are preset thresholds. ⁇ 3 and ⁇ 4 may be the same or different.
- the positions of the largest positive peak and the smallest negative positive peak in the third audio segment and the fourth audio segment correspond, and the total number of positive peaks and the total number of negative peaks in the third audio segment and the fourth audio segment are within a reasonable range, This can ensure the calculation accuracy of the time offset. If the positions of the largest positive peaks and the smallest negative positive peaks in the third audio segment and the fourth audio segment do not correspond, or the total number of positive peaks and the total number of negative peaks in the third audio segment and the fourth audio segment are not within a reasonable range, It indicates that the third audio segment and the fourth audio segment are interfered by the outside world.
- the third audio segment is synchronously extracted from the third audio information collected by the third audio collection circuit
- the fourth audio segment is synchronously extracted from the fourth audio information collected by the fourth audio collection circuit.
- step 502 the effective positive peak value and the effective negative peak value in the third audio segment and the fourth audio segment are obtained.
- the corresponding valid one is selected from the third audio segment and the fourth audio segment.
- Positive peak According to the difference between the smallest negative peak sample number in the third audio segment and the smallest negative peak sample number in the fourth audio segment, the corresponding effective negative peak is selected in the third audio segment and the fourth audio segment.
- U i is the sample number of the i-th effective positive peak DU i in the third audio segment
- D j is the sample number of the j-th effective positive peak DD j in the fourth audio segment.
- the maximum positive peak sample number in the third audio segment is U max
- the maximum positive peak sample number in the fourth audio segment is D max .
- U i is the sampling sequence of the i-th effective negative peak DU i in the third audio segment
- D j is the sampling sequence of the j-th effective negative peak DD j in the fourth audio segment. If the smallest negative peak sample number in the third audio segment is U min and the smallest negative peak sample number in the fourth audio segment is D min , the following formula (21) holds true when DU i corresponds to DD j.
- ⁇ 3 and ⁇ 4 are preset thresholds. ⁇ 3 and ⁇ 4 may be the same or different.
- the third audio segment is determined according to the sample sequence number deviation of the corresponding effective positive peaks in the third audio segment and the fourth audio segment, and the sample sequence number deviations of the corresponding effective negative peaks in the third audio segment and the fourth audio segment.
- the sample sequence number deviations of the corresponding effective positive peaks in the third audio segment and the fourth audio segment and the sample sequence number deviations of the corresponding effective negative peaks in the third audio segment and the fourth audio segment, you can pass Calculate the arithmetic mean, geometric mean, or standard deviation value to determine the second sampling clock deviation between the third audio segment and the fourth audio segment.
- the sampling sequence number deviation between the i-th effective peak in the third audio segment and the corresponding j-th effective peak in the fourth audio segment is ⁇ i.
- the standard deviation M2 of the deviation of the sampling sequence number is calculated by using the following formula (22) as the second sampling clock deviation of the third audio segment and the fourth audio segment.
- step 504 the second time offset is determined according to the second sampling clock deviation and the sampling conversion frequency.
- the effective positive peak value M Vaild and the effective positive peak value in the third audio segment or the fourth audio segment can be further determined. Does the negative peak value N Vaild satisfy the following formula (24)?
- D3 is the preset threshold. If the above formula (24) is established, it indicates that there are too few effective peaks in the third audio segment and the fourth audio segment. This is usually caused by the silence of the current scene. In this case, control the video capture circuit to perform panoramic shooting.
- D4 is the preset threshold
- U Ptotal is the total number of positive peaks in the third audio segment
- Untotal is the total number of negative peaks in the third audio segment
- D Ptotal is the total number of positive peaks in the fourth audio segment
- D ntotal is the total number of positive peaks in the fourth audio segment.
- step 403 the first distance difference between the first distance between the sound source and the first audio collection circuit and the second distance between the sound source and the second audio collection circuit is determined according to the first time offset, and according to the second time offset Determine the second distance difference between the third distance between the sound source and the third audio collection circuit and the fourth distance between the sound source and the fourth audio collection circuit.
- the first distance difference a1 is calculated using the above formula (11).
- formula (26) is used to calculate the second distance difference a2.
- the propagation speed v of sound in the air is 340 m/s.
- step 404 the first offset angle of the sound source is determined according to the first distance difference, and the second offset angle of the sound source is determined according to the second distance difference.
- the first offset angle ⁇ 1 is calculated using the above formula (15).
- the corresponding hyperbolic equation is shown in the following formula (27).
- c is the distance between the third audio collection circuit and the fourth audio collection circuit, and the distance parameter b satisfies the following formula (28).
- the second offset angle of the sound source is obtained according to the slope of the asymptote.
- the second offset angle ⁇ 2 is calculated using the following formula (30).
- step 405 the video capture direction of the video capture circuit is adjusted according to the first offset angle and the second offset angle, so that the video capture circuit is aligned with the sound source.
- the first audio collection circuit, the second audio collection circuit, and the video collection circuit are located on a first straight line, and the first straight line is a horizontal direction.
- the third audio collection circuit, the fourth audio collection circuit, and the video collection circuit are located on the fourth straight line, and the second straight line is the vertical direction.
- the first offset angle can be used to control the deflection angle of the video capture circuit in the left and right directions
- the second offset angle can be used to control the deflection angle of the video capture circuit in the up and down directions. Therefore, sound source tracking can be achieved in three-dimensional space.
- Fig. 6 is a schematic structural diagram of a sound source tracking control device according to an embodiment of the present disclosure.
- the sound source tracking control device includes an extraction module 61, a time offset determination module 62, a distance difference determination module 63, an offset angle determination module 64 and a direction adjustment module 65.
- the extraction module 61 extracts the first audio segment from the first audio information collected by the first audio collection circuit, and synchronously extracts the second audio segment from the second audio information collected by the second audio collection circuit.
- the time offset determination module 62 determines the first time offset of the first audio segment and the second audio segment according to the deviation between the preset peak values in the first audio segment and the second audio segment.
- the time offset determination module 62 calculates the first time offset between the first audio segment and the second audio segment using the process shown in FIG. 2 described above.
- the distance difference determining module 63 determines the first distance difference between the first distance between the sound source and the first audio collection circuit and the second distance between the sound source and the second audio collection circuit according to the first time offset.
- the offset angle determination module 64 determines the first offset angle of the sound source according to the first distance difference.
- the offset angle determination module 64 uses the above formula (15) to calculate the first offset angle of the sound source.
- the direction adjustment module 65 adjusts the video capture direction of the video capture circuit according to the first offset angle, so that the video capture circuit is aligned with the sound source.
- the first audio collection circuit and the second audio collection circuit are symmetrically arranged on both sides of the video collection circuit.
- the first audio collection circuit, the second audio collection circuit, and the video collection circuit are located on the first straight line. If the first straight line is the horizontal direction, the sound source tracking control device uses the first offset angle to control the deflection angle of the video capture circuit in the left and right directions. Therefore, sound source tracking can be achieved on the horizontal plane.
- the extraction module 61 extracts the first audio segment from the first audio information collected by the first audio collection circuit, and synchronously extracts the second audio segment from the second audio information collected by the second audio collection circuit, The third audio segment is synchronously extracted from the third audio information collected by the third audio collecting circuit, and the fourth audio segment is synchronously extracted from the fourth audio information collected by the fourth audio collecting circuit.
- the time offset determination module 62 determines the first time offset of the first audio segment and the second audio segment according to the deviation between the preset peak values in the first audio segment and the second audio segment. The time offset determination module 62 also determines the second time offset of the third audio segment and the third audio segment according to the deviation between the preset peak values in the third audio segment and the fourth audio segment.
- the time offset determination module 62 calculates the first time offset between the first audio segment and the second audio segment using the process shown in FIG. 2 described above.
- the time offset determination module 62 calculates the second time offset of the third audio segment and the fourth audio segment by using the process shown in FIG. 5 above.
- the distance difference determining module 63 determines the first distance difference between the first distance between the sound source and the first audio collection circuit and the second distance between the sound source and the second audio collection circuit according to the first time offset. The distance difference determining module 63 also determines the second distance difference between the third distance of the sound source from the third audio collecting circuit and the fourth distance of the sound source from the fourth audio collecting circuit according to the second time offset.
- the offset angle determination module 64 determines the first offset angle of the sound source according to the first distance difference. The offset angle determination module 64 also determines the second offset angle of the sound source according to the second distance difference.
- the offset angle determination module 64 uses the above formula (15) to calculate the first offset angle of the sound source.
- the offset angle determination module 64 uses the above formula (30) to calculate the second offset angle of the sound source.
- the direction adjustment module 65 adjusts the video capture direction of the video capture circuit according to the first offset angle and the second offset angle, so that the video capture circuit is aligned with the sound source.
- the first audio collection circuit and the second audio collection circuit are symmetrically arranged on both sides of the video collection circuit.
- the third audio collecting circuit and the fourth audio collecting circuit are symmetrically arranged on the other two sides of the video collecting circuit.
- the first audio collection circuit, the second audio collection circuit, and the video collection circuit are located on the first straight line.
- the third audio collection circuit, the fourth audio collection circuit and the video collection circuit are located on the second straight line.
- the first straight line is perpendicular to the second straight line.
- the sound source tracking control device uses the first offset angle to control the deflection angle of the video capture circuit in the left and right directions, and the second offset angle to control the video capture The deflection angle of the circuit in the up and down direction. Therefore, the sound source tracking can be realized on the horizontal plane.
- Fig. 7 is a schematic structural diagram of a sound source tracking control device according to an embodiment of the present disclosure. As shown in FIG. 7, the sound source tracking control device includes a memory 701 and a processor 702.
- the memory 701 is used to store instructions, and the processor 702 is coupled to the memory 701, and the processor 702 is configured to execute the method involved in any one of the embodiments in FIG. 1, FIG. 2, FIG. 4, and FIG. 5 based on the execution of instructions stored in the memory.
- the sound source tracking control device further includes a communication interface 703, which is used to exchange information with other devices.
- the sound source tracking control device also includes a bus 704, a processor 702, a communication interface 703, and a memory 701 communicate with each other through the bus 704.
- the memory 701 may include a high-speed RAM memory, and may also include a non-volatile memory (non-volatile memory), for example, at least one disk memory.
- the memory 701 may also be a memory array.
- the memory 701 may also be divided into blocks, and the blocks may be combined into a virtual volume according to certain rules.
- processor 702 may be a central processing unit CPU, or may be an application specific integrated circuit (ASIC for short), or configured as one or more integrated circuits for implementing the embodiments of the present disclosure.
- CPU central processing unit
- ASIC application specific integrated circuit
- the present disclosure also relates to a computer-readable storage medium, in which the computer-readable storage medium stores computer instructions, and when the instructions are executed by a processor, the instructions involved in any one of the embodiments shown in FIG. 1, FIG. 2, FIG. 4, and FIG. 5 are realized. method.
- Fig. 8 is a schematic structural diagram of a sound source tracking system according to an embodiment of the present disclosure.
- the sound source tracking system includes a first audio acquisition circuit 811, a second audio acquisition circuit 812, a sound source tracking control device 82 and a video acquisition circuit 83.
- the sound source tracking control device 82 is a sound source tracking control device related to any one of the embodiments in FIG. 6 or FIG. 7.
- the first audio collection circuit 811 and the second audio collection circuit 812 are symmetrically arranged on both sides of the video collection circuit 73.
- the distance from the video acquisition circuit to the first audio acquisition circuit is the same as the distance from the video acquisition circuit to the second audio acquisition circuit.
- the first audio collection circuit, the second audio collection circuit, and the video collection circuit are located on the first straight line.
- the first audio collection circuit 811 and the second audio collection circuit 812 are microphones.
- the first straight line is a horizontal direction.
- the sound source tracking control device 82 uses the calculated first offset angle to control the left and right deflection angle of the video capture circuit 83, so as to realize the sound source tracking on the horizontal plane.
- Fig. 9 is a schematic structural diagram of a sound source tracking system according to another embodiment of the present disclosure.
- the video capture circuit 83 includes a direction control platform 831 and a camera 832 provided on the direction control platform 831.
- the direction control platform 831 is a pan-tilt.
- the sound source tracking control device 82 sends control parameters to the direction control platform 831 by using the communication protocol supported by the direction control platform 831 to adjust the direction of the direction control platform 831 to adjust the video capture direction of the camera 832.
- the communication protocol used is UART protocol
- the sound source tracking system further includes an analog-to-digital converter 84.
- the analog-to-digital converter 84 performs analog-to-digital conversion on the audio signal collected by the first audio collection circuit 811 to generate first audio information.
- the analog-to-digital converter 84 performs analog-to-digital conversion on the audio signal collected by the second audio collection circuit 812 to generate second audio information.
- analog-to-digital converter 84 is provided with multiple independent conversion modules. Therefore, the first conversion module in the analog-to-digital converter 84 can be used to perform analog-to-digital conversion on the audio signal collected by the first audio collection circuit 811 to generate first audio information, and the second conversion module in the analog-to-digital converter 84 can be used to The audio signal collected by the second audio collection circuit 812 undergoes analog-to-digital conversion to generate second audio information.
- the analog-to-digital converter 84 is a pipelined analog-to-digital converter, a successive approximation (successive approximation register, abbreviation: SAR) analog-to-digital converter, or a sigma-delta (Sigma-Delta) analog-to-digital converter. converter.
- Fig. 10 is a schematic structural diagram of a sound source tracking system according to another embodiment of the present disclosure. The difference between FIG. 10 and FIG. 9 is that in the embodiment shown in FIG. 10, the sound source tracking system further includes a third audio collection circuit 813 and a fourth audio collection circuit 814.
- the third audio collection circuit 813 and the fourth audio collection circuit 814 are symmetrically arranged on the other two sides of the video collection circuit 83.
- the distance from the video capture circuit 83 to the third audio capture circuit 813 is the same as the distance from the video capture circuit 83 to the fourth audio capture circuit 814.
- the first audio collection circuit 811, the second audio collection circuit 812, and the video collection circuit 83 are located on the first straight line.
- the third audio collection circuit 813, the fourth audio collection circuit 814, and the video collection circuit 83 are located on the second straight line.
- the first straight line is perpendicular to the second straight line.
- the first straight line is a horizontal direction
- the second straight line is a vertical direction.
- the sound source tracking control device 82 uses the first offset angle to control the deflection angle of the video capture circuit 83 in the left-right direction.
- the sound source tracking control device 82 uses the second offset angle to control the deflection angle of the video capture circuit 83 in the vertical direction. This enables sound source tracking in a three-dimensional space.
Landscapes
- Physics & Mathematics (AREA)
- Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Radar, Positioning & Navigation (AREA)
- Remote Sensing (AREA)
- Studio Devices (AREA)
Abstract
Description
Claims (15)
- 一种声源跟踪控制方法,包括:从第一音频采集电路采集的第一音频信息中提取第一音频段,并同步地从第二音频采集电路采集的第二音频信息中提取第二音频段;根据所述第一音频段和所述第二音频段中的预设峰值之间的偏差,确定所述第一音频段和所述第二音频段的第一时间偏移量;根据所述第一时间偏移量,确定声源相距所述第一音频采集电路的第一距离和所述声源相距所述第二音频采集电路的第二距离的第一距离差;根据所述第一距离差,确定所述声源的第一偏移角;根据所述第一偏移角调整视频采集电路的视频采集方向,以便所述视频采集电路对准所述声源。
- 根据权利要求1所述的控制方法,其中,所述根据所述第一距离差,确定所述声源的第一偏移角包括:利用所述第一距离差,以及所述第一音频采集电路和所述第二音频采集电路之间的距离确定第一距离参数;根据所述第一距离参数和所述第一距离差的比值确定所述声源的第一偏移角。
- 根据权利要求1所述的控制方法,其中,所述根据所述第一音频段和所述第二音频段中的预设峰值之间的偏差,确定所述第一音频段和所述第二音频段的第一时间偏移量包括:根据所述第一音频段中的最大正峰值采样序号和所述第二音频段中的最大正峰值采样序号的第一差值,在所述第一音频段和所述第二音频段中选择出对应的有效正峰值,其中所述第一音频段和所述第二音频段中分别包括多个采样值;根据所述第一音频段中的最小负峰值采样序号和所述第二音频段中的最小负峰值采样序号的第二差值,在所述第一音频段和所述第二音频段中选择出对应的有效负峰值;根据所述第一音频段和所述第二音频段中对应的有效正峰值的采样序号偏差,以及所述第一音频段和所述第二音频段中对应的有效负峰值的采样序号偏差,确定所述 第一音频段和所述第二音频段的第一采样时钟偏差;根据所述第一采样时钟偏差和采样转换频率确定所述第一时间偏移量。
- 根据权利要求3所述的控制方法,其中:所述第一音频段中的有效正峰值采样序号和所述第二音频段中对应的有效正峰值采样序号之差与所述第一差值的差在第一预设范围内;所述第一音频段中的有效负峰值采样序号和所述第二音频段中对应的有效负峰值采样序号之差与所述第二差值的差在第二预设范围内。
- 根据权利要求3所述的控制方法,还包括:判断所述第一音频段或所述第二音频段中的有效正峰值和有效负峰值的第一和值是否小于第一预设门限;若所述第一和值小于第一预设门限,则控制所述视频采集电路进行全景拍摄。
- 根据权利要求5所述的控制方法,还包括:若所述第一和值不小于第一预设门限,则判断所述第一音频段或所述第二音频段中的所述有效正峰值的数量和所述有效负峰值的数量是否相同;在所述第一音频段或所述第二音频段中的所述有效正峰值的数量和所述有效负峰值的数量相同的情况下,进一步计算第一音频段或第二音频段中的正峰值总数和负峰值总数的第二和值;响应于所述第一和值与所述第二和值之比大于第二预设门限,控制所述视频采集电路进行全景拍摄。
- 根据权利要求3所述的控制方法,还包括:计算所述第一音频段中的所述最大正峰值采样序号和所述最小负正峰值采样序号的第三差值;计算所述第二音频段中的所述最大正峰值采样序号和所述最小负正峰值采样序号的第四差值;响应于所述第三差值和所述第四差值的正负性一致,且所述第三差值和所述第四差值的差在第三预设范围内,则在所述第一音频段和所述第二音频段中选择出对应的 有效正峰值。
- 根据权利要求3所述的控制方法,还包括:计算所述第一音频段中的正峰值总数和所述第二音频段中的正峰值总数的第五差值,以及所述第一音频段中的正峰值总数和所述第二音频段中的正峰值总数的第三和值;计算所述第一音频段中的负峰值总数和所述第二音频段中的负峰值总数的第六差值,以及所述第一音频段中的负峰值总数和所述第二音频段中的负峰值总数的第四和值;响应于所述第五差值与所述第三和值的比值在第四预定范围内,且所述第六差值与所述第四和值的比值在所述第五预定范围内,则在所述第一音频段和所述第二音频段中选择出对应的有效正峰值。
- 根据权利要求1-8中任一项所述的控制方法,还包括:同步地从第三音频采集电路采集的第三音频信息中提取第三音频段,从第四音频采集电路采集的第四音频信息中提取第四音频段;根据所述第三音频段和所述第四音频段中的预设峰值之间的偏差,确定所述第三音频段和所述第四音频段的第二时间偏移量;根据所述第二时间偏移量,确定所述声源相距所述第三音频采集电路的第三距离和所述声源相距所述第四音频采集电路的第四距离的第二距离差;根据所述第二距离差,确定所述声源的第二偏移角;根据所述第一偏移角和所述第二偏移角调整视频采集电路的视频采集方向,以便所述视频采集电路对准所述声源。
- 一种声源跟踪控制装置,包括:提取模块,被配置为从第一音频采集电路采集的第一音频信息中提取第一音频段,并同步地从第二音频采集电路采集的第二音频信息中提取第二音频段;时间偏移量确定模块,被配置为根据所述第一音频段和所述第二音频段中的预设峰值之间的偏差,确定所述第一音频段和所述第二音频段的第一时间偏移量;距离差确定模块,被配置为根据所述第一时间偏移量,确定声源相距所述第一音 频采集电路的第一距离和所述声源相距所述第二音频采集电路的第二距离的第一距离差;偏移角确定模块,被配置为根据所述第一距离差,确定所述声源的第一偏移角;方向调整模块,被配置为根据所述第一偏移角调整视频采集电路的视频采集方向,以便所述视频采集电路对准所述声源。
- 一种声源跟踪控制装置,包括:存储器,被配置为存储指令;处理器,耦合到存储器,处理器被配置为基于存储器存储的指令执行实现如权利要求1-9中任一项所述的方法。
- 一种声源跟踪系统,包括如权利要求10或11所述的声源跟踪控制装置,以及视频采集电路,被配置为根据所述声源跟踪控制装置的控制调整视频采集方向;第一音频采集电路和第二音频采集电路,其中所述第一音频采集电路和所述第二音频采集电路对称设置在所述视频采集电路的两侧。
- 根据权利要求12所述的跟踪系统,其中:所述声源到所述视频采集电路的距离与所述第一音频采集电路到第二音频采集电路的距离之比大于预设距离门限。
- 根据权利要求13所述的跟踪系统,还包括:模数转换器,用于对第一音频采集电路采集的音频信号进行模数转换以生成第一音频信息,对第二音频采集电路采集的音频信号进行模数转换以生成第二音频信息;所述视频采集电路包括:方向控制平台和设置在所述方向控制平台上的摄像头,所述方向控制平台被配置为根据所述声源跟踪控制装置的控制调整方向。
- 一种计算机可读存储介质,其中,计算机可读存储介质存储有计算机指令,所述指令被处理器执行时实现如权利要求1-9中任一项所述的方法。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202080000167.1A CN113631942B (zh) | 2020-02-24 | 2020-02-24 | 声源跟踪控制方法和控制装置、声源跟踪系统 |
PCT/CN2020/076462 WO2021168620A1 (zh) | 2020-02-24 | 2020-02-24 | 声源跟踪控制方法和控制装置、声源跟踪系统 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2020/076462 WO2021168620A1 (zh) | 2020-02-24 | 2020-02-24 | 声源跟踪控制方法和控制装置、声源跟踪系统 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021168620A1 true WO2021168620A1 (zh) | 2021-09-02 |
Family
ID=77491742
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/076462 WO2021168620A1 (zh) | 2020-02-24 | 2020-02-24 | 声源跟踪控制方法和控制装置、声源跟踪系统 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN113631942B (zh) |
WO (1) | WO2021168620A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114926378A (zh) * | 2022-04-01 | 2022-08-19 | 浙江西图盟数字科技有限公司 | 一种声源跟踪的方法、系统、装置和计算机存储介质 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6826284B1 (en) * | 2000-02-04 | 2004-11-30 | Agere Systems Inc. | Method and apparatus for passive acoustic source localization for video camera steering applications |
CN103235287A (zh) * | 2013-04-17 | 2013-08-07 | 华北电力大学(保定) | 一种声源定位摄像追踪装置 |
CN103797821A (zh) * | 2011-06-24 | 2014-05-14 | 若威尔士有限公司 | 使用直接声的到达时间差确定 |
CN103841357A (zh) * | 2012-11-21 | 2014-06-04 | 中兴通讯股份有限公司 | 基于视频跟踪的麦克风阵列声源定位方法、装置及系统 |
CN204517964U (zh) * | 2015-02-13 | 2015-07-29 | 上海赢谊电子设备有限公司 | 一种应用在智能家居中的声音定位装置 |
CN106842131A (zh) * | 2017-03-17 | 2017-06-13 | 浙江宇视科技有限公司 | 麦克风阵列声源定位方法及装置 |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI230023B (en) * | 2003-11-20 | 2005-03-21 | Acer Inc | Sound-receiving method of microphone array associating positioning technology and system thereof |
CN102695043A (zh) * | 2012-06-06 | 2012-09-26 | 郑州大学 | 基于声源定位的动态视频监控系统 |
CN108231085A (zh) * | 2016-12-14 | 2018-06-29 | 杭州海康威视数字技术股份有限公司 | 一种声源定位方法及装置 |
JP6375475B1 (ja) * | 2017-06-07 | 2018-08-15 | 井上 時子 | 音源方向追従システム |
-
2020
- 2020-02-24 WO PCT/CN2020/076462 patent/WO2021168620A1/zh active Application Filing
- 2020-02-24 CN CN202080000167.1A patent/CN113631942B/zh active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6826284B1 (en) * | 2000-02-04 | 2004-11-30 | Agere Systems Inc. | Method and apparatus for passive acoustic source localization for video camera steering applications |
CN103797821A (zh) * | 2011-06-24 | 2014-05-14 | 若威尔士有限公司 | 使用直接声的到达时间差确定 |
CN103841357A (zh) * | 2012-11-21 | 2014-06-04 | 中兴通讯股份有限公司 | 基于视频跟踪的麦克风阵列声源定位方法、装置及系统 |
CN103235287A (zh) * | 2013-04-17 | 2013-08-07 | 华北电力大学(保定) | 一种声源定位摄像追踪装置 |
CN204517964U (zh) * | 2015-02-13 | 2015-07-29 | 上海赢谊电子设备有限公司 | 一种应用在智能家居中的声音定位装置 |
CN106842131A (zh) * | 2017-03-17 | 2017-06-13 | 浙江宇视科技有限公司 | 麦克风阵列声源定位方法及装置 |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114926378A (zh) * | 2022-04-01 | 2022-08-19 | 浙江西图盟数字科技有限公司 | 一种声源跟踪的方法、系统、装置和计算机存储介质 |
Also Published As
Publication number | Publication date |
---|---|
CN113631942B (zh) | 2024-04-16 |
CN113631942A (zh) | 2021-11-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11398235B2 (en) | Methods, apparatuses, systems, devices, and computer-readable storage media for processing speech signals based on horizontal and pitch angles and distance of a sound source relative to a microphone array | |
CN107408386B (zh) | 基于语音方向控制电子装置 | |
EP3614377B1 (en) | Object recognition method, computer device and computer readable storage medium | |
CN106782584B (zh) | 音频信号处理设备、方法和电子设备 | |
CN107464564B (zh) | 语音交互方法、装置及设备 | |
CN107346661B (zh) | 一种基于麦克风阵列的远距离虹膜跟踪与采集方法 | |
WO2022127180A1 (zh) | 目标跟踪方法、装置、电子设备及存储介质 | |
JP5456832B2 (ja) | 入力された発話の関連性を判定するための装置および方法 | |
JP2019186929A (ja) | カメラ撮影制御方法、装置、インテリジェント装置および記憶媒体 | |
WO2016131361A1 (zh) | 一种监控系统和方法 | |
WO2012036424A2 (en) | Method and apparatus for performing microphone beamforming | |
CN107799126A (zh) | 基于有监督机器学习的语音端点检测方法及装置 | |
EP2836964A1 (en) | Object recognition using multi-modal matching scheme | |
WO2021008000A1 (zh) | 语音唤醒方法、装置及电子设备、存储介质 | |
CN103685906A (zh) | 一种控制方法、控制装置及控制设备 | |
CN111432115A (zh) | 基于声音辅助定位的人脸追踪方法、终端及存储装置 | |
WO2021168620A1 (zh) | 声源跟踪控制方法和控制装置、声源跟踪系统 | |
WO2016137042A1 (ko) | 사용자 인식을 위한 특징 벡터를 변환하는 방법 및 디바이스 | |
WO2019153382A1 (zh) | 智能音箱及播放控制方法 | |
CN112925235A (zh) | 交互时的声源定位方法、设备和计算机可读存储介质 | |
CN105516692A (zh) | 一种物联网智能设备 | |
WO2019227552A1 (zh) | 基于行为识别的语音定位方法以及装置 | |
CN107533415B (zh) | 声纹检测的方法和装置 | |
WO2023193803A1 (zh) | 音量控制方法、装置、存储介质和电子设备 | |
Park et al. | Robust multi-channel speech recognition using frequency aligned network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20921161 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20921161 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20921161 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 04/04/2023) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20921161 Country of ref document: EP Kind code of ref document: A1 |