WO2016013286A1

WO2016013286A1 - Phase difference calculation device, sound source direction detection device, and phase difference calculation method

Info

Publication number: WO2016013286A1
Application number: PCT/JP2015/064814
Authority: WO
Inventors: 中村　圭介
Original assignee: シャープ株式会社
Priority date: 2014-07-25
Filing date: 2015-05-22
Publication date: 2016-01-28
Also published as: JP2016031243A

Abstract

The present invention more accurately calculates the phase difference between audio signals output by different microphones. An audio signal evaluation unit (5) of a sound source direction detection device (1) creates phase shift audio signals in which the relative phase shift amounts of the audio signals output by a first microphone (2a) and a second microphone (2b) are changed in steps and calculates the phase difference between the two audio signals on the basis of the integrated value of the differences between the output values of the two audio signals within a fixed range in each phase shift audio signal.

Description

Phase difference calculation device, sound source direction detection device, and phase difference calculation method

The present invention relates to a phase difference calculation device that calculates a phase difference between audio signals obtained from a plurality of microphones, a sound source direction detection device including the phase difference calculation device, and a phase difference calculation method.

Conventionally, a technique for detecting the direction or position of a sound source by measuring a temporal shift (phase difference) of an audio signal obtained from each microphone using a plurality of microphones is known. As a technique for determining the phase difference of an audio signal obtained from each of a plurality of microphones, there is a technique for obtaining a phase difference spectrum in an acoustic signal as disclosed in Patent Document 1 and Patent Document 2, for example.

In Patent Document 1, a phase difference spectrum in two acoustic signals is obtained, and all or part of the obtained phase difference spectrum is approximated by a linear function relating to a frequency passing through the origin, and the direction of the sound source is determined from the slope of the linear function. A method for calculating is disclosed.

Further, Patent Document 2 obtains a phase difference spectrum between two acoustic signals, obtains a power spectrum of at least one of the two acoustic signals, and generates a sound source for each sound source based on the obtained phase difference spectrum and the power spectrum. A method for determining a direction is disclosed.

Japanese Patent Publication “JP 2003-337164 A (published on November 28, 2003)” Japanese Patent Publication “Japanese Patent Laid-Open No. 2007-183202 (Published July 19, 2007)”

The frequency component of human voice greatly fluctuates depending on the pronunciation of words. Therefore, when a human voice is a sound source, it is difficult to accurately measure the deviation of each audio signal obtained from a plurality of microphones. For example, the sound of “shi” includes many relatively high-frequency sounds, and “o” includes many low-frequency sounds.

In addition, the sound input to each microphone is not necessarily a sound from a single sound source like the utterance from a single person, but also includes ambient noise and reflected sound from nearby walls. The same waveform signal shifted on the axis is not input to each microphone in the same manner. Therefore, it is difficult to accurately measure the deviation of each signal when the surrounding noise is large or there is a hard wall that easily reflects sound in the vicinity.

Therefore, the present invention has been made in view of the above problems, and an object of the present invention is to provide a phase difference calculation device and a phase difference calculation device that can more accurately calculate a phase difference between audio signals output from different microphones. An object of the present invention is to provide a sound source direction detecting device including a device and a phase difference calculating method.

In order to solve the above-described problems, a phase difference calculation device according to one embodiment of the present invention includes a plurality of microphones that are arranged at different positions and that convert external sound into an audio signal and output the sound, and the plurality of microphones. The phase-shifted audio signal in which the relative phase shift amount between the audio signal output from one of the microphones and the audio signal output from the other microphone is changed stepwise A phase-shifted sound signal creating unit that creates each amount, and an integrated value calculating unit that calculates an integrated value of the difference between output values within a certain range of the two sound signals in the phase-shifted sound signal corresponding to each phase-shifted amount And the phase difference between the audio signal output from the one microphone and the audio signal output from the other microphone based on the integrated value. It comprises a phase difference calculating unit for output, a.

In order to solve the above problem, a sound source direction detection method according to an aspect of the present invention includes a step of converting external sound into an audio signal using a plurality of microphones arranged at different positions and outputting the sound signal; The phase-shifted sound in which the relative phase shift amount between the sound signal output from one of the plurality of microphones and the sound signal output from the other microphone is changed stepwise. A step of generating a signal, a step of calculating an integrated value of a difference between output values within a certain range of two audio signals in the phase-shifted audio signal corresponding to each of the phase shift amounts, and based on the integrated value, Calculating a phase difference between the audio signal output from one microphone and the audio signal output from the other microphone.

According to one aspect of the present invention, the phase difference between audio signals output from different microphones can be calculated more accurately.

It is a block diagram which shows the sound source direction detection apparatus which concerns on one Embodiment of this invention. It is a block diagram which shows the audio | voice signal evaluation part which concerns on one Embodiment of this invention. FIG. 5 is a diagram for explaining a method of creating a deviation value graph from two audio data in an embodiment of the present invention. FIG. 5 is a diagram for explaining a method of creating a deviation value graph from two audio data in an embodiment of the present invention. It is a figure which shows the example of audio | voice data suitable for calculation of a phase difference. 5 is a shift value graph when the audio signal obtained from the first microphone in FIG. 4 is phase-shifted in the positive direction toward the audio signal obtained from the second microphone. FIG. 6 is a diagram for explaining a method of specifying a sound source direction from a phase difference between two audio signals in an embodiment of the present invention. It is a figure which shows the example of audio | voice data when there is much noise around. 5 is a deviation value graph obtained from the audio data of FIG. 4. It is a figure which shows the example of the audio | voice data of the high frequency exceeding a predetermined frequency. 10 is a deviation value graph obtained from the audio data of FIG. 9. It is a flowchart which shows the flow of the sound source direction detection method which concerns on one Embodiment of this invention. It is a block diagram which shows the sound source direction detection apparatus which concerns on other embodiment of this invention. It is a block diagram which shows the sound source direction detection apparatus which concerns on further another embodiment of this invention. FIG. 6 is a diagram for explaining a method of specifying a sound source direction using four microphones in an embodiment of the present invention.

[Embodiment 1]
(Configuration of sound source direction detection device)
Hereinafter, a sound source direction detection apparatus according to Embodiment 1 of the present invention will be described in detail with reference to FIG. FIG. 1 is a block diagram showing a sound source direction detection device 1 according to this embodiment.

The sound source direction detection device 1 is a device that detects the direction of a sound source using a plurality of microphones. The sound source direction detection device 1 can be mounted on, for example, a robot that has a conversation with a human. By detecting the direction of the person who is speaking by the sound source direction detection device 1, the face or line of sight of the robot can be directed toward the speaker. Alternatively, the sound source direction detection device 1 can be mounted on the microphone unit of the telephone conference system. The sound source direction detection device 1 can identify the person by detecting the direction of the person who is speaking among the participants in the conference. The sound source direction detection device 1 can also be installed in a security monitoring system. Even an intruder who is hidden behind the object and is not reflected in the camera of the security monitoring system can detect the approximate position of the intruder if the sound source direction detection device 1 can detect sound.

As shown in FIG. 1, the sound source direction detecting device 1 includes a first microphone 2a, a second microphone 2b, a first microphone signal input unit 3a, a second microphone signal input unit 3b, a first microphone signal storage unit 4a, and a second microphone. A microphone signal storage unit 4b, an audio signal evaluation unit 5, a sound source direction specifying unit 6 (specifying unit), and an angle table 7 are provided. The first microphone 2a, the second microphone 2b, the first microphone signal input unit 3a, the second microphone signal input unit 3b, the first microphone signal storage unit 4a, the second microphone signal storage unit 4b, and the audio signal evaluation unit 5 However, it functions as the phase difference calculation apparatus 10 according to the present invention.

The first microphone 2a and the second microphone 2b are microphones that convert external sounds into audio signals. The first microphone 2a and the second microphone 2b are arranged at different positions.

The first microphone signal input unit 3a creates sound data obtained by digitizing the sound signal converted by the first microphone 2a. Similarly, the second microphone signal input unit 3b creates audio data in which the audio signal converted by the second microphone 2b is digitized. Specifically, the audio data is data indicating the relationship between the output value of the audio signal and time (change in the output value of the audio signal over time).

The first microphone signal storage unit 4a stores voice data created by the first microphone signal input unit 3a, and the second microphone signal storage unit 4b stores voice data created by the second microphone signal input unit 3b. is doing. Both the first microphone signal storage unit 4a and the second microphone signal storage unit 4b always store audio data for an arbitrary fixed time.

The audio signal evaluation unit 5 refers to the audio data stored in the first microphone signal storage unit 4a and the second microphone signal storage unit 4b, and measures the temporal shift of the audio signal represented by each audio data. Thus, the phase difference between the two audio signals is obtained. A block diagram of the audio signal evaluation unit is shown in FIG. As shown in FIG. 2, the audio signal evaluation unit 5 includes a phase shift audio signal creation unit 51, a deviation value calculation unit 52 (integrated value calculation unit), a deviation value graph creation unit 53, a determination unit 54, and a phase difference calculation unit. 55. Details of these members and a method of calculating a phase difference by these members will be described later.

The sound source direction specifying unit 6 specifies the sound source direction based on the phase difference calculated by the audio signal evaluating unit 5. Specifically, the angle table 7 stores the angle direction corresponding to each phase difference, and when the phase difference is received from the sound source direction specifying unit 6, the angle of the sound source direction corresponding to the phase difference is specified. Send to part 6. The angle of the sound source direction is an angle between the sound source direction and a reference direction. The sound source direction specifying unit 6 specifies the angle direction of the sound source from the angle obtained using the angle table 7. Then, the sound source direction specifying unit 6 outputs the specified angular direction to the outside as a detection result of the sound source direction.

(Calculation method of phase difference)
Hereinafter, a method in which the audio signal evaluation unit 5 calculates the phase difference between two audio signals based on the audio data stored in the first microphone signal storage unit 4a and the second microphone signal storage unit 4b will be described. 3 and FIG. 3 and 4 are diagrams for explaining a method of creating a deviation value graph from two audio data.

The explanation will be given by taking two audio data as shown in FIG. For example, it is assumed that audio data indicated by a solid line represents an audio signal obtained from the first microphone 2a, and audio data indicated by a dotted line represents an audio signal obtained from the second microphone 2b.

First, the phase-shifted sound signal creating unit 51 of the sound signal evaluating unit 5 creates a phase-shifted sound signal in which the relative phase shift amount between two sound signals is changed stepwise for each phase shift amount. The deviation value calculation unit 52 obtains an integrated value of the difference between the output values within a certain range of the two audio signals in the phase-shifted audio signal corresponding to each phase shift amount. The difference between the output values is the absolute value of the difference between the output values of the two audio signals at a specific point (time). These integrated values are the two audio signals within a certain range (in FIG. 2). (A) in (A) is a gray area). Hereinafter, the integrated value of the difference between the output values within a certain range in the two audio signals is referred to as a “deviation value”. It is considered that the larger the deviation value, the larger the phase difference between the two audio signals, and the smaller the deviation value, the smaller the phase difference between the two audio signals.

Then, the deviation value graph creating unit 53 creates a deviation value graph showing the change of the deviation value over time when the relative phase shift amount between the two audio signals is changed in stages. FIG. 3B shows a state when the audio signal (solid line) obtained from the first microphone 2a in FIG. 3A is phase-shifted by +1 in the right direction (plus direction) of the drawing. Further, from this state, the state when the audio signal obtained from the first microphone 2a is phase-shifted by +1 in the right direction of the drawing is (C) in FIG. 3, and the state when the phase is further shifted by +1 is shown. It is (D) in FIG. By plotting the deviation value in each state, the deviation value graph shown in FIG. 2 is obtained.

On the other hand, it is assumed that two audio data as shown in (A) of FIG. 4 are obtained. In this case, the audio signal (solid line) obtained from the first microphone 2a of (A) in FIG. 4 is phase-shifted by −1 in the left direction (minus direction) of the drawing. The state at this time is (B) in FIG. Further, from this state, the state when the audio signal obtained from the first microphone 2a is phase-shifted by -1 in the right direction of the drawing is (C) in FIG. 4, and when the phase is further shifted by -1. The state is (D) in FIG. By plotting the deviation value in each state, the deviation value graph shown in FIG. 4 is obtained.

In the above, the example in which the audio signal obtained from the first microphone 2a is phase-shifted based on the audio signal obtained from the second microphone 2b has been described, but the present invention is not necessarily limited thereto. The audio signal obtained from the second microphone 2b may be phase-shifted with reference to the audio signal obtained from the first microphone 2a. In this case, the deviation value graph when the audio signal obtained from the second microphone 2b in FIG. 3A is phase-shifted in the left direction (minus direction) in the drawing is the deviation value graph shown in FIG. . Also, the shift value graph when the audio signal obtained from the second microphone 2b of (A) in FIG. 4 is phase-shifted in the right direction (plus direction) in the drawing is the shift value graph shown in FIG.

The determination unit 54 determines whether or not the audio signals obtained from the first microphone 2a and the second microphone 2b are appropriate for calculating the phase difference based on the deviation value graph created by the deviation value graph creation unit 53. . When the determination unit 54 determines that the audio signals obtained from the first microphone 2a and the second microphone 2b are appropriate for the calculation of the phase difference, the phase difference calculation unit 55 includes the deviation value graph creation unit 53. Based on the generated deviation value graph, the phase difference between the audio signals obtained from the first microphone 2a and the second microphone 2b is calculated. A determination method by the determination unit 54 will be described later, and a phase difference calculation method by the phase difference calculation unit 55 will be described below.

FIG. 5 shows an example of sound data suitable for calculating the phase difference. Then, the deviation value graph when the audio signal evaluation unit 5 phase-shifts the audio signal obtained from the first microphone 2a in FIG. 5 in the positive direction toward the audio signal obtained from the second microphone 2b. Is shown in FIG.

In the deviation value graph shown in FIG. 6, the point where the deviation value is minimum is one point or two consecutive points. The deviation value is minimized when the two audio signals substantially match. Therefore, the phase shift amount when the deviation value between the two audio signals becomes the minimum is the temporal deviation (phase difference) between the two audio signals. For example, in FIG. 6, since the shift value is minimum when the phase shift amount is +9, the phase difference between the two audio signals is +9.

Thus, the integrated value (the difference between the output values of the two within a certain range when the relative phase shift amount between the audio signals output from the first microphone 2a and the second microphone 2b is changed stepwise. The phase difference between the audio signals output from the first microphone 2a and the second microphone 2b can be calculated based on the deviation value. Therefore, the phase difference calculation unit 55 calculates the phase difference between the audio signals output from the first microphone 2a and the second microphone 2b based on the deviation value for each phase shift amount.

(Sound source direction identification method)
Hereinafter, a method in which the sound source direction specifying unit 6 specifies the direction of the sound source based on the phase difference calculated by the audio signal evaluating unit 5 will be described with reference to FIG. FIG. 7 is a diagram for explaining a method of specifying the sound source direction from the phase difference between two audio signals.

As shown in FIG. 7, a description will be given by taking as an example a case where the distance between the microphones is 100 mm, the sound speed is 343.5 m / s, and the sound sampling rate is 48 KHz. Note that the direction from the second microphone 2b to the first microphone 2a is a 0 degree direction, and the direction from the first microphone 2a to the second microphone 2b is a 180 degree direction. At this time, when there is a sound source in the 0 degree direction, the phase difference = 14, and when there is a sound source in the 180 degree direction, the phase difference = -14. The phase difference here refers to the phase shift amount of the audio signal obtained from the first microphone 2a with reference to the audio signal obtained from the second microphone 2b. When the phase shift amount of the audio signal obtained from the second microphone 2b based on the audio signal obtained from the first microphone 2a is used as the phase difference, the phase difference = − when there is a sound source in the 0 degree direction. 14 and when there is a sound source in the direction of 180 degrees, the phase difference = 14.

In general, the locus of a point where the difference in distance from each of the two points is a constant amount is a hyperbolic function with the two points as the focal point. If the point is sufficiently far away from the distance between the two focal points, the point is located on the asymptote of the hyperbolic function, and the slope of the asymptote can be regarded as the direction in which the point is located. it can. That is, the sound source is located on the asymptote of the hyperbolic function having the respective positions of the first microphone 2a and the second microphone 2b as the focal points, and the direction in which the sound source is located can be regarded as the slope of the asymptote.

Therefore, when the phase difference calculated by the audio signal evaluation unit 5 is +9, the angle of the sound source direction is
arccos (9/14) ≒ 50 degrees. Said angle is angle (theta) between 0 degree directions. That is, the sound source direction is a 50 degree direction.

The result of the above calculation for each phase difference is stored in the angle table 7. That is, the angle table 7 stores the angle of the sound source direction corresponding to each phase difference. When the sound source direction specifying unit 6 sends the phase difference calculated by the audio signal evaluation unit 5 to the angle table 7, the sound source direction specifying unit 6 receives the angle of the sound source direction corresponding to the phase difference from the angle table 7. The sound source direction specifying unit 6 specifies the angle direction of the sound source from the angle obtained using the angle table 7 and outputs the specified angle direction to the outside as a detection result of the sound source direction.

It should be noted that the angle direction specified from the phase difference calculated by the audio signal evaluation unit 5 can be the left and right directions with respect to the straight line connecting the first microphone 2a and the second microphone 2b. Therefore, in the angle direction specified from one phase difference, it cannot be specified whether the sound source direction is the left and right directions with respect to the straight line connecting the first microphone 2a and the second microphone 2b. Therefore, the sound source direction detection device 1 according to the present embodiment is preferably used when detecting a sound source located on one side with respect to a straight line connecting the first microphone 2a and the second microphone 2b, such as when installed near a wall. .

(Avoidance of audio data inappropriate for phase difference calculation)
An example of audio data when there is a lot of noise around is shown in FIG. In this figure, in addition to the main voice showing a large waveform, another voice showing a small waveform is emitted from a different direction.

The audio signal evaluation unit 5 creates a deviation value graph by the same method as described above. The created deviation value graph is as shown in FIG. In the deviation value graph shown in FIG. 9, although the point where the deviation value is minimum is one point or two consecutive points, the value of the minimum value is higher than that of the deviation value graph shown in FIG. This means that the degree of coincidence between the two audio signals is low. When the phase difference is calculated using such a deviation value graph, it is difficult to calculate an accurate phase difference, and it is easy to erroneously detect the sound source direction. For this reason, such a deviation value graph is inappropriate for calculating the phase difference and not suitable for detecting the direction of the sound source.

Therefore, when the determination unit 54 of the audio signal evaluation unit 5 sets a predetermined threshold value and a deviation value graph having a minimum value equal to or greater than the threshold value is obtained, the obtained audio data is not suitable for calculating the phase difference. Judge that it is appropriate. And the phase difference calculation part 55 does not calculate a phase difference using the said audio | voice data. Thereby, it is possible to prevent erroneous detection of the sound source direction by calculating the phase difference using sound data inappropriate for calculating the phase difference, and to perform more accurate detection of the sound source direction. The predetermined threshold is, for example, the minimum value of the deviation value graph obtained from appropriate audio data (the deviation value graph shown in FIG. 6) and the deviation value graph obtained from inappropriate audio data (the deviation shown in FIG. 9). It can be a value between the minimum value of the value graph).

Next, FIG. 10 shows an example of high-frequency audio data exceeding a predetermined frequency. This figure shows a state in which sound having a dense waveform is emitted.

The audio signal evaluation unit 5 creates a deviation value graph by the same method as described above. The created deviation value graph is as shown in FIG. In the deviation value graph shown in FIG. 11, there are two minimum points. This is because, when one audio signal is phase-shifted toward the other audio signal, the two audio signals are matched so that the two audio signals coincide with each other multiple times regardless of whether the phase is shifted forward or backward. This is because it has a dense waveform. When the phase difference is calculated using such a deviation value graph, it is difficult to calculate an accurate phase difference, and it is easy to erroneously detect the sound source direction. For this reason, such a deviation value graph is inappropriate for calculating the phase difference and not suitable for detecting the direction of the sound source.

Accordingly, the determination unit 54 of the audio signal evaluation unit 5 determines that the obtained audio data is inappropriate for calculating the phase difference when the deviation value graph has two or more minimum points. . And the phase difference calculation part 55 does not calculate a phase difference using the said audio | voice data. As a result, it is possible to prevent erroneous detection of the sound source direction due to calculation of the phase difference using sound data inappropriate for calculation of the phase difference, and to detect the sound source direction more accurately. Note that the number of minimum points in the deviation value graph can be easily obtained by counting the number of times the slope of the graph changes from − (negative value) to + (positive value).

Thus, in order for the determination unit 54 to identify audio data inappropriate for calculating the phase difference, the phase difference calculation unit 55 calculates the phase difference using only audio data appropriate for the phase difference calculation. Can do. For this reason, the sound source direction detection device 1 can calculate the phase difference more accurately. The sound source direction specifying unit 6 can detect the sound source direction based on the phase difference calculated using only sound data appropriate for the calculation of the phase difference. For this reason, the sound source direction detection device 1 can detect the sound source direction more accurately. For example, the frequency component of human voice greatly fluctuates due to the pronunciation of words. In addition, the sound signal may be intermittently generated due to mixing of other noise from the surroundings. Even for such a sound source, by calculating the deviation value, by evaluating the audio signal based on the deviation value, it is possible to extract a portion of the audio signal that can accurately calculate the phase difference between the two. Thus, in this embodiment, the phase difference between the audio signals output from different microphones can be calculated more accurately.

In particular, if the sound source direction is detected using the audio data as shown in FIGS. 9 and 11, since many processes are required until the sound source direction is specified, the sound source direction detecting device requires many processing members. It becomes. However, since the sound source direction detection apparatus 1 according to the present embodiment does not detect the sound source direction using the audio data as shown in FIGS. 9 and 11, the processing until the sound source direction is specified can be reduced. The processing members required for the sound source direction detection device 1 can be suppressed.

In addition, the determination unit 54 determines whether the audio data is appropriate for calculating the phase difference using the deviation value graph, and the phase difference calculation unit 55 uses the deviation value graph to determine whether the audio data is appropriate. The phase difference between signals is calculated. As described above, the sound source direction detection device 1 can determine whether or not the sound data is appropriate for calculating the phase difference and calculate the phase difference between the sound signals only by creating a deviation value graph. It is possible to reduce the processing required to specify the sound source direction.

(Sound source direction detection procedure)
The flow of the above processing is as shown in FIG.

First, the audio signal evaluation unit 5 acquires audio data stored in the first microphone signal storage unit 4a and the second microphone signal storage unit 4b (step S1; hereinafter abbreviated as S1). The phase-shifted audio signal creating unit 51 of the audio signal evaluating unit 5 converts the phase-shifted audio signal in which the relative phase shift amount between the audio signals indicated by each of the two audio data is changed stepwise for each phase shift amount. Create (S2). Then, the deviation value calculation unit 52 obtains an integrated value (deviation value) of the difference between the output values within a certain range of the two audio signals in the phase-shifted audio signal corresponding to each phase shift amount (S3), and the deviation value graph The creating unit 53 creates a deviation value graph indicating the change of the deviation value with time (S4).

The determination unit 54 determines whether or not the audio signals output from the first microphone 2a and the second microphone 2b are appropriate for calculating the phase difference based on the deviation value graph created by the deviation value graph creation unit 53. judge. Specifically, the determination unit 54 determines whether or not the minimum value of the deviation value graph is larger than a predetermined threshold (S5). If the minimum value of the deviation value graph is larger than the predetermined threshold value, the determination unit 54 determines that the sound signals output from the first microphone 2a and the second microphone 2b are inappropriate for calculating the phase difference. Then, the process returns to S1, and the processing after S2 is performed again using another audio data.

On the other hand, when the minimum value of the deviation value graph is smaller than the predetermined threshold, the determination unit 54 specifies the number of local minimum points of the deviation value graph (S6). And the determination part 54 determines whether the number of the specified minimum points is one (S7). When the number of local minimum points in the deviation value graph is two or more, the determination unit 54 determines that the audio signals output from the first microphone 2a and the second microphone 2b are inappropriate for calculating the phase difference. Then, the process returns to S1, and the processing after S2 is performed again using another audio data.

On the other hand, when the number of local minimum points in the deviation value graph is one, the determination unit 54 determines that the audio signals output from the first microphone 2a and the second microphone 2b are appropriate for calculating the phase difference. . Then, the phase difference calculation unit 55 calculates the phase shift amount at the minimum value of the deviation value graph as the phase difference between the two audio signals indicated by the two audio data (S8).

The sound source direction specifying unit 6 specifies the sound source direction based on the phase difference calculated by the phase difference calculating unit 55 (S9). Specifically, the sound source direction identification unit 6 sends the phase difference calculated by the phase difference calculation unit 55 to the angle table 7 and receives the angle of the sound source direction corresponding to the phase difference from the angle table 7. Then, the sound source direction specifying unit 6 specifies the angle direction of the sound source from the angle obtained using the angle table 7, and outputs the specified angle direction to the outside as a detection result of the sound source direction (S10).

[Embodiment 2]
The second embodiment of the present invention will be described below with reference to FIG. For convenience of explanation, members having the same functions as those described in the first embodiment are denoted by the same reference numerals and description thereof is omitted. FIG. 13 is a block diagram showing the sound source direction detection device 11 according to the present embodiment.

As shown in FIG. 13, the sound source direction detection device 11 includes a first microphone 2a, a second microphone 2b, a third microphone 2c, a first microphone signal input unit 3a, a second microphone signal input unit 3b, and a third microphone signal input. A unit 3c, a first microphone signal storage unit 4a, a second microphone signal storage unit 4b, a third microphone signal storage unit 4c, an audio signal evaluation unit 5, a sound source direction specifying unit 6, and an angle table 7. The first microphone 2a, the second microphone 2b, the third microphone 2c, the first microphone signal input unit 3a, the second microphone signal input unit 3b, the third microphone signal input unit 3c, the first microphone signal storage unit 4a, the first microphone The 2 microphone signal storage unit 4b, the third microphone signal storage unit 4c, and the audio signal evaluation unit 5 function as the phase difference calculation device 20 according to the present invention.

The sound source direction detection device 11 is different from the sound source direction detection device 1 in that it includes a third microphone 2c, a third microphone signal input unit 3c, and a third microphone signal storage unit 4c. The third microphone 2c is a microphone that converts external sound into an audio signal. The 1st microphone 2a, the 2nd microphone 2b, and the 3rd microphone 2c are arranged in a mutually different position, and are arranged so that it may not be located on the same straight line.

The third microphone signal input unit 3c creates voice data in which the voice signal converted by the third microphone 2c is digitized. The third microphone signal storage unit 4c stores the voice data created by the third microphone signal input unit 3c, and always stores voice data for an arbitrary fixed time.

Since the sound source direction detection device 11 has three microphones, that is, the first microphone 2a, the second microphone 2b, and the third microphone 2c, the audio signal evaluation unit 5 has a combination of two microphones out of the three microphones. In addition, the phase difference between the audio signals obtained from each microphone is calculated. That is, the audio signal evaluation unit 5 calculates the phase difference for each of the three combinations of microphones. In principle, the audio signal evaluation unit 5 may calculate the phase difference of at least two combinations of microphones.

Here, the angle direction specified from the phase difference calculated by the audio signal evaluation unit 5 may be the left and right directions with respect to the straight line connecting the two microphones. For this reason, in the angular direction specified from one phase difference, it cannot be specified whether the sound source direction is the left and right directions with respect to the straight line connecting the two microphones.

Therefore, the sound source direction specifying unit 6 specifies, as the sound source direction, the direction in which the angle direction specified from the phase difference coincides between at least two phase differences calculated by the audio signal evaluation unit 5. That is, the sound source direction specifying unit 6 specifies, as the sound source direction, a direction that matches one of the angle directions specified from the other phase difference among the angle directions specified from the one phase difference. The sound source direction specifying unit 6 outputs the specified sound source direction to the outside as a detection result.

Thereby, the sound source direction detection device 11 can detect the sound source direction from all directions of 360 degrees on a plane including three microphones. In the sound source direction detection device 1 according to the first embodiment, the use place is restricted such as being installed near a wall. However, in the sound source direction detection device 11 according to the present embodiment, the use place is not restricted, and the sound source can be placed at a favorite place. A direction detection device 11 can be installed.

[Embodiment 3]
The third embodiment of the present invention will be described as follows. For convenience of explanation, members having the same functions as those described in the second embodiment are denoted by the same reference numerals and description thereof is omitted.

In the present embodiment, two sound source direction detection devices 11 according to the second embodiment are used, and both are installed at locations separated from each other. Then, by using a triangulation method from the sound source direction detected by each sound source direction detection device 11, the distance from each sound source direction detection device to the sound source can be calculated, and the position of the sound source can be specified.

Thus, in the present embodiment, the position of the sound source can be detected by using the two sound source direction detection devices 11. That is, in this embodiment, the two sound source direction detection devices 11 can be used as the sound source position detection device.

[Embodiment 4]
Embodiment 4 of the present invention will be described below with reference to FIG. For convenience of explanation, members having the same functions as those described in the second embodiment are denoted by the same reference numerals and description thereof is omitted. FIG. 14 is a block diagram showing the sound source direction detection device 21 according to this embodiment.

As shown in FIG. 14, the sound source direction detection device 21 includes a first microphone 2a, a second microphone 2b, a third microphone 2c, a fourth microphone 2d, a first microphone signal input unit 3a, a second microphone signal input unit 3b, Third microphone signal input unit 3c, fourth microphone signal input unit 3d, first microphone signal storage unit 4a, second microphone signal storage unit 4b, third microphone signal storage unit 4c, fourth microphone signal storage unit 4d, audio signal An evaluation unit 5, a sound source direction specifying unit 6, and an angle table 7 are provided. The first microphone 2a, the second microphone 2b, the third microphone 2c, the fourth microphone 2d, the first microphone signal input unit 3a, the second microphone signal input unit 3b, the third microphone signal input unit 3c, and the fourth microphone signal. The input unit 3d, the first microphone signal storage unit 4a, the second microphone signal storage unit 4b, the third microphone signal storage unit 4c, the fourth microphone signal storage unit 4d, and the audio signal evaluation unit 5 are phase differences according to the present invention. It functions as the calculation device 30.

The sound source direction detection device 21 is different from the sound source direction detection device 11 in that it includes a fourth microphone 2d, a fourth microphone signal input unit 3d, and a fourth microphone signal storage unit 4d. The fourth microphone 2d is a microphone that converts external sound into an audio signal. The 1st microphone 2a, the 2nd microphone 2b, the 3rd microphone 2c, and the 4th microphone 2d are arranged in a mutually different position, and are arranged so that it may not be located on the same plane.

The fourth microphone signal input unit 3d creates voice data obtained by digitizing the voice signal converted by the fourth microphone 2d. The fourth microphone signal storage unit 4d stores the voice data created by the fourth microphone signal input unit 3d, and always stores voice data for an arbitrary fixed time.

A method in which the sound source direction identification unit 6 identifies the direction of the sound source based on the phase difference calculated by the audio signal evaluation unit 5 will be described with reference to FIG. FIG. 15 is a diagram for explaining a method of specifying a sound source direction using four microphones.

As shown in FIG. 15, the fourth microphone 2d is arranged at a position perpendicular to the plane including the first microphone 2a, the second microphone 2b, and the third microphone 2c and on a straight line passing through the third microphone 2c. Assume a case. In this case, using the first microphone 2a, the second microphone 2b, and the third microphone 2c, the plane (the first microphone 2a, the second microphone 2b, and the third microphone 2c are replaced by the method described in the second embodiment). Specify the direction of the sound source on the plane containing the sound source. Thereby, the angle θ1 in the sound source direction is obtained. The angle θ1 is an angle between a straight line passing through the first microphone 2a and the second microphone 2b in the plane.

Subsequently, by using the third microphone 2c and the fourth microphone 2d, the sound source direction with respect to the straight line (the straight line passing through the third microphone 2c and the fourth microphone 2d) is specified by the method described in the first embodiment. Thereby, the angle θ2 in the sound source direction is obtained. The angle θ2 is an angle between the straight line. That is, the sound source direction is a direction passing through a conical surface having the above-mentioned straight line as a main axis and an apex angle of 2 × θ2.

The sound source direction can be specified from the three-dimensional vector of the polar coordinate system obtained from the two angles θ1 and θ2 obtained as described above. Specifically, the angle between the third microphone 2c and the straight line passing through the fourth microphone 2d is an angle θ2, and the plane includes the first microphone 2a, the second microphone 2b, and the third microphone 2c. , The direction in which the angle between the straight line passing through the first microphone 2a and the second microphone 2b and the parallel line is the angle θ1 (= angle θ1 ′) is the sound source direction.

The sound source direction specifying unit 6 outputs the sound source direction specified as described above to the outside as a detection result. As described above, the sound source direction detection device 21 can detect the direction of the sound source position in the three-dimensional space instead of detecting the sound source direction on the plane. As a result, when the sound source is a human voice, the position of the mouth of the person talking can be detected including the height direction, so that the height can be known to some extent, which makes it possible to distinguish between adults and children. Will also be available.

[Summary]
Phase

difference calculation apparatuses

10, 20, and 30 according to aspect 1 of the present invention are arranged at different positions, convert a sound from an external sound into a sound signal, and output one of the plurality of microphones. A phase-shifted audio signal is created for each phase-shifted amount by gradually changing the relative phase-shifted amount between the audio signals output from one of the microphones and the audio signal output from the other microphones. A phase shift audio signal creation unit 51 that performs an integrated value calculation unit that calculates an integrated value of a difference between output values within a certain range of two audio signals in the phase shift audio signal corresponding to each phase shift amount (deviation value) Based on the calculation unit 52), the audio signal output from the one microphone, and the audio signal output from the other microphone based on the integrated value It includes a phase difference calculator 55 for calculating a phase difference, a.

According to the above configuration, within a certain range of the two audio signals in the phase-shifted audio signal in which the relative phase shift amount between the audio signal of one microphone and the audio signal of the other microphone is changed stepwise. Based on the integrated value of the difference between the output values, the phase difference between the two can be calculated. For example, the frequency component of human voice greatly fluctuates due to the pronunciation of words. In addition, the sound signal may be intermittently generated due to mixing of other noise from the surroundings. Even in such a sound source, by calculating the integrated value of the difference between the output values within a certain range of the two audio signals for each phase shift amount, by evaluating the audio signal based on the integrated value, It is possible to extract a portion of the audio signal that can accurately calculate the phase difference. As described above, the phase difference calculation apparatus according to one embodiment of the present invention can calculate the phase difference between audio signals output from different microphones more accurately.

In the phase

difference calculation devices

10, 20, and 30 according to the second aspect of the present invention, the phase difference calculation unit 55 according to the first aspect is configured such that the accumulated value when the phase shift amount is changed stepwise. In such a change, the phase shift amount when the integrated value becomes the minimum is calculated as a phase difference between the audio signal output from the one microphone and the audio signal output from the other microphone.

The above integrated value is minimized when the two audio signals are substantially coincident. Therefore, the phase shift amount when the integrated value becomes the minimum is the time shift (phase difference) between the two audio signals. Therefore, in the above configuration, the integrated value is the smallest in the change over time of the integrated value when the audio signal output from the other microphone is phase-shifted with respect to the audio signal output from one microphone. Is calculated as the phase difference between the audio signals output by both microphones.

The phase

difference calculation apparatuses

10, 20, and 30 according to the aspect 3 of the present invention are based on the time-dependent change in the integrated value when the phase shift amount is changed stepwise in the

aspect

1 or 2. The sound signal output from the microphone further includes a determination unit that determines whether or not the sound signal is appropriate for calculating the phase difference, and the phase difference calculation unit 55 includes the sound output from each microphone. When the determination unit 54 determines that the signal is appropriate for calculating the phase difference, the audio signal output from the one microphone and the other based on the change over time of the integrated value The phase difference from the audio signal output by the microphone is calculated.

According to the above configuration, the determination unit 54 recognizes an audio signal inappropriate for calculating the phase difference, and the phase difference calculation unit 55 does not use the audio signal for calculating the phase difference. The phase difference can be calculated using only an appropriate audio signal. Therefore, the phase

difference calculation apparatuses

10, 20, and 30 according to one aspect of the present invention can calculate the phase difference more accurately.

In the phase

difference calculation devices

10, 20, and 30 according to aspect 4 of the present invention, in the aspect 3, the determination unit 54 determines that each of the microphones has a minimum point in the change with time of the integrated value. It is determined that the output audio signal is appropriate for calculating the phase difference.

There are two or more local minimum points in the change of the integrated value over time. When one audio signal is phase-shifted toward the other audio signal, it can be shifted multiple times regardless of whether it is phase-shifted forward or backward. This is because the two audio signals have a dense waveform such that the two audio signals match. When the phase difference is calculated using such an audio signal, it is difficult to calculate an accurate phase difference. Therefore, according to the above configuration, the phase difference is calculated only when there is one local minimum point in the temporal change in the integrated value, so that the phase difference can be calculated more accurately.

In the phase

difference calculation devices

10, 20, and 30 according to aspect 5 of the present invention, in the aspect 4, the determination unit 54 determines that the minimum value of the integrated value in the change over time of the integrated value is less than a predetermined threshold value. In this case, it is determined that the audio signal output from each microphone is appropriate for calculating the phase difference.

When the minimum value of the integrated value in the change over time of the integrated value is equal to or greater than a predetermined threshold value, it means that the degree of coincidence between the two audio signals is low. When the phase difference is calculated using such an audio signal, it is difficult to calculate an accurate phase difference. Therefore, according to the above configuration, the phase difference is calculated only when the minimum value of the integrated value in the temporal change of the integrated value is less than a predetermined threshold value, and therefore the phase difference can be calculated more accurately. Can do.

The sound source

direction detection devices

1, 11 and 21 according to aspect 6 of the present invention determine the sound source direction of the sound based on the phase difference calculated by the phase

difference calculation devices

10, 20, and 30 according to aspects 1 to 5. Identify.

According to the above configuration, since the sound source direction can be detected using the accurate phase difference calculated by the phase

difference calculation devices

10, 20, and 30 according to one aspect of the present invention, a more accurate sound source Direction detection is possible.

The sound source

direction detection devices

11 and 21 according to aspect 7 of the present invention include the three or more microphones in the aspect 6, and the phase

difference calculating devices

20 and 30 output the three or more microphones. With respect to the audio signal, the phase difference is calculated, and the sound source direction is specified based on the two or more phase differences calculated by the phase

difference calculation devices

20 and 30.

According to the above configuration, by using three microphones, it is possible to detect the sound source direction from all directions of 360 degrees on a plane including the three microphones. Further, by using four microphones, it is possible to detect the direction of the sound source position in the three-dimensional space, not the direction of the sound source on the plane.

A phase difference calculation method according to aspect 8 of the present invention includes a step of converting an external sound into an audio signal using a plurality of microphones arranged at different positions, and outputting one of the plurality of microphones. Creating a phase-shifted audio signal in which the relative phase shift amount between the audio signals output from the two microphones and the audio signals output from the other microphones are changed stepwise; A step of calculating an integrated value of a difference between output values within a predetermined range of the two audio signals in the phase-shifted audio signal corresponding to the phase shift amount; and the audio signal output from the one microphone based on the integrated value And calculating a phase difference with the audio signal output from the other microphone.

According to the above method, the same effects as those of the phase difference calculation apparatus according to one aspect of the present invention can be obtained.

The present invention is not limited to the above-described embodiments, and various modifications are possible within the scope shown in the claims, and embodiments obtained by appropriately combining technical means disclosed in different embodiments. Is also included in the technical scope of the present invention. Furthermore, a new technical feature can be formed by combining the technical means disclosed in each embodiment.

The present invention can be suitably used as a phase difference calculating device of a sound source direction detecting device used for a robot that talks with a human, a microphone unit of a telephone conference system, or a security monitoring system.

DESCRIPTION OF SYMBOLS 1 Sound source direction detection apparatus 2a 1st microphone 2b 2nd microphone 2c 3rd microphone 2d 4th microphone 3a 1st microphone signal input part 3b 2nd microphone signal input part 3c 3rd microphone signal input part 3d 4th microphone signal input part 4a First microphone signal storage unit 4b Second microphone signal storage unit 4c Third microphone signal storage unit 4d Fourth microphone signal storage unit 5 Audio signal evaluation unit (phase shift unit, integrated value calculation unit, phase difference calculation unit, determination unit)
6 Sound source direction identification part (specification part)
7 Angle table 11 Sound source direction detection device 21 Sound source direction detection device 10 Phase difference calculation device 20 Phase difference calculation device 30 Phase difference calculation device

Claims

A plurality of microphones that are arranged at different positions, convert external sounds into audio signals, and output them;
A phase-shifted audio signal in which the relative phase shift amount between the audio signal output from one of the plurality of microphones and the audio signal output from the other microphone is changed stepwise. For each phase shift amount, a phase shift audio signal creation unit,
An integrated value calculating unit that calculates an integrated value of the difference between output values within a certain range of two audio signals in the phase-shifted audio signal corresponding to each phase shift amount;
A phase difference calculating unit that calculates a phase difference between the audio signal output from the one microphone and the audio signal output from the other microphone based on the integrated value; Calculation device.
The phase difference calculation unit is configured to determine the phase shift amount when the integrated value is minimized in the change with time of the integrated value when the phase shift amount is changed stepwise by the one microphone. The phase difference calculation apparatus according to claim 1, wherein the phase difference is calculated as a phase difference between the output audio signal and the audio signal output by the other microphone.
Based on the change over time of the integrated value when the phase shift amount is changed stepwise, it is determined whether or not the audio signal output from each microphone is appropriate for calculating the phase difference. A determination unit for determining;
The phase difference calculation unit, when the determination unit determines that the audio signal output from each microphone is appropriate for calculating the phase difference, based on the change over time of the integrated value, The phase difference calculation apparatus according to claim 1, wherein a phase difference between the audio signal output from the one microphone and the audio signal output from the other microphone is calculated.
The determination unit determines that the audio signal output from each microphone is appropriate for calculating the phase difference when there is one minimum point in the change with time of the integrated value. The phase difference calculation apparatus according to claim 3.
The determination unit is suitable for calculating the phase difference between the audio signals output from the microphones when a minimum value of the integrated value in a change with time of the integrated value is equal to or greater than a predetermined threshold. The phase difference calculation apparatus according to claim 3, wherein it is determined that the phase difference is present.
A phase difference calculation apparatus according to any one of claims 1 to 5;
A sound source direction detecting device comprising: a sound source direction specifying unit that specifies a sound source direction of the sound based on the phase difference calculated by the phase difference calculating device.
The phase difference calculating device includes three or more microphones, calculates the phase difference with respect to the audio signals output by the three or more microphones,
The sound source direction detecting device according to claim 6, wherein the sound source direction specifying unit specifies the sound source direction based on the two or more phase differences calculated by the phase difference calculating device.
Using a plurality of microphones arranged at different positions, converting external sound into an audio signal, and outputting the sound signal;
A phase-shifted audio signal in which the relative phase shift amount between the audio signal output from one of the plurality of microphones and the audio signal output from the other microphone is changed stepwise. And the process of creating
Calculating an integrated value of a difference between output values within a certain range of two audio signals in the phase-shifted audio signal corresponding to each phase shift amount;
A phase difference calculation method comprising: calculating a phase difference between the sound signal output from the one microphone and the sound signal output from the other microphone based on the integrated value.