KR101678305B1 - 3D Hybrid Microphone Array System for Telepresence and Operating Method thereof - Google Patents

3D Hybrid Microphone Array System for Telepresence and Operating Method thereof Download PDF

Info

Publication number
KR101678305B1
KR101678305B1 KR1020150095517A KR20150095517A KR101678305B1 KR 101678305 B1 KR101678305 B1 KR 101678305B1 KR 1020150095517 A KR1020150095517 A KR 1020150095517A KR 20150095517 A KR20150095517 A KR 20150095517A KR 101678305 B1 KR101678305 B1 KR 101678305B1
Authority
KR
South Korea
Prior art keywords
sound source
signal
sound
directionality
housing
Prior art date
Application number
KR1020150095517A
Other languages
Korean (ko)
Inventor
전진용
무하마드임란
차용원
Original Assignee
한양대학교 산학협력단
재단법인 실감교류인체감응솔루션연구단
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 한양대학교 산학협력단, 재단법인 실감교류인체감응솔루션연구단 filed Critical 한양대학교 산학협력단
Priority to KR1020150095517A priority Critical patent/KR101678305B1/en
Application granted granted Critical
Publication of KR101678305B1 publication Critical patent/KR101678305B1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • H04R1/40Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
    • H04R1/406Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers microphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/005Circuits for transducers, loudspeakers or microphones for combining the signals of two or more microphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones
    • H04R3/04Circuits for transducers, loudspeakers or microphones for correcting frequency response
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R5/00Stereophonic arrangements
    • H04R5/027Spatial or constructional arrangements of microphones, e.g. in dummy heads

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Otolaryngology (AREA)
  • General Health & Medical Sciences (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Stereophonic System (AREA)

Abstract

Proposed are a hybrid type 3D microphone array system for telepresence and an operating method thereof. The hybrid type 3D microphone array system for the telepresence includes: a housing; a microphone sensor provided on a surface of the housing to collect a voice signal, in which two channels are disposed at each x, y, and z axes and arrayed in the orthogonal direction; and a chip set provided inside the housing for converting the voice signal collected from the microphone sensor into a digital signal to be data-processed in real time, thus transferring output signal to a speaker. The chip set determines directivity of sound source with respect to the voice signal collected through channels of the microphone sensor, directional data is integrated into voice data, thereby being transferred to the speaker or a headphone.

Description

TECHNICAL FIELD [0001] The present invention relates to a hybrid 3D microphone array system for telepresence,

The following embodiments relate to a hybrid 3D microphone array system and method of operation for telepresence. And more particularly, to a hybrid 3D microphone array system and method for telepresence that is embedded in a microphone array and capable of real-time data processing.

Research on indoor environments and coexistence spaces has increased interest in teleconference applications and telepresence applications using various microphone array systems.

Telepresence is a virtual videoconferencing system that allows participants to feel as though they are actually in the same room. It is a videoconferencing system that combines Internet technology with virtual reality (digital display) technology, to be.

However, current microphone array systems mostly use a single channel microphone, and the sound quality of the room is poor due to noise and room conditions, which causes the quality of the voice signal to be distorted.

Korean Patent Laid-Open No. 10-2009-0037692 relates to a method and apparatus for extracting a target sound source signal from such a mixed sound. In this method, a mixed sound is processed to extract only a target sound source signal desired by a user from a mixed sound including various sound sources Method and apparatus of the present invention.

The embodiments describe a hybrid type 3D microphone array system and an operation method for telepresence, and more specifically, a hybrid 3D microphone array system and operation method for a telepresence capable of real time data processing built in a microphone array do.

Embodiments can improve the voice received and output through the microphone array by accurately determining the directionality of the sound source in real time and integrating and outputting the direction data in the voice data, thereby accurately estimating the position of the sound source, And to provide a hybrid 3D microphone array system and method for telepresence that can be sounded.

A hybrid 3D microphone array for telepresence in accordance with one embodiment includes a housing; A microphone sensor formed on a surface of the housing for collecting voice signals and having two channels arranged on x, y and z axes and six channels arranged in orthogonal directions; And a chipset formed in the housing, for converting the voice signal collected from the microphone sensor into a digital signal, for data processing in real time, and for transmitting an output signal to a speaker, wherein a plurality of the channels of the microphone sensor The directional data of the sound source is determined in the chipset and the direction data is integrated into the voice data to be transmitted to the speaker or the headphone.

Here, the housing may have a spherical shape, and the six channels of the cylindrical shape of the microphone sensor may protrude from the surface of the housing in a direction orthogonal to each other.

Wherein the chipset comprises: adaptive noise canceling means for removing noise of the voice signal collected using an adaptive filter; A Direction of Arrival (DOA) for determining the directionality of a sound source by calculating an arrival time difference of a sound source with respect to the sound signal collected through different channels; A binaural synthesizer for determining the directionality of the sound source by reflecting the Interaural Time Difference (ITD) and the Interaural Level Difference (ILD) between the two ears; And a signal transmission unit for transmitting the result of the DOA and the result of the binaural synthesis unit to determine the directionality of the sound source and transmitting the final audio signal whose directionality of the sound source is determined to a speaker or a headset .

A hybrid 3D microphone array system for telepresence according to another embodiment includes a housing; A microphone sensor in which six channels are arranged on the surface of the housing in directions orthogonal to each other; A voice signal collecting unit for collecting voice signals through the plurality of channels of the microphone sensor; A Direction of Arrival (DOA) for determining the directionality of a sound source by calculating an arrival time difference of a sound source with respect to the sound signal collected through the different channels in the sound signal collecting unit; Binaural, which determines the directionality of the sound source by reflecting the Interaural Time Difference (ITD) and the Interaural Level Difference (ILD) between the two ears, A synthesis section; And a signal transmission unit for transmitting the result of the DOA and the result of the binaural synthesis unit to determine the directionality of the sound source and transmitting the final audio signal whose directionality of the sound source is determined to a speaker or a headset, The Direction of Arrival (DOA), the Binaural synthesis unit, and the signal transmission unit are formed in the housing, and data processing is performed in real time to determine the directionality of the sound source, To the speaker or headphone.

The apparatus may further include an adaptive noise eliminator formed inside the housing and adapted to remove noise of the speech signal collected by the speech signal collector using an adaptive filter.

The adaptive noise canceller calculates a weight and a coefficient of the filter using a least mean square (LMS) filter and finally derives a least mean square (LMS) of the error between the output signal and the desired signal Noise can be removed.

The Direction of Arrival (DOA) determines the arrival time difference of the sound source with respect to the voice signal collected through the different channels in the voice signal collecting unit through a cross-correlation, Value.

Further comprising a crosstalk cancellation for separating and correcting a channel between both ears in the housing, wherein, when outputting through the speaker, output after correction through the crosstalk canceling can do.

According to another embodiment of the present invention, there is provided a method of operating a hybrid type 3D microphone array for telepresence, comprising the steps of: collecting voice signals using a microphone sensor having six channels arranged orthogonal to each other in the x, y, and z axes; Determining a directionality of a sound source by calculating a time difference of a sound source with respect to the sound signal collected through a plurality of channels of the microphone sensor in a DOA (Direction of Arrival) formed in the housing; Determining the directionality of the sound source by reflecting the Interaural Time Difference (ITD) and the Interaural Level Difference (ILD) of the two ears in the collected voice signal; And a result of at least one of a time difference of the sound source, a difference in sound between the two ears (ITD) and a difference in intensity of sound (ILD), to process the data in real time in the housing And transmitting the final voice signal whose directionality of the sound source is determined to a speaker or a headphone.

The method may further include removing noise of the collected voice signal using an adaptive filter in the housing.

The step of removing the noise includes calculating a weight and a coefficient of the filter using a least mean square (LMS) filter, and finally calculating a least mean square (LMS) of the error between the output signal and the desired signal The noise can be removed.

The step of calculating the arrival time difference of the sound source of the sound signal to determine the directionality of the sound source may include calculating a time difference of a sound source of the plurality of sound signals collected through the plurality of channels of the microphone sensor, ), And the position of the sound source can be derived as an angle value.

The step of transmitting the final voice signal of which the directionality of the sound source is determined to the speaker or the headphone includes a step of separating and correcting a channel between both ears through crosstalk canceling when outputting through the speaker .

According to embodiments, a hybrid 3D microphone array system for telepresence capable of real-time data processing embedded in a microphone array and a technique related to an operation method can be provided.

According to the embodiments, directionality of a sound source is determined in real time, and direction data is integrated and output to the sound data, thereby improving the sound received and output through the microphone array, estimating the position of the sound source accurately, The present invention can provide a hybrid type 3D microphone array system and an operation method for telepresence that can be stereophonically converted into a stereophonic sound.

1 is a diagram illustrating a hybrid 3D microphone array system for telepresence in accordance with one embodiment.
2 is a diagram illustrating a microphone array noise removal algorithm according to an embodiment.
3 is a diagram illustrating a sound source tracking algorithm according to an exemplary embodiment of the present invention.
4 is a diagram for explaining a sound source tracking algorithm and DOA according to an embodiment.
5 is a flow diagram illustrating a hybrid 3D microphone array operating method for telepresence in accordance with one embodiment.
6 is a perspective view showing a hybrid type 3D microphone array for telepresence according to another embodiment.
7 is an enlarged view of Fig. 6A.
8 is a cross-sectional view illustrating a hybrid 3D microphone array for telepresence according to another embodiment.
9 is a diagram showing a configuration used in a preamplifier of a microphone array according to another embodiment.

Hereinafter, embodiments will be described with reference to the accompanying drawings. However, the embodiments described may be modified in various other forms, and the scope of the present invention is not limited by the embodiments described below. In addition, various embodiments are provided to more fully describe the present invention to those skilled in the art. The shape and size of elements in the drawings may be exaggerated for clarity.

A hybrid 3D microphone array system for telepresence according to one embodiment includes a housing; A microphone sensor in which six channels are arranged on the surface of the housing in directions orthogonal to each other; A voice signal collecting unit for collecting voice signals through the plurality of channels of the microphone sensor; A Direction of Arrival (DOA) for determining the directionality of a sound source by calculating an arrival time difference of a sound source with respect to the sound signal collected through the different channels in the sound signal collecting unit; Binaural, which determines the directionality of the sound source by reflecting the Interaural Time Difference (ITD) and the Interaural Level Difference (ILD) between the two ears, A synthesis section; And a signal transmission unit for transmitting the result of the DOA and the result of the binaural synthesis unit to determine the directionality of the sound source and transmitting the final audio signal whose directionality of the sound source is determined to a speaker or a headset, The Direction of Arrival (DOA), the Binaural synthesis unit, and the signal transmission unit are formed in the housing, and data processing is performed in real time to determine the directionality of the sound source, To the speaker or headphone.

The apparatus may further include an adaptive noise eliminator formed inside the housing and adapted to remove noise of the speech signal collected by the speech signal collector using an adaptive filter.

By implementing the hybrid 3D microphone array system for telepresence, it is possible to improve the voice received and output through the microphone array, accurately estimate the position of the sound source, and stereoscopic (Rendering).

The hybrid 3D microphone array system for telepresence is described in more detail below.

1 is a diagram illustrating a hybrid 3D microphone array system for telepresence in accordance with one embodiment.

Referring to FIG. 1, a hybrid type 3D microphone array system for telepresence converts a voice signal collected through a microphone array 100 into a digital signal, and then performs data processing to output a sound through the speaker 200 or the headphone 210 Can be output. At this time, a hybrid 3D microphone array system for telepresence may be formed in the microphone array 100.

The microphone array 100 may be formed in a spherical shape and may have a built-in microphone for internal signal processing. A cylindrical microphone sensor having six channels arranged orthogonally to each other may be formed on an outer surface of the microphone array 100. However, There is no.

The microphone array 100 is capable of picking up sound in all directions with uniform sensitivity in an omni direction, and can accept sounds coming from all directions about the microphone.

The voice signal collecting unit 11 can collect external voice signals by capturing data using a microphone sensor of the microphone array 100. [

The adaptive noise eliminator 12 can remove noise from the collected speech signal and can be omitted if necessary. The adaptive noise eliminator 12 may use Least Mean Square (LMS) as an adaptive filter. That is, the weight and coefficient of the filter can be calculated using a least mean square (LMS) filter, and finally the least mean square (LMS) of the error can be derived between the output signal and the desired signal. Therefore, noise can be removed from the speech signal (recording) by using a least mean square (LMS) filter.

The Direction of Arrival (DOA) 13 can determine the directionality of the sound source by calculating the arrival time difference of the sound sources with respect to the sound signals collected through the different channels in the sound signal collection unit 11. [ For example, by taking a signal from a microphone to remove noise, and calculating a time difference of a voice signal received through two different distances from a noise-removed signal through a time difference, the direction of the sound source can be grasped.

The DOA 13 acquires the arrival time difference of the sound source between the microphones when the sound source reaches the microphone using a plurality of microphones through a cross correlation and derives the position of the sound source as an angle value based on the cross- have.

Also, it is possible to improve the positional accuracy of the sound source derived from the DOA using the adaptive beamforming technique MVDR.

That is, the DOA 13 can track the position of a sound source by using an adaptive beamforming technique using a noise-canceled speech signal.

Meanwhile, the adaptive noise removing unit 12 removes noise from the voice signal or the voice signal collected through the voice signal collecting unit 11, and outputs the voice signal to the binaural synthesizing unit 12 through a spatial convolution (Spatial Convolution) (16), the directionality of the voice can be determined using the difference of the sound of the two ears.

Binaural synthesis In order to generate a binaural model which is a spatial algorithm in the synthesis section 16, hearing processing is required.

For locating sound source, HRTF (head-related transfer function, 17) can be considered. Here, HRTF (head-related transfer function, 17) is a head transfer function, and it is the theory that a difference in sound arriving at both ears from a specific sound source occurs due to the head. That is, the directionality of the voice can be grasped by considering the Interaural Time Difference (ITD) and the Interaural Level Difference (ILD) between the two ears.

Accordingly, by integrating the result obtained from the DOA 13 and the result obtained from the binaural synthesizing unit 16, the position of the sound source can be accurately determined.

The signal transmission unit is formed inside the housing and processes the data in real time to determine the directionality of the sound source, so that the direction data can be integrated into the sound data and transmitted to the speaker 200 or the headphone 210.

Thereafter, the final voice signal in which the directional data is integrated through the speaker 200 or the headphone 210 can be output.

Here, when the speaker 200 is used, since the sound of the speaker is transmitted to both ears of the user, it is necessary to separate and correct the channel between the ears.

Accordingly, the inside of the housing further includes a crosstalk cancellation 18, and the crosstalk canceling unit 18 can separate and correct the channel between the ears. The crosstalk removing unit 18 may use a filter for crosstalk canceling (CTC).

Therefore, it is possible to provide a hybrid type 3D microphone array system for telepresence which is built in a microphone array and can process real time data. According to embodiments, a hybrid type 3D microphone array system for telepresence is implemented in hardware form, Can be used in a video conferencing system. In addition, the microphone array has a built-in chipset (DSP) that processes data in real time, removes the noise of the collected audio signal, tracks the location of the sound source, and can render the audio signal based on binaural integration or the like .

In the below, noise is removed from the adaptive noise eliminator 12, the directional difference of the sound source is calculated by calculating the arrival time difference of the sound source in the DOA 13, and binaural synthesis unit 16 The configuration for determining the directionality of the speech in consideration of the differences will be described in more detail through respective algorithms.

2 is a diagram illustrating a microphone array noise removal algorithm according to an embodiment.

Referring to FIG. 2, in the microphone array noise cancellation algorithm, the measured signal d (n) may include two signals. One of these two signals may be a desired signal v (n) and the other may be an interference signal u (n).

The microphone array noise cancellation algorithm removes the interference signal from the measured signal using the reference signal x (n), which is removed using the reference signal x (n) correlated with the interference signal u (n) .

This microphone array noise cancellation algorithm can use Least Mean Square (LMS) as an adaptive filter. That is, the weight and coefficient of the filter can be calculated using a least mean square (LMS) filter, and finally the least mean square (LMS) of the error can be derived between the output signal and the desired signal. At this time, the Least Mean Square (LMS) filter can remove noise from the recording.

For example, if there are two input signals, one of them may be a distorted signal and the other may be a desired signal. The distorted signal may include music recording and filtered noise, while the other signal may include noise that is not filtered into the desired signal. By eliminating the difference between the output signal and the desired signal through the filter, a clean recording can be obtained.

When the microphone array noise cancellation algorithm starts, you can hear noise along with the voice at first, but over time, the adaptive filter can remove the noise and only hear the desired voice.

3 is a diagram illustrating a sound source tracking algorithm according to an exemplary embodiment of the present invention.

Referring to FIG. 3, the sound source tracking algorithm can track the position of a sound source using a DOA (Direction of Arrival) and an adaptive beamforming technique.

The Direction of Arrival (DOA) uses multiple microphones to obtain the cross-correlation of the arrival time difference of the sound source between the microphones when the sound source reaches the microphone, and derives the position of the sound source as an angle value based on the cross- can do.

In addition, the position accuracy of the sound source derived from the DOA can be improved by using an adaptive beamforming technique, MVDR (Minimum Variance Distortionless Response).

4 is a diagram for explaining a sound source tracking algorithm and DOA according to an embodiment.

As shown in FIG. 4, a conceptual diagram of a sound source tracking algorithm and a DOA and an example of a result presentation method can be shown.

The DOA uses multiple microphones to obtain the cross-correlation between the arrival time differences of the sound sources between the microphones and to derive the position of the sound source as an angle value. These angular value derivation equations and cross-correlation can be expressed as a function formula.

First, the angle value of the position of the sound source using the DOA can be derived as follows.

Figure 112015064971153-pat00001

Here, θ is the incident angle, τ s is the time difference of sound, c is the speed of sound, and d is the distance between the microphones.

In addition, the cross-correlation of arrival time differences of the sound sources between the microphones can be expressed by the following functional expression.

Figure 112015064971153-pat00002

Here, φ ij represents a cross-correlation function, τ represents a time length, and x i and x j represent signals received from i and j microphones at time t .

The binaural modeling algorithm is described below.

Binaural modeling algorithms reflect the dynamic sound environment and deliver the sound to the ears, allowing the user to feel the same voice direction when transmitting the sound of the actual situation to another space.

Binaural modeling algorithms can exist in two situations. The first is when the headphone is worn on the user's ear and the second is when the speaker is installed away from the user.

To generate a binaural model, a hearing process is required, and a head-related transfer function (HRTF) can be considered for locating sound. Here, the head-related transfer function (HRTF) is a head-related transfer function. It is the theory that a difference in sound arriving at both ears from a specific sound source occurs due to the head.

On the other hand, humans can track the location of a sound source by comparing the cues from one ear to the difference cues or binaural cues. Here, the difference cues or binaural cues can be the Interaural Time Difference (ITD) and the Interaural Level Difference (ILD) difference between the two ears.

In order to generate a binaural model, the auditory processing is required and the voice direction is determined by considering the Interaural Time Difference (ITD) and the Interaural Level Difference (ILD) difference between the two ears .

In particular, when using a speaker, the sound of the speaker is transmitted to both ears of the user, so it is necessary to separate and correct the channel between the ears. This correction process is called crosstalk cancellation (CTC), and a filter for crosstalk cancellation (CTC) can be used.

The frequency domain for expressing this can be expressed by the following equation.

Figure 112015064971153-pat00003

5 is a flow diagram illustrating a hybrid 3D microphone array operating method for telepresence in accordance with one embodiment.

Referring to FIG. 5, a method of operating a hybrid type 3D microphone array for telepresence can be described. Here, a description overlapping with the hybrid type 3D microphone array system for telepresence will be omitted.

In step 510, a method of operating a hybrid 3D microphone array for telepresence can collect voice signals using a microphone sensor in which six channels are arranged orthogonal to each other in the x, y, and z axes on the surface of the housing .

In step 520, the adaptive filter may be used in the interior of the housing to remove noise of the collected speech signal.

The method of removing noise is to calculate the weight and coefficient of the filter using a least square mean (LMS) filter and finally to derive the least mean square (LMS) of the error between the output signal and the desired signal Noise can be removed. At this time, a method of removing noise may be omitted.

In step 530, directionality of the sound source can be determined by calculating the arrival time difference of the sound source with respect to the sound signal collected through the plurality of channels of the microphone sensor in the DOA (Direction of Arrival) configured in the interior of the housing.

The method of determining the directionality of the sound source by calculating the arrival time difference of the sound source of the sound signal is a method of determining the arrival time difference of the sound sources of the plurality of sound signals collected through the plurality of channels of the microphone sensor through cross- And the position of the sound source can be derived as an angle value.

In step 540, the directionality of the sound source can be determined by reflecting the Interaural Time Difference (ITD) and the Interaural Level Difference (ILD) between the two ears in the collected voice signal.

In step 550, the directionality of the sound source is determined by integrating the result of at least one of the arrival time difference of the sound source, the sound difference between the two ears (ITD), and the sound intensity difference (ILD) So that the final voice signal whose directionality of the sound source is judged can be transmitted to the speaker or the headphone.

On the other hand, in a method of transmitting a final voice signal whose directionality of a sound source is determined to a speaker or a headphone, when outputting through a speaker, a channel between both ears can be separated and corrected through a crosstalk cancellation.

According to embodiments, the directionality of a sound source is determined in real time, and the direction data is integrated and output to the sound data, so that the sound received and output through the microphone array can be improved and the position of the sound source can be accurately estimated. Furthermore, it can be stereoscopically converted into a three-dimensional space and can be used in a video conferencing system such as a telepresence system.

6 to 8 are views showing a hybrid type 3D microphone array for telepresence according to another embodiment.

FIG. 6 is a perspective view showing a hybrid type 3D microphone array for telepresence according to another embodiment, and FIG. 7 is an enlarged view of FIG. 6A. And FIG. 8 is a cross-sectional view illustrating a hybrid type 3D microphone array for telepresence according to another embodiment.

6-8, a hybrid 3D microphone array 100 for telepresence may include a housing 110, a microphone sensor 120, and a chipset (not shown).

The housing 110 represents the outer shape of the hybrid type 3D microphone array 100 for telepresence, and a chipset can be formed therein.

The housing 110 may have a spherical shape, and the outer surface of the housing 110 may be formed by protruding six channels of the microphone sensor 120, but the shape of the housing 110 is not limited.

The microphone sensor 120 is formed on the surface of the housing 110 and collects voice signals, and two channels are arranged on the x, y, and z axes, and six channels are arranged orthogonal to each other.

Microphone sensors are an important part of electret condenser microphones and can be powered using a filler material, for example, permanently using electrostatic induction-based microphones. Such a microphone sensor may include the characteristics of a preamplifier.

The chipset is formed inside the housing 110, converts the voice signal collected from the microphone sensor 120 into a digital signal, processes the data in real time, and transmits the output signal to the speaker. In other words, the chipset is a DSP that can continuously process voice signals captured by a microphone sensor, remove noise by receiving real-world signals such as voice, audio, and video, Rendering, and so on.

Accordingly, the output signal output through the speaker or the headphone can be a signal in which the sound signal and the direction signal are integrated.

9 is a diagram showing a configuration used in a preamplifier of a microphone array according to another embodiment.

The chipset may include adaptive noise rejection, Direction of Arrival (DOA), binaural synthesis, and signal delivery.

The adaptive noise canceling unit may remove noise of the voice signal collected using the adaptive filter.

The Direction of Arrival (DOA) can determine the directionality of a sound source by calculating the arrival time difference of a sound source with respect to a voice signal collected through different channels.

The binaural synthesis unit can determine the directionality of the sound source by reflecting the Interaural Time Difference (ITD) and the Interaural Level Difference (ILD) between the two ears.

The signal transmission unit may determine the directionality of the sound source by integrating the result of the DOA and the result of the binaural synthesis unit, and may transmit the final sound signal whose directionality of the sound source is determined to the speaker or the headset.

Here, when the speaker is used, since the sound of the speaker is transmitted to both ears of the user, it is necessary to separate and correct the channel between the ears.

Accordingly, crosstalk canceling is further included, and the channel between both ears can be separated and corrected in the crosstalk canceling.

Hybrid 3D microphone arrays for such telepresence can be embedded with a chipset (signal processing unit, DPS) for accurate analysis. The chipset includes a DOA for determining the directionality of a sound source by calculating a time difference of arrival of a sound source between a plurality of microphones, and a binaural synthesis unit for determining the directionality of a sound source according to the difference in sound between two ears The motion of the user in the space can be accurately estimated based on the cross-correlation estimation.

Here, the provided speech (binaural signal) can be synthesized in six channels so as to be able to use a deliberate sound surround space for the listener. Also, a wideband beamforming scheme may be implemented in a chipset (DSP) and spatial resolution, signal to noise ratio, and bandwidth increase may be considered to estimate the quality of the speech signal have.

According to embodiments, a hybrid type 3D microphone array system for telepresence can be implemented in a hardware form and used in a video conferencing system such as a telepresence system. In addition, the microphone array has a built-in chipset (DSP) that processes data in real time, removes the noise of the collected speech signal, tracks the location of the sound source, and renders the speech signal based on binauralization (Rendering).

The apparatus described above may be implemented as a hardware component, a software component, and / or a combination of hardware components and software components. For example, the apparatus and components described in the embodiments may be implemented within a computer system, such as, for example, a processor, controller, arithmetic logic unit (ALU), digital signal processor, microcomputer, field programmable array (FPA) A programmable logic unit (PLU), a microprocessor, or any other device capable of executing and responding to instructions. The processing device may execute an operating system (OS) and one or more software applications running on the operating system. The processing device may also access, store, manipulate, process, and generate data in response to execution of the software. For ease of understanding, the processing apparatus may be described as being used singly, but those skilled in the art will recognize that the processing apparatus may have a plurality of processing elements and / As shown in FIG. For example, the processing apparatus may comprise a plurality of processors or one processor and one controller. Other processing configurations are also possible, such as a parallel processor.

The software may include a computer program, code, instructions, or a combination of one or more of the foregoing, and may be configured to configure the processing device to operate as desired or to process it collectively or collectively Device can be commanded. The software and / or data may be in the form of any type of machine, component, physical device, virtual equipment, computer storage media, or device , Or may be permanently or temporarily embodied in a transmitted signal wave. The software may be distributed over a networked computer system and stored or executed in a distributed manner. The software and data may be stored on one or more computer readable recording media.

The method according to an embodiment may be implemented in the form of a program command that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions to be recorded on the medium may be those specially designed and configured for the embodiments or may be available to those skilled in the art of computer software. Examples of computer-readable media include magnetic media such as hard disks, floppy disks and magnetic tape; optical media such as CD-ROMs and DVDs; magnetic media such as floppy disks; Magneto-optical media, and hardware devices specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions include machine language code such as those produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the embodiments, and vice versa.

While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments. For example, it is to be understood that the techniques described may be performed in a different order than the described methods, and / or that components of the described systems, structures, devices, circuits, Lt; / RTI > or equivalents, even if it is replaced or replaced.

Therefore, other implementations, other embodiments, and equivalents to the claims are also within the scope of the following claims.

Claims (13)

housing;
A microphone sensor formed on a surface of the housing for collecting voice signals and having two channels arranged on x, y and z axes and six channels arranged in orthogonal directions; And
A chip for converting the voice signal collected from the microphone sensor into a digital signal to process the data in real time and transmitting the output signal to a speaker,
/ RTI >
The directional data of the sound source in the chipset is determined with respect to the voice signal collected through the plurality of channels of the microphone sensor, and the direction data is integrated into the voice data and transmitted to the speaker or headphone,
A cross-correlation is obtained between arrival times of sound sources of a plurality of the sound signals collected through the plurality of channels of the microphone sensor, and the position of the sound source is derived as an angle value
A hybrid 3D microphone array for telepresence.
The method according to claim 1,
The housing has a spherical shape, and the six channels of the cylindrical shape of the microphone sensor are disposed on the surface of the housing in such a manner that they protrude in perpendicular directions to each other
A hybrid 3D microphone array for telepresence.
3. The method according to claim 1 or 2,
The chipset
An adaptive noise eliminator for removing noise of the voice signal collected using the adaptive filter;
A Direction of Arrival (DOA) for determining the directionality of a sound source by calculating an arrival time difference of a sound source with respect to the sound signal collected through different channels;
A binaural synthesizer for determining the directionality of the sound source by reflecting the Interaural Time Difference (ITD) and the Interaural Level Difference (ILD) between the two ears; And
A signal transmitting unit for receiving a result of the DOA and the result of the binaural combining unit to determine a directionality of the sound source and transmitting the final audio signal whose directionality of the sound source is determined to a speaker or a headset,
A hybrid 3D microphone array for telepresence.
housing;
A microphone sensor in which six channels are arranged on the surface of the housing in directions orthogonal to each other;
A voice signal collecting unit for collecting voice signals through the plurality of channels of the microphone sensor;
A Direction of Arrival (DOA) for determining the directionality of a sound source by calculating an arrival time difference of a sound source with respect to the sound signal collected through the different channels in the sound signal collecting unit;
Binaural, which determines the directionality of the sound source by reflecting the Interaural Time Difference (ITD) and the Interaural Level Difference (ILD) between the two ears, A synthesis section; And
A signal transmitting unit for receiving a result of the DOA and the result of the binaural combining unit to determine a directionality of the sound source and transmitting the final audio signal whose directionality of the sound source is determined to a speaker or a headset,
Lt; / RTI >
The Direction of Arrival (DOA), the Binaural synthesis unit, and the signal transfer unit are formed in the housing, and process data in real time to determine the directionality of the sound source, And transmitted to the speaker or headphone,
The DOA (Direction of Arrival)
A cross-correlation is used to obtain the arrival time difference of the sound source with respect to the voice signal collected through the different channels in the voice signal collecting unit, and the position of the sound source is derived as an angle value
Hybrid 3D microphone array system for telepresence.
5. The method of claim 4,
And an adaptive noise eliminator that is formed inside the housing and removes noise of the voice signal collected by the voice signal collection unit using an adaptive filter,
Further comprising: a microcomputer for generating a telemetry signal;
6. The method of claim 5,
The adaptive noise removing unit
Removing the noise by calculating the weight and coefficient of the filter using a least mean squared (LMS) filter and finally deriving the least mean square (LMS) of the error between the output signal and the desired signal
Hybrid 3D microphone array system for telepresence.
delete 5. The method of claim 4,
A crosstalk canceling unit for separating and correcting a channel between both ears in the housing,
Further comprising:
And outputting after correction through the crosstalk cancellation when outputting through the speaker
Hybrid 3D microphone array system for telepresence.
Collecting voice signals using a microphone sensor in which six channels are arranged on the surface of the housing in orthogonal directions on the x, y, and z axes;
Determining a directionality of a sound source by calculating a time difference of a sound source with respect to the sound signal collected through a plurality of channels of the microphone sensor in a DOA (Direction of Arrival) formed in the housing;
Determining the directionality of the sound source by reflecting the Interaural Time Difference (ITD) and the Interaural Level Difference (ILD) of the two ears in the collected voice signal; And
Determining the directionality of the sound source by integrating at least one of the arrival time difference of the sound source, the sound difference between the two ears (ITD) and the sound intensity difference (ILD), and performing data processing in real time within the housing Transmitting the final voice signal whose directionality of the sound source is determined to a speaker or a headphone
Lt; / RTI >
Wherein the step of calculating the arrival time difference of the sound source of the sound signal to determine the directionality of the sound source comprises:
A cross-correlation is obtained between arrival times of sound sources of a plurality of the sound signals collected through the plurality of channels of the microphone sensor, and the position of the sound source is derived as an angle value
Wherein the method comprises the steps < RTI ID = 0.0 > of: < / RTI >
10. The method of claim 9,
Removing noise of the collected voice signal using an adaptive filter in the housing
Further comprising the steps < RTI ID = 0.0 > of: < / RTI >
11. The method of claim 10,
The step of removing noise
The noise is removed by calculating the weight and coefficient of the filter using a least mean square (LMS) filter and finally deriving the least mean square (LMS) of the error between the output signal and the desired signal that
Wherein the method comprises the steps < RTI ID = 0.0 > of: < / RTI >
delete 10. The method of claim 9,
Wherein the step of delivering the final voice signal, which is determined to have the directionality of the sound source, to a speaker or a headphone comprises:
Separating and correcting a channel between both ears through crosstalk cancellation when outputting through the speaker,
Further comprising the steps < RTI ID = 0.0 > of: < / RTI >
KR1020150095517A 2015-07-03 2015-07-03 3D Hybrid Microphone Array System for Telepresence and Operating Method thereof KR101678305B1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
KR1020150095517A KR101678305B1 (en) 2015-07-03 2015-07-03 3D Hybrid Microphone Array System for Telepresence and Operating Method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
KR1020150095517A KR101678305B1 (en) 2015-07-03 2015-07-03 3D Hybrid Microphone Array System for Telepresence and Operating Method thereof

Publications (1)

Publication Number Publication Date
KR101678305B1 true KR101678305B1 (en) 2016-11-21

Family

ID=57537934

Family Applications (1)

Application Number Title Priority Date Filing Date
KR1020150095517A KR101678305B1 (en) 2015-07-03 2015-07-03 3D Hybrid Microphone Array System for Telepresence and Operating Method thereof

Country Status (1)

Country Link
KR (1) KR101678305B1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3677025A4 (en) * 2017-10-17 2021-04-14 Hewlett-Packard Development Company, L.P. Eliminating spatial collisions due to estimated directions of arrival of speech
US20230050677A1 (en) * 2021-08-14 2023-02-16 Clearone, Inc. Wideband DOA Improvements for Fixed and Dynamic Beamformers
WO2023191333A1 (en) * 2022-03-28 2023-10-05 삼성전자 주식회사 Electronic device and system for location inference
WO2024039049A1 (en) * 2022-08-17 2024-02-22 삼성전자주식회사 Electronic device and control method therefor

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20100111071A (en) * 2009-04-06 2010-10-14 한국과학기술원 System for identifying the acoustic source position in real time and robot which reacts to or communicates with the acoustic source properly and has the system
KR20110101169A (en) * 2008-11-24 2011-09-15 콸콤 인코포레이티드 Systems, methods, apparatus, and computer program products for enhanced active noise cancellation
KR20130116271A (en) * 2010-10-25 2013-10-23 퀄컴 인코포레이티드 Three-dimensional sound capturing and reproducing with multi-microphones

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20110101169A (en) * 2008-11-24 2011-09-15 콸콤 인코포레이티드 Systems, methods, apparatus, and computer program products for enhanced active noise cancellation
KR20100111071A (en) * 2009-04-06 2010-10-14 한국과학기술원 System for identifying the acoustic source position in real time and robot which reacts to or communicates with the acoustic source properly and has the system
KR20130116271A (en) * 2010-10-25 2013-10-23 퀄컴 인코포레이티드 Three-dimensional sound capturing and reproducing with multi-microphones

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3677025A4 (en) * 2017-10-17 2021-04-14 Hewlett-Packard Development Company, L.P. Eliminating spatial collisions due to estimated directions of arrival of speech
US11317232B2 (en) 2017-10-17 2022-04-26 Hewlett-Packard Development Company, L.P. Eliminating spatial collisions due to estimated directions of arrival of speech
US20230050677A1 (en) * 2021-08-14 2023-02-16 Clearone, Inc. Wideband DOA Improvements for Fixed and Dynamic Beamformers
WO2023191333A1 (en) * 2022-03-28 2023-10-05 삼성전자 주식회사 Electronic device and system for location inference
WO2024039049A1 (en) * 2022-08-17 2024-02-22 삼성전자주식회사 Electronic device and control method therefor

Similar Documents

Publication Publication Date Title
JP6665379B2 (en) Hearing support system and hearing support device
CN106653041B (en) Audio signal processing apparatus, method and electronic apparatus
US10097921B2 (en) Methods circuits devices systems and associated computer executable code for acquiring acoustic signals
US10397722B2 (en) Distributed audio capture and mixing
US20220116723A1 (en) Filter selection for delivering spatial audio
JP6149818B2 (en) Sound collecting / reproducing system, sound collecting / reproducing apparatus, sound collecting / reproducing method, sound collecting / reproducing program, sound collecting system and reproducing system
CN104464739B (en) Acoustic signal processing method and device, Difference Beam forming method and device
KR101724514B1 (en) Sound signal processing method and apparatus
CN106797525B (en) For generating and the method and apparatus of playing back audio signal
JP6466968B2 (en) System, apparatus and method for consistent sound scene reproduction based on informed space filtering
JP4051408B2 (en) Sound collection / reproduction method and apparatus
WO2016183791A1 (en) Voice signal processing method and device
JP6834971B2 (en) Signal processing equipment, signal processing methods, and programs
CN104936125B (en) surround sound implementation method and device
KR101678305B1 (en) 3D Hybrid Microphone Array System for Telepresence and Operating Method thereof
KR20130116271A (en) Three-dimensional sound capturing and reproducing with multi-microphones
CN106872945A (en) Sound localization method, device and electronic equipment
JP5754595B2 (en) Trans oral system
KR20130109615A (en) Virtual sound producing method and apparatus for the same
JP6587047B2 (en) Realistic transmission system and realistic reproduction device
US20190306618A1 (en) Methods circuits devices systems and associated computer executable code for acquiring acoustic signals
WO2024069796A1 (en) Sound space construction device, sound space construction system, program, and sound space construction method
JP2022131067A (en) Audio signal processing device, stereophonic sound system and audio signal processing method
Lindqvist et al. Real-time multiple audio beamforming system
Cha et al. Development of an integrated smart sensor system for sound synthesis and reproduction in telepresence

Legal Events

Date Code Title Description
E701 Decision to grant or registration of patent right
GRNT Written decision to grant
FPAY Annual fee payment

Payment date: 20190905

Year of fee payment: 4