KR101970589B1

KR101970589B1 - Speech signal transmitting apparatus, speech signal receiving apparatus and method thereof

Info

Publication number: KR101970589B1
Application number: KR1020120017252A
Authority: KR
Inventors: 최병권; 권영도; 김동수; 노경식
Original assignee: 삼성전자주식회사
Priority date: 2011-11-28
Filing date: 2012-02-21
Publication date: 2019-04-19
Also published as: KR20130059250A

Abstract

본 발명은 복수의 마이크로폰으로부터 수집된 음원 신호에서 음성 신호를 각각 추출하는 추출부; 다 채널의 음성 신호의 파워를 각각 산출하고 다 채널의 음성 신호 중 어느 하나의 음성 신호를 기준 음성 신호로 설정하는 파워 산출부; 기준 음성 신호에 기초하여 나머지 음성 신호의 싱크를 조절하는 싱크 조절부; 싱크가 조절된 나머지 음성 신호에서 기준 음성 신호를 각각 상쇄시켜 추출 신호를 각각 생성하는 신호 생성부; 기준 음성 신호와 각각의 추출 신호를 압축 및 암호화하는 암호화부; 압축 및 암호화된 기준 음성 신호와 각각의 추출 신호를 송신하는 송신부를 포함한다.
일 측면에 따르면, 다 채널의 음성 신호를 압축하기 전에 기준 음성 신호를 기준으로 나머지 음성 신호의 용량을 줄여 압축 효율을 높일 수 있고 시간도 줄일 수 있다. 또한 무손실 압축 기준으로 1%내지 3%의 압축효율을 볼 수 있다.The present invention relates to a speech recognition apparatus, comprising: an extraction unit for extracting a speech signal from a sound source signal collected from a plurality of microphones; A power calculating unit for calculating the power of the multi-channel audio signal and setting one of the multi-channel audio signals as a reference audio signal; A sync adjusting unit for adjusting a sync of the remaining audio signal based on the reference audio signal; A signal generator for generating an extracted signal by canceling each of the reference voice signals from the remaining voice signals for which synchronization is controlled; An encryption unit for compressing and encrypting the reference speech signal and each extracted signal; And a transmitter for transmitting the compressed and encrypted reference speech signal and each extracted signal.
According to an aspect of the present invention, it is possible to reduce the capacity of the remaining voice signals based on the reference voice signal before compressing the voice signals of multiple channels, thereby increasing the compression efficiency and reducing the time. Also, compression efficiency of 1% to 3% can be seen on the lossless compression standard.

Description

BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech signal transmitting apparatus, a speech signal receiving apparatus,

본 발명은 음성 신호를 압축하여 송신하고 수신된 음성신호를 복원하는 음성 신호 송신 장치, 음성 신호 수신 장치 및 그 방법에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a voice signal transmitting apparatus, a voice signal receiving apparatus, and a method for compressing and transmitting a voice signal and restoring the received voice signal.

일반적으로 음성 신호 송신 장치는 음성 신호가 음원에 따라서 여진된 공명계의 출력으로 간주되는 것에 착안하여 음성 신호를 음원 및 공명계의 특성을 나타내는 몇 개의 파라미터로 분해하여 전송하고, 음성 신호 수신 장치는 이들 파라미터에 따라 원래의 음성 신호를 합성한다.In general, the voice signal transmitting apparatus disassembles and transmits a voice signal into several parameters indicative of characteristics of a sound source and a resonance system, considering that a voice signal is regarded as an output of a resonance system excited by the sound source, The original speech signal is synthesized according to these parameters.

음성 신호 송신 장치 및 음성 신호 수신 장치는 프레임 단위로 음성 신호를 엔코딩 및 디코딩하는 코덱을 포함하는데, 이러한 코덱 중 예를 들어 G.729 코덱은 프레임부로부터 프레임을 입력받아 10㎳ 단위로 음성 신호의 엔코딩 및 디코딩을 수행한다. The voice signal transmitting apparatus and the voice signal receiving apparatus include a codec for encoding and decoding a voice signal on a frame unit basis. For example, among the codecs, the G.729 codec receives a frame from a frame unit, Encoding and decoding.

여기서, 프레임부는 외부로부터 8㎑로 연속적으로 전송되어지는 샘플들을 10㎳ 단위로 분류하고, 이렇게 분류된 80개의 샘플을 하나의 프레임으로 하여 G.729 코덱에 입력 신호로 제공한다.Here, the frame part classifies the samples continuously transmitted at 8 kHz from the outside in units of 10 ms, and provides the 80 samples thus classified as one frame to the G.729 codec as an input signal.

이러한 G.729 코덱은 DSP(Digital Signal Processor)를 이용하여 구현될 수도 있다.The G.729 codec may be implemented using a DSP (Digital Signal Processor).

이 경우 DSP의 메모리 구조는, 처리하는 채널 수에 대응하는 실행 코드를 생성하여 저장하는 코드부와, 프로그램 활용 공간으로 광역 변수, 각 채널 버퍼 스택 등을 저장하는 데이터부로 이루어진다. In this case, the memory structure of the DSP is composed of a code section for generating and storing an execution code corresponding to the number of channels to be processed, and a data section for storing a global variable and each channel buffer stack as a program utilization space.

이러한 코덱에서는 DSP의 처리 능력에 따라 구현 가능한 채널의 수가 결정되고, DSP에서 처리할 수 있는 채널의 수가 증가되는 경우에는 채널 수에 대응되는 실행 코드를 생성해야 하므로 필요한 메모리의 양도 증가하게 된다.In this codec, the number of channels that can be implemented depends on the processing capability of the DSP. When the number of channels that can be processed by the DSP increases, an execution code corresponding to the number of channels is generated.

또한 다채널의 음성 신호 압축 시 손실 압축 데이터가 필요할 경우나 성능을 최대화 하기 위해 무손실 데이터가 필요할 경우 마이크로폰의 수 만큼 전송하고자 하는 음성 데이터의 신호의 양이 커지게 된다. Also, when lossy compressed data is required for compressing multi-channel audio signals or when lossless data is required to maximize performance, the amount of voice data to be transmitted is increased by the number of microphones.

아울러 다채널 마이크로폰을 통해 음성 신호를 수집할 때 마이크로 폰 간의 위치나 특성에 따라 음성 신호의 싱크가 변동되고 음성 신호 간의 파워가 상이하여 압축이 용이하지 못하고 압축효율이 낮은 문제가 있다.In addition, when a voice signal is collected through a multi-channel microphone, there is a problem in that the synchronization of the voice signal is varied depending on the position or characteristics of the microphones, the power is different between the voice signals, and the compression is not easy and the compression efficiency is low.

일 측면은복수의 마이크로폰 사이의 상관관계를 이용하여 다채널 음성 신호의 파워 및 싱크를 조절한 후 암호화 및 압축하여 송신하는 음성 신호 송신 장치 및 그 방법을 제공한다.One aspect of the present invention provides a speech signal transmitting apparatus and method for transmitting and receiving a multi-channel speech signal by controlling the power and the sink of the multi-channel speech signal using the correlation between the plurality of microphones,

다른 측면은 수신된 음성 신호를 파워 계수 및 싱크 계수를 이용하여 복원하는 음성 신호 수신 장치 및 그 방법을 제공한다.Another aspect provides a voice signal receiving apparatus and method for restoring a received voice signal using a power coefficient and a sync coefficient.

일 측면에 따른 음성 신호 송신 장치는, 복수의 마이크로폰으로부터 수집된 음원 신호에서 음성 신호를 각각 추출하는 추출부; 다 채널의 음성 신호의 파워를 각각 산출하고 다 채널의 음성 신호 중 어느 하나의 음성 신호를 기준 음성 신호로 설정하는 파워 산출부; 기준 음성 신호에 기초하여 나머지 음성 신호의 싱크를 조절하는 싱크 조절부; 싱크가 조절된 나머지 음성 신호에서 기준 음성 신호를 각각 상쇄시켜 추출 신호를 각각 생성하는 신호 생성부; 기준 음성 신호와 각각의 추출 신호를 압축 및 암호화하는 암호화부; 압축 및 암호화된 기준 음성 신호와 각각의 추출 신호를 송신하는 송신부를 포함한다.An apparatus for transmitting a sound signal according to an aspect includes: an extractor for extracting a sound signal from a sound source signal collected from a plurality of microphones; A power calculating unit for calculating the power of the multi-channel audio signal and setting one of the multi-channel audio signals as a reference audio signal; A sync adjusting unit for adjusting a sync of the remaining audio signal based on the reference audio signal; A signal generator for generating an extracted signal by canceling each of the reference voice signals from the remaining voice signals for which synchronization is controlled; An encryption unit for compressing and encrypting the reference speech signal and each extracted signal; And a transmitter for transmitting the compressed and encrypted reference speech signal and each extracted signal.

파워 산출부는, 다 채널의 음성 신호 중 파워가 가장 큰 음성 신호를 기준 음성 신호로 설정한다.The power calculator sets the audio signal having the largest power among the multi-channel audio signals as the reference audio signal.

파워 산출부는, 나머지 음성신호의 파워 각각과 기준 음성 신호의 파워와의 비율에 기초하여 나머지 음성신호에 각각 대응되는 파워 계수를 산출한다.The power calculator calculates a power coefficient corresponding to each remaining voice signal based on the ratio of the power of each of the remaining voice signals to the power of the reference voice signal.

신호 생성부는, 나머지 음성신호에 대응되는 파워 계수를 기준 음성 신호에 각각 반영하여 나머지 음성신호에 대응되는 상쇄 신호를 각각 생성하고, 나머지 음성신호에서 각각 대응되는 상쇄 신호를 각각 상쇄시켜 추출 신호를 각각 생성한다.The signal generator reflects the power coefficient corresponding to the remaining voice signal to the reference voice signal to generate the offset signal corresponding to the remaining voice signal, respectively, and cancel each of the offset signals corresponding to the remaining voice signals, .

신호 생성부는, 나머지 음성신호의 파워에서 기준 음성 신호의 파워를 각각 차감하여 상쇄시킨다.The signal generator subtracts the power of the reference speech signal from the power of the remaining speech signal to cancel each of them.

암호화부는, 각각의 추출 신호별로 기준 음성 신호를 수집한 마이크로폰의 정보, 자신의 추출 신호, 마이크로폰 정보, 파워 계수 및 싱크 계수를 각각 암호화한다.The encryption unit encrypts the information of the microphone that has collected the reference speech signal for each extracted signal, its extracted signal, microphone information, power coefficient, and sync coefficient.

싱크 조절부는, 기준 음성 신호가 수집된 마이크로폰과 나머지 음성신호가 수집된 마이크로폰과의 거리에 각각 기초하여 나머지 음성신호의 싱크 계수를 각각 산출하고 각각 산출된 싱크 계수에 기초하여 나머지 음성신호의 싱크를 각각 조절한다.The sync adjusting unit calculates the sync coefficients of the remaining audio signals on the basis of the distances between the microphone for which the reference speech signal is collected and the microphone for which the remaining audio signals are collected, respectively, and sinks the remaining audio signals based on the calculated sync coefficients Respectively.

싱크 조절부는, 복수의 마이크로 폰 간의 상관관계를 이용하여 나머지 음성신호의 싱크를 각각 조절한다.The sync adjusting unit adjusts the synchronization of the remaining audio signals using the correlation between the plurality of microphones.

다른 측면에 따른 음성 신호 수신 장치는, 다채널의 신호를 수신하는 수신부; 수신된 다 채널의 신호를 기준 음성 신호와 적어도 하나의 추출 신호로 복호화하는 복호화부; 복호화된 적어도 하나의 추출 신호의 파워를 복원하여 음성 신호로 복원하는 파워 복원부; 파워가 복원된 적어도 하나의 음성 신호의 싱크를 복원하는 싱크 복원부; 기준 음성 신호와 파워 및 싱크가 복원된 적어도 하나의 음성 신호를 멀티 플렉싱하는 멀티 플렉싱부; 멀티 플렉싱된 음성 신호를 출력하는 출력부를 포함한다.According to another aspect of the present invention, there is provided a voice signal receiving apparatus comprising: a receiving unit for receiving a multi-channel signal; A decoding unit decoding the received multi-channel signal into a reference speech signal and at least one extracted signal; A power restoring unit for restoring the power of the decrypted at least one extracted signal and restoring it to a voice signal; A sync restoring unit for restoring a sync of at least one audio signal in which power is restored; A multiplexing unit for multiplexing a reference speech signal and at least one speech signal in which power and synch is restored; And an output unit for outputting the multiplexed voice signal.

수신부는, 수신된 신호에서 기준 음성 신호와 적어도 하나의 추출 신호를 복호화부에 전달하고, 적어도 하나의 추출 신호의 정보를 파워 복원부 및 싱크 복원부에 전달한다.The receiving unit transmits the reference speech signal and at least one extracted signal to the decoding unit from the received signal, and transmits information of at least one extracted signal to the power recovery unit and the sink restoration unit.

복호화부는, 수신된 다채널의 신호의 헤더를 파싱하여 기준 음성 신호와 추출 신호를 구분한다.The decoding unit parses the header of the received multi-channel signal to distinguish the reference speech signal from the extracted signal.

적어도 하나의 추출 신호의 정보는, 기준 음성 신호가 수집된 마이크로폰의 정보, 자신의 마이크로폰 정보, 파워 계수 및 싱크 계수를 포함한다.The information of the at least one extracted signal includes the information of the microphone from which the reference speech signal is collected, its own microphone information, the power coefficient, and the sync coefficient.

파워 복원부는, 파워 계수를 이용하여 추출 신호의 파워를 복원하여 음성 신호로 복원한다.The power recovery unit restores the power of the extracted signal by using the power coefficient to restore it as a voice signal.

싱크 복원부는, 파워가 복원된 음성 신호의 싱크를 싱크 계수를 이용하여 복원한다.The sink restoration unit restores the power of the restored voice signal using the sync coefficient.

또 다른 측면에 따른 음성 신호 송신 방법은, 복수의 마이크로폰으로부터 음원 신호를 수집하고, 수집된 음원 신호에서 음성 신호를 각각 추출하고, 다 채널의 음성 신호의 파워를 각각 산출하고, 다 채널의 음성 신호 중 어느 하나의 음성 신호를 기준 음성 신호로 설정하고, 기준 음성 신호에 기초하여 나머지 음성 신호의 싱크를 조절하고, 싱크가 조절된 나머지 음성 신호에서 기준 음성 신호를 각각 상쇄시켜 추출 신호를 각각 생성하고, 기준 음성 신호와 각각의 추출 신호를 압축 및 암호화하고, 압축 및 암호화된 기준 음성 신호와 각각의 추출 신호를 송신한다.According to another aspect of the present invention, there is provided a method for transmitting a sound signal, comprising the steps of: collecting a sound source signal from a plurality of microphones; extracting a sound signal from the collected sound source signal; Is set as a reference speech signal, the synchronization of the rest of the speech signals is adjusted based on the reference speech signal, and the extracted speech signals are canceled from the rest of the speech signals for which the synchronization has been adjusted, , Compresses and encrypts the reference speech signal and each extracted signal, and transmits the compressed and encrypted reference speech signal and each extracted signal.

기준 음성 신호로 설정하는 것은, 다 채널의 음성 신호 중 파워가 가장 큰 음성 신호를 기준 음성 신호로 설정하는 것을 포함한다.Setting the reference speech signal includes setting the speech signal having the largest power among the multi-channel speech signals as the reference speech signal.

추출 신호를 각각 생성하는 것은, 나머지 음성신호의 파워 각각과 기준 음성 신호의 파워와의 비율에 기초하여 나머지 음성신호에 각각 대응되는 파워 계수를 산출하고, 나머지 음성신호에 대응되는 파워 계수를 기준 음성 신호에 각각 반영하여 나머지 음성신호에 대응되는 상쇄 신호를 각각 생성하고, 나머지 음성신호에서 각각 대응되는 상쇄 신호를 각각 상쇄시켜 추출된 추출 신호를 각각 생성하는 것을 포함한다.The generation of the extraction signal is performed by calculating a power coefficient corresponding to each remaining voice signal based on the ratio of the power of each of the remaining voice signals to the power of the reference voice signal, And generating cancel signals corresponding to the remaining voice signals, respectively, and canceling corresponding cancel signals in the remaining voice signals, respectively, to generate the extracted signals.

기준 음성 신호와 각각의 추출 신호를 압축 및 암호화하는 것은, 각각의 추출 신호별로 기준 음성 신호가 수집된 마이크로폰의 정보, 자신의 추출 신호, 자신의 마이크로폰 정보, 자신의 파워 계수 및 자신의 싱크 계수를 각각 암호화하는 것을 포함한다.Compressing and encrypting the reference speech signal and the respective extracted signals may be performed by extracting the information of the microphone from which the reference speech signal is collected for each of the extracted signals, its extracted signal, its own microphone information, its own power coefficient, Respectively.

나머지 음성 신호의 싱크를 조절하는 것은, 기준 음성 신호가 수집된 마이크로폰과 나머지 음성신호가 수집된 마이크로폰과의 거리에 각각 기초하여 나머지 음성신호의 싱크 계수를 각각 산출하고, 각각 산출된 싱크 계수에 기초하여 나머지 음성신호의 싱크를 각각 조절하는 것을 포함한다.The adjustment of the remaining audio signals is performed by calculating the sync coefficients of the remaining audio signals on the basis of the distances between the microphones from which the reference audio signals are collected and the microphones from which the remaining audio signals are collected, And adjusting the synchronization of the remaining audio signals, respectively.

또 다른 측면에 따른 음성 신호 수신 방법은, 다채널의 신호를 수신하고, 수신된 다 채널의 신호를 복호화하여 기준 음성 신호, 적어도 하나의 추출 신호 및 적어도 하나의 추출 신호의 정보를 생성하고, 적어도 하나의 추출 신호의 정보에 기초하여 적어도 하나의 추출 신호의 파워 및 싱크를 복원한다.According to another aspect of the present invention, there is provided a method for receiving a voice signal, comprising the steps of receiving a multi-channel signal and decoding the received multi-channel signal to generate information of a reference voice signal, at least one extraction signal, And restores the power and the sink of at least one extracted signal based on the information of one extracted signal.

또 다른 측면에 따른 음성 신호 수신 방법은 기준 음성 신호와 파워 및 싱크가 복원된 적어도 하나의 음성 신호를 멀티 플렉싱하고, 멀티 플렉싱된 음성 신호를 출력하는 것을 더 포함한다.According to another aspect of the present invention, there is provided a method for receiving a voice signal, the method further comprising: multiplexing a reference voice signal, at least one voice signal having power and a restored voice signal, and outputting the multiplexed voice signal.

파워를 복원하는 것은, 파워 계수를 이용하여 추출 신호의 파워를 복원하는 것을 포함한다.Restoring the power includes restoring the power of the extracted signal using the power coefficient.

싱크를 복원하는 것은, 싱크 계수를 이용하여 파워가 복원된 음성 신호의 싱크를 복원하는 것을 포함한다.Reconstructing the sync involves reconstructing the sync of the audio signal with the power restored using the sync coefficient.

파워 계수는, 기준 음성 신호의 파워와 적어도 하나의 음성 신호의 파워와의 비율이다.The power coefficient is the ratio of the power of the reference voice signal to the power of at least one voice signal.

일 측면에 따르면, 다 채널의 음성 신호를 압축하기 전에 기준 음성 신호를 기준으로 나머지 음성 신호의 용량을 줄여 압축 효율을 높일 수 있고 시간도 줄일 수 있다.According to an aspect of the present invention, it is possible to reduce the capacity of the remaining voice signals based on the reference voice signal before compressing the voice signals of multiple channels, thereby increasing the compression efficiency and reducing the time.

또한 무손실 압축 기준으로 1%내지 3%의 압축효율을 볼 수 있다.Also, compression efficiency of 1% to 3% can be seen on the lossless compression standard.

도 1은 실시예에 따른 음성 신호 송신 장치 및 음성 신호 수신 장치의 구성도이다.
도 2는 실시예에 따른 음성 신호 송신 장치의 상세 구성도이다.
도 3은 실시예에 따른 음성 신호 수신 장치의 상세 구성도이다.
도 4는 실시예에 따른 음성 신호 송신 방법의 순서도이다.
도5는 실시예에 따른 음성 신호 송신 전의 추출 신호 생성 예시도이다.
도6은 실시예에 따른 음성 신호 수신 방법의 순서도이다.
도7은 실시예에 따른 음성 신호 수신 후의 음성 신호 복원 예시도이다.1 is a configuration diagram of a voice signal transmitting apparatus and a voice signal receiving apparatus according to an embodiment.
2 is a detailed configuration diagram of a voice signal transmitting apparatus according to an embodiment.
3 is a detailed configuration diagram of a voice signal receiving apparatus according to an embodiment.
4 is a flowchart of a voice signal transmission method according to an embodiment.
5 is a diagram illustrating an example of extraction signal generation before speech signal transmission according to an embodiment.
6 is a flowchart of a method of receiving a voice signal according to an embodiment.
7 is a diagram illustrating an example of restoration of a speech signal after reception of a speech signal according to the embodiment.

이하에서는 첨부도면을 참조하여 본 발명에 대해 상세히 설명한다.Hereinafter, the present invention will be described in detail with reference to the accompanying drawings.

도 1은 실시예에 따른 음성 신호 송신 장치 및 음성 신호 수신 장치의 구성도이고, 도 2는 실시예에 따른 음성 신호 송신 장치의 상세 구성도이며, 도 3은 실시예에 따른 음성 신호 수신 장치의 상세 구성도이다.FIG. 1 is a configuration diagram of a voice signal transmitting apparatus and a voice signal receiving apparatus according to an embodiment, FIG. 2 is a detailed configuration diagram of a voice signal transmitting apparatus according to an embodiment, FIG. 3 is a block diagram of a voice signal receiving apparatus FIG.

음성 신호 송신 장치(100)와 음성 신호 수신 장치(200)는 서로 다른 단말기 내에 위치 가능하고, 서로 다른 단말기 내에서 네트워크를 통해 모니터링 및 음성 인식 등을 위한 음성 신호를 송수신한다.The voice signal transmitting apparatus 100 and the voice signal receiving apparatus 200 can be located in different terminals and transmit and receive voice signals for monitoring, voice recognition, and the like through the network in different terminals.

예를 들어, 다채널 마이크를 가진 로봇(단말)에서 음성 신호를 입력받아 다채널 음성 신호를 처리하기 위한 원격지의 베이스 스테이션(Base Station)과 원격 클라이언트(Remote Client)로 보낸다. For example, a robot having a multi-channel microphone receives a voice signal and sends it to a remote base station and a remote client for processing a multi-channel voice signal.

이때 음성 신호 송신 장치(100)는 음성 신호 송수신 시 원활한 음성 신호의 송수신을 위해 음성 신호를 압축하여 전송하고, 음성 신호 수신 장치(200)는 압축된 음성 신호를 수신하여 압축된 음성 신호를 복원한다.At this time, the voice signal transmission apparatus 100 compresses and transmits a voice signal for smooth voice signal transmission / reception during voice signal transmission / reception, and the voice signal reception apparatus 200 receives the compressed voice signal and restores the compressed voice signal .

즉 음성 신호 송신 장치(100)는 음원을 수집하고 수집된 음원에서 음성 신호를 추출한 후 추출된 음성 신호를 압축 및 암호화하여 음성 신호 수신장치(200)에 전송한다.That is, the voice signal transmitting apparatus 100 collects a voice source, extracts a voice signal from the collected voice source, compresses and encrypts the extracted voice signal, and transmits the voice signal to the voice signal receiving apparatus 200.

그리고 음성 신호 수신장치(200)는 압축 및 암호화된 음성 신호가 수신되면 수신된 음성 신호의 복호화 및 복원을 수행하고, 복호화 및 복원된 음성 신호를 외부에 출력한다.When the compressed and encrypted voice signal is received, the voice signal receiving apparatus 200 performs decoding and restoration of the received voice signal, and outputs the decoded and restored voice signal to the outside.

도 1에 도시된 바와 같이, 음성 신호 송신 장치(100)는 수집부(110: 111 내지 114), 추출부(120), 압축부(130) 및 송신부(140)를 포함한다.1, the voice signal transmitting apparatus 100 includes a collecting unit 110: 111 to 114, an extracting unit 120, a compressing unit 130, and a transmitting unit 140.

수집부(100)는 서로 일정 간격을 두고 설치된 복수의 마이크로 폰(110: 111 내지 114)을 포함한다. 여기서 복수의 마이크로폰(111 내지 114)은 음파 또는 초음파를 받아서 그 진동에 따른 전기 신호를 발생하는 장치로, 여기서의 전기 신호는 음원 신호이다.The collecting unit 100 includes a plurality of microphones 110: 111 to 114 provided at regular intervals from each other. Here, the plurality of microphones 111 to 114 receive sound waves or ultrasonic waves and generate an electric signal according to the vibration, wherein the electric signal is a sound source signal.

복수의 마이크로 폰 간의 일정 간격은 미리 저장되어 있으며, 복수의 마이크로 폰 간의 위치 정보가 미리 저장되어 있는 것도 가능하다.A predetermined interval between the plurality of microphones is stored in advance, and position information between a plurality of microphones may be stored in advance.

이러한 복수의 마이크로폰(111 내지 114)은 주변의 음원을 수집하고, 수집된 음원의 신호를 추출부(120)에 전송한다.The plurality of microphones 111 to 114 collect surrounding sound sources and transmit the collected sound source signals to the extracting unit 120.

추출부(120)는 복수의 마이크로 폰(111 내지 114)을 통해 전달된 다 채널의 음원 신호로부터 음성 신호를 추출한다.The extracting unit 120 extracts speech signals from the multi-channel sound source signals transmitted through the plurality of microphones 111 to 114.

압축부(130)는 다 채널의 음성 신호 중 어느 하나의 채널의 음성 신호를 기준 음성 신호로 설정하고, 기준 음성 신호와 나머지 음성 신호의 상관 관계에 기초하여 나머지 음성 신호의 용량을 감소시키고 기준 음성 신호와 용량이 감소된 나머지 음성 신호의 암호화 및 압축을 수행한다.The compression unit 130 sets a voice signal of one of the multi-channel voice signals as a reference voice signal, reduces the capacity of the remaining voice signal based on the correlation between the reference voice signal and the remaining voice signals, And performs encryption and compression of the rest of the voice signal whose signal and capacity have been reduced.

송신부(140)는 압축 및 암호화된 기준 음성 신호와 변화된 나머지 음성 신호를 음성 신호 수신 장치(200)에 송신한다.The transmission unit 140 transmits the compressed and encrypted reference speech signal and the changed remaining speech signal to the speech signal reception apparatus 200.

도 2를 참조하여 압축부(130)를 좀 더 구체적으로 설명한다.The compression unit 130 will be described in more detail with reference to FIG.

압축부(130)는 파워 산출부(131), 싱크 조절부(132), 신호 생성부(133) 및 암호화부(134)를 포함한다.The compression unit 130 includes a power calculation unit 131, a sync adjustment unit 132, a signal generation unit 133, and an encryption unit 134.

파워 산출부(131)는 다 채널의 음성 신호의 파워를 각각 산출하고 다 채널의 음성 신호 중 어느 하나의 채널의 음성 신호를 기준 음성 신호로 설정하고, 기준 음성 신호의 파워와 나머지 음성 신호의 파워 간의 비율에 기초하여 파워 계수를 각각 산출한다.The power calculator 131 calculates the power of the multi-channel audio signal, sets the audio signal of any one of the multi-channel audio signals as the reference audio signal, and adjusts the power of the reference audio signal and the power The power coefficient is calculated based on the ratio between the power coefficients.

여기서 기준 음성 신호는, 다 채널의 음성 신호 중 파워가 가장 큰 음성 신호이다.Here, the reference speech signal is the speech signal having the largest power among the multi-channel speech signals.

예를 들어, 제1마이크로 폰에서 수집된 제1음성 신호, 제2마이크로 폰에서 수집된 제2음성신호, 제3마이크로 폰에서 수집된 제3음성 신호, 제4마이크로폰에서 수집된 제4음성신호가 존재하는 경우, 제1음성신호, 제2음성신호, 제3음성신호, 제4음성 신호의 파워를 산출하고, 이 중 파워가 가장 큰 음성 신호를 기준 음성 신호로 설정한다. For example, the first voice signal collected in the first microphone, the second voice signal collected in the second microphone, the third voice signal collected in the third microphone, the fourth voice signal collected in the fourth microphone, The power of the first audio signal, the second audio signal, the third audio signal, and the fourth audio signal is calculated, and the audio signal having the largest power is set as the reference audio signal.

아울러, 기준 음성 신호는, 기준 마이크로 폰을 미리 설정하고, 이 기준 마이크로 폰에서 수집된 음성 신호를 기준 음성 신호로 설정하는 것도 가능하며, 파워가 가장 작은 음성 신호를 기준 음성 신호로 설정하는 것도 가능하다.It is also possible to set a reference microphone as a reference voice signal, to set a voice signal collected from the reference microphone as a reference voice signal, and to set a voice signal having the smallest power as a reference voice signal Do.

여기서 파워 산출은, 평균 제곱 파워(Mean Square Power)를 이용하여 산출한다.Here, the power calculation is performed using a mean square power.

이때 제1음성 신호가 기준 음성 신호라 가정하면, 제1음성 신호에 대한 제1파워 계수는 1이 되고, 제2음성 신호에 대한 제2파워 계수는 기준 음성 신호의 파워와 제2음성신호의 파워의 비율이 되며, 제3음성 신호에 대한 제3파워 계수는 기준 음성 신호의 파워와 제3음성 신호의 파워의 비율이 되고, 제4음성 신호에 대한 제4파워 계수는 기준 음성 신호의 파워와 제4음성 신호의 파워의 비율이 된다.Assuming that the first speech signal is a reference speech signal, the first power coefficient for the first speech signal is 1, the second power coefficient for the second speech signal is the power of the reference speech signal, The third power coefficient for the third voice signal is a ratio of the power of the reference voice signal to the power of the third voice signal and the fourth power coefficient for the fourth voice signal is a ratio of the power of the reference voice signal And the power of the fourth audio signal.

파워 산출부(131)는 마이크 폰의 음성 신호와, 기준 음성 신호, 나머지 음성 신호의 파워계수를 전달한다.The power calculator 131 transmits the power coefficient of the voice signal of the microphone, the reference voice signal, and the remaining voice signal.

싱크 조절부(132)는 기준 음성 신호에 기초하여 나머지 음성 신호의 싱크를 조절한다.The sync adjusting unit 132 adjusts the synchronization of the remaining audio signals based on the reference audio signal.

싱크 조절부(132)는 음성 신호들 간의 상관관계(correlation)를 이용하여 싱크를 맞춘다.The sync adjusting unit 132 adjusts the sync using a correlation between voice signals.

아울러 각각 음성 신호 간의 차이를 이용하여 최소 차이값을 산출하고 이를 이용하여 싱크 계수를 산출하는 것도 가능하고, 마이크 폰 사이의 거리에 기초하여 싱크 계수를 산출하는 것도 가능하다. It is also possible to calculate the minimum difference value using the difference between the voice signals, calculate the sync coefficient using the difference, and calculate the sync coefficient based on the distance between the microphone and the microphone.

싱크 조절부(132)는 기준 음성 신호가 수집된 마이크로폰을 기준으로 싱크 테이블을 만들고, 이 싱크 테이블에 기초하여 나머지 음성 신호의 싱크를 조절한다.The sync adjusting unit 132 creates a sync table based on the microphone from which the reference speech signal is collected, and adjusts the synchronization of the remaining audio signals based on the sync table.

신호 생성부(133)는 기준 음성 신호에 각 파워 계수를 적용하여 상쇄 신호를 각각 생성한다. 즉, 상쇄 신호는 나머지 음성 신호에 각각 대응되는 파워 계수로 기준 음성 신호를 변화시킨 것이다.The signal generator 133 applies the respective power coefficients to the reference speech signal to generate the cancel signal. That is, the offset signal is obtained by changing the reference speech signal with a power coefficient corresponding to each of the remaining speech signals.

예를 들어, 제2음성 신호에서 기준 음성 신호를 상쇄시키고자 할 때 기준 음성 신호와 제2음성 신호의 파워가 다르면, 제2파워 계수를 이용하여 기준 음성 신호의 파워를 조절함으로써 기준 음성 신호의 파워가 제2음성 신호의 파워에 대응되도록 한다. 이와 같은 과정에 의해 기준 음성 신호의 파워가 조절된 상쇄신호를 제2 음성 신호에서 차감시켜 추출 신호를 획득한다.For example, if the reference speech signal and the second speech signal have different powers when the reference speech signal is to be canceled in the second speech signal, the power of the reference speech signal is adjusted using the second power coefficient, So that the power corresponds to the power of the second audio signal. In this way, the cancel signal whose power of the reference speech signal is adjusted is subtracted from the second speech signal to obtain the extracted signal.

즉, 신호 생성부(133)는 나머지 음성 신호에서 각 상쇄 신호를 차감시켜 새로운 신호를 생성한다. 여기서 새로운 신호는 나머지 음성 신호에서 상쇄 신호가 차감된 후 추출된 신호로, 즉 추출 신호라 한다.That is, the signal generator 133 subtracts each canceling signal from the remaining voice signal to generate a new signal. Here, the new signal is a signal extracted after the offset signal is subtracted from the remaining speech signal, that is, the extracted signal.

암호화부(134)는 각각의 채널 별로 기준 음성 신호와 각각의 추출 신호를 압축 및 암호화한다.The encryption unit 134 compresses and encrypts the reference speech signal and each extracted signal for each channel.

이때 각각의 채널별로 자신의 추출 신호와 정보를 전송하는데, 이때 정보로는 기준 음성 신호가 수집된 기준 마이크로 폰의 정보, 암호화 대상인 자신의 마이크로 폰 정보, 자신의 파워 계수, 자신의 싱크 계수 등이 있으며, 각 정보를 묶어 하나의 패킷으로 전달되도록 한다.At this time, the extracted signal and information of each channel are transmitted. At this time, the information includes the information of the reference microphone on which the reference speech signal is collected, the microphone information of the user to be encrypted, the power coefficient of the user, , And each information is bundled and transmitted as one packet.

도 1에 도시된 바와 같이 음성 신호 수신 장치(200)는 수신부(210), 복원부(220), 멀티 플렉싱부(230), 출력부(240) 및 복수의 스피커(250: 251 내지 252)를 포함한다.1, the voice signal receiving apparatus 200 includes a receiving unit 210, a restoring unit 220, a multiplexing unit 230, an output unit 240, and a plurality of speakers 250 (251 to 252) .

수신부(210)는 음성 신호 송신 장치(100)로부터 전송된 다채널의 기준 음성 신호, 적어도 하나의 추출 신호 및 추출 신호의 정보를 수신하고, 수신된 기준 음성 신호와 적어도 하나의 추출 신호는 복호화부(221)에 전송하고, 수신된 추출 신호의 정보는 파워 복원부(222) 및 싱크 복원부(223)에 전송한다.The reception unit 210 receives information of the multi-channel reference speech signal, at least one extraction signal and extraction signal transmitted from the speech signal transmission apparatus 100, and the received reference speech signal and at least one extraction signal are transmitted to the decoding unit And outputs the received extracted signal information to the power recovery unit 222 and the sink restoration unit 223. [

복원부(220)는 압축된 다채널의 기준 음성 신호와 적어도 하나의 추출 신호의 압축을 풀고, 압축이 풀린 적어도 하나의 추출 신호의 파워 및 싱크를 복원하여 적어도 하나의 음성 신호를 생성한다.The decompression unit 220 decompresses the compressed multi-channel reference speech signal and at least one extracted signal, and restores the decompressed power and the at least one extracted signal to generate at least one speech signal.

멀티 플렉싱부(230)는 다중 채널의 음성 신호를 하나의 채널을 통하여 동시에 전달한다. 즉, 멀티 플렉싱부(230)는 기준 음성 신호와, 적어도 하나의 음성 신호의 멀티 플렉싱을 수행한다. The multiplexing unit 230 simultaneously transmits voice signals of multiple channels through one channel. That is, the multiplexing unit 230 multiplexes the reference speech signal and the at least one speech signal.

출력부(240)는 멀티 플렉싱된 음성 신호가 출력되도록 한다.The output unit 240 allows the multiplexed voice signal to be output.

이러한 출력부(240)는 디지털의 음성 신호를 아날로그의 음성 신호로 변환시키고, 변환된 아날로그의 음성 신호를 증폭하는 것도 가능하다.The output unit 240 may convert a digital audio signal into an analog audio signal and amplify the converted analog audio signal.

스피커(250: 251, 252)는 전기 신호를 진동판의 진동으로 바꾸어 공기에 소밀파를 발생시켜 음파를 복사하는 장치로, 여기서 전기 신호는 복원된 음성 신호이다.Speakers 250 (251, 252) convert an electrical signal into a vibration of a diaphragm, generate a sine wave in air to copy sound waves, and the electric signal is a restored voice signal.

도 3을 참조하여 복원부(220)를 좀 더 구체적으로 설명한다.The restoring unit 220 will be described in more detail with reference to FIG.

복원부(220)는 복호화부(221), 파워 복원부(222), 싱크 복원부(223)를 포함한다.The restoring unit 220 includes a decoding unit 221, a power restoring unit 222, and a sink restoring unit 223.

복호화부(221)는 수신부(210)에서 전달된 다채널의 기준 음성 신호와 적어도 하나의 추출 신호의 압축을 푼다.The decoding unit 221 decompresses the multi-channel reference speech signal and the at least one extracted signal transmitted from the receiving unit 210.

파워 복원부(222)는 수신부(210)에서 전달된 추출 신호의 정보 중 파워 계수를 이용하여 적어도 하나의 추출 신호의 파워를 복원하여 음성 신호로 복원한다. The power recovery unit 222 restores the power of at least one extracted signal using the power coefficient of the information of the extracted signal transmitted from the receiving unit 210, and restores the power of the extracted signal into a voice signal.

이때 기준 음성 신호에 파워 계수를 적용하여 추가 신호를 생성시키고, 적어도 하나의 음성 신호에 추가 신호를 합산하여 음성 신호로 복원한다.At this time, a power coefficient is applied to the reference speech signal to generate an additional signal, and the additional signal is added to at least one speech signal to recover the speech signal.

싱크 복원부(223)는 수신부(210)에서 전달된 추출 신호의 정보 중 싱크 계수를 이용하여 적어도 하나의 음성 신호의 싱크를 복원한다.The sink restoring unit 223 restores at least one of the voice signals using the sink coefficient among the information of the extracted signal transmitted from the receiving unit 210.

이때 적어도 하나의 음성 신호는 최초 시프트된 싱크 계수만큼 이동한다.At this time, at least one audio signal is shifted by the first shifted sync coefficient.

도 4는 실시예에 따른 음성 송신 방법의 순서도로, 도 5를 참조하여 설명한다.Fig. 4 is a flowchart of a voice transmission method according to the embodiment, which will be described with reference to Fig.

우선, 음성 신호 송신 장치는 서로 일정 간격을 두고 설치된 복수의 마이크로 폰(110: 111 내지 114)을 통해 주변의 음원 신호를 수집(201)한다.First, the voice signal transmitting apparatus collects (201) the surrounding sound source signals through a plurality of microphones 110: 111 to 114 provided at regular intervals from each other.

다음 음성 신호 송신 장치는 다 채널의 음원 신호에서 음성 신호를 각각 추출(202)하고 다 채널의 음성 신호의 파워를 각각 산출(203)하고 다 채널의 음성 신호 중 어느 하나의 채널의 음성 신호를 기준 음성 신호로 설정(204)한다.The next voice signal transmitting apparatus extracts (202) voice signals from multi-channel sound source signals, calculates (203) the power of multi-channel voice signals, and calculates (203) the voice signals of any one of the multi- And sets it as a voice signal (204).

다음, 음원 신호 송신 장치는 기준 음성 신호의 파워와 나머지 음성 신호의 파워 간의 비율에 기초하여 파워 계수를 각각 산출한다.Next, the sound source signal transmitting apparatus calculates the power coefficients based on the ratio between the power of the reference voice signal and the power of the remaining voice signal.

예를 들어 p1이 기준 음성 신호의 파워 계수이면 각각의 파워 계수 p2, p3, p4는 다음과 같다.For example, if p1 is the power coefficient of the reference speech signal, the power coefficients p2, p3, and p4 are as follows.

제2음성신호의 파워 계수 p2 = 제2음성 신호의 파워/기준 음성 신호의 파워Power coefficient of the second audio signal p2 = power of the second audio signal / power of the reference audio signal

제3음성신호의 파워 계수 p3 = 제3음성 신호의 파워/기준 음성 신호의 파워Power coefficient of the third audio signal p3 = power of the third audio signal / power of the reference audio signal

제4음성신호의 파워 계수 p4 = 제4음성 신호의 파워/기준 음성 신호의 파워Power coefficient of the fourth audio signal p4 = power of the fourth audio signal / power of the reference audio signal

여기서 각 음성 신호의 파워는 평균 제곱 파워(Mean Square Power)를 이용하여 산출한 값으로, 각 음성 신호의 파워는 정수로 표현한다. Here, the power of each voice signal is a value calculated by using a mean square power, and the power of each voice signal is represented by an integer.

다음 음성 신호 송신 장치는 음성 신호들 간의 상관관계(correlation)를 이용하여 싱크를 맞춘다. 이때 기준 음성 신호를 기준으로 나머지 음성 신호의 싱크를 조절(205)한다.The next speech signal transmitting apparatus uses a correlation between speech signals to align the sink. At this time, the sync of the remaining audio signal is adjusted (205) based on the reference audio signal.

여기서 싱크란 마이크로 폰 거리에 따른 딜레이 시간을 맞추기 위한 것이다.Here, the sink is to match the delay time according to the microphone distance.

이 싱크 계수는 최소 차이 값이나 상관관계를 이용하여 구하고, 구해진 싱크 계수에 의해 주기적 싱크(Cyclic Sync)를 맞춘다. This sync coefficient is obtained by using a minimum difference value or a correlation, and the cyclic sync is adjusted according to the obtained sync coefficient.

이때 싱크 계수에 의해 조절된 음성 신호 중 싱크 조절에 의해 제거된 첫 음성 신호는 마지막 신호에 연결한다.At this time, among the speech signals adjusted by the sync coefficient, the first speech signal removed by the sink adjustment is connected to the last signal.

실제 측정값을 구해보면 선형 마이크로 폰에서 정면에서의 신호는 싱크 0이 보통이고 측면이나 원형마이크로 폰을 사용해도 해상도에 따라 다르지만 작은 값이 된다. If you look at the actual measurements, the signal from the front side of the linear microphone is usually synch 0, and even if you use a side or a circular microphone, it will be a little different depending on the resolution.

마이크로 폰이 4개인 상태에서 제1음성신호가 기준 음성 신호인 경우 나머지 음성 신호의 싱크 조절된 신호는 다음과 같다.If the first speech signal is a reference speech signal in a state where four microphones are present, the signal adjusted for the remaining speech signal is as follows.

싱크 조절된 제2음성신호= 제2음성 신호 + s2 (cyclic)The second sound signal whose sync is adjusted = the second sound signal + s2 (cyclic)

싱크 조절된 제3음성신호= 제 3 음성 신호 + s3 (cyclic)Sync-adjusted third audio signal = third audio signal + s3 (cyclic)

싱크 조절된 제4음성신호= 제 4 음성 신호+ s4 (cyclic)Sync-adjusted fourth audio signal = fourth audio signal + s4 (cyclic)

여기서 s2, s3, s4는 기준 음성 신호인 제1음성신호를 기초로 조절된 싱크 계수이다.Here, s2, s3, and s4 are sync coefficients adjusted based on the first audio signal as the reference audio signal.

다음, 음성 신호 송신 장치는 기준 음성 신호에 각 파워 계수를 적용하여 상쇄 신호를 각각 생성한다. 즉, 상쇄 신호는 나머지 음성 신호에 각각 대응되는 파워 계수에 기초하여 기준 음성 신호를 변화시킨 것이다.Next, the speech signal transmitting apparatus generates each of the canceling signals by applying each power coefficient to the reference speech signal. That is, the offset signal is obtained by changing the reference speech signal based on the power coefficient corresponding to each of the remaining speech signals.

다음 음성 신호 송신 장치는 나머지 음성 신호에서 각 상쇄 신호를 차감시켜 새로운 추출 신호를 각각 생성(206)한다. 여기서 새로운 추출 신호는 나머지 음성 신호에서 상쇄 신호가 차감된 후 추출된 신호이다.The next speech signal transmission apparatus generates (206) a new extraction signal by subtracting each canceling signal from the remaining speech signal. Here, the new extracted signal is a signal extracted after the offset signal is subtracted from the remaining speech signal.

예를 들어, 마이크로 폰이 4개인 상태에서 제1음성신호가 기준 음성 신호인 경우 나머지 음성 신호에 대응되는 추출 신호를 생성하는 과정은 다음과 같다.For example, when the first speech signal is a reference speech signal in a state where four microphones are present, a process of generating an extracted signal corresponding to the remaining speech signal is as follows.

제2추출신호 = 싱크 조절된 제2음성 신호 - (제2파워계수 × 기준 음성 신호)Second extracted signal = second adjusted voice signal - (second power coefficient x reference voice signal)

제3추출신호 = 싱크 조절된 제3음성 신호 - (제3파워계수 × 기준 음성 신호)Third extracted signal = Sync adjusted third audio signal - (third power coefficient x reference audio signal)

제4추출신호 = 싱크 조절된 제4음성 신호 - (제4파워계수 × 기준 음성 신호)Fourth extracted signal = sync-adjusted fourth audio signal - (fourth power coefficient x reference audio signal)

다음 음성 신호 송신 장치는 각각의 채널 별로 기준 음성 신호와 각각의 추출 신호를 암호화 및 압축(207)한다.The next speech signal transmitting apparatus encrypts and compresses (207) the reference speech signal and each extracted signal for each channel.

이때 기준 음성 신호, 각각의 추출 신호, 각 추출 신호의 정보를 함께 암호화하여 압축한다.At this time, the reference voice signal, the extracted signal, and the information of each extracted signal are encrypted together and compressed.

여기서 추출 신호의 정보는 자신의 마이크로 폰 번호, 기준 음성 데이터가 수집된 마이크로 폰 번호, 파워 계수, 싱크 계수를 포함하고, 이를 하나의 패킷으로 전송한다. Here, the information of the extracted signal includes a microphone number of the user, a microphone number of the reference voice data, a power coefficient, and a sink coefficient, and transmits the same as one packet.

또한 기준 음성 신호와 자신의 마이크로폰 번호와 자신의 파워 계수, 싱크 계수를 함께 전송한다.Also, the reference voice signal, its own microphone number, its own power coefficient, and the sync coefficient are transmitted together.

다음 음성 신호 송신 장치는 암호화 및 압축된 기준 음성 신호와 각 추출 신호를 음성 신호 수신 장치(200)에 송신(208)한다.The next speech signal transmitting apparatus transmits (208) the encoded and compressed reference speech signal and each extracted signal to the speech signal receiving apparatus 200.

여기서 추출 신호의 생성을 도 5를 참조하여 좀 더 구체적으로 설명한다.Here, generation of the extracted signal will be described in more detail with reference to FIG.

제1채널(CH1)의 마이크로폰을 통해 제1음성 신호를 수집하고, 또한 제2채널(CH2)의 마이크로폰을 통해 수집된 제2음성 신호를 수집한다.Collects the first speech signal through the microphone of the first channel CH1 and also collects the second speech signal collected through the microphone of the second channel CH2.

여기서 제1채널(CH1)의 마이크로폰을 통해 수집된 제1음성 신호는 도 5의 (a)에 도시된 바와 같고, 제2채널(CH2)의 마이크로폰을 통해 수집된 제2음성 신호는 도 5의 (b)에 도시된 바와 같다.Here, the first audio signal collected through the microphone of the first channel CH1 is as shown in (a) of FIG. 5, and the second audio signal collected through the microphone of the second channel CH2 is (b).

다음, 제1음성 신호의 파워와 제2음성신호의 파워를 산출한다. 여기서 각 음성 신호의 파워는 평균 제곱 파워(Mean Square Power)를 이용하여 산출한 값으로, 정수로 표현한다. Next, the power of the first audio signal and the power of the second audio signal are calculated. Here, the power of each voice signal is a value calculated by using a mean square power and expressed by an integer.

이때 제1 음성 신호의 파워는 다음과 같다.Here, the power of the first audio signal is as follows.

제2음성 신호의 파워는 다음과 같다.The power of the second audio signal is as follows.

제1음성신호의 파워는 7이고, 제2음성신호는 5로, 제1음성신호의 파워보다 제2음성 신호의 파워가 작기 때문에 제1음성 신호를 기준 음성 신호로 설정하고, 제2채널(CH2)의 마이크로폰을 통해 수집된 제2음성 신호를 추출 신호로 변환시킨다. Since the power of the first audio signal is 7, the second audio signal is 5, and the power of the second audio signal is smaller than the power of the first audio signal, the first audio signal is set as the reference audio signal, CH2) of the second audio signal is converted into an extraction signal.

우선 도 5의 (c)에 도시된 바와 같이 기준 음성 신호를 기준으로 제2음성 신호의 싱크를 조절한다. 즉, 기준 음성 신호와 제2음성 신호의 파형이 최대한 일치되도록 하기 위해 제2음성 신호를 1/4주기만큼 좌측으로 시프트한다.First, as shown in (c) of FIG. 5, the synchronization of the second audio signal is adjusted based on the reference audio signal. That is, the second voice signal is shifted to the left by 1/4 cycle so that the waveforms of the reference voice signal and the second voice signal are maximally matched.

다음, 파워 계수를 산출한다. 이때 파워 계수는 기준음성신호의 파워와 제2음성 신호의 파워와의 비율로 5/7이다.Next, the power coefficient is calculated. At this time, the power coefficient is 5/7 in terms of the ratio of the power of the reference voice signal to the power of the second voice signal.

다음 (a)에 도시된 제1채널(CH1) 신호인기준 음성 신호의 각 값에 5/7를 적용하여 상쇄 신호를 생성한다. 여기서 상쇄신호는 정수로 표현 가능하다.The cancellation signal is generated by applying 5/7 to each value of the reference speech signal which is the first channel (CH1) signal shown in (a). Here, the cancellation signal can be represented by an integer.

도 5의 (d)에 도시된 바와 같이, 상쇄 신호는, 0, 7, 0 -7, 0, 7, 0, -7, 0의 값을 갖는다. As shown in FIG. 5 (d), the offset signal has values of 0, 7, 0 -7, 0, 7, 0, -7,

즉 기준 음성 신호와 제2음성 신호 간의 파워가 다르면 기준 음성 신호의 파워가 제2음성 신호의 파워에 대응되도록 기준 음성 신호에 파워 계수를 적용하여 기준 음성 신호의 각 값을 조절한다. 이때 파워 계수에 의해 각 값이 조절된 기준 음성 신호는, 상쇄 신호가 된다.That is, if the power between the reference voice signal and the second voice signal is different, the power coefficient is applied to the reference voice signal so that the power of the reference voice signal corresponds to the power of the second voice signal. At this time, the reference speech signal whose values are adjusted by the power coefficient becomes an offset signal.

다음 도 5의 (e)에 도시된 바와 같이 싱크 조절된 제2음성 신호에서 상쇄 신호를 차감시켜 추출 신호를 생성한다.As shown in FIG. 5E, an extracted signal is generated by subtracting the offset signal from the second audio signal whose sync is adjusted.

이때 추출 신호는 1(=1-0), 0(=7-7), -1(=-1-0), 1(=-6-(-7)), 2(=2-0), -2(=5-7), 1(=-1-0), 0(=-7-(-7)), -8(=-8-0)의 값을 갖는다.In this case, the extraction signal is 1 (= 1-0), 0 (= 7-7), -1 (= -1-0), 1 (= -6- (= -5-0), -2 (= 5-7), 1 (= -1-0), 0 (= -7-7) and -8 (= -8-0).

도 6은 실시예에 따른 음성 신호 수신 방법의 순서도로, 도 7을 참조하여 설명한다.6 is a flowchart of a method of receiving a voice signal according to an embodiment, and is described with reference to FIG.

음성 신호 수신 장치는 음성 신호 송신 장치(100)로부터 전송된 다채널의 기준 음성 신호, 적어도 하나의 추출 신호 및 추출 신호의 정보를 수신(301)하고 수신된 다채널의 기준 음성 신호와 적어도 하나의 추출 신호의 압축을 풀고, 압축이 풀린 기준 음성 신호와 적어도 하나의 추출 신호를 복호화(302)한다.The voice signal receiving apparatus receives (301) the information of the multi-channel reference voice signal, at least one extracting signal and extracting signal transmitted from the voice signal transmitting apparatus 100, and outputs the multi- Decompresses the extracted signal, and decodes (302) the decompressed reference speech signal and at least one extracted signal.

이때 다채널의 기준 음성 신호와, 적어도 하나의 추출 신호, 추출 신호의 정보가 생성된다. At this time, information of multi-channel reference speech signal and at least one extracted signal and extracted signal is generated.

아울러 음성 신호 수신 장치는 수신된 신호의 헤더를 파싱하여 기준 음성 신호와 추출 신호를 판별한다. 그리고 기준 음성 신호는 압축을 푼 후 멀티 플렉싱부로 그대로 보내어 처리한다. In addition, the speech signal receiving apparatus parses the header of the received signal to discriminate the reference speech signal and the extracted signal. Then, the reference speech signal is decompressed and sent to the multiplexing unit.

다음 음성 신호 수신 장치는 추출 신호의 정보 중 파워 계수를 이용하여 적어도 하나의 추출 신호의 파워를 복원(303)함으로써 적어도 하나의 음성 신호로 생성하고, 추출 신호의 정보 중 싱크 계수를 이용하여 파워가 복원된 적어도 하나의 음성 신호의 싱크를 복원(304)함으로써, 최초 음성 신호로 복원되도록 한다.The next voice signal receiving apparatus generates at least one voice signal by restoring (303) the power of at least one extracted signal using the power coefficient in the information of the extracted signal, And restores (304) the syncs of the restored at least one audio signal to restore the original audio signal.

이때 적어도 하나의 음성 신호는 음성 신호 송신 장치(100)에서 최초 시프트된 싱크 계수만큼 이동한다.At this time, at least one voice signal is shifted by the first shifted sync coefficient in the voice signal transmitting apparatus 100.

예를 들어 4개의 마이크로 폰을 통해 음원을 수집한 경우, 제1음성 신호가 기준신호이면, 제2, 3, 4음성 신호에 대응된 추출 신호의 파워 복원 신호와 싱크 복원 신호는 다음과 같다.For example, when a sound source is collected through four microphones, if the first voice signal is a reference signal, the power recovery signal and the sink restoration signal of the extracted signal corresponding to the second, third, and fourth voice signals are as follows.

제2 파워 복원 신호 = 제2추출 신호 + 제2파워계수 × 기준 마이크 신호Second power recovery signal = second extracted signal + second power coefficient x reference microphone signal

제3 파워 복원 신호 = 제3추출 신호 + 제3파워계수 × 기준 마이크 신호Third power recovery signal = third extracted signal + third power coefficient x reference microphone signal

제4 파워 복원 신호 = 제4추출 신호 + 제4파워계수 × 기준 마이크 신호Fourth power recovery signal = fourth extracted signal + fourth power coefficient 占 reference microphone signal

제2 싱크 복원 신호 = 제2 파워 복원 신호 - s2 (cyclic)Second sync recovered signal = second power recovered signal -s2 (cyclic)

제3 싱크 복원 신호 = 제3 파워 복원 신호 - s3 (cyclic)Third sync recovered signal = third power recovered signal -s3 (cyclic)

제4 싱크 복원 신호 = 제4 파워 복원 신호 - s4 (cyclic)Fourth sync recovered signal = fourth power recovered signal -s4 (cyclic)

다음 음성 신호 수신 장치는 다중 채널의 기준 음성 신호와, 적어도 하나의 음성 신호의 멀티 플렉싱을 수행(305)하고, 멀티 플렉싱된 음성 신호가 적어도 하나의 스피커를 통해 출력(306)되도록 한다.The next voice signal receiving device multiplexes (305) the multi-channel reference voice signal and at least one voice signal and causes the multiplexed voice signal to be output 306 through at least one speaker.

적어도 하나의 추출 신호를 복원하는 구성을 도 7을 참조하여 구체적으로 설명한다.A configuration for restoring at least one extracted signal will be described in detail with reference to Fig.

도 7의 (a)에 도시된 바와 같이, 추출 신호가 수신되면 도 7의 (b)에 도시된 바와 같이 기준 음성 신호에 추출 신호의 정보 중 파워 계수를 적용시켜 추가 신호를 생성하고 생성된 추가 신호를 추출 신호에 합산한다.As shown in FIG. 7 (a), when the extracted signal is received, a power coefficient of the extracted signal information is applied to the reference speech signal as shown in FIG. 7 (b) And adds the signal to the extracted signal.

다음 도 7의 (c)에 도시된 바와 같이 싱크 계수를 이용하여 음성 신호를 시프트 시켜 싱크를 복원한다.7 (c), the voice signal is shifted by using the sync coefficient to recover the sync.

이와 같이 다 채널의 음성 신호를 압축하기 전에 기준 음성 신호를 기준으로 나머지 음성 신호의 용량을 줄여 압축 효율을 높일 수 있고 시간도 줄일 수 있으며 복원이 용이하다.In this way, the capacity of the remaining voice signals can be reduced based on the reference voice signal before compressing the multi-channel voice signal, thereby increasing the compression efficiency, reducing the time and facilitating the restoration.

100: 음성 신호 송신 장치 수집부 110: 수집부
120: 추출부 130: 압축부
140: 송신부부 200: 음성 신호 수신 장치
210: 수신부 220: 복원부
230: 멀티플렉싱부 240: 출력부
250: 스피커100: Voice signal transmitting apparatus collecting section 110: Collecting section
120: Extraction unit 130: Compression unit
140: Transmitting unit 200: Audio signal receiving device
210: Receiving unit 220:
230: multiplexing unit 240: output unit
250: Speaker

Claims

An extraction unit for extracting a multi-channel speech signal from multi-channel sound source signals collected through a plurality of microphones;
A power calculation unit for calculating the power of the multi-channel audio signal and setting one of the multi-channel audio signals as a reference audio signal;
A sync controller for adjusting the synchronization of the remaining audio signals excluding the reference audio signal among the multi-channel audio signals based on the reference audio signal;
A signal generator for canceling each of the reference speech signals from the remaining adjusted speech signals to generate extracted signals;
An encryption unit for compressing and encrypting the reference speech signal and each extracted signal;
And a transmitter for transmitting the compressed and encrypted reference speech signal and each extracted signal,
The power calculation unit calculates,
Calculates a power coefficient corresponding to each of the remaining voice signals based on a ratio of each of the powers of the remaining voice signals to a power of the reference voice signal,
Wherein the signal generator comprises:
Generating a cancel signal corresponding to the remaining voice signal by reflecting the power coefficient corresponding to the remaining voice signal to the reference voice signal, respectively, and subtracting the corresponding cancel signal from each of the remaining voice signals, Wherein the voice signal transmitting device generates the voice signal.

The power control apparatus according to claim 1,
And sets a voice signal having the highest power among the multi-channel voice signals as a reference voice signal.

delete

The information processing apparatus according to claim 1,
And extracting the reference voice signal for each of the extracted signals, the extracting signal of the microphone, the microphone information, the power coefficient, and the sink coefficient.

The apparatus of claim 1,
Calculates a sync coefficient of the remaining audio signal on the basis of the distance between the microphone for which the reference voice signal is collected and the microphone for which the remaining audio signal is collected and calculates a sync coefficient of the remaining audio signal based on the calculated sync coefficient, Respectively.

The apparatus of claim 1,
And adjust the synchronization of the remaining audio signals using the correlation between the plurality of microphones.

delete

Collecting sound source signals from a plurality of microphones,
Extracting voice signals from the collected sound source signals,
The power of the multi-channel audio signal is calculated,
Channel audio signal is set as a reference audio signal,
Adjusts the synchronization of the remaining audio signals based on the reference audio signal,
And generating the extracted signals by canceling the reference speech signals from the remaining adjusted speech signals, respectively,
Compresses and encrypts the reference speech signal and each extracted signal,
Transmitting the compressed and encrypted reference speech signal and each extracted signal,
Generating the extracted signals, respectively,
Calculates a power coefficient corresponding to each of the remaining voice signals based on a ratio of each of the powers of the remaining voice signals to a power of the reference voice signal,
Generating a cancel signal corresponding to the remaining voice signal by reflecting the power coefficient corresponding to the remaining voice signal to the reference voice signal, respectively,
And canceling each of the cancel signals corresponding to the remaining voice signals to generate extracted signals.

16. The method of claim 15, wherein setting the reference voice signal comprises:
And setting a voice signal having the largest power among the multi-channel voice signals as a reference voice signal.

delete

16. The method of claim 15, wherein compressing and encrypting the reference speech signal and each extracted signal comprises:
And encrypting each of the extracted speech signal, the extracted microphone signal, the microphone information, the power coefficient of the microphone, and the sink coefficient of the reference microphone.

16. The method of claim 15, wherein adjusting the synchronization of the remaining audio signal comprises:
Calculates a sync coefficient of the remaining audio signal based on the distance between the microphone for which the reference speech signal is collected and the microphone for which the remaining audio signal is collected,
And adjusting the synchronization of the remaining audio signals based on the respective calculated sync coefficients.

delete