KR20220164395A

KR20220164395A - Apparatus and method for sound signal processing

Info

Publication number: KR20220164395A
Application number: KR1020210171203A
Authority: KR
Inventors: 김재흥; 강현욱
Original assignee: 삼성전자주식회사
Priority date: 2021-06-04
Filing date: 2021-12-02
Publication date: 2022-12-13

Abstract

Disclosed is a device for processing a sound signal, comprising: a user microphone that receives an entire sound comprising a user speech and an external sound generated from the outside of the user, and generates a user speech signal in which the external sound is attenuated from the received entire sound by disposing one surface for receiving the entire sound in a direction corresponding to an utterance point of the user speech; an ambient microphone that receives the entire sound and generates the overall sound signal from the received entire sound; and a processor that generates an external sound signal in which the user speech is attenuated by differentially operating the user speech signal from the entire sound signal.

Description

Sound signal processing apparatus and method {Apparatus and method for sound signal processing}

음향 신호 처리 장치 및 방법에 관한다.It relates to a sound signal processing device and method.

다양한 전자 장치들에 장착되어 음향을 센싱하는 음향 센서의 활용도가 증가하고 있다. 수신되는 다양한 종류의 음향들을 구별하거나 특정 음향만을 센싱하기 위해 전자 장치들에는 복수의 음향 센서들이 채용되고 있다. 다만, 특정 음향을 센싱하는 정확도의 향상을 위해서는 많은 개수의 음향 센서들이 요구되므로, 공정 비용, 복잡성 및 전력 소모가 증가하게 된다. 복수의 음향 센서들로부터 수신한 음향 신호들에 대해 시간 딜레이 연산을 처리하는 과정에서 연산의 복잡성 역시 증가하게 된다. 따라서, 특정 음향을 명확하고 효율적으로 센싱하는 기술이 요구된다.The use of acoustic sensors mounted in various electronic devices to sense sounds is increasing. A plurality of acoustic sensors are employed in electronic devices to distinguish received various types of sounds or to sense only specific sounds. However, since a large number of acoustic sensors are required to improve the accuracy of sensing a specific sound, process cost, complexity, and power consumption increase. In the process of processing the time delay calculation on the acoustic signals received from the plurality of acoustic sensors, the complexity of the calculation also increases. Therefore, a technology for clearly and efficiently sensing a specific sound is required.

또한, 음향 센서가 장착된 웨어러블 기기의 활용이 증가하고 있다. 다양한 음향 환경에서 사용될 수 있는 웨어러블 기기의 특성상, 사용자의 외부로부터 발생하는 음향과 사용자 음성 각각을 명확히 구별하고 센싱하는 기술이 요구된다.In addition, the use of wearable devices equipped with acoustic sensors is increasing. Due to the characteristics of wearable devices that can be used in various acoustic environments, a technology for clearly distinguishing and sensing each of the user's voice and the sound generated from the outside of the user is required.

음향 신호 처리 장치 및 방법을 제공하는데 있다. 또한, 상기 방법을 컴퓨터에서 실행시키기 위한 프로그램을 기록한 컴퓨터로 읽을 수 있는 기록 매체를 제공하는 데 있다. 본 실시예가 이루고자 하는 기술적 과제는 상기된 바와 같은 기술적 과제들로 한정되지 않으며, 이하의 실시예들로부터 또 다른 기술적 과제들이 유추될 수 있다.It is to provide a sound signal processing device and method. Another object of the present invention is to provide a computer-readable recording medium on which a program for executing the method on a computer is recorded. The technical problem to be achieved by the present embodiment is not limited to the technical problems described above, and other technical problems can be inferred from the following embodiments.

상술한 기술적 과제를 해결하기 위한 수단으로서, 일 측면에 따른 음향 신호 처리 장치는 사용자 음성 및 사용자의 외부로부터 발생하는 외부 음향을 포함하는 전체 음향을 수신하고, 상기 전체 음향을 수신하는 일 면(plane)이 상기 사용자 음성의 발화 지점에 대응되는 방향으로 배치됨으로써 상기 수신된 전체 음향에서 상기 외부 음향이 감쇄된 사용자 음성 신호를 생성하는 사용자 마이크(user microphone); 상기 전체 음향을 수신하고, 상기 수신된 전체 음향으로부터 전체 음향 신호를 생성하는 앰비언트 마이크(ambient microphone); 및 상기 전체 음향 신호로부터 상기 사용자 음성 신호를 차분 연산함으로써 상기 사용자 음성이 감쇄된 외부 음향 신호를 생성하는 프로세서를 포함할 수 있다.As a means for solving the above-described technical problem, an acoustic signal processing apparatus according to an aspect receives overall sound including a user's voice and external sound generated from the outside of the user, and receives the entire sound (plane). ) is arranged in a direction corresponding to an ignition point of the user's voice, thereby generating a user's voice signal in which the external sound is attenuated from the entire received sound; an ambient microphone for receiving the overall sound and generating an overall sound signal from the received overall sound; and a processor generating an external audio signal in which the user's voice is attenuated by differentially calculating the user's voice signal from the entire audio signal.

다른 측면에 따른 음향 신호 처리 장치는 일 측면에 따른 사용자 마이크; 상기 전체 음향을 수신하고, 상기 전체 음향을 수신하는 일 면이 상기 사용자 마이크가 배치되는 방향과 상이한 방향으로 배치되되 상기 외부 음향의 발생 지점에 대응되는 방향으로 배치됨으로써 상기 수신된 전체 음향에서 상기 사용자 음성이 감쇄된 제1 외부 음향 신호를 생성하는 앰비언트 마이크; 상기 제1 외부 음향 신호로부터 상기 사용자 음성 신호를 차분 연산함으로써 상기 사용자 음성이 상기 제1 외부 음향 신호에서보다 더 감쇄된 제2 외부 음향 신호를 생성하는 프로세서를 포함할 수 있다.A sound signal processing device according to another aspect includes a user microphone according to one aspect; The entire sound is received, and one surface receiving the entire sound is disposed in a direction different from the direction in which the user microphone is disposed but disposed in a direction corresponding to the generation point of the external sound, so that the user's microphone an ambient microphone for generating a first external sound signal in which voice is attenuated; and a processor configured to generate a second external acoustic signal in which the user's voice is more attenuated than the first external acoustic signal by differentially calculating the user's voice signal from the first external acoustic signal.

또 다른 측면에 따른 음향 신호 처리 장치는 음향 출력 장치로부터의 출력 음향 및 상기 음향 출력 장치의 외부로부터 발생하는 외부 음향을 포함하는 전체 음향을 수신하고, 상기 전체 음향을 수신하는 일 면이 상기 출력 음향의 발생 지점에 대응되는 방향으로 배치됨으로써 상기 수신된 전체 음향에서 상기 외부 음향이 감쇄된 출력 음향 신호를 생성하는 지향성 마이크; 상기 전체 음향을 수신하고, 상기 수신된 전체 음향으로부터 전체 음향 신호를 생성하는 앰비언트 마이크; 및 상기 전체 음향 신호로부터 상기 출력 음향 신호를 차분 연산함으로써 상기 출력 음향이 감쇄된 외부 음향 신호를 생성하는 프로세서를 포함할 수 있다.A sound signal processing device according to another aspect receives overall sound including sound output from a sound output device and external sound generated from an outside of the sound output device, and one side receiving the entire sound receives the output sound a directional microphone disposed in a direction corresponding to the generation point of the directional microphone to generate an output sound signal in which the external sound is attenuated from the received total sound; an ambient microphone that receives the overall sound and generates an overall sound signal from the received overall sound; and a processor configured to generate an external sound signal in which the output sound is attenuated by differentially calculating the output sound signal from the entire sound signal.

또 다른 측면에 따른 음향 신호 처리 방법은 사용자 음성 및 사용자의 외부로부터 발생하는 외부 음향을 포함하는 전체 음향을 수신하는 단계; 상기 수신된 전체 음향으로부터 전체 음향 신호를 생성하는 단계; 상기 수신된 전체 음향에서 상기 외부 음향이 감쇄된 사용자 음성 신호를 생성하는 단계; 및 상기 전체 음향 신호로부터 상기 사용자 음성 신호를 차분 연산함으로써 상기 사용자 음성이 감쇄된 외부 음향 신호를 생성하는 단계를 포함할 수 있다.According to another aspect, a sound signal processing method includes receiving overall sound including a user's voice and external sound generated from the outside of the user; generating a total sound signal from the received total sound; generating a user voice signal in which the external sound is attenuated from the received total sound; and generating an external audio signal in which the user voice is attenuated by differentially calculating the user voice signal from the entire audio signal.

또 다른 측면에 따른 컴퓨터로 읽을 수 있는 기록매체는 상술한 방법을 컴퓨터에서 실행하기 위한 프로그램을 기록한 기록매체를 포함할 수 있다.A computer-readable recording medium according to another aspect may include a recording medium recording a program for executing the above-described method on a computer.

도 1은 일 실시예에 따른 음향 신호 처리 장치의 구성을 도시한 블록도이다.
도 2는 일 실시예에 따른 사용자 마이크의 구성을 도시한 블록도이다.
도 3은 사용자 마이크의 구성을 예시적으로 도시한 도면이다.
도 4a 내지 도 4c는 도 3의 센싱 소자의 단면을 도시한 도면들이다.
도 5는 비교예에 따른 앰비언트 마이크들을 이용한 음향 센싱 방법을 설명하기 위한 도면이다.
도 6은 일 실시예에 따른 사용자 마이크의 지향 패턴을 설명하기 위한 도면이다.
도 7은 사용자 마이크의 지향 패턴을 측정한 결과를 나타내는 도면이다.
도 8은 일 실시예에 따른 음향 신호 처리 장치의 신호 처리를 설명하기 위한 도면이다.
도 9는 일 실시예에 따른 사용자 마이크 및 앰비언트 마이크의 지향 패턴들을 측정한 그래프이다.
도 10a 및 도 10b는 사용자 음성의 발화 지점에 대한 진동부의 배치를 도시한 도면들이다.
도 11은 일 실시예에 따른 음향 조정부의 음향 조정 과정을 도시한 도면이다.
도 12는 일 실시예에 따른 사용자 마이크로부터 생성된 사용자 음성 신호를 도시한 도면이다.
도 13은 일 실시예에 따른 차분 연산 방법을 설명하기 위한 도면이다.
도 14는 도 13의 차분 연산 방법의 일 예를 설명하기 위한 도면이다.
도 15는 일 실시예에 따른 외부 음향 신호를 도시한 도면이다.
도 16a 및 도 16b는 실시예들에 따른 디스플레이를 도시한 도면들이다.
도 17a 내지 도 17c는 음향 신호 처리 장치가 안경형 웨어러블 장치인 실시예들을 도시한 도면들이다.
도 18a 및 도 18b는 디스플레이에 기능의 수행 결과가 표시되는 실시예들을 도시한 도면들이다.
도 19는 다른 실시예에 따른 음향 신호 처리 장치의 구성을 도시한 블록도이다.
도 20은 도 19의 실시예에 따른 사용자 마이크 및 앰비언트 마이크의 배치를 설명하기 위한 도면이다.
도 21은 도 19의 실시예에 따른 차분 연산 방법을 설명하기 위한 도면이다.
도 22는 또 다른 실시예에 따른 음향 신호 처리 장치의 구성을 도시한 블록도이다.
도 23은 도 22의 실시예에 따른 차분 연산 방법을 설명하기 위한 도면이다.
도 24는 일 실시예에 따른 음향 신호 처리 방법을 나타내는 흐름도이다.1 is a block diagram showing the configuration of a sound signal processing apparatus according to an embodiment.
2 is a block diagram illustrating a configuration of a user microphone according to an exemplary embodiment.
3 is a diagram showing the configuration of a user's microphone by way of example.
4A to 4C are cross-sectional views of the sensing element of FIG. 3 .
5 is a diagram for explaining a sound sensing method using ambient microphones according to a comparative example.
6 is a diagram for explaining a directing pattern of a user's microphone according to an exemplary embodiment.
7 is a diagram showing a result of measuring a directivity pattern of a user's microphone.
8 is a diagram for explaining signal processing of a sound signal processing apparatus according to an exemplary embodiment.
9 is a graph in which directive patterns of a user's microphone and an ambient microphone are measured according to an exemplary embodiment.
10A and 10B are diagrams illustrating the arrangement of a vibrator with respect to an ignition point of a user's voice.
11 is a diagram illustrating a sound control process of a sound control unit according to an embodiment.
12 is a diagram illustrating a user voice signal generated from a user microphone according to an exemplary embodiment.
13 is a diagram for explaining a difference calculation method according to an exemplary embodiment.
14 is a diagram for explaining an example of the difference calculation method of FIG. 13 .
15 is a diagram illustrating an external acoustic signal according to an exemplary embodiment.
16A and 16B are diagrams illustrating displays according to embodiments.
17A to 17C are diagrams illustrating embodiments in which a sound signal processing device is a glasses-type wearable device.
18A and 18B are diagrams illustrating embodiments in which a result of performing a function is displayed on a display.
19 is a block diagram showing the configuration of a sound signal processing device according to another embodiment.
FIG. 20 is a diagram for explaining arrangement of a user microphone and an ambient microphone according to the embodiment of FIG. 19 .
21 is a diagram for explaining a difference calculation method according to the embodiment of FIG. 19 .
22 is a block diagram showing the configuration of a sound signal processing device according to another embodiment.
FIG. 23 is a diagram for explaining a difference calculation method according to the embodiment of FIG. 22 .
24 is a flowchart illustrating a sound signal processing method according to an exemplary embodiment.

본 실시예들에서 사용되는 용어는 본 실시예들에서의 기능을 고려하면서 가능한 현재 널리 사용되는 일반적인 용어들을 선택하였으나, 이는 당 기술분야에 종사하는 기술자의 의도 또는 판례, 새로운 기술의 출현 등에 따라 달라질 수 있다. 또한, 특정한 경우는 임의로 선정된 용어도 있으며, 이 경우 해당 실시예의 설명 부분에서 상세히 그 의미를 기재할 것이다. 따라서, 본 실시예들에서 사용되는 용어는 단순한 용어의 명칭이 아닌, 그 용어가 가지는 의미와 본 실시예들의 전반에 걸친 내용을 토대로 정의되어야 한다.The terms used in the present embodiments have been selected from general terms that are currently widely used as much as possible while considering the functions in the present embodiments, but these may vary depending on the intention or precedent of a person skilled in the art, the emergence of new technologies, and the like. can In addition, there are terms selected arbitrarily in certain cases, and in this case, their meanings will be described in detail in the description of the corresponding embodiment. Therefore, the term used in the present embodiments should be defined based on the meaning of the term and the overall content of the present embodiment, not a simple name of the term.

실시예들에 대한 설명들에서, 어떤 부분이 다른 부분과 연결되어 있다고 할 때, 이는 직접적으로 연결되어 있는 경우뿐 아니라, 그 중간에 다른 구성요소를 사이에 두고 전기적으로 연결되어 있는 경우도 포함한다. 또한 어떤 부분이 어떤 구성요소를 포함한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다.In the descriptions of the embodiments, when a part is said to be connected to another part, this includes not only the case where it is directly connected, but also the case where it is electrically connected with another component interposed therebetween. . In addition, when a part includes a certain component, this means that it may further include other components without excluding other components unless otherwise stated.

본 실시예들에서 사용되는 "구성된다" 또는 "포함한다" 등의 용어는 명세서 상에 기재된 여러 구성 요소들, 또는 여러 단계들을 반드시 모두 포함하는 것으로 해석되지 않아야 하며, 그 중 일부 구성 요소들 또는 일부 단계들은 포함되지 않을 수도 있고, 또는 추가적인 구성 요소 또는 단계들을 더 포함할 수 있는 것으로 해석되어야 한다.Terms such as "consists of" or "includes" used in the present embodiments should not be construed as necessarily including all of the various components or steps described in the specification, and some of the components or It should be construed that some steps may not be included, or may further include additional components or steps.

또한, 본 명세서에서 사용되는 '제1' 또는 '제2' 등과 같이 서수를 포함하는 용어는 다양한 구성 요소들을 설명하는데 사용할 수 있지만, 상기 구성 요소들은 상기 용어들에 의해 한정되어서는 안된다. 상기 용어들은 하나의 구성 요소를 다른 구성 요소로부터 구별하는 목적으로만 사용된다.Also, terms including ordinal numbers such as 'first' or 'second' used in this specification may be used to describe various components, but the components should not be limited by the terms. These terms are only used for the purpose of distinguishing one component from another.

하기 실시예들에 대한 설명은 권리범위를 제한하는 것으로 해석되지 말아야 하며, 해당 기술분야의 당업자가 용이하게 유추할 수 있는 것은 실시예들의 권리범위에 속하는 것으로 해석되어야 할 것이다. 이하 첨부된 도면들을 참조하면서 오로지 예시를 위한 실시예들을 상세히 설명하기로 한다.The description of the following embodiments should not be construed as limiting the scope of rights, and what can be easily inferred by a person skilled in the art should be construed as belonging to the scope of the embodiments. Hereinafter, embodiments for illustrative purposes only will be described in detail with reference to the accompanying drawings.

도 1은 일 실시예에 따른 음향 신호 처리 장치의 구성을 도시한 블록도이다.1 is a block diagram showing the configuration of a sound signal processing apparatus according to an embodiment.

도 1을 참조하면, 음향 신호 처리 장치(100)는 사용자 마이크(110), 앰비언트 마이크(120) 및 프로세서(130)를 포함할 수 있다. 도 1에 도시된 음향 신호 처리 장치(100)에는 본 실시예들과 관련된 구성요소들만이 도시되어 있다. 따라서, 음향 신호 처리 장치(100)에 도 1에 도시된 구성요소들 외에 다른 범용적인 구성요소들이 더 포함될 수 있음은 당업자에게 자명하다.Referring to FIG. 1 , the sound signal processing apparatus 100 may include a user microphone 110 , an ambient microphone 120 and a processor 130 . In the acoustic signal processing apparatus 100 shown in FIG. 1, only components related to the present embodiments are shown. Accordingly, it is apparent to those skilled in the art that other general-purpose components may be further included in the sound signal processing apparatus 100 in addition to the components shown in FIG. 1 .

음향 신호 처리 장치(100)는 사용자의 음성을 수신하기 위해 사용자에게 착용되는 웨어러블 장치일 수 있다. 또는, 음향 신호 처리 장치(100)는 사용자에게 착용되지 않는 장치로서 음향 출력 장치에 근접하게 배치되거나, 음향 출력 장치에 포함될 수도 있다. 다만, 이는 예시에 불과하며 음향 신호 처리 장치(100)는 음향을 수신할 수 있는 다양한 형태로 변형되어 실시될 수 있다. 음향 신호 처리 장치(100)의 예시들에 대해서는 도 17a를 참조하여 후술하도록 한다.The sound signal processing device 100 may be a wearable device worn by the user to receive the user's voice. Alternatively, the sound signal processing device 100 is not worn by the user and may be disposed close to the sound output device or included in the sound output device. However, this is only an example, and the sound signal processing device 100 may be modified and implemented in various forms capable of receiving sound. Examples of the sound signal processing apparatus 100 will be described later with reference to FIG. 17A.

음향 신호 처리 장치(100)는 수신되는 음향에 대해 다양한 음향 신호들을 생성하기 위해 상이한 종류의 마이크들을 포함할 수 있다. 동일한 음향이 수신되더라도 마이크의 구성 및 동작에 따라 마이크로부터 생성되는 음향 신호는 상이할 수 있다. 따라서, 음향 신호 처리 장치(100)는 상이한 종류의 마이크들을 포함함으로써 목표한 음향 신호를 생성할 수 있다. 음향 신호 처리 장치(100)는 사용자 음성을 검출하기 위한 사용자 마이크(110) 및 사용자 음성을 포함하는 전체 음향을 검출하기 위한 앰비언트 마이크(120)를 포함할 수 있다.The sound signal processing apparatus 100 may include different types of microphones to generate various sound signals for received sound. Even if the same sound is received, sound signals generated from the microphone may be different depending on the configuration and operation of the microphone. Accordingly, the sound signal processing apparatus 100 may generate a target sound signal by including different types of microphones. The sound signal processing apparatus 100 may include a user microphone 110 for detecting a user's voice and an ambient microphone 120 for detecting overall sound including the user's voice.

사용자 마이크(110) 및 앰비언트 마이크(120)는 사용자 음성 및 사용자 외부로부터 발생하는 외부 음향을 포함하는 전체 음향을 수신할 수 있다. 사용자 음성은 음향 신호 처리 장치(100)를 사용하거나 착용한 사용자의 음성에 해당할 수 있다. 외부 음향은 사용자의 외부로부터 수신되는 음향으로서 사용자 음성을 제외한 음향에 해당할 수 있다. 예를 들어, 외부 음향은 사용자와 대화하는 외부인의 음성 또는 사용자가 시청하는 영상으로부터 출력되는 음향 또는 사용자 주변 환경에서 발생되는 소리 등을 포함할 수 있다. 전체 음향은 사용자 음성 및 외부 음향을 모두 포함하는 음향으로서 음향 신호 처리 장치에 전달(또는 수신)되는 모든 음향에 해당할 수 있다. 사용자 마이크(110)에는 전체 음향이 전달(또는 수신)되나 사용자 마이크(110)의 구조 또는 동작에 의해 전체 음향에서 외부 음향은 감쇄될 수 있고, 이에 따라 사용자 음성 신호가 생성될 수 있다.The user microphone 110 and the ambient microphone 120 may receive overall sound including the user's voice and external sound generated from the outside of the user. The user's voice may correspond to a user's voice using or wearing the sound signal processing apparatus 100 . The external sound is sound received from the outside of the user and may correspond to sound other than the user's voice. For example, the external sound may include a voice of an outsider conversing with the user, a sound output from an image viewed by the user, or a sound generated in the environment around the user. The overall sound is sound including both the user's voice and the external sound, and may correspond to all sounds transmitted (or received) to the sound signal processing device. The entire sound is transmitted (or received) to the user microphone 110, but external sound may be attenuated from the overall sound by the structure or operation of the user microphone 110, and thus a user voice signal may be generated.

한편, 사용자 마이크(110) 및 앰비언트 마이크(120)는 수신한 음향을 주파수, 진폭 및 시간 등의 정보를 포함하는 전기적 신호로 변환할 수 있다.Meanwhile, the user microphone 110 and the ambient microphone 120 may convert the received sound into an electrical signal including information such as frequency, amplitude, and time.

사용자 마이크(110)는 수신된 전체 음향에서 외부 음향을 감쇄시킴으로써 사용자 음성 신호를 생성할 수 있다. 사용자 마이크(110)는 외부 음향을 감쇄시킴으로써 사용자 음성이 보다 명확해진 사용자 음성 신호를 생성할 수 있다. 예를 들어, 사용자 마이크(110)는 수신되는 외부 음향을 감쇄시키기 위해 사용자 음성에 대한 지향성을 갖거나 임계값에 기초하여 외부 음향에 대응되는 신호를 감쇄시킬 수 있다. 사용자 마이크(110)의 구성 및 동작에 관해서는 도 2를 참조하여 후술하도록 한다.The user's microphone 110 may generate a user's voice signal by attenuating the external sound from the total received sound. The user microphone 110 may generate a user voice signal in which the user voice is clearer by attenuating external sound. For example, the user's microphone 110 may have directivity to the user's voice or attenuate a signal corresponding to the external sound based on a threshold value in order to attenuate the received external sound. The configuration and operation of the user microphone 110 will be described later with reference to FIG. 2 .

또한, 사용자 마이크(110)는 사용자 마이크(110)가 형성하는 일 면을 통해 음향을 수신할 수 있다. 여기서, 일 면은 사용자 마이크(110)의 진동부가 형성하는 일 면을 의미하거나, 평면적으로 배열되는 복수의 진동부들에 의해 형성되는 일 면을 의미할 수 있다. 사용자 마이크(110)는, 사용자 마이크(110)가 형성하는 일 면이 사용자 음성의 발화 지점에 대응되는 방향으로 배치되도록 음향 신호 처리 장치(100) 내에 배치될 수 있다. 사용자 마이크(110)의 이와 같은 배치로 인해 사용자 음성은 높은 민감도로 센싱되고 외부 음향은 낮은 민감도로 센싱될 수 있다. 따라서, 사용자 마이크(110)에 수신된 전체 음향에서 외부 음향은 감쇄되고, 사용자 마이크(110)에서 생성되는 음향 신호인 사용자 음성 신호는 외부 음향이 감쇄된 신호일 수 있다.Also, the user microphone 110 may receive sound through one surface formed by the user microphone 110 . Here, one surface may refer to one surface formed by the vibrating unit of the user's microphone 110 or one surface formed by a plurality of vibrating units arranged in a plane. The user microphone 110 may be disposed in the sound signal processing apparatus 100 such that one surface formed by the user microphone 110 is disposed in a direction corresponding to an ignition point of the user's voice. Due to this arrangement of the user's microphone 110, the user's voice can be sensed with high sensitivity and the external sound can be sensed with low sensitivity. Accordingly, the external sound is attenuated from the total sound received by the user microphone 110, and the user voice signal, which is a sound signal generated by the user microphone 110, may be a signal in which the external sound is attenuated.

예를 들어, 사용자 마이크(110)는 전체 음향을 수신하는 일 면과 사용자 음성의 발화 지점으로부터 일 면으로 향하는 방향이 60° 내지 120°를 이루도록 음향 신호 처리 장치(100)에 배치될 수 있다. 사용자 마이크(110)(또는 사용자 마이크의 진동부)의 배치에 대해서는 도 10a 및 도 10b를 참조하여 후술하도록 한다.For example, the user's microphone 110 may be disposed in the sound signal processing apparatus 100 such that a side receiving the entire sound and a direction from the ignition point of the user's voice to the one side form an angle of 60° to 120°. The arrangement of the user microphone 110 (or the vibration unit of the user microphone) will be described later with reference to FIGS. 10A and 10B.

앰비언트 마이크(120)는 수신된 전체 음향으로부터 전체 음향 신호를 생성할 수 있다. 앰비언트 마이크(120)는 사용자 음성 및 외부 음향 중 어느 것이 감쇄되거나 강조되지 않은 전체 음향 신호를 생성할 수 있다.The ambient microphone 120 may generate an overall sound signal from all received sounds. The ambient microphone 120 may generate an overall sound signal in which neither the user's voice nor the external sound is attenuated or emphasized.

프로세서(130)는 마이크들에서 생성된 음향 신호를 수신하고, 이에 대한 연산을 수행할 수 있다. 프로세서(130)는 전체 음향 신호로부터 사용자 음성 신호를 차분 연산함으로써 외부 음향 신호를 생성할 수 있다. 외부 음향 신호는 전체 음향 신호에서 사용자 음성에 대응되는 신호가 감쇄된 신호일 수 있다. 따라서, 외부 음향 신호는 외부 음향에 대응되는 신호만을 포함하거나, 외부 음향에 대응되는 신호가 강조된 신호일 수 있다. 여기서, 특정 신호가 강조된다는 것은 특정 신호가 증폭됨을 의미하는 것이 아니고, 다른 신호가 감쇄됨에 따라 특정 신호가 명확해짐을 의미한다.The processor 130 may receive sound signals generated by the microphones and perform an operation on them. The processor 130 may generate an external sound signal by differentially calculating the user voice signal from the entire sound signal. The external sound signal may be a signal obtained by attenuating a signal corresponding to the user's voice from the entire sound signal. Accordingly, the external sound signal may include only a signal corresponding to the external sound or may be a signal in which a signal corresponding to the external sound is emphasized. Here, emphasizing a specific signal does not mean that the specific signal is amplified, but means that the specific signal becomes clearer as other signals are attenuated.

프로세서(130)가 차분 연산을 수행하는 방법에 대해서는 도 13 및 도 14를 참조하여 후술하도록 한다.A method of performing the difference operation by the processor 130 will be described later with reference to FIGS. 13 and 14 .

프로세서(130)는 다수의 논리 게이트들의 어레이로 구현될 수도 있고, 범용적인 마이크로 프로세서와 이 마이크로 프로세서에서 실행될 수 있는 프로그램이 저장된 메모리의 조합으로 구현될 수도 있다. 또한, 다른 형태의 하드웨어로 구현될 수도 있음을 본 실시예가 속하는 기술분야에서 통상의 지식을 가진 자라면 이해할 수 있다.The processor 130 may be implemented as an array of a plurality of logic gates, or may be implemented as a combination of a general-purpose microprocessor and a memory in which programs executable by the microprocessor are stored. Also, those having ordinary knowledge in the art to which this embodiment belongs can understand that it may be implemented in other types of hardware.

상술한 바와 같이, 음향 신호 처리 장치는 사용자 음성 신호 및 외부 음향 신호 각각을 생성할 수 있으므로, 수신되는 전체 음향에서 사용자 음성 및 외부 음향을 구분할 수 있다. 즉, 음향 신호 처리 장치에 사용자 음성 및 외부 음향이 동시에 수신되더라도 음향 신호 처리 장치는 각 음향을 구분하고 각 음향에 대응되는 신호를 생성할 수 있다. 따라서, 음향 신호 처리 장치는 어떠한 음향 환경에서도 사용자 음성 및 외부 음향 각각에 대응되는 기능을 수행하거나 각각에 대응되는 명령을 처리할 수 있다.As described above, since the sound signal processing apparatus can generate the user voice signal and the external sound signal, respectively, the user voice and the external sound can be distinguished from all received sounds. That is, even when the user's voice and the external sound are simultaneously received by the sound signal processing device, the sound signal processing device may distinguish each sound and generate a signal corresponding to each sound. Accordingly, the sound signal processing apparatus may perform functions corresponding to user voice and external sound, or process commands corresponding to each, in any acoustic environment.

도 2는 일 실시예에 따른 사용자 마이크의 구성을 도시한 블록도이다.2 is a block diagram illustrating a configuration of a user microphone according to an exemplary embodiment.

도 2를 참조하면, 사용자 마이크(110)는 복수의 진동 구조들을 포함할 수 있다. 각 진동 구조(111)는 진동부(112) 및 진동 검출부(113)를 포함할 수 있다. 도 2에 도시된 사용자 마이크(110)에는 본 실시예들과 관련된 구성요소들만이 도시되어 있다. 따라서, 사용자 마이크(110)에 도 2에 도시된 구성요소들 외에 다른 범용적인 구성요소들이 더 포함될 수 있음은 당업자에게 자명하다. 예를 들어, 사용자 마이크(110)는 지지대(미도시) 또는 음향 조정부(미도시) 등을 더 포함할 수 있다.Referring to FIG. 2 , the user microphone 110 may include a plurality of vibration structures. Each vibration structure 111 may include a vibration unit 112 and a vibration detection unit 113 . In the user microphone 110 shown in FIG. 2, only components related to the present embodiments are shown. Accordingly, it is apparent to those skilled in the art that the user's microphone 110 may further include other general-purpose components in addition to the components shown in FIG. 2 . For example, the user's microphone 110 may further include a support (not shown) or a sound controller (not shown).

사용자 마이크(110)는 상이한 주파수 대역의 음향을 센싱하는 복수의 진동 구조들을 포함할 수 있다. 복수의 진동 구조들은 상이한 모양(예를 들어, 길이, 두께, 형상 또는 무게 등)으로 형성될 수 있고 모양에 대응되는 공진주파수를 가질 수 있다. 복수의 진동 구조들은 각각의 공진주파수에 대응되는 주파수 대역의 음향을 센싱할 수 있다. 진동 구조의 구체적인 구조에 대해서는 도 3 및 도 4a를 참조하여 후술하도록 한다.The user's microphone 110 may include a plurality of vibration structures that sense sounds of different frequency bands. A plurality of vibrating structures may be formed in different shapes (eg, length, thickness, shape, weight, etc.) and may have resonant frequencies corresponding to the shapes. The plurality of vibrating structures may sense sound in a frequency band corresponding to each resonant frequency. A detailed structure of the vibrating structure will be described later with reference to FIGS. 3 and 4A.

진동부(112)는 전체 음향이 수신됨에 따라 진동할 수 있다. 예를 들어, 진동부(112)는 진동부(112)의 공진주파수에 근접한 주파수의 음향이 수신됨에 따라 진동할 수 있다. 각 진동부(112)는 전체 음향을 수신하는 일 면을 형성할 수 있다. 또한, 진동부들이 사용자 마이크(110) 내에 평면적으로 배열됨에 따라, 사용자 마이크(110)는 진동부들의 복수의 면들에 대응되는 일 면을 형성할 수 있다. 진동부(112)는 전체 음향이 수신됨에 따라 전체 음향의 주파수에 기초하여 일 면에 직교하는 방향으로 진동할 수 있다. 진동부(112)가 형성하는 일 면에 대해서는 도 4a를 참조하여 후술하도록 한다.The vibration unit 112 may vibrate as the entire sound is received. For example, the vibration unit 112 may vibrate when sound having a frequency close to the resonant frequency of the vibration unit 112 is received. Each vibrating unit 112 may form one surface for receiving the entire sound. In addition, as the vibrating parts are arranged in a planar manner within the user's microphone 110, the user's microphone 110 may form one surface corresponding to a plurality of surfaces of the vibrating parts. The vibrator 112 may vibrate in a direction orthogonal to one surface based on the frequency of the entire sound as the entire sound is received. A surface formed by the vibrating unit 112 will be described later with reference to FIG. 4A.

진동 검출부(113)는 진동부(112)의 진동을 수신하고, 수신된 진동에 대응되는 전기적 신호를 생성할 수 있다. 진동 검출부(113)에 의해 진동이 전기적 신호로 변환됨으로써, 음향 신호 처리 장치는 수신되는 음향에 대한 다양한 처리 및 연산을 수행할 수 있다.The vibration detection unit 113 may receive vibration of the vibration unit 112 and generate an electrical signal corresponding to the received vibration. Since the vibration is converted into an electrical signal by the vibration detection unit 113, the sound signal processing apparatus may perform various processing and calculations on the received sound.

도 3은 사용자 마이크의 구성을 예시적으로 도시한 도면이다.3 is a diagram showing the configuration of a user's microphone by way of example.

도 3을 참조하면, 사용자 마이크(110)는 지지대(115) 및 복수의 진동 구조들을 포함할 수 있다. 지지대(115)는 캐비티(116)를 관통하도록 형성될 수 있다. 지지대(115)로는 예를 들면 실리콘 기판이 사용될 수 있지만, 이에 한정되지는 않는다.Referring to FIG. 3 , a user microphone 110 may include a support 115 and a plurality of vibrating structures. The support 115 may be formed to pass through the cavity 116 . For example, a silicon substrate may be used as the support 115, but is not limited thereto.

복수의 진동 구조들은 지지대(115)의 캐비티(116) 위에 소정 형태로 배열될 수 있다. 진동 구조(111)는 서로 겹침 없이 평면적으로 배열될 수 있다. 각 진동 구조(111)는 도시된 바와 같이 일 측은 지지대(115)에 고정되며, 타 측은 캐비티(116) 쪽을 향하여 연장되도록 배치될 수 있다.A plurality of vibration structures may be arranged in a predetermined shape on the cavity 116 of the support 115 . The vibrating structures 111 may be arranged in a plane without overlapping each other. As shown, one side of each vibrating structure 111 may be fixed to the support 115 and the other side extending toward the cavity 116 .

진동 구조(111)는 예를 들면, 서로 다른 대역의 음향 주파수를 감지하도록 마련될 수 있다. 즉, 진동 구조(111)는 서로 다른 중심 주파수 또는 공진 주파수를 가지도록 마련될 수 있다. 이를 위해, 진동 구조(111)는 서로 다른 치수(dimension)를 가지도록 마련될 수 있다. 진동 구조(111)의 치수는 진동 구조(111)에 대해 원하는 공진 주파수를 고려하여 설정될 수 있다.The vibrating structure 111 may be provided to detect, for example, acoustic frequencies of different bands. That is, the vibrating structures 111 may be provided to have different center frequencies or resonant frequencies. To this end, the vibration structure 111 may be provided to have different dimensions. Dimensions of the vibrating structure 111 may be set in consideration of a desired resonant frequency for the vibrating structure 111 .

도 4a 내지 도 4c는 도 3의 진동 구조의 단면을 도시한 도면들이다.4a to 4c are cross-sectional views of the vibrating structure of FIG. 3 .

도 4a를 참조하면, 진동 구조(111)는 진동부(112), 진동 검출부(113) 및 질량체(114)를 포함할 수 있다. 진동 구조(111)는 도시된 바와 같이 일 측은 지지대(115)에 고정되며, 타 측은 캐비티 쪽을 향하여 연장되도록 배치될 수 있다.Referring to FIG. 4A , the vibration structure 111 may include a vibration unit 112 , a vibration detection unit 113 and a mass body 114 . As shown, one side of the vibrating structure 111 may be fixed to the support 115 and the other side may extend toward the cavity.

각 진동 구조(111)는 입력되는 음향에 반응하여 진동하는 진동부(112)와, 진동부(112)의 움직임을 센싱하는 진동 검출부(113)를 포함할 수 있다. 또한, 진동 구조(111)는 진동부(112)에 소정의 질량(mass)을 제공하기 위한 질량체(114)를 더 포함할 수 있다.Each vibrating structure 111 may include a vibrating unit 112 vibrating in response to an input sound and a vibration detecting unit 113 sensing a motion of the vibrating unit 112 . In addition, the vibration structure 111 may further include a mass body 114 for providing a predetermined mass to the vibration unit 112 .

진동부(112)는 수신되는 음향의 주파수에 기초하여 진동할 수 있다. 진동부(112)는 수신되는 음향의 주파수가 공진 주파수에 근접할수록 크게 진동하고, 수신되는 음향의 주파수가 공진 주파수에 대해 멀수록 작게 진동할 수 있다. 또는 진동부(112)는 센싱 가능한 주파수 대역의 음향이 수신된 경우에 진동하고, 센싱 가능한 주파수 대역을 벗어나는 음향이 수신된 경우에는 진동하지 않을 수 있다.The vibration unit 112 may vibrate based on the frequency of the received sound. The vibrator 112 may vibrate more as the frequency of the received sound approaches the resonant frequency and vibrate less as the frequency of the received sound gets farther from the resonant frequency. Alternatively, the vibrator 112 may vibrate when sound in a frequency band that can be sensed is received, and may not vibrate when sound outside the frequency band that can be sensed is received.

도 4b 및 도 4c를 참조하면, 진동부(112)는 음향을 수신하는 일 면(112a)을 형성할 수 있다.Referring to FIGS. 4B and 4C , the vibration unit 112 may form a surface 112a for receiving sound.

진동부(112)는 전체 음향이 수신됨에 따라 일 면(112a)에 직교하는 방향으로 진동할 수 있다. 진동부(112)는 수신되는 음향의 전파 방향(41)과 일 면(112a)이 이루는 각도에 기초한 세기로 진동할 수 있다. 진동부(112)는 음향의 전파 방향(41)과 일 면(112a)이 이루는 각도가 90°에 근접할수록 큰 진동 세기로 진동하고, 0°에 근접할수록 작은 진동 세기로 진동할 수 있다.The vibration unit 112 may vibrate in a direction orthogonal to one surface 112a as the entire sound is received. The vibration unit 112 may vibrate with an intensity based on an angle formed between the propagation direction 41 of the received sound and one surface 112a. The vibrating unit 112 may vibrate with a larger vibration intensity as the angle formed between the sound propagation direction 41 and the one surface 112a approaches 90°, and may vibrate with a smaller vibration intensity as the angle approaches 0°.

도 4b에 도시된 바와 같이 일 면(112a)에 대해 90°로 전파되는 음향이 수신되는 경우 진동부(112)는 가장 큰 진동 세기로 진동할 수 있다. 또한, 도 4c에 도시된 바와 같이 일 면(112a)에 대해 90°보다 작은 각도로 전파되는 음향이 수신되는 경우 진동부(112)는 가장 도 4b에서보다 작은 진동 세기로 진동할 수 있다.As shown in FIG. 4B , when sound propagating at 90° with respect to one surface 112a is received, the vibration unit 112 may vibrate with the greatest vibration intensity. In addition, as shown in FIG. 4C , when sound propagating at an angle smaller than 90° with respect to one surface 112a is received, the vibrator 112 may vibrate with a vibration intensity smaller than that in FIG. 4B.

이와 같은 진동부(112)의 진동 동작에 기인하여 사용자 마이크(또는 진동 구조들)는 음향의 전파 방향(41)을 고려하여 음향 신호 처리 장치 내에 배치될 수 있다. 예를 들어, 사용자 마이크는 사용자 음성이 90°에 근접한 각도로 일 면(112a)에 전파되도록 음향 신호 처리 장치 내에 배치될 수 있다. 다시 말해, 사용자 마이크는 일 면(112a)이 사용자 음성의 발화 지점을 향하도록 배치될 수 있으며, 이러한 배치에 관해서는 도 10a 및 도 10b를 참조하여 후술하도록 한다.Due to the vibrating operation of the vibrating unit 112, the user's microphone (or vibrating structures) may be disposed in the sound signal processing device in consideration of the propagation direction 41 of sound. For example, the user's microphone may be disposed in the sound signal processing device so that the user's voice is propagated to one surface 112a at an angle close to 90°. In other words, the user's microphone may be disposed such that one side 112a faces the ignition point of the user's voice, and this arrangement will be described later with reference to FIGS. 10A and 10B.

도 5는 비교예에 따른 앰비언트 마이크들을 이용한 음향 센싱 방법을 설명하기 위한 도면이다.5 is a diagram for explaining a sound sensing method using ambient microphones according to a comparative example.

도 5의 비교예에 따른 음향 센싱 방법은 특정 방향의 음향을 극대화하기 위하여, 복수의 앰비언트 마이크들(510)을 이용할 수 있다. 복수의 앰비언트 마이크들(510)은 소정 간격(D)을 두고 배치되며, 그 간격(D)으로 인해 음향이 각 앰비언트 마이크에 도달하는 시간 또는 위상 지연(phase delay)이 생기고, 그 시간 또는 위상 지연을 보상하는 정도를 다르게 함으로써 전체의 지향성이 조절될 수 있다. 이러한 지향성 조절 방법은 Time Difference of Arrival(TDOA)로 지칭될 수 있다.The sound sensing method according to the comparative example of FIG. 5 may use a plurality of ambient microphones 510 to maximize sound in a specific direction. The plurality of ambient microphones 510 are arranged at a predetermined interval D, and the interval D causes a time or phase delay for sound to reach each ambient microphone, and the time or phase delay The overall directivity can be adjusted by varying the degree of compensating for . This directivity control method may be referred to as Time Difference of Arrival (TDOA).

다만, 전술한 방법은 음향이 각 앰비언트 마이크에 도달하는 시간에 차이가 있다는 것을 전제하는 바, 가청 주파수대역의 파장(wavelength)을 고려하여 간격이 설정되어야 하므로 앰비언트 마이크들(510) 간의 간격 설정에 제약이 있을 수 있다. 간격 설정에 제약이 있기 때문에, 전술한 방법을 수행하는 장치의 소형화에 제약이 있을 수 있다. 특히, 낮은 주파수는 파장의 길이가 길어서, 낮은 주파수의 음향을 구분하기 위해서는 앰비언트 마이크들(510) 간의 간격이 넓고, 각 앰비언트 마이크의 신호 대 잡음비(SNR; signal-to-noise ratio)가 높아야 할 수 있다.However, since the above method assumes that there is a difference in the time at which sound reaches each ambient microphone, the interval must be set in consideration of the wavelength of the audible frequency band. There may be restrictions. Since there is a limitation in setting the interval, there may be a limitation in miniaturization of a device performing the above-described method. In particular, since a low frequency has a long wavelength, the distance between the ambient microphones 510 must be wide and the signal-to-noise ratio (SNR) of each ambient microphone must be high in order to distinguish low-frequency sounds. can

이에 더해, 전술한 방법은 각 앰비언트 마이크에서 센싱되는 음향의 주파수 대역에 따라서 위상(phase)이 다르게 되므로, 각 주파수 별로 위상을 보상해주어야 할 수 있다. 각 주파수 별로 위상을 보상하기 위해서, 전술한 방법은 알맞은 웨이트를 각 주파수 별로 적용하는 복잡한 신호처리 과정이 요구될 수 있다.In addition, since the above method has a different phase depending on the frequency band of the sound sensed by each ambient microphone, the phase may need to be compensated for each frequency. In order to compensate the phase for each frequency, the above-described method may require a complex signal processing process of applying an appropriate weight for each frequency.

음향 신호 처리 장치는 도 5의 비교예와 달리 마이크들간의 간격에 제약이 없고, 복잡한 신호처리 없이 간단한 연산만으로 방향을 구분하여 특정 방향의 음향을 획득할 수 있다. 이하 도면들을 참고하여 음향 신호 처리 장치의 효율적인 구조 및 운용에 대해 상세히 설명한다.Unlike the comparative example of FIG. 5 , the sound signal processing apparatus has no restrictions on the distance between the microphones and can acquire sound in a specific direction by distinguishing directions with simple calculations without complex signal processing. An efficient structure and operation of the acoustic signal processing apparatus will be described in detail with reference to the following drawings.

도 6은 일 실시예에 따른 사용자 마이크의 지향 패턴을 설명하기 위한 도면이다.6 is a diagram for explaining a directing pattern of a user's microphone according to an exemplary embodiment.

도 6을 참조하면, 사용자 마이크(110)는 양지향성 지향 패턴(61 및 62)을 가질 수 있다. 예를 들어, 양지향성 지향 패턴(61 및 62)은 사용자 마이크(110)의 전면(+z방향)을 지향하는 전면부(61)와 사용자 마이크(110)의 후면(-z방향)을 지향하는 후면부(62)로 구성되는 8자형(figure-8)의 지향 패턴일 수 있다.Referring to FIG. 6 , a user microphone 110 may have bi-directional directing patterns 61 and 62 . For example, the bi-directional directing patterns 61 and 62 have a front portion 61 directed toward the front (+z direction) of the user's microphone 110 and a rear portion (-z direction) directed toward the user's microphone 110. It may be a figure-8 directing pattern composed of the rear part 62.

진동부가 형성하는 일 면(112a)에 음향이 수직으로 전파되는 경우 진동부가 가장 민감하게 반응함으로써 큰 진동 세기로 진동할 수 있다. 따라서, 일 면(112a)에 직교하는 방향인 사용자 마이크(110)의 전면 방향(+z방향) 및 후면 방향(-z방향)에 기초한 지향 패턴이 형성될 수 있다. 이 경우 사용자 마이크(110)는 지향하지 않는 방향(예를 들어 +x방향 및 -x방향)에서 수신되는 음향에 대해서는 반응하는 민감도가 저하될 수 있다. 따라서 사용자 마이크(110)는 지향하지 않는 방향(예를 들어 +x방향 및 -x방향)에서 수신되는 음향을 감쇄시킬 수 있다.When sound propagates vertically to one surface 112a formed by the vibrating unit, the vibrating unit reacts most sensitively and vibrates with a high vibration intensity. Accordingly, a directing pattern may be formed based on the front direction (+z direction) and the rear direction (−z direction) of the user's microphone 110, which are directions orthogonal to one surface 112a. In this case, the sensitivity of the user's microphone 110 in response to sound received from non-direction directions (eg, +x direction and -x direction) may be reduced. Accordingly, the user microphone 110 may attenuate sound received from non-direction directions (eg, +x direction and -x direction).

사용자 마이크(110)의 구조에 따라 한 면에는 음향이 수신되는 것을 블록(block)함으로써 +z방향 또는 -z방향의 단일지향성 패턴이 형성될 수도 있다. 상술한 사용자 마이크(110)의 지향 패턴들은 예시에 불과하며 진동 구조들(또는 진동부들)의 배치에 따라 지향 패턴은 다양하게 변형될 수 있다.Depending on the structure of the user's microphone 110, a +z direction or -z direction unidirectional pattern may be formed by blocking sound reception on one side. The above-described directing patterns of the user's microphone 110 are just examples, and the directing patterns may be variously modified according to the arrangement of vibration structures (or vibrating units).

도 7은 사용자 마이크의 지향 패턴을 측정한 결과를 나타내는 도면이다.7 is a diagram showing a result of measuring a directivity pattern of a user's microphone.

도 7에 도시된 바와 같이, 사용자 마이크는 다양한 주파수에 대해 균일하게 양지향성의 지향 패턴을 가지는 것을 확인할 수 있다. 즉, 다양한 주파수에 대해 0° 방향과 180° 방향인 도 6의 +z 축 방향 및 -z 축 방향으로의 지향성을 가지고 있음을 확인할 수 있다.As shown in FIG. 7 , it can be confirmed that the user's microphone has a uniform bi-directional directing pattern for various frequencies. That is, it can be seen that the antenna has directivity in the +z-axis direction and the -z-axis direction of FIG. 6, which are the 0° direction and the 180° direction for various frequencies.

도 8은 일 실시예에 따른 음향 신호 처리 장치의 신호 처리를 설명하기 위한 도면이다.8 is a diagram for explaining signal processing of a sound signal processing apparatus according to an exemplary embodiment.

도 8을 참조하면, 사용자 마이크(110)는 양지향성 지향 패턴(81)을 가지고, 앰비언트 마이크(120)는 전지향성 또는 무지향성 지향 패턴(82)을 가질 수 있다. 사용자 마이크(110)는 앰비언트 마이크(120)가 센싱한 음향의 위상과 동위상(in-phase)인 음향을 양지향성 지향 패턴(81)의 전면 방향(예를 들어, 도 6의 +z방향)으로부터 센싱할 수 있고, 앰비언트 마이크(120)가 센싱한 음향의 위상과 반대위상(anti-phase)인 음향을 후면 방향(예를 들어, 도 6의 -z방향)으로부터 센싱할 수 있다. 다만 도 8에 도시된 사용자 마이크(110)의 지향 패턴은 예시에 불과하며 상술하였듯이 사용자 마이크(110)의 구조 및 진동 구조들(또는 진동부들)의 배치에 따라 지향 패턴은 다양하게 변형될 수 있다.Referring to FIG. 8 , the user microphone 110 may have a bi-directional directing pattern 81 and the ambient microphone 120 may have an omni-directional or non-directional directing pattern 82 . The user microphone 110 transmits sound that is in-phase with the phase of the sound sensed by the ambient microphone 120 in the front direction of the bidirectional directing pattern 81 (eg, the +z direction in FIG. 6). , and a sound that is anti-phase with the phase of the sound sensed by the ambient microphone 120 can be sensed from the rear direction (eg, -z direction in FIG. 6 ). However, the directing pattern of the user microphone 110 shown in FIG. 8 is only an example, and as described above, the directing pattern may be variously modified according to the structure of the user microphone 110 and the arrangement of vibration structures (or vibrating units). .

도 9는 일 실시예에 따른 사용자 마이크 및 앰비언트 마이크의 지향 패턴들을 측정한 그래프이다.9 is a graph in which directive patterns of a user's microphone and an ambient microphone are measured according to an exemplary embodiment.

도 9를 참조하면, 사용자 마이크는 양지향성 지향 패턴을 가지고, 앰비언트 마이크는 전지향성(또는 무지향성) 지향 패턴을 가지는 것을 알 수 있다. 예를 들어, 사용자 마이크는 전면(도 6의 +z방향)에 대응하는 330°~30°(사용자 마이크가 형성하는 일 면 기준 60°~120°) 영역으로부터 전달된 음향을 센싱할 수 있고, 후면(도 6의 -z방향)에 대응하는 150°~210°(사용자 마이크가 형성하는 일 면 기준 240°~300°) 영역으로부터 전달된 음향을 센싱할 수 있다. 예를 들어, 사용자 마이크는 30°(사용자 마이크가 형성하는 일 면 기준 120°) 영역에서는 0°(사용자 마이크가 형성하는 일 면 기준 90°) 영역 대비 대략 0.85배의 크기의 음향을 센싱할 수 있다.Referring to FIG. 9 , it can be seen that the user microphone has a bi-directional directing pattern and the ambient microphone has an omni-directional (or non-directional) directing pattern. For example, the user's microphone can sense sound transmitted from an area of 330° to 30° (60° to 120° based on one side formed by the user's microphone) corresponding to the front side (+z direction in FIG. 6), Sound transmitted from an area of 150° to 210° (240° to 300° based on one side formed by the user's microphone) corresponding to the rear surface (-z direction in FIG. 6) can be sensed. For example, the user's microphone can sense approximately 0.85 times as much sound in an area of 30° (120° based on one side formed by the user's microphone) than in an area of 0° (90° based on one side formed by the user's microphone). have.

앰비언트 마이크는 주변 360° 영역의 모든 방향으로부터 전달된 음향을 센싱할 수 있다.Ambient microphones can sense sound transmitted from all directions in a 360° area around them.

사용자 마이크는 90° 또는 270°(사용자 마이크가 형성하는 일 면 기준 0°)에 근접한 방향에서 수신되는 음향을 감쇄시킬 수 있다. 도 9에 따른 실시예에서 사용자 마이크는 60° 내지 120°의 방향에서 수신되는 음향에 대해서는 낮은 민감도로 반응하므로 해당 방향의 음향을 감쇄시킬 수 있다.The user's microphone may attenuate sound received from a direction close to 90° or 270° (0° based on one side formed by the user's microphone). In the embodiment of FIG. 9 , the user's microphone reacts with low sensitivity to sound received in a direction of 60° to 120°, and thus can attenuate sound in a corresponding direction.

도 9에서는 하나의 주파수에 대한 결과만을 도시했으나, 도 7에서 전술히였듯이, 사용자 마이크는 다양한 주파수에 대해 균일한 민감도를 가질 수 있으므로, 다양한 주파수에 대한 결과들도 유사한 형태의 지향 패턴을 형성할 수 있음은 물론이다. 예를 들어, 다양한 주파수들은 가청 주파수 영역의 주파수들일 수 있고, 사용자 마이크에 대해 주파수의 고저와 무관하게 유사한 형태의 지향 패턴이 형성될 수 있다.Although FIG. 9 shows the results for only one frequency, as described above in FIG. 7, since the user's microphone can have uniform sensitivity to various frequencies, the results for various frequencies can form similar directing patterns. Of course you can. For example, the various frequencies may be frequencies in the audible frequency range, and a directing pattern of a similar form may be formed for a user's microphone regardless of the high and low frequencies.

도 10a 및 도 10b는 사용자 음성의 발화 지점에 대한 진동부의 배치를 도시한 도면들이다.10A and 10B are diagrams illustrating the arrangement of a vibrator with respect to an ignition point of a user's voice.

도 10a 및 도 10b를 참조하면, 사용자 음성의 발화 지점(42)으로부터 전파된 사용자 음성이 진동부(112)가 형성하는 일 면(112a)에 수신될 수 있다.Referring to FIGS. 10A and 10B , the user voice propagated from the ignition point 42 of the user voice may be received on one surface 112a formed by the vibration unit 112 .

도 10a에 도시된 바와 같이 사용자 음성의 전파 방향과 진동부(112)가 형성하는 일 면(112a)이 직교하는 경우 진동부(112)가 가장 민감하게 반응하고, 사용자 음성이 가장 크게 센싱될 수 있다. 따라서, 사용자 마이크는 진동부(112)(또는 복수의 진동부들)가 형성하는 일 면(112a)이 사용자 음성의 발화 지점(42)에 대응되는 방향으로 배치되도록 음향 신호 처리 장치 내에 배치될 수 있다.As shown in FIG. 10A, when the propagation direction of the user's voice and the surface 112a formed by the vibrating unit 112 are orthogonal, the vibrating unit 112 responds most sensitively and the user's voice can be sensed the most. there is. Accordingly, the user's microphone may be disposed in the sound signal processing device so that one surface 112a formed by the vibrating unit 112 (or a plurality of vibrating units) is disposed in a direction corresponding to the ignition point 42 of the user's voice. .

다시 말해, 사용자 마이크는 진동부(112)(또는 복수의 진동부들)가 형성하는 일 면(112a)과 사용자 음성의 발화 지점(42)으로부터 일 면(112a)으로 향하는 방향이 서로 대응되도록(바람직하게는 90°를 이루도록) 배치될 수 있다.In other words, in the user's microphone, one surface 112a formed by the vibrating unit 112 (or a plurality of vibrating units) and the direction from the ignition point 42 of the user's voice to the one surface 112a correspond to each other (preferably). It may be arranged so as to achieve 90 °).

한편, 일 면(112a)과 사용자 음성의 전파 방향이 이루는 각도가 90°인 경우 가장 큰 민감도로 음향이 센싱될 수 있지만, 공정상 또는 사용상의 여러 제약으로 인해 그 각도가 90°로 유지되기 어려울 수 있다. 예를 들어, 도 10b에 도시된 바와 같이 사용자 음성의 전파 방향과 일 면(112a)이 90° 미만의 각도를 이룰 수 있다. 다만, 이 경우에도 도 9에서 상술하였듯이 사용자 마이크는 사용자 음성을 센싱할 수 있다.On the other hand, when the angle between one surface 112a and the propagation direction of the user's voice is 90 °, the sound can be sensed with the greatest sensitivity, but it is difficult to maintain the angle at 90 ° due to various limitations in processing or use. can For example, as shown in FIG. 10B, the propagation direction of the user's voice and one surface 112a may form an angle of less than 90°. However, even in this case, as described above with reference to FIG. 9 , the user's microphone can sense the user's voice.

사용자 마이크는 공정상 및 사용상의 유연성을 확보하며 사용자 음성을 효과적으로 센싱하기 위한 각도로 음향 신호 처리 장치 내에 배치될 수 있다. 사용자 마이크는 진동부(112)(또는 복수의 진동부들)가 형성하는 일 면(112a)과 사용자 음성의 발화 지점(42)으로부터 일 면(112a)으로 향하는 방향이 60° 내지 120°를 이루도록 음향 신호 처리 장치에 배치될 수 있다. 도 9에 상술하였듯이 사용자 마이크가 60° 또는 120°로 음향을 수신하더라도 90°로 수신하는 것 대비 대략 0.85배의 크기로 음향을 수신할 수 있다. 따라서, 60° 내지 120°는 공정상 및 사용상의 유연성을 제공하며 사용자 음성을 센싱하기에 충분한 각도일 수 있다.The user's microphone may be disposed in the acoustic signal processing device at an angle to effectively sense the user's voice while ensuring flexibility in processing and use. The user's microphone is configured such that one surface 112a formed by the vibrating unit 112 (or a plurality of vibrating units) and a direction from the ignition point 42 of the user's voice to the one surface 112a form an angle of 60° to 120°. It can be placed in a signal processing device. As described above with reference to FIG. 9 , even if a user's microphone receives sound at 60° or 120°, it can receive sound at a size approximately 0.85 times larger than when received at 90°. Therefore, 60° to 120° provides flexibility in processing and use and may be an angle sufficient for sensing a user's voice.

이와 같이 사용자 마이크가 사용자 음성의 발화 지점(42)을 지향하도록 배치된 경우, 사용자 음성의 발화 지점(42)과 이격된 곳에서 발생하는 외부 음향에 대해서는 낮은 민감도로 반응할 수 있다. 따라서, 사용자 마이크는 외부 음향을 감쇄시킬 수 있다.In this way, when the user's microphone is disposed to direct the user's voice to the ignition point 42, it can respond with low sensitivity to external sound generated at a place far from the user's voice to the ignition point 42. Therefore, the user's microphone can attenuate external sound.

이러한 사용자 마이크의 배치가 음향 신호 처리 장치에 적용되는 실시예에 관해서는 도 17c를 참조하여 도시적으로 설명하도록 한다.An embodiment in which the arrangement of the user microphone is applied to the sound signal processing device will be illustrated with reference to FIG. 17C.

도 11은 일 실시예에 따른 음향 조정부의 음향 조정 과정을 도시한 도면이다.11 is a diagram illustrating a sound control process of a sound control unit according to an embodiment.

도 11을 참조하면, 2개의 시간 프레임들 각각에서, 상이한 주파수 대역을 센싱하는 3개의 진동 구조들에 의해 생성된 전기적 음향 신호 프레임들(1210a 내지 1210f)이 도시된다. 음향 신호 프레임들은 음향 조정부(1100)에 입력되며, 음향 조정부(1100)는 각 진동 구조에 하나씩 포함되거나, 사용자 마이크에 하나가 포함될 수도 있다.Referring to FIG. 11 , in each of two time frames, electrical acoustic signal frames 1210a to 1210f generated by three vibration structures sensing different frequency bands are shown. The sound signal frames are input to the sound control unit 1100, and one sound control unit 1100 may be included in each vibration structure or may be included in a user's microphone.

사용자 마이크의 음향 조정부(1100)는 임계값에 기초하여, 진동 구조들에 의해 생성된 전기적 신호들 중 감쇄시킬 전기적 신호를 결정할 수 있다. 음향 조정부(1100)는 결정된 전기적 신호를 감쇄시킬 수 있다. 여기서, 감쇄되는 전기적 신호는 외부 음향에 대응되는 신호일 수 있다. 음향 조정부(1100)에 의해 외부 음향에 대응되는 신호가 감쇄됨에 따라 사용자 음성이 극대화될 수 있다.The sound control unit 1100 of the user's microphone may determine an electrical signal to be attenuated among electrical signals generated by the vibration structures based on the threshold value. The sound controller 1100 may attenuate the determined electrical signal. Here, the attenuated electrical signal may be a signal corresponding to external sound. As a signal corresponding to external sound is attenuated by the sound controller 1100, the user's voice can be maximized.

“Frame 0”는 첫째 시간 구간에서 측정한 음향 신호 프레임을 나타낸다. “Frame j”는 상기 첫째 시간 구간 이후, j번째 시간 구간에서 측정한 음향 신호 프레임을 나타낸다. 제1 내지 제3 음향 신호 프레임들(1110a 내지 1110c)은 동일 시간 구간(첫째 시간 구간)에서 측정한 프레임들이고, 제4 내지 제6 음향 신호 프레임들(1110d 내지 1110f)도 동일 시간 구간(j번째 시간 구간)에서 측정한 프레임이다.“Frame 0” represents a sound signal frame measured in the first time interval. “Frame j” represents a sound signal frame measured in the j-th time interval after the first time interval. The first to third sound signal frames 1110a to 1110c are frames measured in the same time period (the first time period), and the fourth to sixth sound signal frames 1110d to 1110f are also the same time period (the jth time period). It is the frame measured in the time interval).

제1 및 제4 음향 신호 프레임(1110a, 1110d)은 동일 주파수 대역에 있고, 동일 진동 구조를 통해 음향 조정부(1100)에 입력될 수 있다. 제2 및 제5 음향 신호 프레임(1110b, 1110e)은 동일 주파수 대역에 있고, 동일 진동 구조를 통해 음향 조정부(1100)에 입력될 수 있다. 제3 및 제6 음향 신호 프레임(1110c, 1110f)은 동일 주파수 대역에 있고, 동일 진동 구조를 통해 음향 조정부(1100)에 입력될 수 있다. 제1 및 제4 음향 신호 프레임(1110a, 1110d)의 주파수 대역과 제2 및 제5 음향 신호 프레임(1110b, 1110e)의 주파수 대역과 제3 및 제6 음향 신호 프레임(1110c, 1110f)의 주파수 대역은 상이하다.The first and fourth sound signal frames 1110a and 1110d are in the same frequency band and may be input to the sound controller 1100 through the same vibration structure. The second and fifth sound signal frames 1110b and 1110e are in the same frequency band and may be input to the sound controller 1100 through the same vibration structure. The third and sixth sound signal frames 1110c and 1110f are in the same frequency band and may be input to the sound controller 1100 through the same vibration structure. The frequency bands of the first and fourth sound signal frames 1110a and 1110d, the frequency bands of the second and fifth sound signal frames 1110b and 1110e, and the frequency bands of the third and sixth sound signal frames 1110c and 1110f is different

도 11에서 “Drop”은 음향 조정부(1100)가 입력된 음향 신호를 감쇄시킬 음향 신호로 결정한 경우를 나타내고, “Add”는 음향 조정부(1100)가 입력된 음향 신호를 감쇄시키지 않는 경우를 나타낸다.In FIG. 11, “Drop” indicates a case in which the sound control unit 1100 determines the input sound signal as an audio signal to be attenuated, and “Add” indicates a case in which the sound control unit 1100 does not attenuate the input sound signal.

도 11을 참조하면, 제1 내지 제4 음향 신호 프레임(1110a 내지 1110d)의 경우처럼 음향 신호의 세기가 임계값(T)이하이거나 임계값(T)을 초과하더라도 초과 정도가 설정된 값 이하인 경우, 음향 조정부(1100)는 해당 음향 신호를 감쇄시킬 수 있다(Drop).Referring to FIG. 11, as in the case of the first to fourth sound signal frames 1110a to 1110d, even if the intensity of the sound signal is less than or equal to the threshold value T or exceeds the threshold value T, if the degree of excess is less than or equal to a set value, The sound control unit 1100 may attenuate (drop) the corresponding sound signal.

반면, 제5 및 제6 음향 신호 프레임(1110e, 1110f)과 같이 음향 신호의 세기가 임계값(T)을 초과하고, 초과 정도가 기 설정된 값을 초과하는 경우, 음향 조정부(1100)는 해당 음향 신호를 감쇄시키지 않을 수 있다(Add).On the other hand, as in the fifth and sixth sound signal frames 1110e and 1110f, when the intensity of the sound signal exceeds the threshold value T and the excess exceeds a preset value, the sound controller 1100 controls the corresponding sound signal. The signal may not be attenuated (Add).

음향 조정부(1100)의 출력 결과는 예를 들어 증폭부 등을 거쳐 프로세서(130)에 전달될 수 있다.An output result of the sound control unit 1100 may be delivered to the processor 130 through, for example, an amplification unit.

도 12는 일 실시예에 따른 사용자 마이크로부터 생성된 사용자 음성 신호를 도시한 도면이다.12 is a diagram illustrating a user voice signal generated from a user microphone according to an exemplary embodiment.

도 12를 참조하면, 도 5의 비교예에 따른 방법으로 사용자 음성을 센싱한 결과를 나타낸 제1 그래프(1210) 및 사용자 마이크가 사용자 음성을 센싱한 결과를 나타낸 제2 그래프(1220)가 도시된다.Referring to FIG. 12, a first graph 1210 showing the result of sensing the user's voice by the method according to the comparative example of FIG. 5 and a second graph 1220 showing the result of sensing the user's voice by the user's microphone are shown. .

제1 그래프(1210)는 도 5의 비교예에 따라 복수의 앰비언트 마이크들을 이용하여 외부 음향을 감쇄한 결과를 나타낸다. 제1 그래프(1210)에는 사용자 음성에 대응되는 신호(1210a) 및 외부 음향에 대응되는 신호(1210b)가 나타난다. 외부 음향에 대응되는 신호(1210b)는 사용자 음성에 대응되는 신호(1210a)보다는 감쇄된 것으로 확인되나 센싱될 수 있을 정도의 신호가 남아있음이 확인된다.A first graph 1210 shows a result of attenuating external sound using a plurality of ambient microphones according to the comparative example of FIG. 5 . A signal 1210a corresponding to the user's voice and a signal 1210b corresponding to external sound appear in the first graph 1210. The signal 1210b corresponding to the external sound is confirmed to be attenuated rather than the signal 1210a corresponding to the user's voice, but it is confirmed that the signal 1210b corresponding to the external sound remains to the extent that it can be sensed.

제2 그래프(1220)는 사용자 마이크가 외부 음향 신호를 감쇄시킴으로써 생성한 사용자 음성 신호를 나타낸다. 제2 그래프(1220)에는 사용자 음성에 대응되는 신호(1220a) 및 외부 음향에 대응되는 신호(1220b)가 나타난다. 제2 그래프(1220)에서 외부 음향에 대응되는 신호(1220b)가 확연히 감쇄된 것으로 확인된다. 제2 그래프(1220)에서 외부 음향에 대응되는 신호(1220b)는 센싱되기 어려운 정도로서 무음에 가까운 수준으로 감쇄된 것이 확인된다.A second graph 1220 shows a user's voice signal generated by the user's microphone attenuating an external sound signal. A signal 1220a corresponding to the user's voice and a signal 1220b corresponding to external sound appear in the second graph 1220. In the second graph 1220, it is confirmed that the signal 1220b corresponding to the external sound is significantly attenuated. In the second graph 1220, it is confirmed that the signal 1220b corresponding to the external sound is attenuated to a level close to silence as it is difficult to sense.

사용자 마이크는 진동 구조들이 갖는 지향성에 기초한, 사용자 음성의 발화 지점을 향한 배치를 통해 외부 음향을 감쇄시킬 수 있다. 또는, 사용자 마이크는 임계값에 기초하여 진동 구조들에 의해 생성된 신호들 중 일부 신호를 감쇄시킴으로써 외부 음향을 감쇄시킬 수 있다. 결과적으로, 사용자 마이크는 상술한 두가지 방법들 중 하나 또는 두가지 방법을 모두 사용함으로써 외부 음향 신호를 감쇄시키고, 사용자 음성 신호를 생성할 수 있다.The user's microphone may attenuate external sound by being disposed toward an ignition point of the user's voice based on directivity of the vibration structures. Alternatively, the user's microphone may attenuate external sound by attenuating some of the signals generated by the vibration structures based on the threshold value. As a result, the user's microphone can attenuate an external sound signal and generate a user's voice signal by using one or both of the above two methods.

도 13은 일 실시예에 따른 차분 연산 방법을 설명하기 위한 도면이다.13 is a diagram for explaining a difference calculation method according to an exemplary embodiment.

도 13을 참조하면, 앰비언트 마이크(120)로부터 생성된 전체 음향 신호 및 사용자 마이크(110)로부터 생성된 사용자 음성 신호가 프로세서(130)에 입력될 수 있다. 프로세서(130)는 입력된 신호들에 대한 연산을 통해 외부 음향 신호를 생성할 수 있다.Referring to FIG. 13 , the entire sound signal generated from the ambient microphone 120 and the user voice signal generated from the user microphone 110 may be input to the processor 130 . The processor 130 may generate external sound signals through calculations on input signals.

전체 음향은 외부 음향 및 사용자 음성을 포함하므로, 전체 음향에 대응되는 전체 음향 신호는 외부 음향에 대응되는 신호 및 사용자 음성에 대응되는 신호를 포함할 수 있다. 전체 음향 신호는 어느 종류의 음향도 감쇄되거나 강조되지 않은 신호일 수 있다. 사용자 음성 신호는 사용자 음성이 높은 민감도로 센싱되고 외부 음향이 낮은 민감도로 센싱됨으로써 전체 음향에서 외부 음향이 감쇄된 신호일 수 있다.Since the overall sound includes the external sound and the user's voice, the overall sound signal corresponding to the overall sound may include a signal corresponding to the external sound and a signal corresponding to the user's voice. The total acoustic signal may be a signal in which no kind of sound is attenuated or enhanced. The user voice signal may be a signal obtained by attenuating the external sound from the overall sound by sensing the user voice with high sensitivity and the external sound with low sensitivity.

따라서, 프로세서(130)는 전체 음향 신호로부터 사용자 음성 신호를 차분 연산함으로써 전체 음향 신호에서 사용자 음성에 대응되는 신호는 감쇄되고 외부 음향에 대응되는 신호는 유지되는 신호를 생성할 수 있다. 이와 같이 프로세서(130)는 외부 음향에 대응되는 신호가 강조된 외부 음향 신호를 생성할 수 있다.Accordingly, the processor 130 may generate a signal in which a signal corresponding to the user's voice is attenuated and a signal corresponding to the external sound is maintained in the entire audio signal by differentially calculating the user's voice signal from the entire audio signal. As such, the processor 130 may generate an external sound signal in which a signal corresponding to the external sound is emphasized.

도 14는 도 13의 차분 연산 방법의 일 예를 설명하기 위한 도면이다.14 is a diagram for explaining an example of the difference calculation method of FIG. 13 .

도 14를 참조하면, 사용자 마이크(110)로부터 생성된 사용자 음성 신호가 적응 필터(1400)에 입력될 수 있다. 앰비언트 마이크(120)로부터 생성된 전체 음향 신호로부터, 적응 필터(1400)의 출력 신호가 차분 연산되고, 차분 연산의 결과인 피드백 신호가 적응 필터(1400)에 입력될 수 있다. 피드백된 적응 필터(1400)의 출력 신호가 전체 음향 신호로부터 차분 연산됨으로써 최종적으로 외부 음향 신호가 생성될 수 있다.Referring to FIG. 14 , a user voice signal generated from a user microphone 110 may be input to an adaptive filter 1400 . An output signal of the adaptive filter 1400 is differentially calculated from the entire acoustic signal generated by the ambient microphone 120, and a feedback signal resulting from the differential operation may be input to the adaptive filter 1400. The output signal of the feedbacked adaptive filter 1400 is differentially calculated from the entire acoustic signal, so that an external acoustic signal may be finally generated.

적응 필터(1400)는 피드백 신호에 기초하여 파라미터들을 조정할 수 있다. 여기서, 파라미터들은 차분 연산 결과 전체 음향 신호에서 사용자 음성이 감쇄될 수 있도록 조정될 수 있다. 적응 필터(1400)는 예를 들어, 오차 신호를 최소화하기 위한 최소 자승 평균(LMS) 알고리즘, 필터링된-X LMS(FXLMS) 알고리즘, 필터링된-오차 LMS(FELMS) 알고리즘, 경사 하강(steepest descent) 알고리즘 또는 재귀형 최소 자승(recursive least square, RLS) 알고리즘 등과 같은 다양한 알고리즘에 따라 동작할 수 있다. 파라미터들은 예를 들어, 신호들 간의 상관계수, 신호들의 딜레이 또는 신호들의 진폭에 관한 파라미터들을 포함할 수 있다. 상관계수는 스피어만 상관계수(Spearman correlation coefficient), 크론바흐 알파 계수(Cronbach's alpha) 또는 피어슨 상관계수(Pearson correlation coefficient) 등을 포함할 수 있다.Adaptive filter 1400 can adjust parameters based on the feedback signal. Here, the parameters may be adjusted so that the user's voice can be attenuated from the entire sound signal as a result of the difference operation. The adaptive filter 1400 may be, for example, a Least Square Mean (LMS) algorithm for minimizing the error signal, a Filtered-X LMS (FXLMS) algorithm, a Filtered-Error LMS (FELMS) algorithm, a steepest descent It can operate according to various algorithms, such as algorithms or recursive least squares (RLS) algorithms. The parameters may include, for example, parameters related to a correlation coefficient between signals, a delay of signals, or an amplitude of signals. The correlation coefficient may include Spearman correlation coefficient, Cronbach's alpha coefficient, or Pearson correlation coefficient.

프로세서는 입력된 신호들에 대한 연산을 통해 외부 음향 신호를 생성할 수 있다. 프로세서는 사용자 음성 신호가 적응 필터(1400)에 입력됨에 따라 적응 필터(1400)로부터 출력되는 신호를 전체 음향 신호로부터 차분 연산함으로써 피드백 신호를 생성할 수 있다. 프로세서는 피드백 신호를 적응 필터(1400)에 입력함으로써 파라미터들을 조정하도록 적응 필터(1400)를 제어할 수 있다. 프로세서는 피드백 신호가 인가됨에 따라 파라미터들이 조정된 적응 필터(1400)로부터의 출력 신호를 전체 음향 신호로부터 차분 연산함으로써 최종적으로 사용자 음성에 대응되는 신호가 감쇄된 외부 음향 신호를 생성할 수 있다.The processor may generate an external acoustic signal through an operation on input signals. As the user voice signal is input to the adaptive filter 1400, the processor may generate a feedback signal by differentially calculating a signal output from the adaptive filter 1400 from all sound signals. The processor may control the adaptive filter 1400 to adjust the parameters by inputting a feedback signal to the adaptive filter 1400. As the feedback signal is applied, the processor may differentially calculate the output signal from the adaptive filter 1400, the parameters of which are adjusted, from the entire acoustic signal, thereby finally generating an external acoustic signal in which the signal corresponding to the user's voice is attenuated.

다른 실시예에서, 음향 신호 처리 장치는 적응 필터(1400)를 사용하지 않고 뉴럴 네트워크(Neural network) 연산을 통해 차분 연산을 수행할 수도 있다. 예를 들어, 음향 신호 처리 장치는 CNN(Convolution neural network) 연산, DNN(Deep neural network) 연산 또는 RNN(Recurrent neural network) 연산 등을 통해 차분 연산을 수행할 수 있다. 다만, 음향 신호 처리 장치에 채용되는 뉴럴 네트워크의 종류는 이에 제한되지 않는다.In another embodiment, the acoustic signal processing apparatus may perform a difference operation through a neural network operation without using the adaptive filter 1400 . For example, the sound signal processing apparatus may perform a difference operation through a convolution neural network (CNN) operation, a deep neural network (DNN) operation, or a recurrent neural network (RNN) operation. However, the type of neural network employed in the acoustic signal processing apparatus is not limited thereto.

도 15는 일 실시예에 따른 외부 음향 신호를 도시한 도면이다.15 is a diagram illustrating an external acoustic signal according to an exemplary embodiment.

도 15를 참조하면, 도 5의 비교예에 따른 방법으로 출력된 음향 신호를 나타낸 제1 그래프(1510) 및 음향 신호 처리 장치로부터 생성된 외부 음향 신호를 나타낸 제2 그래프(1520)가 도시된다. 제1 그래프(1510)에는 외부 음향에 대응되는 신호(1510b) 및 사용자 음성에 대응되는 신호(1510a)가 나타난다. 제2 그래프(1520)에 또한 외부 음향에 대응되는 신호(1520b) 및 사용자 음성에 대응되는 신호(1520a)가 나타난다.Referring to FIG. 15 , a first graph 1510 showing sound signals output by the method according to the comparative example of FIG. 5 and a second graph 1520 showing external sound signals generated from the sound signal processing device are shown. A signal 1510b corresponding to an external sound and a signal 1510a corresponding to a user's voice appear in the first graph 1510 . A signal 1520b corresponding to the external sound and a signal 1520a corresponding to the user's voice are also shown in the second graph 1520.

제1 그래프(1510)는 도 5의 비교예에 따라 복수의 앰비언트 마이크들을 이용하여 출력한 음향 신호를 나타낸다. 앰비언트 마이크에 근접한 지점에서 발화된 사용자 음성에 대응되는 신호(1510a)가 강조되며 외부 음향에 대응되는 신호(1510b)는 감쇄되어 나타난 것이 확인된다. 제1 그래프(1510)에 따르면 도 5의 비교예에서는 외부 음향이 명확히 센싱되지 않고, 외부 음향에 따른 기능들 또한 수행되기 어렵다.A first graph 1510 shows sound signals output using a plurality of ambient microphones according to the comparative example of FIG. 5 . It is confirmed that the signal 1510a corresponding to the user's voice uttered at a point close to the ambient microphone is emphasized and the signal 1510b corresponding to the external sound is attenuated. According to the first graph 1510, in the comparative example of FIG. 5, external sound is not clearly sensed, and it is difficult to perform functions according to external sound.

제2 그래프(1520)는 도 음향 신호 처리 장치가 생성한 외부 음향 신호를 나타낸다. 사용자 마이크 및 앰비언트 마이크와 사용자 음성의 발화 지점이 근접함에도 불구하고 사용자 음성에 대응되는 신호(1520a)가 감쇄된 것이 확인된다. 반면에 외부 음향에 대응되는 신호(1520b)는 강조된 것이 확인된다. 제2 그래프(1520)에 따르면 음향 신호 처리 장치는 사용자 음성을 배제하며 외부 음향을 명확히 센싱할 수 있고, 이에 따라 외부 음향에 응답하여 대응되는 기능을 수행할 수 있다.A second graph 1520 also shows an external acoustic signal generated by the acoustic signal processing apparatus. It is confirmed that the signal 1520a corresponding to the user's voice is attenuated despite the proximity of the user's microphone and the ambient microphone to the ignition point of the user's voice. On the other hand, it is confirmed that the signal 1520b corresponding to the external sound is emphasized. According to the second graph 1520, the sound signal processing device can clearly sense the external sound while excluding the user's voice, and thus can perform a corresponding function in response to the external sound.

제1 그래프(1510)에서는 외부 음향에 대응되는 신호(1510b)가 -34.45dB로 측정되고 사용자 음성에 대응되는 신호(1510a)가 -17.76dB로 측정되었다. 제2 그래프(1520)에서는 외부 음향에 대응되는 신호(1520b)가 -19.98dB로 측정되고 사용자 음성에 대응되는 신호(1520a)가 -25.41dB로 측정되었다. 따라서, 외부 음향에 대응되는 신호(1510b, 1520b) 대비 사용자 음성에 대응되는 신호(1510a, 1520a)의 차이는 제1 그래프(1510)에서 -16.69dB가 되고 제2 그래프(1520)에서 5.43dB가 된다. 이는 사용자 음성에 비하여 외부 음향이 강조된 정도를 나타내는 수치이며, 제1 그래프(1510) 및 제2 그래프(1520)에서 그 수치의 차이는 22.12dB이 된다. 음향 신호 처리 장치는 도 5의 비교예에 따른 경우보다 22dB이 넘는 수치만큼 사용자 음성 감쇄 및 외부 음향 강조를 수행함이 확인된다.In the first graph 1510, the signal 1510b corresponding to the external sound is measured as -34.45dB and the signal 1510a corresponding to the user's voice is measured as -17.76dB. In the second graph 1520, the signal 1520b corresponding to the external sound is measured as -19.98dB and the signal 1520a corresponding to the user's voice is measured as -25.41dB. Therefore, the difference between the signals 1510a and 1520a corresponding to the external sound and the signals 1510a and 1520a corresponding to the user's voice is -16.69 dB in the first graph 1510 and 5.43 dB in the second graph 1520. do. This is a value representing the degree of emphasis of external sound compared to the user's voice, and the difference between the value in the first graph 1510 and the second graph 1520 is 22.12 dB. It is confirmed that the sound signal processing apparatus attenuates the user's voice and emphasizes the external sound by a value greater than 22 dB compared to the case according to the comparative example of FIG. 5 .

도 16a 및 도 16b는 실시예들에 따른 디스플레이를 도시한 도면들이다.16A and 16B are diagrams illustrating displays according to embodiments.

음향 신호 처리 장치는 시각 정보를 출력하는 디스플레이(1600)를 더 포함할 수 있다. 디스플레이(1600)는 프로세서의 제어에 응답하여 다양한 시각 정보를 표시할 수 있다. 프로세서는 사용자 음성 신호에 대응되는 기능 또는 외부 음향 신호에 대응되는 기능을 수행할 수 있다. 프로세서는 기능의 수행 결과를 디스플레이(1600)에 표시할 수 있다. 프로세서는 사용자 음성 신호에 대응되는 기능 및 외부 음향 신호에 대응되는 기능을 모두 수행하는 경우, 수행되는 기능들 각각의 수행 결과를 디스플레이(1600)의 상이한 영역에 표시할 수 있다.The sound signal processing apparatus may further include a display 1600 outputting visual information. The display 1600 may display various visual information in response to the processor's control. The processor may perform a function corresponding to a user voice signal or an external sound signal. The processor may display a function execution result on the display 1600 . When the processor performs both the function corresponding to the user's voice signal and the function corresponding to the external sound signal, the processor may display the result of each of the functions performed in different regions of the display 1600 .

도 16a를 참조하면, 디스플레이(1600)는 하나의 프레임 내에 제1 영역(1600a) 및 제2 영역(1600b)을 포함할 수 있다. 예를 들어, 디스플레이(1600)는 사용자 음성 신호에 대응되는 기능의 수행 결과를 제1 영역(1600a)에 표시하고 외부 음향 신호에 대응되는 기능의 수행 결과를 제2 영역(1600b)에 표시할 수 있다.Referring to FIG. 16A , a display 1600 may include a first area 1600a and a second area 1600b within one frame. For example, the display 1600 may display a result of performing a function corresponding to a user's voice signal on the first area 1600a and display a result of performing a function corresponding to an external sound signal on the second area 1600b. there is.

도 16b를 참조하면, 디스플레이(1600)는 개별적인 프레임으로 형성되는 제1 영역(1600a) 및 제2 영역(1600b)을 포함할 수 있다. 예를 들어, 디스플레이(1600)는 사용자 음성 신호에 대응되는 기능의 수행 결과를 제1 영역(1600a)에 표시하고 외부 음향 신호에 대응되는 기능의 수행 결과를 제2 영역(1600b)에 표시할 수 있다.Referring to FIG. 16B , the display 1600 may include a first area 1600a and a second area 1600b formed as individual frames. For example, the display 1600 may display a result of performing a function corresponding to a user's voice signal on the first area 1600a and display a result of performing a function corresponding to an external sound signal on the second area 1600b. there is.

도 17a 내지 도 17c는 음향 신호 처리 장치가 안경형 웨어러블 장치인 실시예들을 도시한 도면들이다.17A to 17C are diagrams illustrating embodiments in which a sound signal processing device is a glasses-type wearable device.

도 17a를 참조하면, 음향 신호 처리 장치(100)는 안경형 웨어러블 장치이며 안경 프레임(1700)을 포함할 수 있다. 안경 프레임(1700)은 안경 브릿지(1700a), 안경테(1700b) 및 안경 다리(1700c)를 포함할 수 있다.Referring to FIG. 17A , the sound signal processing device 100 is a glasses-type wearable device and may include a glasses frame 1700 . The spectacle frame 1700 may include a spectacle bridge 1700a, an spectacle frame 1700b, and a spectacle temple 1700c.

사용자 마이크 및 앰비언트 마이크는 안경 프레임(1700)에 배치될 수 있다. 사용자 마이크 및 앰비언트 마이크는 수신하고자 하는 음향에 따라 안경 프레임(1700)의 다양한 위치에 배치될 수 있다. 예를 들어, 사용자 마이크는 사용자 음성을 보다 근접한 위치에서 수신할 수 있도록 안경 브릿지(1700a) 또는 안경테(1700b)에 배치될 수 있다. 또한, 앰비언트 마이크는 안경테(1700b) 또는 안경 다리(1700c)에 배치될 수 있다.A user microphone and an ambient microphone may be placed in the eyeglass frame 1700 . The user microphone and the ambient microphone may be placed in various positions of the glasses frame 1700 according to the sound to be received. For example, the user's microphone may be disposed on the eyeglass bridge 1700a or the eyeglass frame 1700b to receive the user's voice at a closer location. Also, the ambient microphone may be disposed on the spectacle frame 1700b or the temple 1700c.

한편, 도 17a는 음향 신호 처리 장치(100)가 안경형 웨어러블 장치인 것으로 도시되었으나, 이는 예시에 불과하며 음향 신호 처리 장치(100)는 손목에 착용하는 워치(watch) 또는 팔찌 형태이거나, 목에 착용하는 목걸이 형태, 또는 귀에 착용하는 이어폰, 헤드폰 등의 다양한 형태의 웨어러블 장치일 수 있다. 음향 신호 처리 장치(100)는 사용자가 착용 가능한 형태라면 제한 없이 해당될 수 있다.On the other hand, FIG. 17A shows that the sound signal processing device 100 is a glasses-type wearable device, but this is only an example, and the sound signal processing device 100 is in the form of a watch or bracelet worn on the wrist, or worn on the neck It may be a wearable device in various forms, such as a necklace type, or earphones or headphones worn on the ears. The sound signal processing device 100 may be applicable without limitation as long as it is a wearable type by a user.

도 17b를 참조하면 사용자 마이크(110)는 음향 신호 처리 장치(100)의 안경 브릿지(1700a)에 배치되고 앰비언트 마이크(120)는 음향 신호 처리 장치(100)의 안경 다리(1700c)에 배치될 수 있다.Referring to FIG. 17B , the user microphone 110 may be disposed on the glasses bridge 1700a of the sound signal processing apparatus 100 and the ambient microphone 120 may be disposed on the temples 1700c of the sound signal processing apparatus 100. there is.

사용자 음성의 발화 지점은 사용자의 구강 또는 입술에 해당하므로, 사용자 마이크(110)는 발화 지점에 대응되도록 안경 브릿지(1700a)에 배치될 수 있다. 앰비언트 마이크(120)는 사용자의 측면 방향의 외부 음향을 보다 효과적으로 수신하고, 사용자 음성의 발화 지점에서 멀도록 안경 다리(1700c)에 배치될 수 있다. 다만, 상술하였듯이 마이크들의 배치는 안경 프레임(1700) 내에 다양한 위치들에 배치될 수 있다.Since the speech point of the user's voice corresponds to the user's mouth or lips, the user's microphone 110 may be disposed on the glasses bridge 1700a to correspond to the speech point. The ambient microphone 120 may more effectively receive external sound in the direction of the user's side and may be disposed on the temple 1700c so as to be far from the ignition point of the user's voice. However, as described above, the microphones may be arranged in various positions within the glasses frame 1700.

도 17c를 참조하면 사용자 음성의 발화 지점(42)으로부터 사용자 음성이 사용자 마이크(110)에 전파되는 것이 도시된다.Referring to FIG. 17C , the propagation of the user voice from the ignition point 42 of the user voice to the user microphone 110 is illustrated.

사용자 음성의 발화 지점(42)은 사용자의 구강 또는 입술에 대응되는 위치일 수 있다. 사용자 음성은 사용자 마이크(110)에 전파되어 사용자 마이크(110)의 진동부(112)가 형성하는 일 면(112a)에 수신될 수 있다. 여기서, 사용자 음성이 진동부(112)가 형성하는 일 면(112a)에 수직으로 전파될 경우 사용자 마이크(110)에 의해 가장 큰 민감도로 센싱될 수 있다.The speech point 42 of the user's voice may be a position corresponding to the user's oral cavity or lips. The user's voice may propagate to the user's microphone 110 and be received by one surface 112a formed by the vibrating part 112 of the user's microphone 110 . Here, when the user's voice propagates vertically to one surface 112a formed by the vibrating unit 112, it can be sensed with the highest sensitivity by the user's microphone 110.

따라서, 도 17c에 도시된 바와 같이 사용자 마이크(110)는 음향 신호 처리 장치(100)에서 사용자 음성의 발화 지점(42)으로부터 일 면(112a)으로 향하는 방향이 수직이 되도록 배치될 수 있다. 사용자의 정면 또는 측면으로부터 외부인의 음성이 수신되는 경우, 사용자 마이크(110)의 일 면(112a)과 평행한 방향으로 수신되므로 외부인 음성은 사용자 마이크(110)에 의해 가장 낮은 민감도로 센싱되거나, 센싱되지 않을 수 있다. 이와 같은 배치로 인해 사용자 마이크(110)는 외부 음향을 감쇄시키고 사용자 음성은 강조할 수 있다.Accordingly, as shown in FIG. 17C , the user microphone 110 may be arranged such that a direction from the ignition point 42 of the user voice toward one surface 112a is perpendicular to the sound signal processing apparatus 100. When an outsider's voice is received from the front or side of the user, since it is received in a direction parallel to one side 112a of the user's microphone 110, the outsider's voice is sensed with the lowest sensitivity by the user's microphone 110 or sensed It may not be. Due to this arrangement, the user's microphone 110 can attenuate external sound and emphasize the user's voice.

다만, 공정상 또는 사용상의 제약으로 인해 수직을 유지하는 것이 어려우므로 사용자 마이크(110)는 일 면(112a)과 사용자 음성의 진행 방향이 60° 내지 120°를 이루도록 배치될 수도 있다. 도 9 및 도 10b를 참조하여 상술하였듯이 사용자 마이크(110)가 위 각도로 배치되더라도 사용자 음성을 효과적으로 센싱할 수 있고, 이에 따라 사용자 음성이 감쇄된 외부 음향 신호가 생성될 수 있다.However, since it is difficult to maintain a vertical position due to process or use restrictions, the user microphone 110 may be arranged so that one surface 112a and the direction of the user's voice form an angle of 60° to 120°. As described above with reference to FIGS. 9 and 10B , even when the user's microphone 110 is disposed at an upward angle, the user's voice can be effectively sensed, and thus an external sound signal in which the user's voice is attenuated can be generated.

도 18a 및 도 18b는 디스플레이에 기능의 수행 결과가 표시되는 실시예들을 도시한 도면들이다.18A and 18B are diagrams illustrating embodiments in which a result of performing a function is displayed on a display.

도 18a 및 도 18b를 참조하면, 디스플레이는 제1 영역(1600a) 및 제2 영역(1600b)을 포함하고 각 영역에 상이한 시각 정보를 표시할 수 있다.Referring to FIGS. 18A and 18B , the display may include a first area 1600a and a second area 1600b and display different visual information on each area.

도 18a에 도시된 바와 같이 사용자 마이크(110) 및 앰비언트 마이크(120)는 외부 음향인 외부인 음성을 수신하고, 동시에 사용자 음성 또한 수신할 수 있다. 외부인 음성은 사용자 마이크(110)의 전면 방향에서 수신되므로 진동부가 형성하는 일 면에 평행하거나 평행에 가까운 각도록 수신될 수 있다. 이에 따라, 사용자 마이크(110)는 외부인 음성을 감쇄시킬 수 있다.As shown in FIG. 18A , the user microphone 110 and the ambient microphone 120 may receive an outsider's voice, which is an external sound, and may also receive the user's voice at the same time. Since the outsider's voice is received in the front direction of the user's microphone 110, it can be received at an angle parallel to or close to parallel to one surface formed by the vibrating unit. Accordingly, the user's microphone 110 can attenuate an outsider's voice.

음향 신호 처리 장치(100)는 사용자 마이크(110)를 이용하여 사용자 음성 신호를 생성하고, 신호들 간의 차분 연산을 통해 사용자 음성이 감쇄된 외부 음향 신호를 생성할 수 있다. 음향 신호 처리 장치(100)는 외부 음향 신호에 대응되는 시각 정보(120a)를 제1 영역(1600a)에 표시하고, 사용자 음성 신호에 대응되는 시각 정보(110a)를 제2 영역(1600b)에 표시할 수 있다.The sound signal processing apparatus 100 may generate a user voice signal using the user microphone 110 and generate an external sound signal in which the user voice is attenuated through a difference operation between the signals. The sound signal processing apparatus 100 displays visual information 120a corresponding to an external acoustic signal on the first area 1600a and displays visual information 110a corresponding to the user's voice signal on the second area 1600b. can do.

도 18a에 따른 실시예에서, 음향 신호 처리 장치(100)는 외부인 음성을 인식하고 이를 텍스트로 변환한 후 제1 영역(1600a)에 표시할 수 있다. 예를 들어, 외부인이 외국인인 경우, 음향 신호 처리 장치(100)는 수신된 외부인 음성에 대해 번역을 수행한 후 제1 영역(1600a)에 번역 결과를 표시할 수 있다. 동시에, 음향 신호 처리 장치(100)는 사용자 음성에 대응되는 기능을 수행하고 그 수행 결과를 제2 영역(1600b)에 표시할 수 있다.In the embodiment of FIG. 18A , the sound signal processing apparatus 100 may recognize an outsider's voice, convert it into text, and then display it on the first area 1600a. For example, if the outsider is a foreigner, the sound signal processing apparatus 100 may display the translation result on the first area 1600a after performing translation on the received outsider's voice. At the same time, the sound signal processing apparatus 100 may perform a function corresponding to the user's voice and display the result of the function in the second area 1600b.

도 18b에 도시된 바와 같이 사용자 마이크(110) 및 앰비언트 마이크(120)는 음향 출력 장치로부터 외부 음향(예를 들어, 음악 등)을 수신하고, 동시에 사용자 음성 또한 수신할 수 있다. 외부 음향은 사용자 마이크(110)의 전면 방향에서 수신되므로 진동부가 형성하는 일 면에 평행하거나 평행에 가까운 각도록 수신될 수 있다. 이에 따라, 사용자 마이크(110)는 외부 음향을 감쇄시킬 수 있다.As shown in FIG. 18B , the user microphone 110 and the ambient microphone 120 may receive external sound (eg, music, etc.) from the sound output device and simultaneously receive the user's voice. Since the external sound is received in the front direction of the user's microphone 110, it may be received at an angle parallel to or close to parallel to one surface formed by the vibrating unit. Accordingly, the user's microphone 110 can attenuate external sound.

도 18b에 따른 실시예에서, 음향 신호 처리 장치(100)는 사용자 음성이 감쇄된 외부 음향 신호를 녹음 또는 녹화할 수 있다. 음향 신호 처리 장치(100)는 사용자 음성이 수신되더라도 사용자 음성을 효과적으로 감쇄시키므로 외부 음향만을 녹음할 수 있다. 예를 들어, 음향 신호 처리 장치(100)는 외부 음향을 녹음 중이라는 표시를 제1 영역(1600a)에 표시할 수 있다. 동시에, 음향 신호 처리 장치(100)는 사용자 음성에 대응되는 기능을 수행하고 그 수행 결과를 제2 영역(1600b)에 표시할 수 있다.In the embodiment according to FIG. 18B , the sound signal processing apparatus 100 may record or record an external sound signal in which the user's voice is attenuated. Even if the user's voice is received, the sound signal processing apparatus 100 can record only the external sound because it effectively attenuates the user's voice. For example, the sound signal processing apparatus 100 may display an indication that external sound is being recorded on the first area 1600a. At the same time, the sound signal processing apparatus 100 may perform a function corresponding to the user's voice and display the result of the function in the second area 1600b.

한편, 상술한 수신된 신호들에 기초하여 수행되는 다양한 기능들은 예시에 불과하며 다양하게 변형되어 실시될 수 있다.Meanwhile, various functions performed based on the above-described received signals are merely examples and may be variously modified and implemented.

도 19는 다른 실시예에 따른 음향 신호 처리 장치의 구성을 도시한 블록도이다.19 is a block diagram showing the configuration of a sound signal processing device according to another embodiment.

도 19를 참조하면, 음향 신호 처리 장치(1900)는 사용자 마이크(1910), 앰비언트 마이크(1920) 및 프로세서(1930)를 포함할 수 있다. 도 19에 도시된 음향 신호 처리 장치(1900)에는 본 실시예들과 관련된 구성요소들만이 도시되어 있다. 따라서, 음향 신호 처리 장치(1900)에 도 19에 도시된 구성요소들 외에 다른 범용적인 구성요소들이 더 포함될 수 있음은 당업자에게 자명하다. 한편, 도 19의 사용자 마이크(1910)는 도 1의 사용자 마이크에 대응되며, 도 19의 앰비언트 마이크(1920)는 도 1의 앰비언트 마이크 및 도 5의 앰비언트 마이크와는 구조 및 동작 방법이 상이하고 도 1의 사용자 마이크와 유사한 구조를 가질 수 있다.Referring to FIG. 19 , a sound signal processing apparatus 1900 may include a user microphone 1910, an ambient microphone 1920, and a processor 1930. In the sound signal processing apparatus 1900 shown in FIG. 19, only components related to the present embodiments are shown. Accordingly, it is apparent to those skilled in the art that the sound signal processing apparatus 1900 may further include other general-purpose components in addition to the components shown in FIG. 19 . Meanwhile, the user microphone 1910 of FIG. 19 corresponds to the user microphone of FIG. 1, and the ambient microphone 1920 of FIG. 19 is different in structure and operation method from the ambient microphone of FIG. 1 and the ambient microphone of FIG. It may have a structure similar to that of the user's microphone of 1.

도 19의 실시예에 따른 음향 신호 처리 장치(1900)는 도 1의 실시예에 따른 음향 신호 처리 장치와 달리 앰비언트 마이크(1920)가 사용자 마이크(1910)와 대응되는 구조를 가질 수 있다. 앰비언트 마이크(1920)는 사용자 마이크(1910)와 같이 복수의 진동 구조들을 포함하며, 수신되는 음향의 전파 방향을 고려하여 음향 신호 처리 장치(1900) 내에 배치될 수 있다.Unlike the sound signal processing device according to the embodiment of FIG. 1 , the sound signal processing device 1900 according to the embodiment of FIG. 19 may have a structure in which the ambient microphone 1920 corresponds to the user microphone 1910 . The ambient microphone 1920 includes a plurality of vibration structures like the user microphone 1910 and may be disposed in the sound signal processing apparatus 1900 in consideration of a propagation direction of a received sound.

사용자 마이크(1910)는 제1 면(plane)을 형성하며 제1 면을 통해 음향을 수신할 수 있다. 사용자 마이크(1910)는, 제1 면이 사용자 음성의 발화 지점에 대응되는 방향으로 배치되도록 음향 신호 처리 장치(1900) 내에 배치될 수 있다. 이와 같은 배치로 인해 사용자 마이크(1910)는 외부 음향이 감쇄된 사용자 음성 신호를 생성할 수 있다.The user microphone 1910 forms a first plane and can receive sound through the first plane. The user's microphone 1910 may be disposed in the sound signal processing apparatus 1900 such that a first surface is disposed in a direction corresponding to a user's voice utterance point. Due to such an arrangement, the user microphone 1910 may generate a user voice signal in which external sound is attenuated.

앰비언트 마이크(1920)는 제2 면(plane)을 형성하며 제2 면을 통해 음향을 수신할 수 있다. 앰비언트 마이크(1920)는, 제2 면이 사용자 마이크가 배치되는 방향과 상이한 방향으로 배치되도록 음향 신호 처리 장치(1900) 내에 배치될 수 있다. 따라서, 앰비언트 마이크(1920)는 사용자 음성의 발화 지점에 대응되도록 배치되지 않으므로, 앰비언트 마이크(1920)에서는 사용자 마이크(1910)에서 생성되는 음향 신호에 비하여 사용자 음성이 감쇄된 음향 신호가 생성될 수 있다.The ambient microphone 1920 forms a second plane and can receive sound through the second plane. The ambient microphone 1920 may be disposed in the sound signal processing apparatus 1900 such that the second surface is disposed in a direction different from the direction in which the user microphone is disposed. Therefore, since the ambient microphone 1920 is not disposed to correspond to the ignition point of the user's voice, the ambient microphone 1920 can generate a sound signal in which the user's voice is attenuated compared to the sound signal generated by the user's microphone 1910. .

또한, 앰비언트 마이크(1920)는, 제2 면이 외부 음향의 발생 지점에 대응되는 방향으로 배치되도록 음향 신호 처리 장치(1900) 내에 될 수 있다. 앰비언트 마이크(1920)의 이와 같은 배치로 인해 외부 음향은 높은 민감도로 센싱되고 사용자 음성은 낮은 민감도로 센싱될 수 있다. 따라서, 앰비언트 마이크(1920)에 수신된 전체 음향에서 사용자 음성은 감쇄되고, 앰비언트 마이크(1920)에서 생성되는 음향 신호인 제1 외부 음향 신호는 사용자 음성이 감쇄된 신호일 수 있다.Also, the ambient microphone 1920 may be included in the sound signal processing apparatus 1900 such that a second surface is disposed in a direction corresponding to a point where external sound is generated. Due to such an arrangement of the ambient microphone 1920, external sound can be sensed with high sensitivity and a user's voice can be sensed with low sensitivity. Accordingly, the user's voice is attenuated from the entire sound received by the ambient microphone 1920, and the first external sound signal, which is a sound signal generated by the ambient microphone 1920, may be a signal in which the user's voice is attenuated.

프로세서(1930)는 제1 외부 음향 신호로부터 사용자 음성 신호를 차분 연산함으로써 제2 외부 음향 신호를 생성할 수 있다. 제1 외부 음향 신호에는 사용자 음성이 이미 감쇄되어 센싱되어있지만, 프로세서(1930)는 앰비언트 마이크(1920)의 제1 외부 음향 신호로부터 사용자 마이크(1910)의 사용자 음성 신호를 차분 연산함으로써 사용자 음성이 보다 더 감쇄된 제2 외부 음향 신호를 생성할 수 있다.The processor 1930 may generate a second external acoustic signal by differentially calculating the user voice signal from the first external acoustic signal. Although the user's voice has already been attenuated and sensed in the first external sound signal, the processor 1930 differentially calculates the user's voice signal of the user microphone 1910 from the first external sound signal of the ambient microphone 1920, so that the user's voice is more pronounced. A further attenuated second external acoustic signal may be generated.

도 20은 도 19의 실시예에 따른 사용자 마이크 및 앰비언트 마이크의 배치를 설명하기 위한 도면이다.FIG. 20 is a diagram for explaining arrangement of a user microphone and an ambient microphone according to the embodiment of FIG. 19 .

도 20을 참조하면, 사용자 음성의 발화 지점(42)으로부터 전파된 사용자 음성이 사용자 마이크의 진동부(1912)가 형성하는 제1 면(1912a)에 수신될 수 있다. 또한, 외부 음향의 발생 지점(43)으로부터 전파된 외부 음향이 앰비언트 마이크의 진동부(1922)가 형성하는 제2 면(1922a)에 수신될 수 있다.Referring to FIG. 20 , the user's voice propagated from the ignition point 42 of the user's voice may be received by the first surface 1912a formed by the vibration unit 1912 of the user's microphone. In addition, the external sound propagated from the external sound generation point 43 may be received by the second surface 1922a formed by the vibrating part 1922 of the ambient microphone.

도 20에 도시된 바와 같이 사용자 음성의 전파 방향과 사용자 마이크의 진동부(1912)가 형성하는 제1 면(1912a)이 직교하는 경우 진동부(1912)가 가장 민감하게 반응하고, 사용자 음성이 가장 크게 센싱될 수 있다. 따라서, 사용자 마이크는 진동부(1912)(또는 복수의 진동부들)가 형성하는 제1 면(1912a)이 사용자 음성의 발화 지점(42)에 대응되는 방향으로 배치되도록 음향 신호 처리 장치 내에 배치될 수 있다.As shown in FIG. 20, when the propagation direction of the user's voice and the first surface 1912a formed by the vibration unit 1912 of the user's microphone are orthogonal, the vibration unit 1912 reacts most sensitively and the user's voice responds most sensitively. It can be greatly sensed. Accordingly, the user's microphone may be disposed in the sound signal processing device such that the first surface 1912a formed by the vibrating unit 1912 (or the plurality of vibrating units) is disposed in a direction corresponding to the ignition point 42 of the user's voice. there is.

다시 말해, 사용자 마이크는 진동부(1912)(또는 복수의 진동부들)가 형성하는 제1 면(1912a)과 사용자 음성의 발화 지점(42)으로부터 일 면(1912a)으로 향하는 방향이 서로 대응되도록(바람직하게는 90°를 이루도록) 배치될 수 있다. 예를 들어, 사용자 마이크는 진동부(1912)(또는 복수의 진동부들)가 형성하는 제1 면(1912a)과 사용자 음성의 발화 지점(42)으로부터 제1 면(1912a)으로 향하는 방향이 60° 내지 120°를 이루도록 음향 신호 처리 장치에 배치될 수 있다.In other words, in the user's microphone, the first surface 1912a formed by the vibrating unit 1912 (or a plurality of vibrating units) and the direction from the ignition point 42 of the user's voice to the one surface 1912a correspond to each other ( Preferably, it may be arranged so as to achieve 90 °). For example, the user's microphone has a first surface 1912a formed by the vibrating unit 1912 (or a plurality of vibrating units) and a direction from the ignition point 42 of the user's voice to the first surface 1912a at an angle of 60°. to 120° may be arranged in the acoustic signal processing device.

또한, 도 20에 도시된 바와 같이 외부 음향의 전파 방향과 앰비언트 마이크의 진동부(1922)가 형성하는 제2 면(1922a)이 직교하는 경우 진동부(1922)가 가장 민감하게 반응하고, 외부 음향이 가장 크게 센싱될 수 있다. 따라서, 앰비언트 마이크는 진동부(1922)(또는 복수의 진동부들)가 형성하는 제2 면(1922a)이 외부 음향의 발생 지점(43)에 대응되는 방향으로 배치되도록 음향 신호 처리 장치 내에 배치될 수 있다.20, when the propagation direction of the external sound and the second surface 1922a formed by the vibrating unit 1922 of the ambient microphone are orthogonal, the vibrating unit 1922 responds most sensitively, and the external sound This can be sensed the most. Therefore, the ambient microphone may be disposed in the sound signal processing device such that the second surface 1922a formed by the vibrating unit 1922 (or a plurality of vibrating units) is disposed in a direction corresponding to the external sound generation point 43. there is.

다시 말해, 앰비언트 마이크는 진동부(1922)(또는 복수의 진동부들)가 형성하는 제2 면(1922a)과 외부 음향의 발생 지점(43)으로부터 제2 면(1922a)으로 향하는 방향이 서로 대응되도록(바람직하게는 90°를 이루도록) 배치될 수 있다. 예를 들어, 앰비언트 마이크는 진동부(1922)(또는 복수의 진동부들)가 형성하는 제2 면(1922a)과 외부 음향의 발생 지점(43)으로부터 제2 면(1922a)으로 향하는 방향이 60° 내지 120°를 이루도록 음향 신호 처리 장치에 배치될 수 있다.In other words, in the ambient microphone, the second surface 1922a formed by the vibrating unit 1922 (or a plurality of vibrating units) and the direction from the external sound generating point 43 to the second surface 1922a correspond to each other. (preferably at 90°). For example, in the ambient microphone, the second surface 1922a formed by the vibrating unit 1922 (or a plurality of vibrating units) and the direction from the external sound generating point 43 toward the second surface 1922a are at an angle of 60°. to 120° may be arranged in the acoustic signal processing device.

도 20에 도시된 바와 같이 음향 신호 처리 장치 내에서 사용자 마이크 및 앰비언트 마이크가 각각 상이한 지점에 대응되는 방향으로 배치됨에 따라, 사용자 마이크의 제1 면(1912a)에는 외부 음향이 90°에서 먼 각도(또는 평행에 근접한 각도로)로 수신되고, 앰비언트 마이크의 제2 면(1922a)에는 사용자 음성이 90°에서 먼 각도(또는 평행에 근접한 각도로)로 수신될 수 있다. 따라서, 사용자 마이크에는 외부 음향이 낮은 민감도로 센싱되고 앰비언트 마이크에는 사용자 음성이 낮은 민감도로 센싱될 수 있다. 즉, 사용자 마이크는 외부 음향이 감쇄된 사용자 음성 신호를 생성할 수 있고, 앰비언트 마이크는 사용자 음성이 감쇄된 제1 외부 음향 신호를 생성할 수 있다.As shown in FIG. 20, as the user microphone and the ambient microphone are arranged in directions corresponding to different points in the sound signal processing apparatus, the external sound is emitted on the first surface 1912a of the user microphone at an angle far from 90 ° ( or at an angle close to parallel), and the user's voice may be received at an angle far from 90° (or at an angle close to parallel) to the second surface 1922a of the ambient microphone. Accordingly, the external sound may be sensed with low sensitivity in the user's microphone and the user's voice may be sensed with low sensitivity in the ambient microphone. That is, the user microphone may generate a user voice signal in which the external sound is attenuated, and the ambient microphone may generate a first external sound signal in which the user voice is attenuated.

도 17a 내지 도 17c의 실시예에서와 같이, 음향 신호 처리 장치가 안경형 웨어러블 장치인 경우, 사용자 마이크는 제1 면(1912a)이 사용자 음성의 발화 지점인 사용자의 입술 또는 구강에 대응되는 방향으로 배치되도록 음향 신호 처리 장치 내에 배치될 수 있다. 앰비언트 마이크는 사용자 마이크가 배치되는 방향과는 상이한 방향으로 배치되며, 제2 면(1922a)이 사용자의 정면 또는 측면에 대응되는 방향으로 배치되도록 음향 신호 처리 장치 내에 배치될 수 있다.As in the embodiments of FIGS. 17A to 17C , when the sound signal processing device is a glasses-type wearable device, the first surface 1912a of the user's microphone is disposed in a direction corresponding to the user's lips or oral cavity, which is the ignition point of the user's voice It can be arranged in the acoustic signal processing device as much as possible. The ambient microphone is disposed in a direction different from the direction in which the user's microphone is disposed, and may be disposed in the sound signal processing device such that the second surface 1922a is disposed in a direction corresponding to the front or side of the user.

도 18a의 실시예를 참조하면, 외부 음향인 외부인의 음성이 사용자의 정면으로부터 수신될 수 있다. 앰비언트 마이크의 제2 면(1922a)이 외부 음향의 발생 지점인 사용자의 정면에 대응되는 방향으로 배치된 경우, 외부인의 음성은 엠비언트 마이크게 높은 민감도로 센싱되는 동시에 사용자 마이크에는 낮은 민감도로 센싱될 수 있다. 이 경우, 앰비언트 마이크는 외부 음향을 높은 민감도로 센싱하므로 사용자 음성이 감쇄된 제1 외부 음향 신호를 생성할 수 있다.Referring to the embodiment of FIG. 18A , an outsider's voice, which is external sound, may be received from the front of the user. When the second surface 1922a of the ambient microphone is disposed in a direction corresponding to the front of the user, which is the point where external sound is generated, the voice of an outsider can be sensed with high sensitivity by the ambient microphone and sensed with low sensitivity by the user's microphone. there is. In this case, since the ambient microphone senses external sound with high sensitivity, it can generate a first external sound signal in which the user's voice is attenuated.

도 18b의 실시예에서도 마찬가지로, 외부 음향(예를 들어, 음악 등)이 사용자의 정면으로부터 수신되므로 앰비언트 마이크의 제2 면(1922a)이 외부 음향의 발생 지점인 사용자의 정면에 대응되는 방향으로 배치된 경우, 앰비언트 마이크는 외부 음향을 높은 민감도로 센싱할 수 있다.Similarly in the embodiment of FIG. 18B, since external sound (eg, music, etc.) is received from the front of the user, the second surface 1922a of the ambient microphone is disposed in a direction corresponding to the front of the user, which is the generating point of the external sound. In this case, the ambient microphone can sense external sound with high sensitivity.

한편, 앰비언트 마이크의 제2 면(1922a)이 배치되는 방향은 사용자의 정면 또는 측면에 한정되지 않으며, 설계에 따라 다양한 방향으로 배치될 수 있다.Meanwhile, the direction in which the second surface 1922a of the ambient microphone is disposed is not limited to the front or side of the user, and may be disposed in various directions according to design.

도 21은 도 19의 실시예에 따른 차분 연산 방법을 설명하기 위한 도면이다.21 is a diagram for explaining a difference calculation method according to the embodiment of FIG. 19 .

도 21을 참조하면, 앰비언트 마이크(1920)로부터 생성된 제1 외부 음향 신호 및 사용자 마이크(1910)로부터 생성된 사용자 음성 신호가 프로세서(1930)에 입력될 수 있다. 프로세서(1930)는 입력된 신호들에 대한 연산을 통해 제2 외부 음향 신호를 생성할 수 있다.Referring to FIG. 21 , the first external sound signal generated from the ambient microphone 1920 and the user voice signal generated from the user microphone 1910 may be input to the processor 1930 . The processor 1930 may generate a second external acoustic signal through an operation on input signals.

제1 외부 음향 신호는 외부 음향이 높은 민감도로 센싱되고 사용자 음성은 낮은 민감도로 센싱됨으로써 전체 음향에서 사용자 음성이 감쇄된 신호일 수 있다. 사용자 음성 신호는 사용자 음성이 높은 민감도로 센싱되고 외부 음향이 낮은 민감도로 센싱됨으로써 전체 음향에서 외부 음향이 감쇄된 신호일 수 있다.The first external sound signal may be a signal in which the user's voice is attenuated from the overall sound by sensing the external sound with high sensitivity and the user's voice with low sensitivity. The user voice signal may be a signal obtained by attenuating the external sound from the overall sound by sensing the user voice with high sensitivity and the external sound with low sensitivity.

프로세서(1930)는 제1 외부 음향 신호로부터 사용자 음성 신호를 차분 연산함으로써 제1 외부 음향 신호에서 사용자 음성에 대응되는 신호는 더욱 감쇄되고 외부 음향에 대응되는 신호는 유지되는 신호를 생성할 수 있다.The processor 1930 may generate a signal in which a signal corresponding to the user voice is further attenuated and a signal corresponding to the external sound is maintained in the first external acoustic signal by performing a differential operation on the user voice signal from the first external acoustic signal.

프로세서(1930)는 적응 필터 또는 뉴럴 네트워크 등을 이용하여 신호들 간의 차분 연산을 수행할 수 있다.The processor 1930 may perform a difference operation between signals using an adaptive filter or a neural network.

도 22는 또 다른 실시예에 따른 음향 신호 처리 장치의 구성을 도시한 블록도이다.22 is a block diagram showing the configuration of a sound signal processing device according to another embodiment.

도 22를 참조하면, 음향 신호 처리 장치(2200)는 지향성 마이크(2210), 앰비언트 마이크(2220) 및 프로세서(2230)를 포함할 수 있다. 음향 신호 처리 장치(2200)는 음향 출력 장치(2300)로부터의 출력 음향을 수신할 수 있다. 도 22에 도시된 음향 신호 처리 장치(2200)에는 본 실시예들과 관련된 구성요소들만이 도시되어 있다. 따라서, 음향 신호 처리 장치(2200)에 도 22에 도시된 구성요소들 외에 다른 범용적인 구성요소들이 더 포함될 수 있음은 당업자에게 자명하다. 한편, 도 22의 지향성 마이크(2210)는 도 1의 사용자 마이크에 대응될 수 있다.Referring to FIG. 22 , a sound signal processing apparatus 2200 may include a directional microphone 2210, an ambient microphone 2220, and a processor 2230. The sound signal processing device 2200 may receive output sound from the sound output device 2300 . In the sound signal processing apparatus 2200 shown in FIG. 22, only components related to the present embodiments are shown. Accordingly, it is obvious to those skilled in the art that the sound signal processing apparatus 2200 may further include other general-purpose components in addition to the components shown in FIG. 22 . Meanwhile, the directional microphone 2210 of FIG. 22 may correspond to the user microphone of FIG. 1 .

도 22의 실시예에서 음향 신호 처리 장치(2200)는 사용자에게 착용되지 않는 장치로서 음향 출력 장치(2300)에 근접하게 배치되거나, 음향 출력 장치(2300)에 포함될 수 있다. 도 22의 실시예의 출력 음향은 도 1의 실시예의 사용자 음성에 대응될 수 있다.In the embodiment of FIG. 22 , the sound signal processing device 2200 is not worn by the user and may be disposed close to the sound output device 2300 or included in the sound output device 2300 . The output sound of the embodiment of FIG. 22 may correspond to the user's voice of the embodiment of FIG. 1 .

도 1의 실시예에서 음향 신호 처리 장치(2200)가 차분 연산을 통해 근접한 위치에서 발생하는 사용자 음성을 감쇄시키는 반면, 도 22의 실시예에서 음향 신호 처리 장치(2200)는 음향 출력 장치(2300)와 근접하므로, 차분 연산을 통해 근접한 위치에서 발생하는 출력 음향을 감쇄시킬 수 있다. 도 22의 실시예에서 음향 신호 처리 장치(2200)는 출력 음향이 감쇄된 외부 음향을 생성하는데, 외부 음향에는 사용자 음성이 포함될 수 있다. 따라서, 음향 출력 장치(2300)는 출력 음향을 수신하는 와중에도, 출력 음향이 감쇄되고 사용자 음성이 강조된 외부 음향 신호를 생성할 수 있다.In the embodiment of FIG. 1 , the sound signal processing device 2200 attenuates the user's voice generated at a nearby location through a difference operation, whereas in the embodiment of FIG. 22 , the sound signal processing device 2200 uses the sound output device 2300 Since it is close to , it is possible to attenuate an output sound generated in a close position through a difference operation. In the embodiment of FIG. 22 , the sound signal processing apparatus 2200 generates external sound in which the output sound is attenuated, and the external sound may include the user's voice. Accordingly, the sound output device 2300 may generate an external sound signal in which the output sound is attenuated and the user's voice is emphasized, even while receiving the output sound.

지향성 마이크(2210)는 음향 출력 장치(2300)로부터의 출력 음향 및 음향 출력 장치(2300)의 외부로부터 발생하는 외부 음향을 포함하는 전체 음향을 수신할 수 있다. 지향성 마이크(2210)는 수신된 전체 음향에서 외부 음향을 감쇄시킴으로써 출력 음향 신호를 생성할 수 있다. 지향성 마이크(2210)는, 전체 음향을 수신하는 일 면이 출력 음향의 발생 지점에 대응되는 방향으로 배치되도록 음향 신호 처리 장치(2200) 내에 배치될 수 있다. 지향성 마이크(2210)의 이와 같은 배치로 인해 출력 음향은 높은 민감도로 센싱되고 외부 음향은 낮은 민감도로 센싱될 수 있다. 따라서, 지향성 마이크(2210)에 수신된 전체 음향에서 외부 음향은 감쇄되고, 지향성 마이크(2210)에서 생성되는 음향 신호인 출력 음향 신호는 외부 음향이 감쇄된 신호일 수 있다.The directional microphone 2210 may receive overall sound including output sound from the sound output device 2300 and external sound generated from the outside of the sound output device 2300 . The directional microphone 2210 may generate an output sound signal by attenuating external sound from the total received sound. The directional microphone 2210 may be disposed in the sound signal processing apparatus 2200 such that one surface receiving the entire sound is disposed in a direction corresponding to a point where the output sound is generated. Due to such an arrangement of the directional microphone 2210, output sound can be sensed with high sensitivity and external sound can be sensed with low sensitivity. Accordingly, the external sound is attenuated from the total sound received by the directional microphone 2210, and the output sound signal, which is a sound signal generated by the directional microphone 2210, may be a signal in which the external sound is attenuated.

예를 들어, 지향성 마이크(2210)는 음향을 수신하는 일 면과 출력 음향의 발생 지점으로부터 일 면으로 향하는 방향이 60° 내지 120°를 이루도록 배치될 수 있다.For example, the directional microphone 2210 may be arranged such that a side receiving sound and a direction from a point where output sound is generated form an angle of 60° to 120°.

앰비언트 마이크(2220)는 전체 음향을 수신하고, 수신된 전체 음향으로부터 전체 음향 신호를 생성할 수 있다. 프로세서(2230)는 전체 음향 신호로부터 출력 음향 신호를 차분 연산함으로써 출력 음향이 감쇄된 외부 음향 신호를 생성할 수 있다.The ambient microphone 2220 may receive overall sound and generate an overall sound signal from the received overall sound. The processor 2230 may generate an external sound signal in which the output sound is attenuated by differentially calculating the output sound signal from the entire sound signal.

도 22의 음향 신호 처리 장치(2200)는 감쇄 또는 강조하는 신호의 종류만이 상이할뿐 동작 방법 및 배치는 도 1의 음향 신호 처리 장치에 대응되므로, 중복되는 설명은 생략한다.The sound signal processing apparatus 2200 of FIG. 22 differs only in the type of signals to be attenuated or emphasized, and the operation method and arrangement correspond to the sound signal processing apparatus 2200 of FIG. 1 , so duplicate descriptions are omitted.

도 23은 도 22의 실시예에 따른 차분 연산 방법을 설명하기 위한 도면이다.FIG. 23 is a diagram for explaining a difference calculation method according to the embodiment of FIG. 22 .

도 23을 참조하면, 앰비언트 마이크(2220)로부터 생성된 전체 음향 신호 및 지향성 마이크(2210)로부터 생성된 출력 음향 신호가 프로세서(2230)에 입력될 수 있다. 프로세서(2230)는 입력된 신호들에 대한 연산을 통해 외부 음향 신호를 생성할 수 있다.Referring to FIG. 23 , the entire sound signal generated from the ambient microphone 2220 and the output sound signal generated from the directional microphone 2210 may be input to the processor 2230 . The processor 2230 may generate external sound signals through calculations on input signals.

전체 음향은 외부 음향 및 출력 음향을 포함하므로, 전체 음향에 대응되는 전체 음향 신호는 외부 음향에 대응되는 신호 및 출력 음향에 대응되는 신호를 포함할 수 있다. 한편, 외부 음향에는 사용자 음성이 포함될 수 있다. 전체 음향 신호는 어느 종류의 음향도 감쇄되거나 강조되지 않은 신호일 수 있다. 출력 음향 신호는 출력 음향이 높은 민감도로 센싱되고 외부 음향이 낮은 민감도로 센싱됨으로써 전체 음향에서 외부 음향이 감쇄된 신호일 수 있다.Since the total sound includes external sound and output sound, the total sound signal corresponding to the total sound may include a signal corresponding to the external sound and a signal corresponding to the output sound. Meanwhile, the external sound may include a user's voice. The total acoustic signal may be a signal in which no kind of sound is attenuated or enhanced. The output sound signal may be a signal obtained by attenuating the external sound from the overall sound by sensing the output sound with high sensitivity and the external sound with low sensitivity.

따라서, 프로세서(2230)는 전체 음향 신호로부터 출력 음향 신호를 차분 연산함으로써 전체 음향 신호에서 출력 음향에 대응되는 신호는 감쇄되고 외부 음향(또는 사용자 음성)에 대응되는 신호는 유지되는 신호를 생성할 수 있다. 이와 같이 프로세서(2230)는 외부 음향에 대응되는 신호가 강조된 외부 음향 신호를 생성할 수 있다.Accordingly, the processor 2230 differentially calculates the output sound signal from the entire sound signal, thereby generating a signal in which the signal corresponding to the output sound is attenuated and the signal corresponding to the external sound (or user voice) is maintained. there is. As such, the processor 2230 may generate an external sound signal in which a signal corresponding to the external sound is emphasized.

프로세서(2230)는 적응 필터 또는 뉴럴 네트워크 등을 이용하여 신호들 간의 차분 연산을 수행할 수 있다.The processor 2230 may perform a difference operation between signals using an adaptive filter or a neural network.

도 24는 일 실시예에 따른 음향 신호 처리 방법을 나타내는 흐름도이다.24 is a flowchart illustrating a sound signal processing method according to an exemplary embodiment.

도 24를 참조하면, 음향 신호 처리 방법은 도 1에 도시된 음향 신호 처리 장치에서 시계열적으로 처리되는 단계들로 구성된다. 따라서, 이하에서 생략된 내용이라 하더라도 도 1 등을 참조하여 음향 신호 처리 장치에 관하여 전술된 내용은 도 24의 방법에도 적용됨을 알 수 있다.Referring to FIG. 24 , the sound signal processing method includes steps processed time-sequentially in the sound signal processing apparatus shown in FIG. 1 . Therefore, it can be seen that even if the content is omitted below, the above description of the acoustic signal processing apparatus with reference to FIG. 1 is also applied to the method of FIG. 24 .

단계 2410에서, 음향 신호 처리 장치는 사용자 음성 및 사용자의 외부로부터 발생하는 외부 음향을 포함하는 전체 음향을 수신할 수 있다.In operation 2410, the sound signal processing apparatus may receive all sound including the user's voice and external sound generated from the outside of the user.

음향 신호 처리 장치는 상이한 주파수 대역의 음향을 센싱하는 복수의 진동 구조들 각각이 수신된 전체 음향의 주파수에 기초하여, 전체 음향을 수신하도록 형성된 일 면에 직교하는 방향으로 진동하는 진동할 수 있다.The sound signal processing apparatus may vibrate so that each of the plurality of vibration structures sensing sounds of different frequency bands vibrates in a direction orthogonal to a surface configured to receive all sounds, based on the frequencies of all sounds received.

음향 신호 처리 장치는 수신되는 음향의 전파 방향과 일 면이 이루는 각도에 기초한 진동 세기로 진동할 수 있다.The sound signal processing apparatus may vibrate with a vibration intensity based on an angle formed between the propagation direction of the received sound and one surface.

음향 신호 처리 장치는 각도가 90°에 근접할수록 큰 진동 세기로 진동하고, 각도가 0°에 근접할수록 작은 진동 세기로 진동할 수 있다.The sound signal processing device may vibrate with a higher vibration intensity as the angle approaches 90° and vibrate with a smaller vibration intensity as the angle approaches 0°.

음향 신호 처리 장치는 복수의 진동 구조들 각각의 진동에 대응되는 전기적 신호를 생성할 수 있다.The sound signal processing apparatus may generate an electrical signal corresponding to vibration of each of the plurality of vibration structures.

단계 2420에서, 음향 신호 처리 장치는 수신된 전체 음향으로부터 전체 음향 신호를 생성할 수 있다.In operation 2420, the sound signal processing apparatus may generate a total sound signal from all received sounds.

단계 2430에서, 음향 신호 처리 장치는 수신된 전체 음향에서 외부 음향이 감쇄된 사용자 음성 신호를 생성할 수 있다.In operation 2430, the sound signal processing apparatus may generate a user voice signal in which external sounds are attenuated from all received sounds.

음향 신호 처리 장치는 임계값에 기초하여, 전기적 신호들 중 감쇄시킬 전기적 신호를 결정하고, 결정된 전기적 신호를 감쇄시킬 수 있다.The acoustic signal processing apparatus may determine an electrical signal to be attenuated among electrical signals based on the threshold value and attenuate the determined electrical signal.

음향 신호 처리 장치는 전기적 신호들의 평균 크기에 기초하여 임계값을 결정할 수 있다.The acoustic signal processing apparatus may determine a threshold value based on average amplitudes of electrical signals.

단계 2440에서, 음향 신호 처리 장치는 전체 음향 신호로부터 사용자 음성 신호를 차분 연산함으로써 사용자 음성이 감쇄된 외부 음향 신호를 생성할 수 있다.In operation 2440, the sound signal processing apparatus may generate an external sound signal in which the user's voice is attenuated by differentially calculating the user's voice signal from the entire sound signal.

음향 신호 처리 장치는 사용자 음성 신호를 적응 필터에 입력하고, 적응 필터로부터 출력되는 신호를 전체 음향 신호로부터 차분 연산함으로써 피드백 신호를 생성하고, 피드백 신호를 적응 필터에 입력함으로써 파라미터들을 조정하도록 적응 필터를 제어할 수 있다.The acoustic signal processing apparatus generates a feedback signal by inputting a user voice signal to an adaptive filter, performing a differential operation on a signal output from the adaptive filter from an overall acoustic signal, and inputting the feedback signal to the adaptive filter to adjust parameters of the adaptive filter. You can control it.

음향 신호 처리 장치는 사용자 음성 신호에 대응되는 기능 및 외부 음향 신호에 대응되는 기능을 수행하고, 기능들 각각의 수행 결과를 디스플레이의 상이한 영역에 표시할 수 있다.The sound signal processing apparatus may perform a function corresponding to a user voice signal and a function corresponding to an external sound signal, and display results of each of the functions on different areas of the display.

상술한 바와 같이, 음향 신호 처리 장치는 별도의 연산 과정 없이 사용자 음성 신호를 생성할 수 있고, 사용자 음성 신호 및 전체 음향 신호에 대해 간단한 연산만을 통해 사용자 음성이 감쇄된 외부 음향 신호를 생성할 수 있다. 음향 신호 처리 장치는 생성된 사용자 음성 신호 및 외부 음향 신호 각각을 활용하여 다양한 기능을 수행할 수 있다.As described above, the sound signal processing apparatus can generate a user voice signal without a separate calculation process, and can generate an external sound signal in which the user voice is attenuated through a simple operation on the user voice signal and all sound signals. . The sound signal processing apparatus may perform various functions by utilizing each of the generated user voice signal and external sound signal.

한편, 전술한 도 24의 동작 방법은 그 방법을 실행하는 명령어들을 포함하는 하나 이상의 프로그램이 기록된 컴퓨터로 읽을 수 있는 기록 매체에 기록될 수 있다. 컴퓨터로 읽을 수 있는 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령어의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다.Meanwhile, the above-described operating method of FIG. 24 may be recorded in a computer-readable recording medium in which one or more programs including instructions for executing the method are recorded. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks and magnetic tapes, optical media such as CD-ROMs and DVDs, and floptical disks. magneto-optical media, such as ROM, and hardware devices specially configured to store and execute program instructions, such as ROM, RAM, flash memory, and the like. Examples of program instructions include high-level language codes that can be executed by a computer using an interpreter, as well as machine language codes such as those produced by a compiler.

이상에서 실시예들에 대하여 상세하게 설명하였지만 본 개시의 권리범위는 이에 한정되는 것은 아니고 다음의 청구범위에서 정의하고 있는 본 개시의 기본 개념을 이용한 당업자의 여러 변형 및 개량 형태 또한 본 발명의 권리범위에 속한다.Although the embodiments have been described in detail above, the scope of the present disclosure is not limited thereto, and various modifications and improvements of those skilled in the art using the basic concepts of the present disclosure defined in the following claims are also within the scope of the present invention. belongs to

Claims

Receives the entire sound including the user's voice and external sound generated from the outside of the user, and a plane receiving the entire sound is disposed in a direction corresponding to the ignition point of the user's voice, thereby receiving the entire sound a user microphone for generating a user voice signal in which the external sound is attenuated;
an ambient microphone for receiving the overall sound and generating an overall sound signal from the received overall sound; and
and a processor configured to generate an external audio signal in which the user voice is attenuated by differentially calculating the user voice signal from the entire audio signal.

According to claim 1,
The user microphone,
Includes a plurality of vibration structures for sensing sounds of different frequency bands;
Each of the plurality of vibration structures,
and a vibration unit configured to form the one side that receives the entire sound and vibrate in a direction orthogonal to the one side based on a frequency of the entire sound as the entire sound is received.

According to claim 2,
the vibrator,
A sound signal processing device that vibrates with a vibration intensity based on an angle between a propagation direction of a received sound and a surface formed by the vibration unit.

According to claim 3,
the vibrator,
The sound signal processing apparatus vibrates with a large vibration intensity as the angle approaches 90 ° and vibrates with a small vibration intensity as the angle approaches 0 °.

According to claim 2,
The user microphone,
The acoustic signal processing device is arranged so that the one surface and a direction from the ignition point of the user voice to the one surface form an angle of 60 ° to 120 °.

According to claim 2,
Each of the plurality of vibration structures,
The acoustic signal processing apparatus further comprises a vibration detection unit for receiving the vibration of the vibration unit and generating an electrical signal corresponding to the received vibration.

According to claim 6,
The user microphone,
The acoustic signal processing apparatus further comprises a sound control unit that determines an electrical signal to be attenuated among electrical signals generated by the vibration structures based on a threshold value, and attenuates the determined electrical signal.

According to claim 7,
The sound control unit,
and determining the threshold value based on an average magnitude of electrical signals generated by the vibrating structures.

According to claim 1,
The signal processing device,
Further comprising an adaptive filter for adjusting parameters such that the user's voice is attenuated from the entire acoustic signal based on a feedback signal;
the processor,
As the user voice signal is input to the adaptive filter, the feedback signal is generated by performing a differential operation on a signal output from the adaptive filter from the entire acoustic signal, and the parameters are adjusted by inputting the feedback signal to the adaptive filter. An acoustic signal processing device that controls the adaptive filter.

According to claim 1,
The sound signal processing device,
Further comprising a display outputting visual information,
the processor,
Acoustic signal processing apparatus for performing a function corresponding to the user voice signal and a function corresponding to the external acoustic signal, and controlling the display so that a result of performing each of the functions is displayed in a different area of the display.

According to claim 1,
The sound signal processing device is a glasses-type wearable device,
The user microphone and the ambient microphone are disposed on the glasses frame of the glasses-type wearable device;
The user microphone is arranged such that one surface receiving the entire sound faces an ignition point of the user voice.

According to claim 11,
The user's microphone is disposed on a glasses bridge or a glasses frame of the glasses-type wearable device,
The ambient microphone is spaced apart from the user microphone and is disposed on a frame or temple of the glasses-type wearable device.

a user microphone according to claim 1;
The entire sound is received, and one surface receiving the entire sound is disposed in a direction different from the direction in which the user microphone is disposed but disposed in a direction corresponding to the generation point of the external sound, so that the user's microphone an ambient microphone for generating a first external sound signal in which voice is attenuated;
and a processor configured to generate a second external acoustic signal in which the user's voice is attenuated more than that of the first external acoustic signal by differentially calculating the user's voice signal from the first external acoustic signal.

By receiving total sound including sound output from the sound output device and external sound generated from the outside of the sound output device, and one side receiving the total sound is disposed in a direction corresponding to a point where the output sound is generated. a directional microphone for generating an output sound signal in which the external sound is attenuated from the received total sound;
an ambient microphone that receives the overall sound and generates an overall sound signal from the received overall sound; and
and a processor configured to generate an external sound signal in which the output sound is attenuated by differentially calculating the output sound signal from the entire sound signal.

Receiving overall sound including the user's voice and external sound generated from the outside of the user;
generating a total sound signal from the received total sound;
generating a user voice signal in which the external sound is attenuated from the received total sound; and
and generating an external audio signal in which the user voice is attenuated by differentially calculating the user voice signal from the entire audio signal.

According to claim 15,
Receiving the entire sound,
and vibrating each of a plurality of vibrating structures for sensing sounds of different frequency bands in a direction orthogonal to a surface formed to receive the entire sound, based on the frequency of the received overall sound. signal processing method.

According to claim 16,
The step of receiving the user's entire sound,
The acoustic signal processing method of claim 1, further comprising vibrating at a vibration intensity based on an angle formed between a propagation direction of the received sound and the one surface.

According to claim 17,
Receiving the entire sound,
The sound signal processing method further comprising the step of vibrating with a large vibration intensity as the angle approaches 90 ° and vibrating with a small vibration intensity as the angle approaches 0 °.

According to claim 16,
Receiving the entire sound,
And generating an electrical signal corresponding to the vibration of each of the plurality of vibration structures, acoustic signal processing method.

According to claim 19,
Generating the user voice signal,
Based on a threshold, determining an electrical signal to be attenuated among the electrical signals, and attenuating the determined electrical signal.

21. The method of claim 20,
Generating the user voice signal,
Further comprising determining the threshold value based on the average magnitude of the electrical signals, the acoustic signal processing method.

According to claim 15,
Generating the external acoustic signal,
inputting the user voice signal to an adaptive filter;
generating a feedback signal by differentially calculating a signal output from the adaptive filter from the total acoustic signal; and
and controlling the adaptive filter to adjust parameters by inputting the feedback signal to the adaptive filter.

According to claim 15,
Generating the external acoustic signal,
performing a function corresponding to the user voice signal and a function corresponding to the external sound signal; and
And displaying a result of performing each of the functions in a different area of the display, the acoustic signal processing method.

A computer-readable recording medium recording a program for executing the method of claim 15 on a computer.