KR20110072363A

KR20110072363A - Method for processing plurality of external voices in sound processsing apparatus and sound processing apparatus thereof

Info

Publication number: KR20110072363A
Application number: KR1020090129252A
Authority: KR
Inventors: 박종세
Original assignee: 엘지전자 주식회사
Priority date: 2009-12-22
Filing date: 2009-12-22
Publication date: 2011-06-29

Abstract

PURPOSE: A method for processing a plurality of external voice in a sound processor and the sound processor using the same are provided to offer a convenient function to a user with the clear division of a mixed voice. CONSTITUTION: A sound processor obtains a plurality of external voice(S1). A controller analyzes the spectral characteristic of the obtained external audio(S2). The sound processor discriminates the external voice in consideration of the spectral characteristic and location information(S3,S4). The controller performs the clustering of the discriminated external voice(S5). If the selection signal of the discriminated voice is generated, the controller reproduces only the discriminated voice with a sound output module(S6).

Description

TECHNICAL FIELD OF PROCESSING PLURALITY OF EXTERNAL VOICES IN SOUND PROCESSSING APPARATUS AND SOUND PROCESSING APPARATUS THEREOF

본 발명은, 음향 처리 장치에서의 복수의 외부 음성을 처리하는 방법 및 이를 적용한 음향 처리 장치에 관한 것이다.The present invention relates to a method of processing a plurality of external voices in a sound processing apparatus and a sound processing apparatus to which the same is applied.

단말기는 이동 가능 여부에 따라 이동 단말기(mobile/portable terminal) 및 고정 단말기(stationary terminal)으로 나뉠 수 있다. 다시 이동 단말기는 사용자의 직접 휴대 가능 여부에 따라 휴대(형) 단말기(handheld terminal) 및 거치형 단말기(vehicle mount terminal)로 나뉠 수 있다. The terminal can move And can be divided into a mobile / portable terminal and a stationary terminal depending on whether the mobile terminal is a mobile terminal or a mobile terminal. The mobile terminal may be further classified into a handheld terminal and a vehicle mount terminal according to whether a user can directly carry it.

이와 같은 이동 단말기(terminal)는 기본적으로 음성을 획득하고 이를 처리하는 기능을 가지고 있다. 예를 들면, 음악이나 동영상 파일의 재생, 보이스 레코더로서의 기능을 가지게 되므로, 음향 처리 장치에 해당된다 할 것이다.Such a mobile terminal basically has a function of acquiring and processing voice. For example, it will be a sound processing apparatus because it will have a function as a music recorder and a music recorder.

이러한 이동 단말기의 기능 지지 및 증대를 위해, 단말기의 구조적인 부분 및/또는 소프트웨어적인 부분을 개량하는 것이 고려될 수 있다.In order to support and increase the function of such a mobile terminal, it may be considered to improve the structural part and / or the software part of the terminal.

본 발명은 복수의 음원으로부터 음성을 수신하는 경우, 이를 보다 정확하게 분리하기 위한 방법 및 장치를 제공하기 위함이다. The present invention is to provide a method and apparatus for more accurately separating the voice when receiving a plurality of sound sources.

본 발명에 따른 일실시예인, 음향 처리 장치에서의 복수의 외부 음성을 처리하는 방법은, 상기 복수의 외부 음성의 스펙트럼 특성을 분석하는 단계; 상기 스펙트럼 특성에 기초하여, 상기 복수의 외부 음성을 각각 구별하는 단계; 및 상기 각각 구별된 음성을 클러스터링하는 단계를 포함할 수 있다.According to an embodiment of the present invention, a method of processing a plurality of external voices in a sound processing apparatus includes analyzing spectral characteristics of the plurality of external voices; Discriminating each of the plurality of external voices based on the spectral characteristics; And clustering the distinguished voices.

본 발명의 일실시예의 일태양에 의하면, 상기 외부 음성 처리 방법은, 상기 구별되는 외부 음성의 갯수를 미리 설정하는 단계를 더 포함할 수 있다.According to an aspect of an embodiment of the present invention, the external voice processing method may further include presetting the number of the distinguished external voices.

본 발명의 일실시예의 일태양에 의하면, 상기 음향 처리 장치는 적어도 2개의 마이크를 포함하고, 상기 외부 음성 처리 방법은, 상기 적어도 2개의 마이크를 통해 획득된 상기 복수의 외부 음성의 위치정보를 획득하는 단계를 더 포함하고, 상기 스펙트럼 특성에 기초하여, 상기 복수의 외부 음성을 각각 구별하는 단계는, 상기 스펙트럼 정보 및 상기 위치 정보를 기초하여, 상기 복수의 외부 음성을 각각 구별하는 단계를 포함할 수 있다.According to an aspect of an embodiment of the present invention, the sound processing apparatus includes at least two microphones, and the external voice processing method may include obtaining location information of the plurality of external voices acquired through the at least two microphones. And discriminating the plurality of external voices based on the spectral characteristics, respectively, and distinguishing the plurality of external voices based on the spectral information and the location information. Can be.

본 발명의 일실시예의 일태양에 의하면, 상기 외부 음성 처리 방법은, 상기 복수의 외부 음성은 상기 음향 처리 장치의 마이크 및 무선 통신부 중 적어도 하나 를 통해 획득되는 단계를 더 포함할 수 있다.According to an aspect of an embodiment of the present invention, the external voice processing method may further include obtaining the plurality of external voices through at least one of a microphone and a wireless communication unit of the sound processing apparatus.

본 발명의 일실시예의 일태양에 의하면, 상기 외부 음성 처리 방법은, 각각의 구별된 음성에 대한 명칭을 입력하기 위한 메뉴를 제공하는 단계를 더 포함할 수 있다.According to an aspect of an embodiment of the present invention, the external voice processing method may further include providing a menu for inputting a name for each distinct voice.

본 발명의 일실시예의 일태양에 의하면, 상기 외부 음성 처리 방법은, 상기 구별된 음성에 대한 선택신호가 발생되면, 이 구별된 음성만 재생하는 단계를 더 포함할 수 있다.According to an aspect of an embodiment of the present invention, the external voice processing method may further include reproducing only the distinguished voice when the selection signal for the distinguished voice is generated.

본 발명의 다른 실시예인 이동 통신 단말기는, 복수의 외부 음성을 획득하기 위해 구성된 마이크; 및 상기 복수의 외부 음성의 스펙트럼 특성을 분석하고, 상기 스펙트럼 특성에 기초하여, 상기 복수의 외부 음성을 각각 구별하고, 상기 각각 구별된 음성을 클러스터링하는 제어부를 포함할 수 있다.Another embodiment of the present invention provides a mobile communication terminal comprising: a microphone configured to obtain a plurality of external voices; And a controller configured to analyze the spectral characteristics of the plurality of external voices, distinguish the plurality of external voices based on the spectral characteristics, and cluster the distinguished voices.

본 발명의 다른 실시예의 일태양에 의하면, 이동 통신 단말기는, 상기 구별되는 외부 음성의 갯수를 설정하기 위해 구성된 사용자 입력부를 더 포함할 수 있다.According to an aspect of another embodiment of the present invention, the mobile communication terminal may further include a user input configured to set the number of the distinguished external voices.

본 발명의 다른 실시예의 일태양에 의하면, 상기 마이크는 적어도 2개로 구성되고, 상기 제어부는, 상기 적어도 2개의 마이크를 통해 획득된 상기 복수의 외부 음성의 위치정보를 획득하고, 상기 스펙트럼 정보 및 상기 위치 정보를 기초하여, 상기 복수의 외부 음성을 각각 구별할 수 있다.According to an aspect of another embodiment of the present invention, the microphone is composed of at least two, the control unit, obtains the position information of the plurality of external voices obtained through the at least two microphones, and the spectral information and the Based on the location information, the plurality of external voices may be distinguished from each other.

본 발명의 다른 실시예의 일태양에 의하면, 상기 이동 통신 단말기는, 상기 복수의 외부 음성을 획득하기 위한 무선 통신부를 더 포함할 수 있다.According to an aspect of another embodiment of the present invention, the mobile communication terminal may further include a wireless communication unit for acquiring the plurality of external voices.

본 발명의 다른 실시예의 일태양에 의하면, 상기 이동 통신 단말기는, 음성 출력 모듈을 더 포함하고, 상기 제어부는, 상기 구별된 음성에 대한 선택신호가 발생되면, 이 구별된 음성만 재생하도록 상기 음성 출력 모듈을 제어할 수 있다.According to an aspect of another embodiment of the present invention, the mobile communication terminal further includes a voice output module, and the controller controls the voice to reproduce only the differentiated voice when a selection signal for the distinguished voice is generated. Output module can be controlled.

상술한 구성을 가진 본 발명에 따르면, 복수의 음원으로부터 획득된 혼합된 음성을 보다 명확하게 분리하고, 이를 이용하여 사용자에게 편리한 기능을 제공할 수 있게 된다. According to the present invention having the above-described configuration, it is possible to more clearly separate the mixed voice obtained from the plurality of sound sources, to provide a convenient function to the user.

이하, 본 발명과 관련된 음향 처리 장치에 대하여 도면을 참조하여 보다 상세하게 설명한다. 이하의 설명에서 사용되는 구성요소에 대한 접미사 "모듈" 및 "부"는 명세서 작성의 용이함만이 고려되어 부여되거나 혼용되는 것으로서, 그 자체로 서로 구별되는 의미 또는 역할을 갖는 것은 아니다. EMBODIMENT OF THE INVENTION Hereinafter, the sound processing apparatus which concerns on this invention is demonstrated in detail with reference to drawings. The suffixes "module" and "unit" for components used in the following description are given or used in consideration of ease of specification, and do not have distinct meanings or roles from each other.

본 명세서에서 설명되는 음향 처리 장치에는 휴대폰, 스마트 폰(smart phone), 노트북 컴퓨터(laptop computer), 디지털방송용 단말기, PDA(Personal Digital Assistants), PMP(Portable Multimedia Player), 네비게이션 등이 포함될 수 있다. 그러나, 본 명세서에 기재된 실시예에 따른 구성은 음향 처리 장치에만 적용 가능한 경우를 제외하면, 디지털 TV, 데스크탑 컴퓨터 등에도 적용될 수도 있 음을 본 기술분야의 당업자라면 쉽게 알 수 있을 것이다.The sound processing apparatus described herein may include a mobile phone, a smart phone, a laptop computer, a digital broadcasting terminal, a personal digital assistant (PDA), a portable multimedia player (PMP), navigation, and the like. However, it will be readily apparent to those skilled in the art that the configuration according to the embodiments described herein may also be applied to digital TVs, desktop computers, etc., except when applicable only to the sound processing apparatus.

도 1은 본 발명의 일 실시예와 관련된 음향 처리 장치의 블록 구성도이다.1 is a block diagram of a sound processing apparatus according to an embodiment of the present invention.

상기 음향 처리 장치(100)는 무선 통신부(110), 마이크(120), 사용자 입력부(130), 디스플레이부(140), 음향 출력모듈(150), 메모리(160), 인터페이스부(170), 및 제어부(180)등을 포함할 수 있다. 도 1에 도시된 구성요소들이 필수적인 것은 아니어서, 그보다 많은 구성요소들을 갖거나 그보다 적은 구성요소들을 갖는 음향 처리 장치가 구현될 수도 있다.The sound processing apparatus 100 may include a wireless communication unit 110, a microphone 120, a user input unit 130, a display unit 140, a sound output module 150, a memory 160, an interface unit 170, and The controller 180 may be included. The components shown in FIG. 1 are not essential, so that a sound processing apparatus having more or fewer components may be implemented.

이하, 상기 구성요소들에 대해 차례로 살펴본다.Hereinafter, the components will be described in order.

무선 통신부(110)는 음향 처리 장치(100)와 무선 통신 시스템 사이 또는 음향 처리 장치(100)와 음향 처리 장치(100)가 위치한 네트워크 사이의 무선 통신을 가능하게 하는 하나 이상의 모듈을 포함할 수 있다. 예를 들어, 무선 통신부(110)는 방송수신 모듈, 이동통신 모듈, 무선 인터넷 모듈, 근거리 통신 모듈 및 위치정보 모듈 등을 포함할 수 있다.The wireless communication unit 110 may include one or more modules that enable wireless communication between the sound processing apparatus 100 and the wireless communication system or between a network in which the sound processing apparatus 100 and the sound processing apparatus 100 are located. . For example, the wireless communication unit 110 may include a broadcast receiving module, a mobile communication module, a wireless internet module, a short range communication module and a location information module.

방송수신 모듈은 방송 채널을 통하여 외부의 방송 관리 서버로부터 방송 신호 및/또는 방송 관련된 정보를 수신한다. The broadcast receiving module receives a broadcast signal and / or broadcast related information from an external broadcast management server through a broadcast channel.

상기 방송 채널은 위성 채널, 지상파 채널을 포함할 수 있다. 상기 방송 관리 서버는, 방송 신호 및/또는 방송 관련 정보를 생성하여 송신하는 서버 또는 기 생성된 방송 신호 및/또는 방송 관련 정보를 제공받아 단말기에 송신하는 서버를 의미할 수 있다. 상기 방송 신호는, TV 방송 신호, 라디오 방송 신호, 데이터 방송 신호를 포함할 뿐만 아니라, TV 방송 신호 또는 라디오 방송 신호에 데이터 방송 신호가 결합한 형태의 방송 신호도 포함할 수 있다. The broadcast channel may include a satellite channel and a terrestrial channel. The broadcast management server may mean a server that generates and transmits a broadcast signal and / or broadcast related information or a server that receives a previously generated broadcast signal and / or broadcast related information and transmits the same to a terminal. The broadcast signal may include not only a TV broadcast signal, a radio broadcast signal, and a data broadcast signal, but also a broadcast signal having a data broadcast signal combined with a TV broadcast signal or a radio broadcast signal.

상기 방송 관련 정보는, 방송 채널, 방송 프로그램 또는 방송 서비스 제공자에 관련한 정보를 의미할 수 있다. 상기 방송 관련 정보는, 이동통신망을 통하여도 제공될 수 있다. 이러한 경우에는 상기 이동통신 모듈에 의해 수신될 수 있다.The broadcast related information may mean information related to a broadcast channel, a broadcast program, or a broadcast service provider. The broadcast related information may also be provided through a mobile communication network. In this case, it may be received by the mobile communication module.

상기 방송 관련 정보는 다양한 형태로 존재할 수 있다. 예를 들어, DMB(Digital Multimedia Broadcasting)의 EPG(Electronic Program Guide) 또는 DVB-H(Digital Video Broadcast-Handheld)의 ESG(Electronic Service Guide) 등의 형태로 존재할 수 있다.The broadcast related information may exist in various forms. For example, it may exist in the form of Electronic Program Guide (EPG) of Digital Multimedia Broadcasting (DMB) or Electronic Service Guide (ESG) of Digital Video Broadcast-Handheld (DVB-H).

상기 방송 수신 모듈은, 예를 들어, DMB-T(Digital Multimedia Broadcasting-Terrestrial), DMB-S(Digital Multimedia Broadcasting-Satellite), MediaFLO(Media Forward Link Only), DVB-H(Digital Video Broadcast-Handheld), ISDB-T(Integrated Services Digital Broadcast-Terrestrial) 등의 디지털 방송 시스템을 이용하여 디지털 방송 신호를 수신할 수 있다. 물론, 상기 방송수신 모듈은, 상술한 디지털 방송 시스템뿐만 아니라 다른 방송 시스템에 적합하도록 구성될 수도 있다.The broadcast receiving module may include, for example, Digital Multimedia Broadcasting-Terrestrial (DMB-T), Digital Multimedia Broadcasting-Satellite (DMB-S), Media Forward Link Only (MediaFLO), and Digital Video Broadcast-Handheld (DVB-H). The digital broadcast signal may be received using a digital broadcasting system such as ISDB-T (Integrated Services Digital Broadcast-Terrestrial). Of course, the broadcast receiving module may be configured to be suitable for not only the above-described digital broadcast system but also other broadcast systems.

방송수신 모듈을 통해 수신된 방송 신호 및/또는 방송 관련 정보는 메모리(160)에 저장될 수 있다.The broadcast signal and / or broadcast related information received through the broadcast receiving module may be stored in the memory 160.

이동통신 모듈은, 이동 통신망 상에서 기지국, 외부의 단말, 서버 중 적어도 하나와 무선 신호를 송수신한다. 상기 무선 신호는, 음성 호 신호, 화상 통화 호 신호 또는 문자/멀티미디어 메시지 송수신에 따른 다양한 형태의 데이터를 포함할 수 있다. The mobile communication module transmits and receives a radio signal with at least one of a base station, an external terminal, and a server on a mobile communication network. The wireless signal may include various types of data according to transmission and reception of a voice call signal, a video call call signal, or a text / multimedia message.

무선 인터넷 모듈은 무선 인터넷 접속을 위한 모듈을 말하는 것으로, 음향 처리 장치(100)에 내장되거나 외장될 수 있다. 무선 인터넷 기술로는 WLAN(Wireless LAN)(Wi-Fi), Wibro(Wireless broadband), Wimax(World Interoperability for Microwave Access), HSDPA(High Speed Downlink Packet Access) 등이 이용될 수 있다. The wireless internet module refers to a module for wireless internet access and may be embedded or external to the sound processing apparatus 100. Wireless Internet technologies may include Wireless LAN (Wi-Fi), Wireless Broadband (Wibro), World Interoperability for Microwave Access (Wimax), High Speed Downlink Packet Access (HSDPA), and the like.

근거리 통신 모듈은 근거리 통신을 위한 모듈을 말한다. 근거리 통신(short range communication) 기술로 블루투스(Bluetooth), RFID(Radio Frequency Identification), 적외선 통신(IrDA, infrared Data Association), UWB(Ultra Wideband), ZigBee 등이 이용될 수 있다.The short range communication module refers to a module for short range communication. As a short range communication technology, Bluetooth, Radio Frequency Identification (RFID), Infrared Data Association (IrDA), Ultra Wideband (UWB), ZigBee, and the like may be used.

위치정보 모듈은 음향 처리 장치의 위치를 획득하기 위한 모듈로서, 그의 대표적인 예로는 GPS(Global Position System) 모듈이 있다.The location information module is a module for acquiring a location of the sound processing device, and a representative example thereof is a GPS (Global Position System) module.

마이크(120)는 통화모드 또는 녹음모드, 음성인식 모드 등에서 마이크로폰(Microphone)에 의해 외부의 음향 신호를 입력받아 전기적인 음성 데이터로 처리한다. 처리된 음성 데이터는 통화 모드인 경우 이동통신 모듈을 통하여 이동통신 기지국으로 송신 가능한 형태로 변환되어 출력될 수 있다. 마이크(120)에는 외부의 음향 신호를 입력받는 과정에서 발생되는 잡음(noise)을 제거하기 위한 다양한 잡음 제거 알고리즘이 구현될 수 있다.The microphone 120 receives an external sound signal by a microphone in a call mode, a recording mode, a voice recognition mode, etc., and processes the external sound signal into electrical voice data. The processed voice data may be converted into a form transmittable to the mobile communication base station through the mobile communication module and output in the call mode. The microphone 120 may implement various noise removing algorithms for removing noise generated in the process of receiving an external sound signal.

또한, 적어도 2개의 마이크가 음향 처리 장치(100)에 장착될 수 있다. 그리고, 상기 적어도 2개의 마이크로부터 수신되는 음원은, 메모리(160)에 저장된 음원 프로그램을 통해 신호처리 되어, 보다 우수한 품질의 음원을 획득할 수 있다.In addition, at least two microphones may be mounted in the sound processing apparatus 100. The sound source received from the at least two microphones may be signal processed through a sound source program stored in the memory 160 to obtain a sound source having higher quality.

사용자 입력부(130)는 사용자가 음향 처리 장치의 동작 제어를 위한 입력 데이터를 발생시킨다. 사용자 입력부(130)는 키 패드(key pad) 돔 스위치 (dome switch), 터치 패드(정압/정전), 조그 휠, 조그 스위치 등으로 구성될 수 있다. The user input unit 130 generates input data for the user to control the operation of the sound processing apparatus. The user input unit 130 may include a key pad dome switch, a touch pad (static pressure / capacitance), a jog wheel, a jog switch, and the like.

디스플레이부(140)는 음향 처리 장치(100)에서 처리되는 정보를 표시(출력)한다. 예를 들어, 음향 처리 장치가 통화 모드인 경우 통화와 관련된 UI(User Interface) 또는 GUI(Graphic User Interface)를 표시한다. 음향 처리 장치(100)가 화상 통화 모드 또는 촬영 모드인 경우에는 촬영 또는/및 수신된 영상 또는 UI, GUI를 표시한다. The display 140 displays (outputs) information processed by the sound processing apparatus 100. For example, when the sound processing apparatus is in a call mode, the UI (User Interface) or GUI (Graphic User Interface) related to the call is displayed. When the sound processing apparatus 100 is in a video call mode or a photographing mode, it displays a photographed and / or received image, a UI, or a GUI.

디스플레이부(140)는 액정 디스플레이(liquid crystal display, LCD), 박막 트랜지스터 액정 디스플레이(thin film transistor-liquid crystal display, TFT LCD), 유기 발광 다이오드(organic light-emitting diode, OLED), 플렉시블 디스플레이(flexible display), 3차원 디스플레이(3D display) 중에서 적어도 하나를 포함할 수 있다. The display unit 140 includes a liquid crystal display (LCD), a thin film transistor-liquid crystal display (TFT LCD), an organic light-emitting diode (OLED), and a flexible display (flexible). and at least one of a 3D display.

음향 처리 장치(100)의 구현 형태에 따라 디스플레이부(140)이 2개 이상 존재할 수 있다. 예를 들어, 음향 처리 장치(100)에는 복수의 디스플레이부들이 하나의 면에 이격되거나 일체로 배치될 수 있고, 또한 서로 다른 면에 각각 배치될 수 도 있다. Two or more display units 140 may exist according to the implementation form of the sound processing apparatus 100. For example, the plurality of display units may be spaced apart or integrally disposed on one surface of the sound processing apparatus 100, or may be disposed on different surfaces, respectively.

디스플레이부(140)와 터치 동작을 감지하는 센서(이하, '터치 센서'라 함)가 상호 레이어 구조를 이루는 경우(이하, '터치 스크린'이라 함)에, 디스플레이부(140)는 출력 장치 이외에 입력 장치로도 사용될 수 있다. 터치 센서는, 예를 들어, 터치 필름, 터치 시트, 터치 패드 등의 형태를 가질 수 있다.When the display unit 140 and a sensor for detecting a touch operation (hereinafter, referred to as a touch sensor) form a mutual layer structure (hereinafter referred to as a touch screen), the display unit 140 may be configured in addition to an output device. Can also be used as an input device. The touch sensor may have, for example, a form of a touch film, a touch sheet, a touch pad, or the like.

터치 센서는 디스플레이부(140)의 특정 부위에 가해진 압력 또는 디스플레이부(140)의 특정 부위에 발생하는 정전 용량 등의 변화를 전기적인 입력신호로 변환하도록 구성될 수 있다. 터치 센서는 터치 되는 위치 및 면적뿐만 아니라, 터치 시의 압력까지도 검출할 수 있도록 구성될 수 있다. The touch sensor may be configured to convert a change in pressure applied to a specific portion of the display 140 or a capacitance generated at a specific portion of the display 140 into an electrical input signal. The touch sensor may be configured to detect not only the position and area of the touch but also the pressure at the touch.

터치 센서에 대한 터치 입력이 있는 경우, 그에 대응하는 신호(들)는 터치 제어기로 보내진다. 터치 제어기는 그 신호(들)를 처리한 다음 대응하는 데이터를 제어부(180)로 전송한다. 이로써, 제어부(180)는 디스플레이부(140)의 어느 영역이 터치 되었는지 여부 등을 알 수 있게 된다.If there is a touch input to the touch sensor, the corresponding signal (s) is sent to the touch controller. The touch controller processes the signal (s) and then transmits the corresponding data to the controller 180. As a result, the controller 180 can determine which area of the display 140 is touched.

음향 출력 모듈(150)은 호신호 수신, 통화모드 또는 녹음 모드, 음성인식 모드, 방송수신 모드 등에서 무선 통신부(110)로부터 수신되거나 메모리(160)에 저장된 오디오 데이터를 출력할 수 있다. 음향 출력 모듈(152)은 음향 처리 장치(100)에서 수행되는 기능과 관련된 음향 신호를 출력하기도 한다. 이러한 음향 출력 모듈(150)에는 리시버(Receiver), 스피커(speaker), 버저(Buzzer) 등이 포함될 수 있다.The sound output module 150 may output audio data received from the wireless communication unit 110 or stored in the memory 160 in a call signal reception, a call mode or a recording mode, a voice recognition mode, a broadcast reception mode, and the like. The sound output module 152 may also output a sound signal related to a function performed by the sound processing apparatus 100. The sound output module 150 may include a receiver, a speaker, a buzzer, and the like.

메모리(160)는 제어부(180)의 동작을 위한 프로그램을 저장할 수 있고, 입/ 출력되는 데이터들(예를 들어, 폰북, 메시지, 정지영상, 동영상 등)을 임시 저장할 수도 있다. 상기 메모리(160)는 상기 터치스크린 상의 터치 입력시 출력되는 다양한 패턴의 진동 및 음향에 관한 데이터를 저장할 수 있다.The memory 160 may store a program for the operation of the controller 180 and may temporarily store input / output data (for example, a phone book, a message, a still image, a video, etc.). The memory 160 may store data regarding vibration and sound of various patterns output when a touch input on the touch screen is performed.

메모리(160)는 플래시 메모리 타입(flash memory type), 하드디스크 타입(hard disk type), 멀티미디어 카드 마이크로 타입(multimedia card micro type), 카드 타입의 메모리(예를 들어 SD 또는 XD 메모리 등), 램(Random Access Memory, RAM), SRAM(Static Random Access Memory), 롬(Read-Only Memory, ROM), EEPROM(Electrically Erasable Programmable Read-Only Memory), PROM(Programmable Read-Only Memory), 자기 메모리, 자기 디스크, 광디스크 중 적어도 하나의 타입의 저장매체를 포함할 수 있다. 음향 처리 장치(100)는 인터넷(internet)상에서 상기 메모리(160)의 저장 기능을 수행하는 웹 스토리지(web storage)와 관련되어 동작할 수도 있다.The memory 160 may be a flash memory type, a hard disk type, a multimedia card micro type, a card type memory (for example, SD or XD memory), RAM (Random Access Memory, RAM), Static Random Access Memory (SRAM), Read-Only Memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Programmable Read-Only Memory (PROM), Magnetic Memory, Magnetic It may include a storage medium of at least one type of disk, optical disk. The sound processing apparatus 100 may operate in association with a web storage that performs a storage function of the memory 160 on the Internet.

인터페이스부(170)는 음향 처리 장치(100)에 연결되는 모든 외부기기와의 통로 역할을 한다. 인터페이스부(170)는 외부 기기로부터 데이터를 전송받거나, 전원을 공급받아 음향 처리 장치(100) 내부의 각 구성 요소에 전달하거나, 음향 처리 장치(100) 내부의 데이터가 외부 기기로 전송되도록 한다. 예를 들어, 유/무선 헤드셋 포트, 외부 충전기 포트, 유/무선 데이터 포트, 메모리 카드(memory card) 포트, 식별 모듈이 구비된 장치를 연결하는 포트, 오디오 I/O(Input/Output) 포트, 비디오 I/O(Input/Output) 포트, 이어폰 포트 등이 인터페이스부(170)에 포함될 수 있다. The interface unit 170 serves as a path with all external devices connected to the sound processing apparatus 100. The interface unit 170 receives data from an external device, receives power, transfers the power to each component inside the sound processing device 100, or transmits data inside the sound processing device 100 to an external device. For example, a wired / wireless headset port, an external charger port, a wired / wireless data port, a memory card port, a port for connecting a device having an identification module, an audio I / O port, A video input / output (I / O) port, an earphone port, and the like may be included in the interface unit 170.

상기 인터페이스부는 음향 처리 장치(100)가 외부 크래들(cradle)과 연결될 때 상기 크래들로부터의 전원이 상기 음향 처리 장치(100)에 공급되는 통로가 되거나, 사용자에 의해 상기 크래들에서 입력되는 각종 명령 신호가 상기 이동단말기로 전달되는 통로가 될 수 있다. 상기 크래들로부터 입력되는 각종 명령 신호 또는 상기 전원은 상기 이동단말기가 상기 크래들에 정확히 장착되었음을 인지하기 위한 신호로 동작될 수도 있다.The interface unit may be a passage through which power from the cradle is supplied to the sound processing apparatus 100 when the sound processing apparatus 100 is connected to an external cradle, or various command signals input from the cradle may be input by a user. It may be a passage that is delivered to the mobile terminal. Various command signals or power input from the cradle may be operated as signals for recognizing that the mobile terminal is correctly mounted on the cradle.

제어부(controller, 180)는 통상적으로 음향 처리 장치의 전반적인 동작을 제어한다. 상기 제어부(180)는 상기 터치스크린 상에서 행해지는 필기 입력 또는 그림 그리기 입력을 각각 문자 및 이미지로 인식할 수 있는 패턴 인식 처리를 행할 수 있다. The controller 180 typically controls the overall operation of the sound processing apparatus. The controller 180 may perform a pattern recognition process for recognizing a writing input or a drawing input performed on the touch screen as text and an image, respectively.

여기에 설명되는 다양한 실시예는 예를 들어, 소프트웨어, 하드웨어 또는 이들의 조합된 것을 이용하여 컴퓨터 또는 이와 유사한 장치로 읽을 수 있는 기록매체 내에서 구현될 수 있다.Various embodiments described herein may be implemented in a recording medium readable by a computer or similar device using, for example, software, hardware or a combination thereof.

하드웨어적인 구현에 의하면, 여기에 설명되는 실시예는 ASICs (application specific integrated circuits), DSPs (digital signal processors), DSPDs (digital signal processing devices), PLDs (programmable logic devices), FPGAs (field programmable gate arrays, 프로세서(processors), 제어기(controllers), 마이크로 컨트롤러(micro-controllers), 마이크로 프로세서(microprocessors), 기타 기능 수행을 위한 전기적인 유닛 중 적어도 하나를 이용하여 구현될 수 있다. 일부의 경우에 본 명세서에서 설명되는 실시예들이 제어부(180) 자체로 구현될 수 있다.According to a hardware implementation, the embodiments described herein include application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), and the like. It may be implemented using at least one of processors, controllers, micro-controllers, microprocessors, and electrical units for performing other functions. The described embodiments may be implemented by the controller 180 itself.

소프트웨어적인 구현에 의하면, 본 명세서에서 설명되는 절차 및 기능과 같은 실시예들은 별도의 소프트웨어 모듈들로 구현될 수 있다. 상기 소프트웨어 모듈들 각각은 본 명세서에서 설명되는 하나 이상의 기능 및 작동을 수행할 수 있다. 적절한 프로그램 언어로 쓰여진 소프트웨어 어플리케이션으로 소프트웨어 코드가 구현될 수 있다. 상기 소프트웨어 코드는 메모리(160)에 저장되고, 제어부(180)에 의해 실행될 수 있다.According to the software implementation, embodiments such as the procedures and functions described herein may be implemented as separate software modules. Each of the software modules may perform one or more functions and operations described herein. Software code can be implemented in a software application written in a suitable programming language. The software code may be stored in the memory 160 and executed by the controller 180.

도 2는 본 발명의 일실실시예인 음향 처리 장치의 개념을 설명하기 위한 도면이다.2 is a view for explaining the concept of the sound processing apparatus according to an embodiment of the present invention.

도시된 바와 같이, 음향 처리 장치(100)를 중심으로 3명의 화자(A,B,C)가 위치한다. 음향 처리 장치(100)에 적어도 2개의 마이크가 설치되어 있다면, 상기 화자들의 위치정보를 획득할 수 있고, 이 위치 정보를 기초하여 화자를 구별할 수 있다. 그러나, 화자 B와 화자 C의 위치가 거의 동일하기 때문에, 화자 B와 화자 C의 위치 정보는 오차 범위내가 될 수 있다. 이 경우, 음향 처리 장치(100)의 제어부(180)는 위치정보로만은 화자 B와 화자 C를 구별할 수 없게 된다. As shown, three speakers A, B, and C are positioned around the sound processing apparatus 100. If at least two microphones are installed in the sound processing apparatus 100, the location information of the speakers may be obtained, and the speakers may be distinguished based on the location information. However, since the positions of the speaker B and the speaker C are almost the same, the positional information of the speaker B and the speaker C may fall within an error range. In this case, the controller 180 of the sound processing apparatus 100 cannot distinguish between the speaker B and the speaker C only by the position information.

도 3은 본 발명의 일실실시예인 음향 처리 장치에서의 복수의 외부 음성을 처리하는 방법을 설명하기 위한 개념도이다.3 is a conceptual diagram illustrating a method of processing a plurality of external voices in a sound processing apparatus according to an embodiment of the present invention.

도 2에서의 화자의 음성을 수신하여, 스펙트럼 분석을 하게 되면, 도 3과 같 은 스펙트럼 도면이 나타나게 된다. 210은 화자 A의 스펙트럼이고, 220은 화자 B의 스펙트럼이며, 230은 화자 C 의 스펙트럼이고, 240은 화자 A의 스펙트럼이다. 도시된 바와 같이, 각 화자간의 스펙트럼에는 그 특성이 상이하게 된다. 이를 이용하여 , 제어부(180)는 각 화자간의 음성을 구별하여 인식할수 있게 된다. 한편, 화자별로 구분된 구간들 중 동일한 화자는 그 스펙트럼이 유사하다(도 3에서는 210과 240). 이 경우, 유사한 스펙트럼 특성을 가지는 구간끼리 클러스터링을 할 수도 있다. After receiving the speaker's voice in FIG. 2 and performing spectrum analysis, a spectral diagram as shown in FIG. 3 appears. 210 is the spectrum of speaker A, 220 is the spectrum of speaker B, 230 is the spectrum of speaker C, and 240 is the spectrum of speaker A. As shown in the figure, the characteristics of each speaker are different. By using this, the controller 180 can distinguish and recognize the voices between the respective speakers. On the other hand, the same speaker among the sections divided by the speaker is similar in the spectrum (210 and 240 in Figure 3). In this case, sections having similar spectral characteristics may be clustered.

도 4는 본 발명의 일실시예인 음향 처리 장치에서의 복수의 외부 음성을 처리하는 방법을 설명하기 위한 흐름도이다.4 is a flowchart illustrating a method of processing a plurality of external voices in an acoustic processing apparatus according to an embodiment of the present invention.

도시된 바와 같이, 우선 음향 처리 장치(100)는 복수의 외부 음성을 획득한다(S1). 이 복수의 외부 음성은 마이크(120)를 통해 획득될수 도 있고, 무선 통신부(110)을 통해 획득될 수도 있다(즉, 라디오 음성, 이동 방송 음성 등을 포함한다). 제어부(180)는 상기 획득된 복수의 외부 음성의 스펙트럼 특성을 분석한다(S2). As shown, first, the sound processing apparatus 100 obtains a plurality of external voices (S1). The plurality of external voices may be obtained through the microphone 120 or may be obtained through the wireless communication unit 110 (ie, radio voice, mobile broadcast voice, etc.). The controller 180 analyzes the spectral characteristics of the obtained plurality of external voices (S2).

상기 음향 처리 장치는 적어도 2개의 마이크를 포함할 수 있고, 상기 2개의 마이크를 통해 획득된 상기 복수의 외부 음성의 위치정보를 획득하게 되면, 상기 스펙트럼 특성 및 상기 위치 정보를 모두 구려하여 상기 복수의 외부 음성을 구별하여 인식할 수 있게 된다(S3, S4). 그 다음, 제어부(180)는 구별 인식된 외부음성별로 클러스터링 한다(S5). 이 때, 사용자 입력부(130)를 통해 상기 구별되는 외 부 음성의 갯수를 미리 설정될 수 있다.The sound processing apparatus may include at least two microphones. When acquiring the location information of the plurality of external voices acquired through the two microphones, the sound processing apparatus may be configured to consider both the spectral characteristics and the location information. External voices can be distinguished and recognized (S3, S4). Next, the controller 180 clusters the distinguished and recognized external voices (S5). At this time, the number of the distinguished external voices may be preset through the user input unit 130.

그 다음, 제어부(180)는 상기 구별된 음성에 대한 선택신호가 발생되면, 이 구별된 음성만 재생하도록 음향 출력 모듈(150)을 제어한다. 여기서, 각각의 구별된 음성에 대한 명칭을 입력하기 위한 메뉴가 디스플레이부(140)에 더 표시될 수 있다.Next, when the selection signal for the distinguished voice is generated, the controller 180 controls the sound output module 150 to reproduce only the distinguished voice. Here, a menu for inputting a name for each distinguished voice may be further displayed on the display unit 140.

상술한 구성을 가진 본 발명의 일실시예에 따르면, 동일 유사한 위치에서 발생되는 음성을 식별할 수 있고. 이 식별된 음성을 선택적으로 재생할 수 있게 된다. According to an embodiment of the present invention having the above-described configuration, it is possible to identify the voice generated from the same similar position. This identified voice can be selectively reproduced.

이하에서는 상술한 향 처리장치에서의 복수의 외부 음성을 처리하는 방법이 적용된 음향 처리장치의 예를 도 5를 참조하여 상세하게 설명하도록 한다.Hereinafter, an example of a sound processing apparatus to which the method for processing a plurality of external voices in the aforementioned flavor processing apparatus is applied will be described in detail with reference to FIG. 5.

도 5는 본 발명의 일실시예인 음향 처리장치에서의 복수의 외부 음성을 처리하는 방법이 적용된 음향 처리장치의 적용예에 관한 이미지도이다.5 is an image diagram illustrating an application example of a sound processing apparatus to which a method of processing a plurality of external voices is applied in a sound processing apparatus according to an embodiment of the present invention.

본 예에서는 메모리(160)에 저장된 음성을 재생하는 경우를 설명한다. 본 발명은 이에 한정되지 않고, 동영상을 재생하는 경우에도 적용될 수 있음이 명백하다.In this example, a case of reproducing a voice stored in the memory 160 will be described. It is apparent that the present invention is not limited thereto and may be applied to the case of playing a video.

도 5a에는 음성 보관함 화면(200)이 도시되어 있다. 이 음성 보관함 화면(200)은 음성 파일 리스트(201-203), 메뉴 아이콘(204), 확인 아이콘(205), 및 이전 아이콘(206)을 포함한다.5A shows the voice recorder screen 200. This voice recorder screen 200 includes a voice file list 201-203, a menu icon 204, a confirmation icon 205, and a previous icon 206.

사용자가 상기 음성 파일리스트 중 하나(202)가 사용자 입력부(130)를 통해 입력되면, 도 5b에 도시된 바와 같이, 음성 재생화면(210)이 디스플레이부(140)에 표시된다. When the user inputs one of the voice file lists 202 through the user input unit 130, as shown in FIG. 5B, the voice reproduction screen 210 is displayed on the display unit 140.

음성 재생 화면(210)은 프로그래시브바(211), 메뉴 아이콘(212) 및 재생 제어 아이콘(213)을 포함한다.The audio playback screen 210 includes a progressive bar 211, a menu icon 212, and a playback control icon 213.

이 상태에서, 메뉴 아이콘(212)를 선택하면, 도 5c에 도시된 바와 같이, 메뉴창(220)이 표시된다. 상기 메뉴창(220)에는 전송아이콘(221), 삭제 아이콘(222), 및 음성별 재생 아이콘(223)이 표시되어 있다.In this state, when the menu icon 212 is selected, as shown in FIG. 5C, the menu window 220 is displayed. In the menu window 220, a transmission icon 221, a deletion icon 222, and a play icon 223 for each voice are displayed.

전송 아이콘(221)이 선택되면, 상기 선택된 음성 파일을 전송하기 위한 메뉴 화면이 디스플레이부(140)에 표시된다. When the transmission icon 221 is selected, a menu screen for transmitting the selected voice file is displayed on the display 140.

삭제 아이콘(222)이 선택되면, 상기 선택된 음성 파일이 삭제된다.When the delete icon 222 is selected, the selected voice file is deleted.

상기 음성별 재생 아이콘(223)이 선택되면, 도 5d에 도시된 바와 같이, 구별하고자 하는 음성의 수를 결정하기 위한 화면(230)이 디스플레이부(140)에 도시될 수 있다. 이 때, 사용자는 사용자 입력부(130)를 이용하여 구별가능한 화자수를 결정할 수 있다. 본 실시예에서는 3명으로 제한하였다. 이렇게, 최대 인원수를 제한함으로써, 제어부(180)의 계산 부담을 줄일 수 있다. 또는 자동적으로, 상기 음성파일을 분석하여 구별되는 음성의 수를 결정할 수도 있다.When the play icon 223 for each voice is selected, as illustrated in FIG. 5D, a screen 230 for determining the number of voices to be distinguished may be displayed on the display 140. In this case, the user may determine the number of distinguishable speakers using the user input unit 130. In this example, it was limited to three people. In this way, by limiting the maximum number of people, the calculation burden of the controller 180 can be reduced. Or automatically, the voice file may be analyzed to determine the number of distinct voices.

그 다음, 상기 선택된 음성 파일을 분석하게 되면, 도 5e에 도시된 바와 같이, 각각 구별된 음성별 메뉴 화면(240)이 디스플레이부(140)에 도시될 수 있다. Then, when analyzing the selected voice file, as shown in FIG. 5E, a menu screen 240 for each voice may be displayed on the display 140.

상기 음성별 메뉴 화면(240)은 제 1 화자 메뉴(241), 제 2 화자 메뉴(242) 및 제 3 화자 메뉴(243)을 포함한다.The voice menu screen 240 includes a first speaker menu 241, a second speaker menu 242, and a third speaker menu 243.

상기 제 1 화자 메뉴(241)는 제 1 재생 제어 아이콘(241-1) 및 제 1 프로그래시브 바(241-2)를 포함한다. The first speaker menu 241 includes a first playback control icon 241-1 and a first progressive bar 241-2.

상기 제 2 화자 메뉴(242)는 제 2 재생 제어 아이콘(242-1) 및 제 2 프로그래시브 바(242-2)를 포함한다. The second speaker menu 242 includes a second playback control icon 242-1 and a second progressive bar 242-2.

상기 제 3 화자 메뉴(243)는 제 3 재생 제어 아이콘(243-1) 및 제 3 프로그래시브 바(243-2)를 포함한다. The third speaker menu 243 includes a third playback control icon 243-1 and a third progressive bar 243-2.

사용자는 상기 재생 제어 아이콘을 통해 원하는 화자의 음성을 청취할 수 있고, 상기 프로그래시브바를 통해 현재 진행상태및 전체 음성 데이터의 상태를 확인할 수 있다.The user can listen to the desired speaker's voice through the playback control icon, and can check the current progress state and the state of the entire voice data through the progressive bar.

또한, 사용자는 사용자 입력부(130)를 통해 각 화자 메뉴에 대하여 타이틀 정보를 입력할 수 있다. 또한, 상기 분석된 음성파일의 스펙트럼 특성은 메모리(160)에 저장되고, 그 다음, 사용자가 음성을 획득하는 경우, 이 획득된 음성에미리 분석된 스펙트럼을 가진 화자가 있는 경우, 자동으로 그 화자의 이름이 지정될 수 있다. In addition, the user may input title information for each speaker menu through the user input unit 130. In addition, the spectral characteristics of the analyzed voice file are stored in the memory 160, and then when the user acquires the voice, if the speaker with the acquired voice pre-analyzed spectrum is present, the speaker automatically. The name of can be specified.

또한, 본 발명의 일실시예에 의하면, 전술한 방법은, 프로그램이 기록된 매체에 프로세서가 읽을 수 있는 코드로서 구현하는 것이 가능하다. 프로세서가 읽을 수 있는 매체의 예로는, ROM, RAM, CD-ROM, 자기 테이프, 플로피 디스크, 광 데이터 저장장치 등이 있으며, 캐리어 웨이브(예를 들어, 인터넷을 통한 전송)의 형태 로 구현되는 것도 포함한다.Further, according to an embodiment of the present invention, the above-described method can be implemented as a code that can be read by a processor on a medium on which the program is recorded. Examples of processor-readable media include ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical data storage, and the like, and may be implemented in the form of a carrier wave (for example, transmission over the Internet). Include.

상기와 같이 설명된 음향 처리 장치에서의 복수의 외부 음성을 처리하는 방법 및 이를 적용한 음향 처리 장치는 상기 설명된 실시예들의 구성과 방법이 한정되게 적용될 수 있는 것이 아니라, 상기 실시예들은 다양한 변형이 이루어질 수 있도록 각 실시예들의 전부 또는 일부가 선택적으로 조합되어 구성될 수도 있다. The method of processing a plurality of external voices in the above-described sound processing apparatus and the sound processing apparatus to which the same is applied may not be limitedly applied to the configuration and method of the above-described embodiments. All or part of each of the embodiments may be selectively combined to be implemented.

도 1은 본 발명의 일 실시예와 관련된 음향 처리 장치의 블록 구성도.1 is a block diagram of a sound processing apparatus according to an embodiment of the present invention;

도 2는 본 발명의 일실실시예인 음향 처리 장치의 개념을 설명하기 위한 도면.2 is a view for explaining the concept of a sound processing device which is an embodiment of the present invention.

도 3은 본 발명의 일실실시예인 음향 처리 장치에서의 복수의 외부 음성을 처리하는 방법을 설명하기 위한 개념도.3 is a conceptual diagram illustrating a method of processing a plurality of external voices in a sound processing apparatus according to an embodiment of the present invention.

도 4는 본 발명의 일실시예인 음향 처리 장치에서의 복수의 외부 음성을 처리하는 방법을 설명하기 위한 흐름도.4 is a flowchart illustrating a method of processing a plurality of external voices in an acoustic processing apparatus according to an embodiment of the present invention.

도 5는 본 발명의 일실시예인 음향 처리장치에서의 복수의 외부 음성을 처리하는 방법이 적용된 음향 처리장치의 적용예에 관한 이미지도.5 is an image of an application example of a sound processing apparatus to which a method of processing a plurality of external voices is applied in a sound processing apparatus according to an embodiment of the present invention.

Claims

As a method of processing a plurality of external voices in a sound processing apparatus,

Analyzing spectral characteristics of the plurality of external voices;

Discriminating each of the plurality of external voices based on the spectral characteristics; And

Clustering said distinct voices.

The method of claim 1,

And presetting the number of the distinguished external voices in advance.

The method of claim 1,

The sound processing device includes at least two microphones,

Acquiring position information of the plurality of external voices acquired through the at least two microphones;

Based on the spectral characteristics, the step of distinguishing each of the plurality of external voices,

And discriminating each of the plurality of external voices based on the spectrum information and the location information.

The method of claim 1,

And a plurality of external voices are obtained through at least one of a microphone and a wireless communication unit of the sound processing device.

The method of claim 1,

And providing a menu for inputting a name for each distinguished voice.

The method of claim 1,

And reproducing only the distinguished voice when the selection signal for the distinguished voice is generated.

A microphone configured to obtain a plurality of external voices; And

And a controller for analyzing the spectral characteristics of the plurality of external voices, distinguishing the plurality of external voices based on the spectral characteristics, and clustering the distinguished voices, respectively.

The method of claim 7, wherein

And a user input configured to set the number of the distinguished external voices.

The method of claim 7, wherein

The microphone is composed of at least two,

The control unit,

And obtaining location information of the plurality of external voices acquired through the at least two microphones, and distinguishing the plurality of external voices based on the spectrum information and the location information, respectively.

The method of claim 7, wherein

And a wireless communication unit for acquiring the plurality of external voices.

The method of claim 7, wherein

Further includes a voice output module,

The control unit,

And when the selection signal for the discriminated voice is generated, controls the speech output module to reproduce only the discriminated speech.