KR100798955B1

KR100798955B1 - Method and apparatus to control sending a waiting sound on the phone using speech recognition

Info

Publication number: KR100798955B1
Application number: KR1020070026738A
Authority: KR
Inventors: 김정호
Original assignee: (주)케이티에프테크놀로지스
Priority date: 2007-03-19
Filing date: 2007-03-19
Publication date: 2008-01-30

Abstract

A method for controlling output of alternative sound source of a busy tone via recognition of voice and an apparatus thereof are provided to solve a problem that the other side has to wait conventionally without an alternative sound source because a user can not operate keys or is very busy. A method for controlling output of alternative sound source of a busy tone via recognition of voice comprises the following several steps. A communication terminal outputs a pre-selected alternative sound source if an instruction for outputting a preset alternative sound source of a busy tone is inputted(330,340). The communication terminal monitors whether an instruction for terminating the output of the preset alternative sound source is inputted during the output of the alternative sound source(360). If the instruction for terminating the output of the alternative sound source is inputted, the communication terminal terminates the output of the alternative sound source and switches to a call connection mode(370).

Description

Method and apparatus for controlling alternative sound transmission during voice call {Method and apparatus to control sending a waiting sound on the phone using speech recognition}

도 1은 본 발명의 실시예에 따른 통신단말기의 음성인식과 대체음원 송출에 관련된 구성요소를 나타낸 개략적인 구성도.1 is a schematic configuration diagram showing the components related to speech recognition and alternative sound source transmission of the communication terminal according to an embodiment of the present invention.

도 2는 본 발명의 실시예에 따른 통신단말기 내부 중앙처리부의 음성인식과 대체음원 송출에 관련된 구성요소를 나타낸 구성도.Figure 2 is a block diagram showing the components related to the voice recognition and the alternative sound source transmission of the central processing unit of the communication terminal according to an embodiment of the present invention.

도 3은 본 발명의 실시예에 따른 음성인식을 통한 통화 중 대체음원 송출 제어 방법을 나타내는 순서도.3 is a flow chart illustrating a method of controlling an alternative sound source transmission during a call through voice recognition according to an embodiment of the present invention.

본 발명은 통화 중 대체음원 송출을 제어하는 방법에 관한 것으로서, 보다 상세하게는 음성인식을 통하여 통화 중 대체음원 송출을 제어하는 방법에 관한 것이다.The present invention relates to a method of controlling the transmission of an alternative sound source during a call, and more particularly, to a method of controlling the transmission of an alternative sound source during a call through voice recognition.

통신단말기로 통화 중 대화 없이 상대방을 기다리게 하는 상황이 발생하였을 때, 대기해야 하는 상대방을 위하여 대체음원을 송출하는 여러 가지 방법이 제시되어 왔다.When there is a situation in which a communication terminal waits for the other party without a conversation during a call, various methods for transmitting an alternative sound source for the other party to wait have been proposed.

이 때, 통화 중 대체음원을 송출하는 방법뿐 아니라 대체음원 송출을 제어하는 방법도 고려될 수 있다. 즉, 대체음원 송출을 시작하여 상대방으로 하여금 대체음원을 수신할 수 있도록 하고, 대체음원 송출을 종료하고 다시 통화 상태로 돌아가도록 하는 방법에 관한 고려가 이루어질 수 있다.In this case, not only a method of transmitting an alternative sound source during a call but also a method of controlling the transmission of an alternative sound source may be considered. That is, consideration may be made regarding a method of starting the transmission of the alternative sound source to allow the other party to receive the alternative sound source, ending the transmission of the alternative sound source, and returning to the call state again.

종래 기술에 있어서, 통화 중 대체음원 송출을 제어하기 위하여 통신단말기 외부 키입력부의 키를 조작하는 방법이 제시되었다. 그러나, 이 방법은 수동적으로 통신단말기의 외부 키를 조작하여야 한다는 문제점이 있다.In the prior art, a method of manipulating keys of an external key input unit of a communication terminal to control transmission of an alternative sound source during a call has been proposed. However, this method has a problem in that the external key of the communication terminal must be manually operated.

또한, 통신단말기 외부의 키를 조작하지 못하는 상황이거나 매우 바쁜 상황일 때는 외부 키를 조작할 수 없어 상대방을 대체음원 없이 기다리게 할 수 있다는 문제점이 있다.In addition, there is a problem that can not wait to operate the external key when the external key can not be manipulated or in a very busy situation when the external terminal can not operate the external key.

본 발명은 통신단말기에서 별도의 키 조작 없이 음성만으로 통화 중 대체음원 송출을 제어할 수 있는 방법을 제공하기 위한 것이다.The present invention is to provide a method that can control the transmission of the alternate sound source during the call only voice without a separate key operation in the communication terminal.

본 발명의 또 다른 목적들은 이하의 실시예에 대한 설명을 통해 쉽게 이해될 수 있을 것이다.Still other objects of the present invention will be readily understood through the following description of the embodiments.

상기한 바와 같은 목적을 달성하기 위해, 본 발명의 일 측면에 따르면 통신단말기에서 음성인식을 통하여 통화 중 대체음원 송출을 제어하는 방법이 제공된다.In order to achieve the object as described above, according to an aspect of the present invention there is provided a method for controlling the transmission of the alternative sound source during the call through the voice recognition in the communication terminal.

본 발명의 실시예에 따르면, 통신단말기에서 통화 중 대체음원 송출을 제어하는 방법에 있어서, 통화 중 미리 지정된 대체음원 송출 시작 명령이 입력되면 미리 선택된 대체음원을 송출하는 단계, 대체음원 송출 중 미리 지정된 대체음원 송출 종료 명령이 입력되는지 모니터링하는 단계 및 대체음원 송출 종료 명령이 입력되면 대체음원 송출을 종료하고 통화모드로 전환하는 단계를 포함하는 음성인식을 통한 통화 중 대체음원 송출 제어 방법이 제공될 수 있다.According to an embodiment of the present invention, in a method of controlling an alternative sound transmission during a call in a communication terminal, transmitting a preselected alternative sound source when a predetermined alternative sound source transmission start command is input during a call, and a predetermined sound transmission during an alternative sound source A method of controlling an alternative sound source transmission during a call through voice recognition may include providing a method of monitoring whether an alternative sound source transmission end command is input and ending the alternative sound source transmission and switching to a call mode when the alternative sound source transmission end command is input. have.

여기서, 대체음원 송출 시작 명령 또는 종료 명령 중 하나 이상으로 각각 인식되도록 하기 위한 사용자의 음성을 입력 받아 디지털 데이터로 변환하여 저장하는 단계가 대체음원을 송출하는 단계 이전에 실행될 수 있다.Here, the step of receiving the user's voice to be recognized as one or more of the substitute sound source transmission start command or the end command, respectively, and converting the digital data into digital data may be performed before transmitting the substitute sound source.

또한, 통화모드에서 통신단말기에 입력되는 음성 중 대체음원 송출 시작 명령에 상응하도록 저장된 음성이 입력되면 대체음원 송출 시작 명령이 입력된 것으로 인식할 수 있다.In addition, when the voice stored in correspondence with the alternative sound source transmission start command is input from the voice input to the communication terminal in the call mode, it may be recognized that the alternative sound source transmission start command is input.

또한, 대체음원이 송출되는 동안 통신단말기에 대체음원 송출 종료 명령에 상응하도록 저장된 음성이 입력되면 대체음원 송출 종료 명령이 입력된 것으로 인식할 수 있다.In addition, when a voice stored corresponding to the replacement sound source transmission end command is input to the communication terminal while the replacement sound source is being transmitted, it may be recognized that the replacement sound source transmission end command is input.

본 발명의 다른 측면에 의하면, 음성인식을 통하여 통화 중 대체음원 송출을 제어하는 장치가 제공된다.According to another aspect of the present invention, there is provided an apparatus for controlling the transmission of an alternative sound source during a call through voice recognition.

본 발명의 실시예에 따르면, 통신단말기에서 통화 중 대체음원 송출을 제어하는 장치에 있어서, 대체음원 송출 시작 명령 또는 종료 명령에 상응하는 음성을 포함하는 오디오 신호를 입력 받아 디지털 데이터로 변환하는 음성처리부, 하나 이상의 포맷의 음원과 디지털 데이터 간의 변환을 수행하거나, 음원과 오디오 신호 간의 변환을 수행하는 음원처리부 및 음성처리부에서 변환된 디지털 데이터를 모니터링하여 대체음원 시작 명령 또는 종료 명령이 통신단말기에 입력되는지 판단하는 중앙제어부를 포함하는 음성인식을 통한 통화 중 대체음원 송출 제어 장치가 제공될 수 있다.According to an embodiment of the present invention, an apparatus for controlling the transmission of an alternative sound source during a call in a communication terminal, the voice processing unit for receiving an audio signal including a voice corresponding to the alternative sound source transmission start command or end command to convert to digital data Whether the alternative sound source start command or the end command is input to the communication terminal by performing the conversion between the sound source and the digital data in one or more formats, or by monitoring the converted digital data in the sound source processing unit and the voice processing unit which converts the sound source and the audio signal Alternative sound source transmission control device during a call through the voice recognition including a central control unit for determining may be provided.

여기서, 중앙제어부에 의해 대체음원 시작 명령이 통신단말기에 입력된 것으로 판단되면, 음원처리부는 미리 선택된 대체음원을 디지털 데이터 또는 오디오 신호로 변환할 수 있다.Here, when it is determined by the central control unit that the replacement sound source start command is input to the communication terminal, the sound source processing unit may convert the previously selected replacement sound source into digital data or audio signal.

또한, 음성처리부에서 변환된 대체음원 송출 시작 명령 또는 종료 명령을 저장하는 음원저장부를 포함하는 것을 특징으로 하는 음성인식을 통한 통화 중 대체음원 송출 제어 장치가 제공될 수 있다.In addition, the alternative sound source transmission control apparatus during the call through the voice recognition, characterized in that it comprises a sound source storage unit for storing the alternative sound source transmission start command or the end command converted by the voice processing unit.

발명은 다양한 변경을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예들을 도면에 예시하고 상세한 설명에 상세하게 설명하고자 한다. 그러 나, 이는 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 본 발명을 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다.As the invention allows for various changes and numerous embodiments, particular embodiments will be illustrated in the drawings and described in detail in the written description. However, this is not intended to limit the present invention to specific embodiments, it should be understood to include all modifications, equivalents, and substitutes included in the spirit and scope of the present invention. In the following description of the present invention, if it is determined that the detailed description of the related known technology may obscure the gist of the present invention, the detailed description thereof will be omitted.

제1, 제2 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 본 발명의 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다. 및/또는 이라는 용어는 복수의 관련된 기재된 항목들의 조합 또는 복수의 관련된 기재된 항목들 중의 어느 항목을 포함한다. Terms such as first and second may be used to describe various components, but the components should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from another. For example, without departing from the scope of the present invention, the first component may be referred to as the second component, and similarly, the second component may also be referred to as the first component. The term and / or includes a combination of a plurality of related items or any item of a plurality of related items.

어떤 구성요소가 다른 구성요소에 "연결되어" 있다거나 "접속되어" 있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소가 다른 구성요소에 "직접 연결되어" 있다거나 "직접 접속되어" 있다고 언급된 때에는, 중간에 다른 구성요소가 존재하지 않는 것으로 이해되어야 할 것이다. When a component is referred to as being "connected" or "connected" to another component, it may be directly connected to or connected to that other component, but it may be understood that other components may be present in between. Should be. On the other hand, when a component is said to be "directly connected" or "directly connected" to another component, it should be understood that there is no other component in between.

본 출원에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르 게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "포함하다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The terminology used herein is for the purpose of describing particular example embodiments only and is not intended to be limiting of the present invention. Singular expressions include plural expressions unless the context clearly indicates otherwise. In this application, the terms "comprise" or "have" are intended to indicate that there is a feature, number, step, operation, component, part, or combination thereof described in the specification, and one or more other features. It is to be understood that the present invention does not exclude the possibility of the presence or the addition of numbers, steps, operations, components, components, or a combination thereof.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가지고 있다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 가지는 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다.Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art. Terms such as those defined in the commonly used dictionaries should be construed as having meanings consistent with the meanings in the context of the related art, and shall not be construed in ideal or excessively formal meanings unless expressly defined in this application. Do not.

이하, 본 발명의 실시예를 첨부한 도면들을 참조하여 상세히 설명하기로 한다. 본 발명을 설명함에 있어 전체적인 이해를 용이하게 하기 위하여 도면 번호에 상관없이 동일한 수단에 대해서는 동일한 참조 번호를 사용하기로 한다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings. In the following description of the present invention, the same reference numerals will be used for the same means regardless of the reference numerals in order to facilitate the overall understanding.

도 1은 본 발명의 실시예에 따른 통신단말기의 음성인식과 대체음원 송출에 관련된 구성요소를 나타낸 개략적인 구성도이다.1 is a schematic configuration diagram showing the components related to speech recognition and alternative sound source transmission of the communication terminal according to an embodiment of the present invention.

도 1을 참조하면, 본 발명의 실시예에 따른 통신단말기는 리시버(120), 마이크(130), 헤드셋(140), 외부입출력포트(150), 음원저장부(160), 통신부(170) 및 중 앙처리부(110)를 포함할 수 있다.1, a communication terminal according to an embodiment of the present invention is a receiver 120, a microphone 130, a headset 140, an external input and output port 150, a sound source storage unit 160, a communication unit 170 and The central processing unit 110 may be included.

리시버(120)는 통화 중 수화음을 출력하거나 대체음원과 같은 오디오 신호를 출력할 수 있는 장치이다.The receiver 120 is a device capable of outputting a ring tone during a call or outputting an audio signal such as an alternative sound source.

마이크(130)는 통신단말기 외부로부터 음성을 포함한 오디오 신호를 입력 받을 수 있는 장치이다.The microphone 130 is a device that can receive an audio signal including voice from the outside of the communication terminal.

헤드셋(140)은 마이크와 이어리시버(ear receiver)를 포함하는 장치이다.Headset 140 is a device that includes a microphone and an ear receiver.

외부입출력포트(150)는 케이블을 통하여 핸즈프리(hands-free)를 포함하는 외부 오디오 장치와 통신단말기를 연결할 수 있는 장치이다.The external input / output port 150 is a device capable of connecting a communication terminal with an external audio device including hands-free through a cable.

여기서, 통신단말기는 통신단말기에 포함된 마이크(130), 헤드셋(140)의 마이크 또는 외부입출력포트(150)에 연결된 오디오 장치에 포함된 마이크 등으로부터 음성을 입력 받을 수 있다.Here, the communication terminal may receive a voice from the microphone 130 included in the communication terminal, the microphone of the headset 140 or the microphone included in the audio device connected to the external input / output port 150.

음원저장부(160)는 다양한 포맷의 음원이 저장될 수 있는 장치이다. 여기서, 음원저장부(160)에 저장될 수 있는 음원은 음성을 포함하는 오디오 신호를 디지털 데이터로 변환한 결과(이하, "오디오 데이터"라 칭함) 또는 MP3(Moving Picture Expert Group, audio layer 3), AAC(Advanced Audio Coding) 등과 같은 오디오 포맷의 음원(이하, "음원 데이터"라 칭함)을 포함한다.The sound source storage unit 160 is a device capable of storing sound sources of various formats. Here, the sound source that can be stored in the sound source storage unit 160 is a result of converting an audio signal including voice into digital data (hereinafter referred to as "audio data") or MP3 (Moving Picture Expert Group, audio layer 3) And a sound source (hereinafter, referred to as "source data") of an audio format such as AAC (Advanced Audio Coding).

통신부(170)는 기저대역(baseband)의 통신 신호를 RF(Radio Frequency) 대역으로 상승시켜 안테나를 통하여 송출하거나, 안테나로부터 수신된 RF 대역의 통신 신호를 기저대역으로 하강시켜 통신단말기 내부 통신 신호 처리부로 전달한다.The communication unit 170 raises the baseband communication signal to the RF (Radio Frequency) band and transmits it through the antenna, or lowers the communication signal of the RF band received from the antenna to the baseband to communicate with the internal communication signal processor. To pass.

중앙처리부(110)는 통신단말기의 전반적인 동작을 제어한다.The central processing unit 110 controls the overall operation of the communication terminal.

지금까지, 도 1을 참조하여 본 발명의 실시예에 따른 통신단말기의 내부 구성을 설명하였다. 이하에서는 도 2를 참조하여 본 발명의 실시예에 따른 중앙처리부(110)에 대해서 상세히 설명한다.Up to now, the internal structure of the communication terminal according to the embodiment of the present invention has been described with reference to FIG. Hereinafter, the central processing unit 110 according to an embodiment of the present invention will be described in detail with reference to FIG. 2.

도 2는 본 발명의 실시예에 따른 통신단말기 내부 중앙처리부(110)의 음성인식과 대체음원 송출에 관련된 구성요소를 나타낸 구성도이다.Figure 2 is a block diagram showing the components related to the voice recognition and the alternative sound source transmission of the internal processing unit 110 in the communication terminal according to an embodiment of the present invention.

도 2를 참조하면, 본 발명의 실시예에 따른 중앙처리부(110)는 음성처리부(210), 음원처리부(220), 중앙제어부(230) 및 통신제어부(240)로 구성될 수 있다.Referring to FIG. 2, the central processing unit 110 according to the embodiment of the present invention may include a voice processing unit 210, a sound source processing unit 220, a central control unit 230, and a communication control unit 240.

음성처리부(210)는 음성을 포함하는 오디오 신호와 PCM 데이터를 포함하는 디지털 데이터 간의 변환을 수행한다. 또한, 음성처리부(210)는 디지털 데이터의 복/부호 과정을 처리한다.The voice processor 210 converts an audio signal including voice and digital data including PCM data. In addition, the voice processor 210 processes a decoding / coding process of digital data.

여기서, 복/부호 과정은 유무선 통신 시스템에 일반적으로 포함되는 과정이며, 일례로 CDMA(Code Devision Multiple Access) 통신 시스템에는 음성부호화 방식으로 QCELP(Qualcomm Code Excited Linear Prediction)가 적용될 수 있다.Here, the decoding / coding process is a process generally included in a wired / wireless communication system. For example, QCELP (Qualcomm Code Excited Linear Prediction) may be applied to a CDMA communication system.

통신제어부(240)는 음성처리부(210)의 복/부호 과정에서 처리되는 데이터와 기저대역의 통신 신호 간의 변환을 수행한다. 여기서, 통신 신호는 QPSK(Quadrature Phase Shift Keying) 등의 방식에 의한 변조나 대역 확산 등의 과정을 거쳐 생성될 수 있다.The communication controller 240 converts the data processed in the decoding / coding process of the voice processor 210 and the communication signal of the baseband. Here, the communication signal may be generated through a process such as modulation or spread spectrum by a method such as quadrature phase shift keying (QPSK).

또한, 통신제어부(240)는 통신부(170)의 동작을 제어한다.In addition, the communication control unit 240 controls the operation of the communication unit 170.

이하, 통신단말기에서 통신을 위하여 수행될 수 있는 과정은 당업자에게 공지된 사항이므로, 본 발명의 요지를 명확하게 하기 위하여 본 명세서에는 이에 대한 상세한 설명을 생략한다.Hereinafter, since the process that can be performed for communication in the communication terminal is a matter known to those skilled in the art, detailed description thereof will be omitted herein for clarity of the gist of the present invention.

음원처리부(220)는 음원 데이터와 오디오 데이터 간의 변환을 수행하거나, 음원 데이터와 오디오 신호 간의 변환을 수행한다. 여기서, 음원 데이터는 통신단말기 내에서 오디오 신호로 변환되어 통신단말기에 포함된 스피커 등의 출력 장치로 출력되는 것이 일반적이다. 따라서, 음원처리부(220)에서 음원 데이터를 오디오 데이터로 변환하는 과정을 거치지 않고 바로 오디오 신호로 변환하는 경우가 있을 수 있다.The sound source processor 220 performs conversion between sound source data and audio data or converts between sound source data and audio signal. Here, the sound source data is generally converted into an audio signal in the communication terminal and output to an output device such as a speaker included in the communication terminal. Therefore, there may be a case where the sound source processing unit 220 converts the sound source data directly into the audio signal without going through the process of converting the sound source data into the audio data.

중앙제어부(230)는 중앙처리부(110) 내/외부 각 구성요소의 동작 및 각 구성요소에서 처리되는 데이터의 입출력을 제어한다. 이하의 설명에 있어서 통상적인 제어부의 처리 및 제어에 관한 설명은 생략하도록 한다.The central controller 230 controls the operation of each component inside and outside the central processing unit 110 and input / output of data processed by each component. In the following description, description of the processing and control of the conventional control unit will be omitted.

지금까지, 도 2를 참조하여 본 발명의 실시예에 따른 중앙처리부(110)의 구성요소를 살펴보았다. 이하에서는 도 3을 참조하여 본 발명의 실시예에 따른 통화 중 대체음원 송출 제어 방법에 대하여 상세히 설명한다.So far, the components of the central processing unit 110 according to the embodiment of the present invention have been described with reference to FIG. 2. Hereinafter, with reference to Figure 3 will be described in detail with respect to the alternative sound transmission control method during the call according to an embodiment of the present invention.

도 3을 설명하기에 앞서, 본 발명의 요지를 명확하게 하기 위하여 본 발명에 적용되는 음성인식 기술에 대하여 먼저 간단한 설명한다.Before explaining FIG. 3, the speech recognition technique applied to the present invention will be briefly described in order to clarify the gist of the present invention.

음성인식(speech recognition)은 화자 의존성 파라미터를 기준으로 두 가지 방식으로 분류될 수 있다. 그 첫 번째 방식은 화자독립(speaker-independent) 음성 인식으로서 발성하는 화자에 무관한 임의의 음성을 인식할 수 있도록 설계된 방식이다. 두 번째 방식은 화자종속(speaker-dependent) 음성인식으로서 특정 화자의 음성을 인식하도록 설계된 방식이다. 화자종속 음성인식은 음성인식기에 특정 화자의 음성을 미리 입력하고 저장하는 과정을 포함함으로써 사용될 수 있다.Speech recognition can be classified in two ways based on speaker dependency parameters. The first method is speaker-independent speech recognition, which is designed to recognize any voice irrespective of the speaker. The second method is speaker-dependent speech recognition, which is designed to recognize the voice of a specific speaker. Speaker dependent speech recognition may be used by including a process of pre-input and storing the voice of a specific speaker in the speech recognizer.

또한, 음성인식은 발성모드 파라미터를 기준으로 네 가지 방식으로 분류될 수 있다. 첫 번째 방식은 고립단어인식(isolated word recognition)으로서 하나의 단어만을 발성한 것을 인식하는 방식이다. 두 번째 방식은 연결단어인식(connected word recognition)으로서 여러 개의 단어를 중간에 짧은 휴지 구간을 넣어서 발성한 것을 인식하는 방식이다. 세 번째 방식은 연속음성인식(continuous speech recognition)으로서 여러 단어를 연속적으로 발성한 것을 인식하는 방식이다. 마지막 네 번째 방식은 핵심어검출(keyword spotting)로서 연속적으로 발성한 음성에서 미리 주어진 핵심어만을 추출하는 방식이다. 핵심어검출 방식은 완전한 연속음성인식 방식의 전 단계로서 고립단어인식보다는 사용자에게 유연성을 제공할 수 있다.In addition, speech recognition may be classified in four ways based on speech mode parameters. The first method is isolated word recognition, which recognizes that only one word is spoken. The second method is connected word recognition, in which several words are recognized by putting a short pause in the middle. The third method is continuous speech recognition, in which multiple words are spoken continuously. Finally, the fourth method is keyword spotting, which extracts only a predetermined key word from consecutive voices. The key word detection method is a preliminary step of the complete continuous speech recognition method and can provide flexibility to the user rather than isolated word recognition.

종래의 기술에서 일반적으로 통신단말기에서 사용되고 있는 음성인식 방식은 화자종속 고립단어인식 방식이다. 이하, 본 명세서에서는 통신단말기에 적용되는 음성인식 방식에 있어서, 화자종속 고립단어인식 방식과 더불어 화자종속 핵심어검출 방식을 실시예로서 설명한다. 그러나, 본 발명은 이에 한정되지 아니하고, 상술한 연결단어인식, 연속음성인식 등의 음성인식 기술이 더불어 적용될 수 있음은 당업자에게 자명하다.In the prior art, a speech recognition method generally used in a communication terminal is a speaker-dependent isolated word recognition method. Hereinafter, in the speech recognition method applied to the communication terminal, the speaker-dependent key word detection method together with the speaker-dependent isolated word recognition method will be described as an embodiment. However, the present invention is not limited thereto, and it will be apparent to those skilled in the art that the above-described speech recognition technology such as connection word recognition and continuous speech recognition may be applied together.

또한, 상술한 음성인식 기술은 당업자에게 공지된 기술이므로, 본 명세서에 서 본 발명의 요지와 무관한 설명은 생략한다.In addition, the above-described speech recognition technology is known to those skilled in the art, and thus descriptions irrelevant to the gist of the present invention will be omitted.

도 3은 본 발명의 실시예에 따른 음성인식을 통한 통화 중 대체음원 송출 제어 방법을 나타내는 순서도이다. 이하, 발명의 이해와 설명의 편의를 도모하기 위하여, 도 1 및 도 2에서 설명한 본 발명의 실시예에 따른 통신단말기의 구성요소를 참조하여, 도 3의 본 발명의 실시예에 따른 통화 중 대체음원 송출 제어 방법을 설명한다.3 is a flowchart illustrating a method of controlling an alternative sound source transmission during a call through voice recognition according to an embodiment of the present invention. Hereinafter, for convenience of understanding and explanation of the invention, with reference to the components of the communication terminal according to the embodiment of the present invention described with reference to FIGS. 1 and 2, the call during the call according to the embodiment of the present invention of FIG. A sound source control method will be described.

사용자의 통신단말기가 통화모드에 있을 때(단계 310), 통신단말기는 대체음원 송출 시작을 위한 음성인식모드를 시작한다(단계 320).When the user's communication terminal is in the call mode (step 310), the communication terminal enters the voice recognition mode for starting to transmit the alternative sound source (step 320).

단계 320에서, 음성인식 방식으로 화자종속 핵심어검출 방식이 적용될 수 있다. 이는 통화 중 통신단말기로 연속적으로 입력되는 음성 중 미리 주어진 대체음원 송출 시작 명령(이하, "시작명령"이라 칭함)에 상응하는 음성을 인식하기 위함이다.In operation 320, the speaker-dependent keyword detection scheme may be applied as a speech recognition scheme. This is for recognizing a voice corresponding to a predetermined sound source transmission start command given below (hereinafter, referred to as a "start command") among voices continuously input to the communication terminal during a call.

여기서, 단계 310 이전에 화자종속 음성인식 방식에서의 일반적인 전개에 따라 시작명령을 미리 입력 받아 저장하는 단계가 포함될 수 있다. 이 때, 음성처리부(210)는 시작명령에 상응하는 사용자의 음성을 입력 받아 오디오 데이터로 변환하고, 변환된 오디오 데이터는 음원저장부(160)에 저장된다.Here, before step 310, a step of receiving and storing a start command in advance according to a general development in a speaker-dependent voice recognition method may be included. At this time, the voice processing unit 210 receives the user's voice corresponding to the start command and converts it into audio data, and the converted audio data is stored in the sound source storage unit 160.

또한, 핵심어검출 음성인식 방식을 적용함으로써 통화 중 통신단말기에 지속적으로 입력되는 음성 중 시작명령에 상응하는 음성이 입력되는지 비교/판단할 수 있다(단계 330).In addition, by applying the keyword detection speech recognition method, it is possible to compare / determine whether a voice corresponding to a start command is input among voices continuously input to the communication terminal during a call (step 330).

단계 330에서, 통화 중 통신단말기에 입력되는 음성 중 시작명령에 상응하는 음성이 없다고 판단되면, 통신단말기는 계속하여 대체음원 송출을 위한 음성인식모드(단계 320)를 수행한다.In step 330, if it is determined that there is no voice corresponding to the start command among voices input to the communication terminal during a call, the communication terminal continuously performs a voice recognition mode (step 320) for transmitting an alternative sound source.

또한, 단계 330에서 통화 중 통신단말기에 입력되는 음성 중 시작명령에 상응하는 음성이 입력되었음이 판단되면, 통신단말기는 대체음원을 송출하기 시작한다(단계 340).In addition, when it is determined in step 330 that a voice corresponding to a start command is input among voices input to the communication terminal during a call, the communication terminal starts to transmit a substitute sound source (step 340).

여기서, 대체음원은 음원 데이터뿐 아니라 사용자에 의해 생성되어 저장된 오디오 데이터일 수 있다.Here, the substitute sound source may be audio data generated and stored by the user as well as sound source data.

본 발명의 실시예에 따라, 대체음원이 음원 데이터인 경우 음원처리부(220)는 음원 데이터를 오디오 데이터 또는 오디오 신호로 변환할 수 있다. 음원처리부(220)에서 음원 데이터가 오디오 데이터로 변환되었을 경우, 음성처리부(210)는 이를 바로 부호화할 수 있다. 음원처리부(220)에서 음원 데이터가 오디오 신호로 변환되었을 경우, 음성처리부(210)는 먼저 오디오 신호를 오디오 데이터로 변환한 후 부호화한다.According to the exemplary embodiment of the present invention, when the alternative sound source is sound source data, the sound source processor 220 may convert the sound source data into audio data or an audio signal. When the sound source data is converted into audio data in the sound source processor 220, the sound processor 210 may directly encode the same. When the sound source data is converted into an audio signal in the sound source processor 220, the sound processor 210 first converts the audio signal into audio data and then encodes the audio signal.

대체음원이 오디오 데이터인 경우 음성처리부(210)는 오디오 데이터를 바로 부호화할 수 있다.When the substitute sound source is audio data, the voice processor 210 may directly encode the audio data.

부호화 된 데이터가 통신부(170) 및 통신제어부(240)를 통하여 송출됨을로써 통화 중인 상대방의 통신단말기로 대체음원이 전송될 수 있다.As the encoded data is transmitted through the communication unit 170 and the communication control unit 240, the alternative sound source may be transmitted to the communication terminal of the other party in the call.

대체음원이 송출되기 시작하면 통신단말기는 대체음원 송출 종료를 위한 음성인식모드로 전환된다(단계 350).When the alternative sound source starts to be transmitted, the communication terminal switches to the voice recognition mode for ending the transmission of the alternative sound source (step 350).

단계 350에서, 음성인식모드로 화자종속 고립단어인식 방식을 적용할 수 있다. 이는 대체음원이 송출되고 있는 중 통신단말기에 음성이 입력될 가능성이 비교적 낮은 상황에서 입력된 음성이 대체음원 송출 종료 명령(이하, "종료명령"이라 칭함)에 상응하는 음성인지 비교하기 위함이다.In operation 350, the speaker-dependent isolated word recognition scheme may be applied to the speech recognition mode. This is to compare whether the input voice is a voice corresponding to the replacement sound source transmission end command (hereinafter, referred to as an "end command") in a situation where the possibility of inputting voice to the communication terminal is relatively low while the alternative sound source is being transmitted.

여기서, 단계 310 이전에 화자종속 음성인식 방식에서의 일반적인 전개에 따라 종료명령을 미리 입력 받아 저장하는 단계가 포함될 수 있다. 이 때, 음성처리부(210)는 종료명령에 상응하는 사용자의 음성을 입력 받아 오디오 데이터로 변환하고, 변환된 오디오 데이터는 음원저장부(160)에 저장된다.Here, before step 310, a step of receiving and storing a termination command in advance according to a general development in a speaker-dependent voice recognition method may be included. At this time, the voice processing unit 210 receives the user's voice corresponding to the end command and converts it into audio data, and the converted audio data is stored in the sound source storage unit 160.

또한, 고립단어인식 음성인식 방식을 적용함으로써 대체음원 송출 중 통신단말기에 입력된 음성이 종료명령에 상응하는 음성인지 비교/판단할 수 있다(단계 360).Further, by applying the isolated word recognition voice recognition method, it is possible to compare / determine whether the voice inputted to the communication terminal during the replacement sound transmission is the voice corresponding to the end command (step 360).

단계 360에서, 대체음원 송출 중 통신단말기에 입력된 음성이 종료명령에 상응하는 음성이 아니라고 판단되면, 통신단말기는 계속하여 대체음원 종료를 위한 음성인식모드(단계 350)를 수행한다.In step 360, if it is determined that the voice input to the communication terminal is not the voice corresponding to the termination command during transmission of the alternative sound source, the communication terminal continuously performs the voice recognition mode (step 350) for terminating the alternative sound source.

또한, 단계 360에서 대체음원 송출 중 통신단말기에 입력된 음성이 종료명령에 상응하는 음성이라고 판단되면, 통신단말기는 대체음원 송출을 중단하고(단계 370) 통화모드(단계 310)로 전환된다.In addition, if it is determined in step 360 that the voice input to the communication terminal during voice transmission is the voice corresponding to the termination command, the communication terminal stops transmitting the replacement sound source (step 370) and switches to the call mode (step 310).

상기한 본 발명의 실시예는 예시의 목적을 위해 개시된 것이고, 본 발명에 대해 통상의 지식을 가진 당업자라면 본 발명의 사상과 범위 안에서 다양한 수정, 변경, 부가가 가능할 것이며, 이러한 수정, 변경 및 부가는 하기의 특허청구범위에 속하는 것으로 보아야 할 것이다.The above-described embodiments of the present invention are disclosed for the purpose of illustration, and those skilled in the art may make various modifications, changes, and additions within the spirit and scope of the present invention. Should be considered to be within the scope of the following claims.

이상에서 설명한 바와 같이 본 발명에 따르면, 통신단말기에서 통화 중 대체음원을 송출하는 데 있어서 별도의 키 조작 없이 음성만으로 대체음원 송출을 시작하거나 종료할 수 있다.As described above, according to the present invention, in transmitting a substitute sound source during a call in the communication terminal, it is possible to start or end the transmission of the alternative sound source using only voice without additional key manipulation.

Claims

In the method of controlling the alternative sound transmission during the call in the communication terminal,

Transmitting a preselected alternative sound source when a predetermined alternative sound source transmission start command is input during a call;

Monitoring whether a predetermined substitute sound source transmission end command is input during transmission of the substitute sound source; And

When the alternative sound source transmission end command is input, the method of controlling the alternative sound source during the call via the voice recognition comprising the step of ending the transmission of the alternative sound source and switching to the call mode.

The method of claim 1,

Voice recognition, characterized in that the step of receiving the user's voice to be recognized as one or more of the alternative sound source transmission start command or end command, respectively, converted to digital data before the step of transmitting the alternative sound source How to control alternative sound transmission during call.

The method of claim 2,

Outgoing replacement sound source during a call through voice recognition, if the stored voice corresponding to the alternative sound source transmission start command is input among the voice input to the communication terminal in the call mode Control method.

The method of claim 2,

While the replacement sound source is being transmitted, if the voice stored in correspondence with the replacement sound source transmission end command is input to the communication terminal, the replacement sound source during a call through voice recognition, characterized in that the replacement sound source is recognized as input. Transmission control method.

In the device for controlling the alternative sound transmission during the call in the communication terminal,

A voice processing unit which receives an audio signal including a voice corresponding to a substitute sound source transmission start command or an end command, and converts the audio signal into digital data;

A sound source processor for converting between a sound source of at least one format and the digital data or converting between the sound source and the audio signal; And

And a central controller configured to monitor the digital data converted by the voice processor to determine whether the alternative sound source start command or the end command is input to the communication terminal.

The method of claim 5,

If it is determined by the central control unit that the replacement sound source start command is input to the communication terminal, the sound source processing unit converts the previously selected replacement sound source into the digital data or the audio signal. Alternative sound source transmission control device.

The method of claim 5,

And a sound source storage unit configured to store the alternative sound source transmission start command or the end command converted by the voice processing unit.