KR101317941B1

KR101317941B1 - Device and method for performing connection for telecommunication service based on voice recognition, and method for generating audio data for the voice recognition

Info

Publication number: KR101317941B1
Application number: KR1020120021786A
Authority: KR
Inventors: 윤정욱; 이지훈
Original assignee: (주)파워보이스
Priority date: 2012-03-02
Filing date: 2012-03-02
Publication date: 2013-10-18
Also published as: KR20130100525A

Abstract

음성인식을 기반으로 통신 서비스의 연결을 수행하는 단말에 관한 것으로, 단말은 네트워크를 통하여 송신 단말로부터 통신 서비스의 연결 요청을 수신하는 연결 요청 수신부, 연결 요청에 응답하는 방식들 각각에 대응하는 복수의 소리 데이터 중 음성인식 소리 데이터를 선택하는 소리 데이터 선택부, 선택된 음성인식 소리 데이터를 소리 재생 장치를 통하여 재생하는 소리 데이터 재생부, 음성인식을 기반으로 사용자로부터 상기 사용자의 음성 데이터를 입력 받는 음성 데이터 입력부, 입력된 음성 데이터에 기초하여 통신 서비스의 연결을 수행하는 서비스 연결 수행부를 포함한다. A terminal for performing a connection of a communication service based on voice recognition, the terminal comprising: a connection request receiver configured to receive a connection request of a communication service from a transmitting terminal through a network; A sound data selection unit for selecting voice recognition sound data among sound data, a sound data reproducing unit for reproducing the selected voice recognition sound data through a sound reproducing apparatus, and voice data receiving the voice data of the user from a user based on voice recognition An input unit includes a service connection performing unit that performs a connection of a communication service based on the input voice data.

Description

Terminal and method for performing communication service connection based on voice recognition, and method for generating sound data for voice recognition RECOGNITION}

음성인식을 기반으로 통신 서비스 연결을 수행하는 단말 및 방법에 관한 것으로, 보다 상세하게는 음성인식을 위한 소리 데이터의 재생 중 음성인식을 기반으로 통신 서비스 연결을 수행하는 단말 및 방법, 그리고 음성인식을 위한 소리 데이터를 생성하는 방법에 관한 것이다. The present invention relates to a terminal and a method for performing a communication service connection based on voice recognition, and more particularly, to a terminal and a method for performing a communication service connection based on voice recognition during playback of sound data for voice recognition. The present invention relates to a method for generating sound data.

일반적으로, 통신 단말기에서 전화 연결을 요청하는 메시지의 착신에 대응하여 사용자는 호의 연결 또는 거절을 위해서 통신 단말기의 소정 버튼을 클릭하거나 통신 단말기의 디스플레이를 터치하여야만 한다. 따라서, 사용자가 통신 단말기와 직접 접촉하기 어려운 상황에서는 착신호를 제어하는데 있어 어려움이 따른다. 이에 대한 방안의 하나로서 음성인식을 이용한 착신호의 제어는 사용자로 하여금 직접 통신 단말기에 직접 접촉하지 않고도 호 연결을 가능하게 한다. 이와 관련하여, 한국공개특허 제2008-0021882호에는 음성인식을 통한 착신호 제어 방법 및 그를 수행하는 휴대단말기에 대한 구성들이 개시되어 있다. In general, in response to an incoming call of a message requesting a telephone connection in a communication terminal, a user must click a predetermined button of the communication terminal or touch a display of the communication terminal in order to connect or reject a call. Therefore, in a situation where the user is difficult to directly contact the communication terminal, there is a difficulty in controlling the incoming call. As a solution to this, the control of an incoming call using voice recognition enables a user to connect a call without directly contacting the communication terminal. In this regard, Korean Patent Laid-Open Publication No. 2008-0021882 discloses a method for controlling an incoming call through voice recognition and a configuration of a portable terminal performing the same.

한편, 통신 단말기에서 음성인식을 이용하는 방식은 다른 종류의 전자 제품과는 다른 제한을 갖는다. 예를 들어, 이동통신 단말기의 경우 마이크와 스피커의 구간이 짧아 스피커에서 발생하는 벨소리가 음성인식을 저해하는 큰 소음으로 작용될 수 있다. On the other hand, the method of using voice recognition in a communication terminal has a different limitation from other kinds of electronic products. For example, in the case of a mobile communication terminal, the interval between the microphone and the speaker is short, so that a ring tone generated from the speaker may act as a loud noise that hinders voice recognition.

통신 단말로 호의 연결을 요청하는 메시지가 수신되는 경우, 음성인식을 통하여 호의 연결을 제어하고자 한다. 통신 서비스와 관련된 메시지의 수신을 알리는 벨소리를 재생함에 있어서, 음성인식을 방해하지 않는 음성인식용 벨소리를 제공하고자 한다. 통신 서비스를 위한 음성인식의 인식률을 향상시킴으로써, 보다 원활한 통신 서비스를 제공하고자 한다. 다만 본 실시예가 이루고자 하는 기술적 과제는 상기된 바와 같은 기술적 과제들로 한정되지 않으며, 또 다른 기술적 과제들이 존재할 수 있다. When a message requesting connection of a call is received to a communication terminal, the connection of the call is controlled through voice recognition. In reproducing a ringtone for notifying reception of a message related to a communication service, a voice recognition ringtone that does not interfere with voice recognition is provided. By improving the recognition rate of speech recognition for a communication service, it is intended to provide a more smooth communication service. However, the technical problem to be achieved by the present embodiment is not limited to the above technical problems, and other technical problems may exist.

상술한 기술적 과제를 달성하기 위한 기술적 수단으로써, 본 발명의 일 실시예는, 네트워크를 통하여 송신 단말로부터 통신 서비스의 연결 요청을 수신하는 연결 요청 수신부, 연결 요청에 응답하는 방식들 각각에 대응하는 복수의 소리 데이터 중 음성인식 소리 데이터를 선택하는 소리 데이터 선택부, 선택된 음성인식 소리 데이터를 소리 재생 장치를 통하여 재생하는 소리 데이터 재생부, 음성인식을 기반으로 사용자 인터페이스로부터 사용자의 음성 데이터를 입력 받는 음성 데이터 입력부, 입력된 음성 데이터에 기초하여 통신 서비스의 연결을 수행하는 서비스 연결 수행부를 포함하는 단말을 제공할 수 있다. As a technical means for achieving the above-described technical problem, an embodiment of the present invention, a connection request receiving unit for receiving a connection request of a communication service from a transmitting terminal through a network, a plurality of corresponding to each of the schemes that respond to the connection request A sound data selection unit for selecting voice recognition sound data from among sound data, a sound data reproducing unit for reproducing the selected voice recognition sound data through a sound reproducing apparatus, and a voice receiving user's voice data from a user interface based on voice recognition The terminal may include a data input unit and a service connection performing unit configured to connect a communication service based on the input voice data.

또한, 본 발명의 다른 실시예는 통신 서비스의 연결 요청을 입력받는 단계, 입력된 연결 요청에 응답하는 방식들 각각에 대응하는 복수의 소리 데이터 중 음성인식 소리 데이터를 선택하는 단계, 선택된 음성인식 소리 데이터를 재생하는 단계, 음성인식을 기반으로 사용자의 음성 데이터를 입력받는 단계, 입력된 음성 데이터에 기초하여 통신 서비스의 연결을 수행하는 단계를 포함하는 통신 서비스 연결 수행 방법을 제공할 수 있다. In addition, another embodiment of the present invention is a step of receiving a connection request of the communication service, selecting the voice recognition sound data of the plurality of sound data corresponding to each of the schemes in response to the input connection request, the selected voice recognition sound The present invention may provide a method of performing communication service connection including reproducing data, receiving voice data of a user based on voice recognition, and connecting a communication service based on the input voice data.

또한, 본 발명의 또 다른 실시예는 데이터베이스로부터 복수의 소리 데이터 중 어느 하나의 소리 데이터를 추출하는 단계, 추출된 소리 데이터를 소정 시간 간격의 복수의 구간으로 분할하는 단계, 분할된 복수의 구간 사이에 음소거 구간을 삽입함으로써, 추출된 소리 데이터로부터 음성인식 소리 데이터를 생성하는 단계, 생성된 음성인식 소리 데이터를 데이터베이스에 저장하는 단계를 포함하는 음성인식 소리 데이터 생성 방법을 제공할 수 있다. Another embodiment of the present invention is to extract the sound data of any one of the plurality of sound data from the database, to divide the extracted sound data into a plurality of sections of a predetermined time interval, between the plurality of divided sections The voice recognition sound data generation method may include providing a voice recognition sound data from the extracted sound data by inserting a muting section into the voice data, and storing the generated voice recognition sound data in a database.

사용자 인터페이스로부터 입력된 음성 데이터에 대응하여 통신 서비스 연결을 수행함으로써, 통신 단말로 호의 연결을 요청하는 메시지가 수신되는 경우, 음성인식을 통하여 호의 연결을 제어할 수 있는 단말 및 방법을 제공할 수 있다. 소리 데이터에 음소거 구간을 삽입하여 음성인식용 소리 데이터를 생성함으로써, 음성인식을 방해하지 않는 음성인식용 벨소리를 재생할 수 있는 단말 및 방법을 제공할 수 있다. 음성인식용 벨소리를 이용하여 통신 서비스를 위한 음성인식의 인식률을 향상시킴으로써, 보다 원활한 통신 서비스를 제공할 수 있는 단말 및 방법을 제공할 수 있다. By performing a communication service connection in response to voice data input from a user interface, when a message requesting connection of a call is received to a communication terminal, a terminal and a method capable of controlling connection of a call through voice recognition may be provided. . It is possible to provide a terminal and a method capable of reproducing a voice recognition ringtone that does not interfere with voice recognition by inserting a mute section into the sound data to generate voice recognition sound data. It is possible to provide a terminal and a method capable of providing a smoother communication service by improving the recognition rate of the voice recognition for the communication service using the voice recognition ringtone.

도 1은 본 발명의 일 실시예에 따른 통신 서비스 시스템의 구성도이다.
도 2는 도 1에 도시된 단말(10)의 구성도이다.
도 3은 본 발명의 다른 실시예에 따른 단말(10)의 구성도이다.
도 4는 도 3의 소리 데이터 생성부(17)에 의하여 소리 데이터(41)로부터 음성인식 소리 데이터(44)를 생성하는 과정의 일 예를 설명하기 위한 도면이다.
도 5는 본 발명의 일 실시예에 따른 음성인식 소리 데이터를 생성하는 방법을 나타낸 동작 흐름도이다.
도 6은 본 발명의 일 실시예에 따른 통신 서비스 연결 수행 방법을 나타낸 동작 흐름도이다. 1 is a block diagram of a communication service system according to an embodiment of the present invention.
2 is a block diagram of the terminal 10 shown in FIG.
3 is a block diagram of a terminal 10 according to another embodiment of the present invention.
4 is a diagram for describing an example of a process of generating voice recognition sound data 44 from the sound data 41 by the sound data generation unit 17 of FIG. 3.
5 is a flowchart illustrating a method of generating voice recognition sound data according to an embodiment of the present invention.
6 is a flowchart illustrating a method of performing a communication service connection according to an embodiment of the present invention.

아래에서는 첨부한 도면을 참조하여 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자가 용이하게 실시할 수 있도록 본 발명의 실시예를 상세히 설명한다. 그러나 본 발명은 여러 가지 상이한 형태로 구현될 수 있으며 여기에서 설명하는 실시예에 한정되지 않는다. 그리고 도면에서 본 발명을 명확하게 설명하기 위해서 설명과 관계없는 부분은 생략하였으며, 명세서 전체를 통하여 유사한 부분에 대해서는 유사한 도면 부호를 붙였다. Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings, which will be readily apparent to those skilled in the art. The present invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. In the drawings, parts irrelevant to the description are omitted in order to clearly describe the present invention, and like reference numerals designate like parts throughout the specification.

명세서 전체에서 어떤 부분이 다른 부분과 "연결"되어 있다고 할 때, 이는 "직접적으로 연결"되어 있는 경우뿐 아니라, 그 중간에 다른 소자를 사이에 두고 "간접적으로 연결"되어 있는 경우도 포함한다. 또한 어떤 부분이 구성요소를 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한 다른 구성 요소를 제외하는 것이 아니라 다른 구성요소를 더 포함할 수 있는 것을 의미한다When a part of the specification is said to be "connected" to another part, this includes not only the "directly connected", but also the "indirectly connected" between the other elements in the middle. In addition, when a part is said to "include" a component, it means that it can further include other components, except to exclude other components unless otherwise stated.

도 1은 본 발명의 일 실시예에 따른 통신 서비스 시스템의 구성도이다. 도 1을 참조하면, 통신 서비스 시스템은 송신 단말(30), 네트워크, 기지국들 및 단말(10)을 포함한다. 일반적으로, 네트워크는 복수의 노드들간에 정보 교환이 가능한 연결 구조를 의미하는 것으로서, 이러한 네트워크의 일 예에는 인터넷(Internet), LAN(Local Area Network), Wireless LAN (Wireless Local Area Network), WAN (Wide Area Network), PAN(Personal Area Network), 이동 통신망(Mobile Radio Communication Network) 등의 네트워크가 포함되나 이에 한정되지는 않는다. 1 is a block diagram of a communication service system according to an embodiment of the present invention. Referring to FIG. 1, a communication service system includes a transmitting terminal 30, a network, base stations, and a terminal 10. In general, a network refers to a connection structure capable of exchanging information between a plurality of nodes, and examples of such a network include the Internet, a local area network, a wireless local area network, and a WAN (WAN). Networks such as a wide area network (PAN), a personal area network (PAN), a mobile radio communication network (Mobile Radio Network), etc. may be included, but are not limited thereto.

단말(10)은 네트워크를 통하여 송신 단말(30)로부터 소정 통신 서비스의 연결 요청을 수신한다. 이 때, 네트워크에는 적어도 하나 이상의 기지국, 중앙 제어 서버 등 다양한 구성요소들이 포함될 수 있으며, 연결 요청은 이러한 구성요소들을 경유하여 송신 단말(30)로부터 단말(10)로 전달될 수 있다. The terminal 10 receives a connection request of a predetermined communication service from the transmitting terminal 30 through a network. In this case, the network may include various components such as at least one or more base stations, a central control server, and the connection request may be transmitted from the transmitting terminal 30 to the terminal 10 via these components.

본 발명의 다양한 실시예들에 따르면 단말(10) 또는 송신 단말(30) 각각은 다양한 형태의 단말일 수 있다. 예를 들어, 단말은 네트워크를 통해 원격지의 서버 또는 다른 단말에 접속할 수 있는 휴대용 단말, TV 장치 또는 컴퓨터일 수 있다. 여기서, 휴대용 단말의 일 예에는 휴대성과 이동성이 보장되는 무선 통신 장치로서, PCS(Personal Communication System), GSM(Global System for Mobile communications), PDC(Personal Digital Cellular), PHS(Personal Handyphone System), PDA(Personal Digital Assistant), IMT(International Mobile Telecommunication)-2000, CDMA(Code Division Multiple Access)-2000, W-CDMA(W-Code Division Multiple Access), Wibro(Wireless Broadband Internet) 단말, 스마트폰(smartphone), 태블릿 PC 등과 같은 모든 종류의 핸드헬드(Handheld) 기반의 무선 통신 장치가 포함되고, TV 장치의 일 예에는 스마트 TV, IPTV 셋톱박스 등이 포함되고, 컴퓨터의 일 예에는 웹 브라우저(WEB Browser)가 탑재된 노트북, 데스크톱(desktop), 랩톱(laptop) 등이 포함될 수 있다. 따라서, 도 1에 개시된 단말의 형태는 설명의 편의를 위해 예시된 것에 불과하므로, 본원에서 이야기하는 단말(10) 또는 송신 단말(30)의 종류 및 형태가 도 1에 도시된 것으로 한정 해석되는 것은 아니다. According to various embodiments of the present disclosure, each of the terminal 10 or the transmitting terminal 30 may be various types of terminals. For example, the terminal may be a portable terminal, a TV device or a computer capable of connecting to a remote server or other terminal via a network. Here, an example of a portable terminal is a wireless communication device that guarantees portability and mobility, and includes a personal communication system (PCS), a global system for mobile communications (GSM), a personal digital cellular (PDC), a personal handyphone system (PHS), and a PDA. (Personal Digital Assistant), International Mobile Telecommunication (IMT) -2000, Code Division Multiple Access (CDMA) -2000, W-Code Division Multiple Access (W-CDMA), Wireless Broadband Internet (WBRO) terminals, smartphones , Handheld-based wireless communication devices such as tablet PCs, and the like, examples of TV devices include smart TVs, IPTV set-top boxes, etc., and examples of computers include web browsers (WEB Browser). May include a laptop, desktop, laptop, and the like. Accordingly, since the form of the terminal disclosed in FIG. 1 is merely illustrated for convenience of description, the type and form of the terminal 10 or the transmitting terminal 30 described herein are limited to the one illustrated in FIG. 1. no.

단말(10)은 송신 단말(30)로부터 소정 통신 서비스의 연결 요청을 수신한다. 이 때, 통신 서비스의 일 예에는 유/무선 전화 서비스, MMS(Multimedia Message Service), SMS(Short Message Service)와 같은 메시지 서비스, 실시간 방송 스트리밍, VoD(Video On Service)와 같은 방송 서비스 또는 어플리케이션 기반의 메시지 서비스와 같은 데이터 전송 서비스 등이 포함될 수 있다. 또한, 연결 요청은 이러한 통신 서비스의 시작 또는 연결을 요청하는 소정 메시지일 수 있다. 예를 들어, 통신 서비스가 유/무선 전화 서비스인 경우, 연결 요청은 호 연결 메시지일 수 있으며, 다른 예를 들어, 통신 서비스가 메시지 서비스인 경우 연결 요청은 MMS 메시지 자체일 수 있다. The terminal 10 receives a connection request of a predetermined communication service from the transmitting terminal 30. At this time, examples of communication services include wired / wireless telephone service, multimedia message service (MMS), message service such as Short Message Service (SMS), live broadcast streaming, broadcast service such as VoD (Video On Service) or application based It may include a data transmission service such as a message service of. In addition, the connection request may be a predetermined message requesting the start or connection of such a communication service. For example, if the communication service is a wired / wireless telephone service, the connection request may be a call connection message. For example, if the communication service is a message service, the connection request may be an MMS message itself.

단말(10)은 연결 요청을 수신함에 따라 소리 데이터를 재생한다. 예를 들어, 단말(10)은 전화 서비스의 연결 요청을 수신함에 따라 스피커를 통하여 연결 요청이 수신되었음을 알리는 벨소리를 재생한다. 다른 예를 들어, 단말(10)은 SMS(Short Message Service)의 착신을 알리는 벨소리를 재생할 수도 있다. The terminal 10 reproduces sound data when receiving the connection request. For example, when the terminal 10 receives a connection request of a telephone service, the terminal 10 plays a ring tone indicating that the connection request is received through the speaker. For another example, the terminal 10 may play a ringtone for notifying the arrival of a short message service (SMS).

단말(10)은 복수의 소리 데이터를 저장하고, 저장된 소리 데이터들 중 어느 하나를 재생할 수 있다. 이 때, 단말(10)은 통신 서비스의 종류에 따라 서로 다른 소리 데이터를 재생할 수 있다. 예를 들어, 단말(10)은 전화 서비스에 대응하여 제 1 소리 데이터를 재생하고, 문자 메시지 서비스에 대응하여 제 2 소리 데이터를 재생할 수 있다. 또한, 단말(10)은 응답방식에 따라 서로 다른 소리 데이터를 재생할 수 있다. 예를 들어, 단말(10)은 버튼 입력을 이용한 응답방식에 따라 제 1 소리 데이터를 재생하고, 화면 터치를 이용한 응답방식에 따라 제 2 소리 데이터를 재생하고, 움직임 인식을 이용한 응답방식에 따라 제 3 소리 데이터를 재생하고, 음성인식을 이용한 응답방식에 따라 제 4 소리 데이터를 재생할 수 있다. The terminal 10 may store a plurality of sound data and reproduce one of the stored sound data. In this case, the terminal 10 may reproduce different sound data according to the type of communication service. For example, the terminal 10 may reproduce the first sound data in response to the telephone service and the second sound data in response to the text message service. In addition, the terminal 10 may reproduce different sound data according to a response method. For example, the terminal 10 reproduces the first sound data according to the response method using the button input, reproduces the second sound data according to the response method using the screen touch, and reproduces the second sound data according to the response method using the motion recognition. The third sound data can be reproduced and the fourth sound data can be reproduced according to the response method using the voice recognition.

단말(10)은 사용자 인터페이스로부터 응답방식에 대한 설정 신호를 입력받고, 해당 응답방식에 대응하는 소리 데이터를 이용하여, 통신 서비스의 연결 요청의 착신을 알릴 수 있다. 예를 들어, 단말(10)은 사용자 인터페이스로부터 음성인식 응답방식을 선택하는 설정 신호를 입력받고, 통신 서비스의 연결 요청을 수신할 때마다 음성인식 소리 데이터를 재생할 수 있다. The terminal 10 may receive a setting signal for the response method from the user interface and notify the incoming call of the connection request of the communication service by using sound data corresponding to the response method. For example, the terminal 10 may receive a setting signal for selecting a voice recognition response method from the user interface, and play back voice recognition sound data each time a connection request of a communication service is received.

단말(10)은 음성인식 소리 데이터를 생성할 수 있다. 일반적으로, 음성인식 소리 데이터는 사용자(20)의 음성을 손실 없이 인식하기 위하여 일반적인 소리 데이터로부터 생성된 것일 수 있다. 이러한 음성인식 소리 데이터의 일 예는 기본 소리 데이터에 적어도 하나 이상의 음소거 구간을 삽입된 소리 데이터가 포함된다. 이하에서 다시 설명되겠으나, 이러한 음성인식 소리 데이터의 음소거 구간은 사용자(20)의 음성이 사용자 인터페이스로 입력되는 시간적 구간으로 이용될 수 있다. 이를 통해, 사용자(20)의 음성인식률은 비약적으로 상승될 수 있다. The terminal 10 may generate voice recognition sound data. In general, the voice recognition sound data may be generated from general sound data in order to recognize the voice of the user 20 without loss. An example of such voice recognition sound data includes sound data in which at least one muting interval is inserted into the basic sound data. As will be described again below, the mute section of the voice recognition sound data may be used as a temporal section in which the voice of the user 20 is input to the user interface. Through this, the voice recognition rate of the user 20 may be dramatically increased.

단말(10)은 연결 요청에 응답하는 방식들 각각에 대응하는 복수의 소리 데이터 중 음성인식 소리 데이터를 선택할 수 있고, 선택된 음성인식 소리 데이터를 소리 재생 장치를 통해 재생할 수 있다. 예를 들어, 단말(10)이 송신 단말로부터 전화 착신호를 수신하고, 음성인식 소리 데이터로 생성된 벨소리(101)를 선택하여, 선택된 벨소리(101)를 재생할 수 있다. The terminal 10 may select voice recognition sound data among a plurality of sound data corresponding to each of the schemes in response to the connection request, and play the selected voice recognition sound data through the sound reproducing apparatus. For example, the terminal 10 may receive an incoming call from a transmitting terminal, select a ring tone 101 generated from voice recognition sound data, and play the selected ring tone 101.

단말(10)은 음성인식을 기반으로 연결 요청에 대응하여 통신 서비스의 연결을 수행한다. 이 때, 통신 서비스의 연결은 송신 단말(30)과의 소정 통신 서비스를 위한 세션 연결을 수행하는 것을 의미하며, 이러한 통신 서비스의 연결은 통신 서비스와 연관된 연결 프로세스를 통하여 이루어질 수 있다. The terminal 10 connects the communication service in response to the connection request based on the voice recognition. At this time, the connection of the communication service means to perform a session connection for a predetermined communication service with the transmitting terminal 30, the connection of this communication service can be made through a connection process associated with the communication service.

단말(10)은 사용자 인터페이스로부터 입력된 사용자(20)의 음성 데이터에 기초하여 통신 서비스의 연결을 수행할 수 있다. 예를 들어, 단말(10)은 전화 착신호를 수신하여, 음성인식 소리 데이터로 생성된 벨소리를 재생함과 동시에, 전화 서비스의 연결을 위해 사용자(20)의 음성(201)을 인식하고, 인식된 사용자(20) 음성(201)에 기초하여 전화 서비스를 연결할 수 있다. 이 때, 앞서 설명된 바와 같이, 음성인식 소리 데이터의 음소거 구간은 사용자(20)의 음성(201)을 인식하기 위한 시간적 구간으로 이용될 수 있다. The terminal 10 may connect the communication service based on the voice data of the user 20 input from the user interface. For example, the terminal 10 receives an incoming call, plays a ring tone generated by voice recognition sound data, and simultaneously recognizes and recognizes the voice 201 of the user 20 for connection of a telephone service. The telephone service can be connected based on the user 20 voice 201. In this case, as described above, the mute section of the voice recognition sound data may be used as a temporal section for recognizing the voice 201 of the user 20.

앞서 설명된 내용들의 수행 주체는 단말(10)일 수 있다. 다만, 본 발명의 다양한 실시예들에 따르면, 앞서 설명된 동작들은 단말(10)에 임베디드(embeded)된 하드웨어 또는 소프트웨어의 의하여 수행될 수도 있고, 단말(10)에 설치된 어플리케이션에 의하여도 수행될 수 있다. 예를 들어, 앞서 설명된 동작들은 단말(10)에 포함된 별도의 장치, 단말(10)의 플랫폼에 설치된 소프트웨어 또는 단말(10)에 설치된 어플리케이션에 의하여 수행될 수 있다. The performing entity of the above-described contents may be the terminal 10. However, according to various embodiments of the present disclosure, the operations described above may be performed by hardware or software embedded in the terminal 10 or may be performed by an application installed in the terminal 10. have. For example, the above-described operations may be performed by a separate device included in the terminal 10, software installed on the platform of the terminal 10, or an application installed on the terminal 10.

이와 같은 단말(10)의 동작에 대해서 아래에서 구체적으로 설명된다. The operation of the terminal 10 will be described in detail below.

도 2는 도 1에 도시된 단말(10)의 구성도이다. 도 2를 참조하면, 단말(10)은 연결 요청 수신부(11), 소리 데이터 선택부(12), 소리 데이터 재생부(13), 음성 데이터 입력부(14), 서비스 연결 수행부(15) 및 데이터베이스(16)를 포함한다. 다만, 도 2에 도시된 단말(10)은 본 발명의 하나의 구현 예에 불과하며, 도 2에 도시된 구성요소들을 기초로 하여 여러 가지 변형이 가능하다. 예를 들어, 단말(10)은 사용자로부터 어떤 명령 내지 정보를 입력받기 위한 사용자 인터페이스(141) 또는 소리 데이터를 출력하는 소리 재생 장치(131)를 더 포함할 수도 있다. 이 경우, 사용자 인터페이스(141)는 키보드, 마우스 등과 같은 입력 장치, 영상 표시 장치에 표현되는 그래픽 유저 인터페이스(GUI, Graphical User Interface) 또는 음성 유저 인터페이스(VUI, Voice User Interface)일 수 있다. 2 is a block diagram of the terminal 10 shown in FIG. Referring to FIG. 2, the terminal 10 includes a connection request receiving unit 11, a sound data selecting unit 12, a sound data reproducing unit 13, a voice data input unit 14, a service connection performing unit 15, and a database. (16). However, the terminal 10 shown in FIG. 2 is only one implementation example of the present invention, and various modifications are possible based on the components shown in FIG. 2. For example, the terminal 10 may further include a user interface 141 for receiving a command or information from a user or a sound reproducing apparatus 131 for outputting sound data. In this case, the user interface 141 may be an input device such as a keyboard or a mouse, a graphical user interface (GUI), or a voice user interface (VUI) expressed on the image display device.

연결 요청 수신부(11)는 네트워크를 통하여 송신 단말(30)로부터 통신 서비스의 연결 요청을 수신한다. 이 때, 통신 서비스는 전화 서비스, 메시지 서비스, 방송 서비스 또는 데이터 전송 서비스 중 어느 하나일 수 있다. 예를 들어, 연결 요청 수신부(11)는 송신 단말(30)로부터 전화 서비스의 연결 요청, SMS 및 MMS 연결 요청, 모바일 메신져 서비스 연결 요청 또는 화상 전화 서비스 연결 요청 등의 통신 서비스의 연결 요청들을 수신할 수 있다. The connection request receiving unit 11 receives a connection request of a communication service from the transmitting terminal 30 through a network. In this case, the communication service may be any one of a telephone service, a message service, a broadcast service, and a data transmission service. For example, the connection request receiving unit 11 may receive connection requests of communication services such as a connection request of a telephone service, an SMS and MMS connection request, a mobile messenger service connection request, or a video telephone service connection request from the transmitting terminal 30. Can be.

소리 데이터 선택부(12)는 연결 요청에 응답하는 방식들 각각에 대응하는 복수의 소리 데이터 중 음성인식 소리 데이터를 선택한다. 이 때, 복수의 소리 데이터의 일 예는 버튼입력 소리 데이터, 화면터치 소리 데이터, 움직임인식 소리 데이터, 음성인식 소리 데이터와 같이 응답방식의 종류 별로 구분된 소리 데이터들을 포함할 수 있다. 다만, 본 발명의 다른 실시예에 따르면, 복수의 소리 데이터의 일 예는 SMS 메시지를 수신하였음을 알리는 소리 데이터, 모바일 메신져의 메시지를 수신하였음을 알리는 소리 데이터, 전화 착신호가 수신되었음을 알리는 벨소리의 소리 데이터, 안내 메시지의 소리 데이터와 같이 통신 서비스의 종류 별로 구분된 소리 데이터들을 포함할 수도 있다. The sound data selector 12 selects voice recognition sound data from among a plurality of sound data corresponding to each of the schemes in response to the connection request. In this case, one example of the plurality of sound data may include sound data classified according to types of response methods, such as button input sound data, screen touch sound data, motion recognition sound data, and voice recognition sound data. However, according to another exemplary embodiment of the present invention, an example of the plurality of sound data may include sound data indicating that an SMS message has been received, sound data indicating that a message of a mobile messenger has been received, and a ringtone sound indicating that an incoming call has been received. The data may include sound data classified according to types of communication services, such as sound data of a guide message.

소리 데이터 선택부(12)는 연결 요청에 응답하는 방식이 음성인식 방식으로 설정된 경우, 복수의 소리 데이터 중 음성인식 소리 데이터를 선택할 수 있다. 이 때, 음성인식 방식의 설정은 사용자 인터페이스로부터 입력된 설정 신호에 기초하여 이루어질 수 있다. The sound data selector 12 may select the voice recognition sound data from the plurality of sound data when the method of responding to the connection request is set to the voice recognition method. In this case, the voice recognition method may be set based on a setting signal input from the user interface.

소리 데이터 선택부(12)는 단말(10) 주변의 소음의 정도를 판단하고, 판단 결과에 따라 음성인식 소리 데이터를 선택할 수도 있다. 예를 들어, 소리 데이터 선택부(12)는 연결 요청에 응답하는 방식이 음성인식 방식으로 설정된 경우, 주변의 소음의 정도를 판단하고, 판단 결과 주변의 소음이 임계치 이상인 경우 음성인식 소리 데이터를 선택하고, 주변의 소임이 임계치 보다 작은 경우 일반 소리 데이터를 선택할 수 있다. The sound data selector 12 may determine the degree of noise around the terminal 10 and select the voice recognition sound data according to the determination result. For example, when the method of responding to the connection request is set as a voice recognition method, the sound data selecting unit 12 determines the degree of noise in the surroundings, and selects the voice recognition sound data when the surrounding noise is greater than or equal to a threshold. If the ambient noise is smaller than the threshold, general sound data can be selected.

소리 데이터 재생부(13)는 선택된 음성인식 소리 데이터를 소리 재생 장치(131)를 통하여 재생한다. 일반적으로, 이러한 소리 재생 장치(131)의 일 예에는 스피커가 포함된다. 일반적으로, 음성인식 소리 데이터는 데이터베이스(16)에 저장되고, 소리 데이터 재생부(13)는 데이터베이스(16)로부터 음성인식 소리 데이터를 추출하여 재생할 수 있다. The sound data reproducing unit 13 reproduces the selected voice recognition sound data through the sound reproducing apparatus 131. In general, one example of such a sound reproducing apparatus 131 includes a speaker. In general, the voice recognition sound data is stored in the database 16, and the sound data reproducing unit 13 can extract and reproduce the voice recognition sound data from the database 16.

음성 데이터 입력부(14)는 음성인식을 기반으로 사용자 인터페이스(141)로부터 사용자의 음성 데이터를 입력받는다. 이 때, 사용자 인터페이스(141)의 일 예는 사용자의 음성을 인식하여 음성 데이터를 생성하는 VUI를 포함하고, 이 경우, 음성 데이터는 사용자의 음성으로부터 변환된 텍스트 기반의 데이터일 수 있다. 사용자의 음성으로부터 음성 데이터가 생성되는 일 예를 설명하면, 사용자 인터페이스는 사용자에 의하여 '전화 받아'라는 음성을 입력받고, 입력된 음성을 텍스트 기반의 음성 데이터로 변환하고, 변환된 음성 데이터를 음성 데이터 입력부(14)로 전달할 수 있다. 다만, 본 발명의 다른 실시예에 따르면, 사용자 인터페이스는 사용자의 음성을 인식하는 마이크 장치이고, 음성 데이터 입력부(14)에서 음성과 텍스트간의 변환을 수행할 수도 있다. The voice data input unit 14 receives voice data of the user from the user interface 141 based on voice recognition. In this case, one example of the user interface 141 includes a VUI for recognizing a user's voice and generating voice data. In this case, the voice data may be text-based data converted from the user's voice. Referring to an example in which voice data is generated from a user's voice, the user interface receives a voice of 'call' by the user, converts the input voice into text-based voice data, and converts the converted voice data into voice. It may be transferred to the data input unit 14. However, according to another embodiment of the present invention, the user interface is a microphone device for recognizing a user's voice, and the voice data input unit 14 may convert voice and text.

서비스 연결 수행부(15)는 입력된 음성 데이터에 기초하여 통신 서비스의 연결을 수행한다. 이 때, 통신 서비스의 연결은 송신 단말(30)과의 소정 통신 서비스를 위한 세션 연결을 수행하는 것을 의미하며, 이러한 통신 서비스의 연결은 통신 서비스와 연관된 연결 프로세스를 통하여 이루어질 수 있다. The service connection performing unit 15 connects the communication service based on the input voice data. At this time, the connection of the communication service means to perform a session connection for a predetermined communication service with the transmitting terminal 30, the connection of this communication service can be made through a connection process associated with the communication service.

데이터베이스(16)는 데이터를 저장한다. 이 때, 데이터는 단말(10) 내부의 각 구성요소들 간에 입력 및 출력되는 데이터를 포함하고, 단말(10)과 단말(10) 외부의 구성요소들간에 입력 및 출력되는 데이터를 포함한다. 예를 들어, 데이터베이스(16)는 송신 단말(30)로부터 수신된 연결 요청을 저장하고, 소리 데이터 선택부(12)에 의하여 생성된 음성인식 소리 데이터의 선택 정보 데이터를 저장할 수 있다. 또한, 데이터베이스(16)는 음성인식 소리 데이터를 포함하는 복수의 소리 데이터를 저장할 수 있다. 이러한 데이터베이스(16)의 일 예에는 단말(10) 내부 또는 외부에 존재하는 하드디스크드라이브, ROM(Read Only Memory), RAM(Random Access Memory), 플래쉬메모리 및 메모리카드 등이 포함된다. The database 16 stores data. At this time, the data includes data input and output between the components of the terminal 10, and includes data input and output between the terminal 10 and the components outside the terminal 10. For example, the database 16 may store the connection request received from the transmitting terminal 30 and store selection information data of the voice recognition sound data generated by the sound data selection unit 12. In addition, the database 16 may store a plurality of sound data including voice recognition sound data. An example of such a database 16 includes a hard disk drive, a read only memory (ROM), a random access memory (RAM), a flash memory, a memory card, or the like existing inside or outside the terminal 10.

도 3은 본 발명의 다른 실시예에 따른 단말(10)의 구성도이다. 도 3을 참조하면, 단말(10)은 연결 요청 수신부(11), 소리 데이터 선택부(12), 소리 데이터 재생부(13), 음성 데이터 입력부(14), 서비스 연결 수행부(15), 데이터베이스(16), 소리 데이터 생성부(17) 및 음성인식 설정부(18)를 포함한다. 다만, 도 3에 도시된 단말(10)은 본 발명의 하나의 구현 예에 불과하며, 도 3에 도시된 구성요소들을 기초로 하여 여러 가지 변형이 가능하다. 예를 들어, 단말(10)은 사용자로부터 어떤 명령 내지 정보를 입력받기 위한 사용자 인터페이스(141)를 더 포함할 수도 있다. 이 경우, 사용자 인터페이스(141)는 키보드, 마우스 등과 같은 입력 장치, 영상 표시 장치에 표현되는 그래픽 유저 인터페이스(GUI, Graphical User Interface) 또는 음성 유저 인터페이스(VUI, Voice User Interface)일 수 있다. 3 is a block diagram of a terminal 10 according to another embodiment of the present invention. Referring to FIG. 3, the terminal 10 includes a connection request receiving unit 11, a sound data selecting unit 12, a sound data reproducing unit 13, a voice data input unit 14, a service connection performing unit 15, and a database. (16), a sound data generation unit 17 and a voice recognition setting unit 18. However, the terminal 10 shown in FIG. 3 is only one implementation example of the present invention, and various modifications are possible based on the components shown in FIG. 3. For example, the terminal 10 may further include a user interface 141 for receiving a certain command or information from the user. In this case, the user interface 141 may be an input device such as a keyboard or a mouse, a graphical user interface (GUI), or a voice user interface (VUI) expressed on the image display device.

연결 요청 수신부(11)는 네트워크를 통하여 송신 단말(30)로부터 통신 서비스의 연결 요청을 수신한다. 소리 데이터 선택부(12)는 연결 요청에 응답하는 방식들 각각에 대응하는 복수의 소리 데이터 중 음성인식 소리 데이터를 선택한다. 소리 데이터 재생부(13)는 선택된 음성인식 소리 데이터를 소리 재생 장치를 통하여 재생한다. 음성 데이터 입력부(14)는 음성인식을 기반으로 사용자 인터페이스(141)로부터 사용자의 음성 데이터를 입력받는다. 서비스 연결 수행부(15)는 입력된 음성 데이터에 기초하여 통신 서비스의 연결을 수행한다. 데이터베이스(16)는 데이터를 저장한다. 이와 같이, 도 3에 도시된 다른 실시예에 따른 단말(10)은 도 2에 도시된 일 실시예에 따른 단말과 같이, 연결 요청 수신부(11), 소리 데이터 선택부(12), 소리 데이터 재생부(13), 음성 데이터 입력부(14), 서비스 연결 수행부(15) 및 데이터베이스(16)를 포함한다. 따라서, 도 3을 통하여, 이러한 구성들에 대해 설명되지 아니한 사항은 앞서 도 2를 통하여 설명된 내용을 준용한다. The connection request receiving unit 11 receives a connection request of a communication service from the transmitting terminal 30 through a network. The sound data selector 12 selects voice recognition sound data from among a plurality of sound data corresponding to each of the schemes in response to the connection request. The sound data reproducing unit 13 reproduces the selected voice recognition sound data through the sound reproducing apparatus. The voice data input unit 14 receives voice data of the user from the user interface 141 based on voice recognition. The service connection performing unit 15 connects the communication service based on the input voice data. The database 16 stores data. As described above, the terminal 10 according to another exemplary embodiment shown in FIG. 3 is connected to the connection request receiving unit 11, the sound data selecting unit 12, and the sound data reproduction like the terminal according to the exemplary embodiment illustrated in FIG. 2. The unit 13 includes a voice data input unit 14, a service connection performing unit 15, and a database 16. Therefore, the matters that are not described with respect to these components through FIG. 3 apply to the contents described above with reference to FIG. 2.

소리 데이터 생성부(16)는 복수의 소리 데이터 중 어느 하나의 소리 데이터로부터 음성인식 소리 데이터를 생성한다. 이 때, 소리 데이터 생성부(16)는 복수의 소리 데이터 중 어느 하나의 소리 데이터를 추출하고, 추출된 소리 데이터에 음소거 구간을 삽입함으로써, 음성인식 소리 데이터를 생성할 수 있다. 예를 들어, 소리 데이터 생성부(16)는 복수의 소리 데이터 중 어느 하나의 소리 데이터를 추출하고, 추출된 소리 데이터를 소정 시간 간격의 복수의 구간으로 분할하고, 분할된 복수의 구간 사이에 음소거 구간을 삽입함으로써, 추출된 소리 데이터로부터 음성인식 소리 데이터를 생성할 수 있다. The sound data generation unit 16 generates voice recognition sound data from any one of the plurality of sound data. At this time, the sound data generation unit 16 may generate sound recognition sound data by extracting any sound data of the plurality of sound data and inserting a mute section into the extracted sound data. For example, the sound data generation unit 16 extracts sound data of any one of the plurality of sound data, divides the extracted sound data into a plurality of sections at predetermined time intervals, and mutes between the divided sections. By inserting the section, speech recognition sound data can be generated from the extracted sound data.

본 발명의 다른 실시예에 따르면, 소리 데이터 생성부(16)는 어느 하나의 소리 데이터를 복수의 주파수의 데이터들로 분할하고, 분할된 데이터들 중 적어도 하나 이상에 음소거 처리를 수행함으로써, 추출된 소리 데이터로부터 음성인식 소리 데이터를 생성할 수 있다. 이 때, 음소거 처리가 수행되는 주파수 대역은 사람의 음성의 주파수 대역과 동일 또는 유사한 대역의 주파수 대역일 수 있다. According to another embodiment of the present invention, the sound data generator 16 extracts sound data by dividing any sound data into data of a plurality of frequencies, and performing a muting process on at least one of the divided data. Voice recognition sound data can be generated from the sound data. In this case, the frequency band in which the muting process is performed may be a frequency band of the same or similar band as the frequency band of the human voice.

소리 데이터 생성부(16)는 사용자 인터페이스로부터 입력된 설정 신호에 기초하여 음성인식 소리 데이터를 생성할 수 있다. 예를 들어, 소리 데이터 생성부(16)는 사용자 또는 다른 어플리케이션 등으로부터 음성인식 소리 데이터를 생성할 것을 요청하는 설정 신호를 수신하는 경우, 이러한 설정 신호에 기초하여 음성인식 소리 데이터를 생성할 수 있다. The sound data generator 16 may generate voice recognition sound data based on a setting signal input from the user interface. For example, when the sound data generation unit 16 receives a setting signal for requesting to generate voice recognition sound data from a user or another application, the sound data generation unit 16 may generate voice recognition sound data based on the setting signal. .

소리 데이터 생성부(16)는 사용자 인터페이스(141)로부터 입력된 설정 정보에 기초하여 음소거 구간의 간격 및 음소거 구간이 소리 데이터에 삽입되는 위치를 결정하고, 결정 결과에 따라 음성인식 소리 데이터를 생성할 수 있다. 예를 들어, 소리 데이터 생성부(16)는 음성인식 소리 데이터에 삽입될 음소거 구간의 간격 및 음소거 구간이 소리 데이터에 삽입되는 위치를 포함하는 설정 정보를 사용자 인터페이스(141)로부터 입력받고, 이러한 설정 정보에 기초하여 음성인식 소리 데이터를 생성할 수 있다. 이를 통해, 소리 데이터 생성부(16)는 사용자 맞춤형 음성인식 소리 데이터를 생성할 수 있다. The sound data generator 16 determines the interval of the mute section and the position at which the muted section is inserted into the sound data based on the setting information input from the user interface 141, and generates the voice recognition sound data according to the determination result. Can be. For example, the sound data generation unit 16 receives the setting information from the user interface 141 including the interval of the mute section to be inserted into the voice recognition sound data and the position at which the mute section is inserted into the sound data. Voice recognition sound data can be generated based on the information. In this way, the sound data generator 16 may generate user-specific voice recognition sound data.

이상과 같이 도 3을 통하여 소리 데이터 생성부(16)에 의하여 음성인식 소리 데이터를 생성하는 과정을 설명하였으나, 이러한 음성인식 소리 데이터는 단말(10) 외부의 소정 서버 또는 단말에 의하여 생성되어 단말(10)로 입력될 수도 있다. As described above, the process of generating voice recognition sound data by the sound data generation unit 16 is described with reference to FIG. 3, but the voice recognition sound data is generated by a predetermined server or a terminal external to the terminal 10 and thus the terminal ( 10) may be entered.

도 4는 도 3의 소리 데이터 생성부(17)에 의하여 소리 데이터(41)로부터 음성인식 소리 데이터(44)를 생성하는 과정의 일 예를 설명하기 위한 도면이다. 도 4를 참조하면, 소리 데이터 생성부(17)는 데이터베이스(16)로부터 복수의 소리 데이터 중 어느 하나의 소리 데이터(41)를 추출한다. 이 때, 어느 하나의 소리 데이터는 연속적인 데이터로 구성된다. 소리 데이터 생성부(17)는 추출된 소리 데이터(41)를 구간(42)을 포함하는 복수의 구간으로 분할한다. 이 때, 복수의 구간들 각각은 동일한 크기 또는 시간 간격을 가질 수도 있으나, 서로 다른 크기 또는 시간 간격을 가질 수도 있다. 소리 데이터 생성부(17)는 분할된 복수의 구간 사이에 음소거 구간을 삽입함으로써, 음성인식 소리 데이터(44)를 생성한다. 이 때, 음소거 구간(43)은 다른 음소거 구간들과 같은 크기 또는 시간 간격을 가질 수도 있고, 다른 음소거 구간들과 다른 크기 또는 시간 간격을 가질 수도 있다. 소리 데이터 생성부(44)는 생성된 음성인식 소리 데이터(44)를 데이터베이스(16)에 저장한다. 다만, 도 3에서 설명된 과정은 본 발명의 일 실시예에 불과하며, 본 발명의 다양한 실시예들에 따르면 소리 데이터로부터 음성인식 소리 데이터는 생성되는 과정들은 다양하다. 예를 들어, 소리 데이터 생성부(17)는 사용자 인터페이스로부터 입력된 설정 정보에 기초하여 구간(42)의 크기 또는 시간 간격, 음소거 구간(43)의 크기 또는 시간 간격, 분할된 구간의 개수 또는 음소거 구간의 개수 등을 결정할 수도 있다. 4 is a diagram for describing an example of a process of generating voice recognition sound data 44 from the sound data 41 by the sound data generation unit 17 of FIG. 3. Referring to FIG. 4, the sound data generating unit 17 extracts any sound data 41 of the plurality of sound data from the database 16. At this time, any sound data is composed of continuous data. The sound data generation unit 17 divides the extracted sound data 41 into a plurality of sections including the section 42. In this case, each of the plurality of sections may have the same size or time interval, or may have different sizes or time intervals. The sound data generation unit 17 generates the voice recognition sound data 44 by inserting a mute section between the plurality of divided sections. At this time, the mute section 43 may have the same size or time interval as the other mute sections, or may have a different size or time interval from the other mute sections. The sound data generator 44 stores the generated voice recognition sound data 44 in the database 16. However, the process described in FIG. 3 is only an embodiment of the present invention, and according to various embodiments of the present disclosure, processes for generating voice recognition sound data from sound data are various. For example, the sound data generator 17 may determine the size or time interval of the section 42, the size or time interval of the muting section 43, the number or number of divided sections based on the setting information input from the user interface. The number of sections may be determined.

음성인식 설정부(18)는 사용자 인터페이스(141)로부터 입력된 설정 신호에 기초하여 연결 요청에 응답하는 방식을 음성인식 방식으로 설정할 수 있다. 이 때, 소리 데이터 선택부(12)는 연결 요청에 응답하는 방식이 음성인식 방식으로 설정된 경우, 복수의 소리 데이터 중 음성인식 소리 데이터를 선택할 수 있다. 또한, 음성인식 설정부(18)는 사용자 인터페이스로부터 입력된 설정 신호에 대응하여 다른 응답 방식을 설정할 수도 있다. 예를 들어, 음성인식 설정부(18)는 사용자 인터페이스로부터 입력된 설정 신호 버튼 입력을 이용한 응답방식, 화면 터치를 이용한 응답방식 또는 움직임 인식을 이용한 응답방식을 설정할 수도 있다. The voice recognition setting unit 18 may set a method of responding to the connection request based on the setting signal input from the user interface 141 as a voice recognition method. In this case, when the method of responding to the connection request is set as a voice recognition method, the sound data selector 12 may select voice recognition sound data from the plurality of sound data. Also, the voice recognition setting unit 18 may set another response method in response to the setting signal input from the user interface. For example, the voice recognition setting unit 18 may set a response method using a setting signal button input input from a user interface, a response method using a screen touch, or a response method using motion recognition.

도 5는 본 발명의 일 실시예에 따른 음성인식 소리 데이터를 생성하는 방법을 나타낸 동작 흐름도이다. 도 5에 도시된 실시예에 따른 음성인식 소리 데이터 생성 방법은 도 1 및 도 2에 도시된 단말(10), 도 3에 도시된 단말(10) 또는 도 3에 도시된 소리 데이터 생성부(17)에 의하여 음성인식 소리 데이터가 생성되는 각각의 단계들을 포함한다. 따라서, 이하 생략된 내용이라고 하더라도 도 1 및 도 2에 도시된 단말(10), 도 3에 도시된 단말(10) 또는 도 3에 도시된 소리 데이터 생성부(17) 중 어느 하나에서 기술된 내용은 도 5에 도시된 실시예에 따른 음성인식 소리 데이터 생성 방법에도 적용된다. 5 is a flowchart illustrating a method of generating voice recognition sound data according to an embodiment of the present invention. The voice recognition sound data generating method according to the embodiment shown in FIG. 5 includes the terminal 10 shown in FIGS. 1 and 2, the terminal 10 shown in FIG. 3, or the sound data generating unit 17 shown in FIG. 3. Each of the steps of generating the voice recognition sound data by). Therefore, although omitted below, contents described in any one of the terminal 10 shown in FIGS. 1 and 2, the terminal 10 shown in FIG. 3, and the sound data generating unit 17 shown in FIG. 3 are described. Is also applied to the voice recognition sound data generation method according to the embodiment shown in FIG.

단계 S501에서 단말(10)은 데이터베이스(16)로부터 복수의 소리 데이터 중 어느 하나의 소리 데이터를 추출한다. 단계 S502에서 단말(10)은 추출된 소리 데이터를 소정 시간 간격의 복수의 구간으로 분할한다. 단계 S503에서 단말(10)은 분할된 복수의 구간 사이에 음소거 구간을 삽입함으로써, 추출된 소리 데이터로부터 음성인식 소리 데이터를 생성한다. 단계 S504에서 단말(10)은 생성된 음성인식 소리 데이터를 데이터베이스(16)에 저장한다. 이 때, 생성된 음성인식 소리 데이터는 통신 서비스의 연결 요청에 응답하는 복수의 방식들 중 음성인식 방식에 매핑될 수 있으며, 이 때, 데이터베이스(16)는 매핑에 관한 매핑 정보를 더 저장할 수 있다. In step S501, the terminal 10 extracts sound data of any one of the plurality of sound data from the database 16. In step S502, the terminal 10 divides the extracted sound data into a plurality of sections at predetermined time intervals. In step S503, the terminal 10 generates a voice recognition sound data from the extracted sound data by inserting a mute section between the plurality of divided sections. In step S504, the terminal 10 stores the generated voice recognition sound data in the database 16. In this case, the generated voice recognition sound data may be mapped to a voice recognition method among a plurality of methods in response to the connection request of the communication service, and at this time, the database 16 may further store mapping information about the mapping. .

도 6은 본 발명의 일 실시예에 따른 통신 서비스 연결 수행 방법을 나타낸 동작 흐름도이다. 도 6에 도시된 실시예에 따른 통신 서비스 연결 수행 방법은 도 1 및 도 2에 도시된 실시예에 따른 단말(10), 도 3에 도시된 다른 실시예에 따른 단말(10) 중 어느 하나에서 시계열적으로 처리되는 단계들을 포함한다. 따라서, 이하 생략된 내용이라고 하더라도 도 2에 도시된 실시예에 따른 단말(10) 또는 도 3에 도시된 다른 실시예에 따른 단말(10) 중 어느 하나에서 기술된 내용은 도 6에 도시된 실시예에 따른 통신 서비스 연결 수행 방법에도 적용된다. 6 is a flowchart illustrating a method of performing a communication service connection according to an embodiment of the present invention. The method for performing a communication service connection according to the embodiment shown in FIG. 6 may be performed in any one of the terminal 10 according to the embodiment shown in FIGS. 1 and 2 and the terminal 10 according to another embodiment shown in FIG. 3. Steps that are processed in time series. Therefore, although omitted below, contents described in any one of the terminal 10 according to the embodiment shown in FIG. 2 or the terminal 10 according to another embodiment shown in FIG. This also applies to a method of performing a communication service connection according to an example.

단계 S601에서 연결 요청 수신부(11)는 통신 서비스의 연결 요청을 입력받는다. 단계 S602에서 소리 데이터 선택부(12)는 입력된 연결 요청에 응답하는 방식들 각각에 대응하는 복수의 소리 데이터 중 음성인식 소리 데이터를 선택한다. In step S601, the connection request receiving unit 11 receives a connection request of a communication service. In operation S602, the sound data selector 12 selects voice recognition sound data from among a plurality of sound data corresponding to each of the schemes in response to the input connection request.

단계 S603에서 소리 데이터 재생부(13)는 선택된 음성인식 소리 데이터를 재생한다. 단계 S604에서 음성 데이터 입력부(14)는 음성인식을 기반으로 사용자의 음성 데이터를 입력받는다. 단계 S605에서 서비스 연결 수행부(15)는 입력된 음성 데이터에 기초하여 통신 서비스의 연결을 수행한다. In step S603, the sound data reproducing unit 13 reproduces the selected voice recognition sound data. In step S604, the voice data input unit 14 receives voice data of the user based on voice recognition. In step S605, the service connection performing unit 15 connects the communication service based on the input voice data.

또한, 도 6에는 도시되지 않았으나, 본 발명의 다른 실시예에 따르면, 통신 서비스 연결 수행 방법은 복수의 소리 데이터 중 어느 하나의 소리 데이터로부터 상기 음성인식 소리 데이터를 생성하는 단계(도시되지 않음)를 더 포함한다. 이 때, 소리 데이터를 재생하는 단계(S603)는 생성된 음성인식 소리 데이터를 재생한다. 또한, 음성인식 소리 데이터는 어느 하나의 소리 데이터에 음소거 구간을 삽입함으로써, 추출된 소리 데이터로부터 생성될 수 있다. In addition, although not shown in FIG. 6, according to another embodiment of the present invention, the method of performing a communication service connection may include generating the voice recognition sound data from one of the plurality of sound data (not shown). It includes more. At this time, reproducing the sound data (S603) reproduces the generated voice recognition sound data. In addition, the voice recognition sound data may be generated from the extracted sound data by inserting a mute section into any one of the sound data.

도 5를 통해 설명된 실시예에 따른 통신 서비스 연결 수행 방법 또는 도 6을 통해 설명된 실시예에 따른 음성인식 소리 데이터 생성 방법 각각은 컴퓨터에 의해 실행되는 프로그램 모듈과 같은 컴퓨터에 의해 실행가능한 명령어를 포함하는 기록 매체의 형태로도 구현될 수 있다. 컴퓨터 판독 가능 매체는 컴퓨터에 의해 액세스될 수 있는 임의의 가용 매체일 수 있고, 휘발성 및 비휘발성 매체, 분리형 및 비분리형 매체를 모두 포함한다. 또한, 컴퓨터 판독가능 매체는 컴퓨터 저장 매체 및 통신 매체를 모두 포함할 수 있다. 컴퓨터 저장 매체는 컴퓨터 판독가능 명령어, 데이터 구조, 프로그램 모듈 또는 기타 데이터와 같은 정보의 저장을 위한 임의의 방법 또는 기술로 구현된 휘발성 및 비휘발성, 분리형 및 비분리형 매체를 모두 포함한다. 통신 매체는 전형적으로 컴퓨터 판독가능 명령어, 데이터 구조, 프로그램 모듈, 또는 반송파와 같은 변조된 데이터 신호의 기타 데이터, 또는 기타 전송 메커니즘을 포함하며, 임의의 정보 전달 매체를 포함한다. Each method of performing a communication service connection according to the embodiment described with reference to FIG. 5 or a method for generating voice recognition sound data according to the embodiment described with reference to FIG. 6 may be executed by a computer executable command such as a program module executed by the computer. It can also be implemented in the form of a recording medium that includes. Computer readable media can be any available media that can be accessed by a computer and includes both volatile and nonvolatile media, removable and non-removable media. In addition, the computer-readable medium may include both computer storage media and communication media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Communication media typically includes any information delivery media, including computer readable instructions, data structures, program modules, or other data in a modulated data signal such as a carrier wave, or other transport mechanism.

전술한 본 발명의 설명은 예시를 위한 것이며, 본 발명이 속하는 기술분야의 통상의 지식을 가진 자는 본 발명의 기술적 사상이나 필수적인 특징을 변경하지 않고서 다른 구체적인 형태로 쉽게 변형이 가능하다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 예를 들어, 단일형으로 설명되어 있는 각 구성 요소는 분산되어 실시될 수도 있으며, 마찬가지로 분산된 것으로 설명되어 있는 구성 요소들도 결합된 형태로 실시될 수 있다. The foregoing description of the present invention is intended for illustration, and it will be understood by those skilled in the art that the present invention may be easily modified in other specific forms without changing the technical spirit or essential features of the present invention. will be. It is therefore to be understood that the above-described embodiments are illustrative in all aspects and not restrictive. For example, each component described as a single entity may be distributed and implemented, and components described as being distributed may also be implemented in a combined form.

본 발명의 범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구범위의 의미 및 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다. The scope of the present invention is shown by the following claims rather than the above description, and all changes or modifications derived from the meaning and scope of the claims and their equivalents should be construed as being included in the scope of the present invention. do.

10: 단말
12: 소리 데이터 선택부
13: 소리 데이터 재생부
14: 음성 데이터 입력부
15: 서비스 연결 수행부
17: 소리 데이터 생성부10: Terminal
12: sound data selector
13: sound data playback unit
14: voice data input unit
15: service connection execution unit
17: sound data generation unit

Claims

In the terminal for the connection of the communication service based on the voice recognition,
A sound data generation unit for generating voice recognition sound data from any one of the plurality of sound data;
A connection request receiving unit which receives a connection request of the communication service from a transmitting terminal through a network;
A sound data selector which selects the voice recognition sound data from among the plurality of sound data corresponding to each of the manners in response to the connection request;
A sound data reproducing unit for reproducing the selected voice recognition sound data through a sound reproducing apparatus;
A voice data input unit configured to receive voice data of the user based on the voice recognition; And
Including a service connection performing unit for performing the connection of the communication service based on the input voice data,
The sound data generation unit may determine an interval of a mute section and a position at which the mute section is inserted into the sound data based on setting information input from a user interface, and generate the voice recognition sound data according to a determination result. That, the terminal.

delete

The method of claim 1,
The sound data generation unit generates voice recognition sound data from the sound data by inserting the mute section into the sound data.

The method of claim 1,
The sound data generation unit divides any one sound data into data of a plurality of frequencies and performs a muting process on at least one or more of the divided data, thereby generating voice recognition sound data from the one sound data. Terminal to generate.

The method of claim 1,
And a database including the plurality of sound data.

The method according to claim 6,
The sound data generation unit extracts sound data of any one of the plurality of sound data from the database,
Dividing the extracted sound data into a plurality of sections at predetermined time intervals,
By inserting a mute section between the divided plurality of sections, the voice recognition sound data is generated from the extracted sound data,
And storing the generated voice recognition sound data in the database.

delete

The method of claim 1,
The apparatus may further include a voice recognition setting unit configured to set a method of responding to the connection request to a voice recognition method based on a setting signal input from a user interface.
The sound data selecting unit selects the voice recognition sound data from among the plurality of sound data when the method of responding to the connection request is set to the voice recognition method.

The method of claim 1,
The communication service is any one of a telephone service, a message service, a broadcast service or a data transmission service.

The method of claim 1,
The sound data selection unit determines the degree of noise around the terminal, and selects the voice recognition sound data according to the determination result.

In the method for performing the connection of the communication service by the terminal based on voice recognition,
Generating voice recognition sound data from any one of the plurality of sound data;
Receiving a connection request of the communication service;
Selecting the voice recognition sound data from the plurality of sound data corresponding to each of the schemes in response to the input connection request;
Playing the selected voice recognition sound data;
Receiving voice data of a user based on the voice recognition; And
Performing the connection of the communication service based on the input voice data,
The generating of the voice recognition sound data may include determining an interval of a mute section and a position at which the mute section is inserted into the one piece of sound data based on setting information input from a user interface, and determining the voice recognition according to a determination result. Generating sound data.

delete

13. The method of claim 12,
The voice recognition sound data is generated from any one of the sound data by inserting a mute period into the one of the sound data, the communication service connection method.

In the method for generating voice recognition sound data,
Extracting sound data of any one of a plurality of sound data from a database;
Dividing the extracted sound data into a plurality of sections at predetermined time intervals;
Generating voice recognition sound data from the extracted sound data by inserting a mute section between the divided plurality of sections; And
Storing the generated voice recognition sound data in the database;
The generating of the voice recognition sound data may include determining an interval of a mute section and a position at which the mute section is inserted into the one piece of sound data based on setting information input from a user interface, and determining the voice recognition according to a determination result. To generate sound data,
The voice recognition sound data is mapped to the voice recognition method of a plurality of ways in response to the connection request of the communication service, voice recognition sound data generation method.

delete