KR20130125064A

KR20130125064A - Method of processing voice communication and mobile terminal performing the same

Info

Publication number: KR20130125064A
Application number: KR1020120048515A
Authority: KR
Inventors: 김경서
Original assignee: 김경서
Priority date: 2012-05-08
Filing date: 2012-05-08
Publication date: 2013-11-18
Also published as: KR101379405B1

Abstract

A method capable of processing the voice call of a mobile terminal comprises the steps of: hooking the start of the voice call; generating conversation context including an application identifier, an application API identifier, and an application API-related parameter by clustering the voice transmitted and received in the execution process of the voice call; checking the installation condition of the application related to the application identifier; and, if the related application is installed, prompting the execution of the corresponding application API by displaying one or more parameters related to the application API. Therefore, the present invention processes the voice transmitted and received in the execution process of the voice call. [Reference numerals] (210) Voice call hooking unit;(220) Conversation context generating unit;(230) Application executing unit;(240) Application installation checking unit;(250) Application storage unit;(260) Keyword database;(270) Application API database;(280) Display unit;(290) Voice storage unit;(300) Control unit

Description

Method of processing voice call to execute related application through keyword speech recognition and mobile terminal executing the same {METHOD OF PROCESSING VOICE COMMUNICATION AND MOBILE TERMINAL PERFORMING THE SAME}

본 발명은 음성 통화 기술에 관한 것으로, 보다 상세하게는, 음성 통화의 수행 과정에서 송수신되는 음성을 처리하는 키워드 음성 인식을 통해 관련 어플리케이션을 실행시키는 음성 통화 처리 방법 및 이를 실행하는 모바일 단말에 관한 것이다.
The present invention relates to a voice call technology, and more particularly, to a voice call processing method for executing a related application through a keyword voice recognition processing a voice transmitted and received in the course of performing a voice call and a mobile terminal for executing the same. .

이동 단말기는 다양한 기능을 수행할 수 있도록 구성될 수 있다. 그러한 다양한 기능들의 예로 음성 및 화상 통화 기능, 전화번호를 저장하는 기능, 스케줄을 관리하는 기능, 카메라를 통해 사진이나 동영상을 촬영하는 기능 등이 있다. 더욱이 최근의 일부 이동 단말기는 사용자에 의해 특정 기능을 갖는 어플리케이션을 인스톨하여 해당 어플리케이션을 실행할 수 있다.The mobile terminal may be configured to perform various functions. Examples of such various functions include voice and video calling, storing phone numbers, managing schedules, and taking pictures or videos through a camera. In addition, some recent mobile terminals may install an application having a specific function by a user and execute the application.

이러한 어플리케이션은 사용자가 이동 단말기를 통해 상대방과 통화 중이라도 사용자에 의해 실행될 수 있다. 하지만, 사용자가 통화를 하면서 동시에 키패드를 사용하여 특정 어플리케이션을 실행하기 위해서는 항상 통화모드에서 빠져나와 어플리케이션을 검색하여 실행하여야 하며, 이런 기능을 수행하는 동안은 사용자가 단말기를 응시해야 함으로 통화가 사실상 불가능하다. Such an application may be executed by the user even if the user is talking to the other party through the mobile terminal. However, in order to execute a specific application by using the keypad at the same time, the user must always exit the call mode and search for and execute the application. During this function, the user must stare at the terminal. Do.

한국공개특허 제10-2009-0112899호는 이동 단말기 및 이동 단말기의 어플리케이션 실행 방법에 관한 것으로, 디스플레이된 영상에서 얼굴 영역을 인식하여, 상기 인식된 얼굴 영역과 대응되는 특정 어플리케이션을 실행할 수 있다.Korean Patent Laid-Open Publication No. 10-2009-0112899 relates to a mobile terminal and a method for executing an application of the mobile terminal, and may recognize a face region in a displayed image and execute a specific application corresponding to the recognized face region.

한국공개특허 제10-2003-0057922호는 인터넷상의 음성통신 및 음성컨텐츠 제공 방법과 이를 실행하기 위한 프로그램을 기록한 컴퓨터로 읽을 수 있는 기록매체에 관한 것으로, 인터넷 상에서의 ARS 컨텐츠 서비스 및 통화 서비스를 제공하는 게이트웨이 방식을 개인 게이트웨이 방식으로 함으로써 상기 ARS 컨텐츠 서비스 및 통화 서비스를 제공하기 위한 별도의 시스템을 구축할 필요가 없다.
Korean Patent Laid-Open No. 10-2003-0057922 relates to a method of providing voice communication and voice content on the Internet, and a computer readable recording medium recording a program for executing the same, and provides an ARS content service and a call service on the Internet. It is not necessary to build a separate system for providing the ARS content service and call service by using the gateway method as a personal gateway method.

한국공개특허 제10-2009-0112899호Korean Patent Publication No. 10-2009-0112899 한국공개특허 제10-2003-0057922호Korean Patent Publication No. 10-2003-0057922

본 발명의 일 실시예는 음성 통화의 수행 과정에서 송수신되는 음성을 처리하는 키워드 음성 인식을 통해 관련 어플리케이션을 실행시키는 음성 통화 처리 방법 및 이를 실행하는 모바일 단말을 제공하고자 한다.An embodiment of the present invention is to provide a voice call processing method for executing a related application through a keyword voice recognition processing a voice transmitted and received in the course of performing a voice call and a mobile terminal executing the same.

본 발명은 모바일 단말기를 통해 통화를 진행하는 중에 미리 입력된 키워드와 매칭되는 통화내용이 인식될 경우 연관된 어플리케이션이 자동적으로 실행함으로써, 통화를 중단하고 어플리케이션을 검색해야 하는 불편함을 제거할 수 있는 키워드 음성 인식을 통해 관련 어플리케이션을 실행시키는 음성 통화 처리 방법 및 이를 실행하는 모바일 단말을 제공하고자 한다.
The present invention is a keyword that can eliminate the inconvenience of having to interrupt the call and search for the application by automatically executing the associated application when the call content matching the pre-entered keyword is recognized during the call through the mobile terminal The present invention provides a voice call processing method for executing a related application through voice recognition and a mobile terminal executing the same.

실시예들 중에서, 모바일 단말의 음성 통화를 처리하는 방법은 상기 음성 통화의 시작을 후킹하는 단계, 상기 음성 통화의 수행 과정에서 송수신되는 음성을 클러스터링하여 대화 컨텍스트를 생성하는 단계 -상기 대화 컨텍스트는 어플리케이션 식별자, 어플리케이션 API 식별자 및 어플리케이션 API 연관 파라미터-, 상기 어플리케이션 식별자와 연관된 어플리케이션의 설치 여부를 체크하는 단계 및 상기 연관된 어플리케이션이 설치되었으면, 상기 어플리케이션 API 연관 파라미터의 적어도 일부를 디스플레이하여 해당 어플리케이션 API의 실행을 프롬프팅하는 단계를 포함한다.Among the embodiments, a method of processing a voice call of a mobile terminal may include hooking a start of the voice call, generating a conversation context by clustering voices transmitted and received during the execution of the voice call, wherein the conversation context is an application. Checking an identifier, an application API identifier and an application API association parameter, whether an application associated with the application identifier is installed, and if the associated application is installed, displaying at least a portion of the application API association parameter to execute execution of the corresponding application API. Prompting.

일 실시예에서, 상기 음성 통화의 시작을 후킹하는 단계는 상기 후킹된 음성에서 특정 키워드가 인식되었을 때 실행할 어플리케이션을 상기 특정 키워드와 대응시켜 미리 설정하는 단계를 더 포함할 수 있다.In an embodiment, the hooking the start of the voice call may further include presetting an application to be executed when the specific keyword is recognized in the hooked voice with the specific keyword.

일 실시예에서, 상기 송수신되는 음성을 클러스터링하여 대화 컨텍스트를 생성하는 단계는 상기 클러스터링된 음성에서 특정 어플리케이션에서 사용되는 추출 용어를 추출하는 단계 및 해당 어플리케이션 API 연관 파라미터의 적어도 일부를 상기 추출된 추출 용어에 의하여 추론된 추론 용어로서 치환하는 단계를 포함할 수 있다.In one embodiment, the clustering of the transmitted and received voices to generate a dialogue context may include extracting an extraction term used in a specific application from the clustered speech and extracting at least a portion of the corresponding application API association parameter from the extracted extraction term. It may include the step of substitution as an inferred term inferred by.

일 실시예에서, 상기 어플리케이션 API의 실행을 프롬프팅하는 단계는 사용자의 제어 하에 상기 디스플레이된 어플리케이션 API 연관 파라미터를 수정하는 것을 허용하는 단계를 더 포함할 수 있다.In one embodiment, prompting execution of the application API may further include allowing modifying the displayed application API association parameter under user control.

실시예들 중에서, 음성 통화를 처리하는 모바일 단말은 상기 음성 통화의 시작을 후킹하는 음성 통화 후킹부, 상기 음성 통화의 수행 과정에서 송수신되는 음성을 클러스터링하여 대화 컨텍스트를 생성하는 대화 컨텍스트 생성부, 상기 어플리케이션 식별자와 연관된 어플리케이션의 설치 여부를 체크하는 어플리케이션 설치 확인부 및 상기 연관된 어플리케이션이 설치되었으면, 상기 어플리케이션 API 연관 파라미터의 적어도 일부를 디스플레이하여 해당 어플리케이션 API의 실행을 프롬프팅하는 어플리케이션 실행부를 포함한다. 여기에서, 상기 대화 컨텍스트는 어플리케이션 식별자, 어플리케이션 API 식별자 및 어플리케이션 API 연관 파라미터를 포함한다.In one or more embodiments, a mobile terminal processing a voice call may include a voice call hooking unit hooking a start of the voice call, a conversation context generating unit clustering voices transmitted and received in the course of performing the voice call, and generating a conversation context. And an application installation checking unit that checks whether an application associated with an application identifier is installed, and an application execution unit that displays at least a portion of the application API association parameters and prompts execution of the corresponding application API when the associated application is installed. Here, the conversation context includes an application identifier, an application API identifier, and an application API association parameter.

일 실시예에서, 상기 대화 컨텍스트 생성부는 상기 클러스터링된 음성에서 특정 어플리케이션에서 사용되는 추출 용어를 추출하고, 해당 어플리케이션 API 연관 파라미터의 적어도 일부를 상기 추출된 추출 용어에 의하여 추론된 추론 용어로서 치환할 수 있다. In an embodiment, the dialogue context generator may extract an extraction term used in a specific application from the clustered voice, and replace at least a portion of the corresponding application API association parameter with an inferred term inferred by the extracted extraction term. have.

일 실시예에서, 상기 대화 컨텍스트 생성부는 상기 송수신되는 음성에서 적어도 하나의 컨텍스트 트리 각각에 있는 루트 노드를 검색할 수 있다.
In an embodiment, the dialogue context generator may search for a root node in each of at least one context tree in the transmitted and received voice.

본 발명의 일 실시예에 따른 키워드 음성 인식을 통해 관련 어플리케이션을 실행시키는 음성 통화 처리 방법 및 이를 실행하는 모바일 단말은 음성 통화의 수행 과정에서 미리 지정된 키워드를 인식하고, 해당 키워드와 연결된 어플리케이션을 자동적으로 수행함으로써 사용자 편의를 높일 수 있다.A voice call processing method for executing a related application through keyword voice recognition according to an embodiment of the present invention, and a mobile terminal executing the same recognizes a predetermined keyword in the course of performing a voice call and automatically detects an application connected with the corresponding keyword. By doing so, user convenience can be enhanced.

본 발명의 일 실시예에 따른 키워드 음성 인식을 통해 관련 어플리케이션을 실행시키는 음성 통화 처리 방법 및 이를 실행하는 모바일 단말은 클러스터링된 음성에 기초하여 결정된 실행 대상 어플리케이션 API 연관 파라미터의 적어도 일부를 디스플레이할 수 있다.
A voice call processing method for executing a related application through keyword speech recognition according to an embodiment of the present invention, and a mobile terminal executing the same may display at least a part of an execution target application API related parameter determined based on the clustered voice. .

도 1은 본 발명의 일 실시예에 따른 음성 통화 처리 시스템을 설명하는 블록도이다.
도 2는 도 1에 있는 모바일 단말을 설명하는 블록도이다.
도 3은 도 1에 있는 모바일 단말의 실행 과정을 설명하는 흐름도이다.
도 4 내지 도 7은 도 3의 실행 과정을 예시하는 도면이다.
도 8은 도 1에 있는 모바일 단말에서 통화가 종료된 후에 녹음된 음성 통화를 처리하는 과정을 예시하는 도면이다.1 is a block diagram illustrating a voice call processing system according to an embodiment of the present invention.
2 is a block diagram illustrating the mobile terminal of FIG.
3 is a flowchart illustrating an execution process of a mobile terminal of FIG. 1.
4 to 7 are diagrams illustrating an execution process of FIG. 3.
FIG. 8 is a diagram illustrating a process of processing a recorded voice call after the call ends in the mobile terminal of FIG. 1.

본 발명에 관한 설명은 구조적 내지 기능적 설명을 위한 실시예에 불과하므로, 본 발명의 권리범위는 본문에 설명된 실시예에 의하여 제한되는 것으로 해석되어서는 아니 된다. 즉, 실시예는 다양한 변경이 가능하고 여러 가지 형태를 가질 수 있으므로 본 발명의 권리범위는 기술적 사상을 실현할 수 있는 균등물들을 포함하는 것으로 이해되어야 한다. 또한, 본 발명에서 제시된 목적 또는 효과는 특정 실시예가 이를 전부 포함하여야 한다거나 그러한 효과만을 포함하여야 한다는 의미는 아니므로, 본 발명의 권리범위는 이에 의하여 제한되는 것으로 이해되어서는 아니 될 것이다.The description of the present invention is merely an example for structural or functional explanation, and the scope of the present invention should not be construed as being limited by the embodiments described in the text. That is, the embodiments are to be construed as being variously embodied and having various forms, so that the scope of the present invention should be understood to include equivalents capable of realizing technical ideas. Also, the purpose or effect of the present invention should not be construed as limiting the scope of the present invention, since it does not mean that a specific embodiment should include all or only such effect.

한편, 본 출원에서 서술되는 용어의 의미는 다음과 같이 이해되어야 할 것이다.Meanwhile, the meaning of the terms described in the present application should be understood as follows.

"제1", "제2" 등의 용어는 하나의 구성요소를 다른 구성요소로부터 구별하기 위한 것으로, 이들 용어들에 의해 권리범위가 한정되어서는 아니 된다. 예를 들어, 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다.The terms "first "," second ", and the like are intended to distinguish one element from another, and the scope of the right should not be limited by these terms. For example, the first component may be referred to as a second component, and similarly, the second component may also be referred to as a first component.

어떤 구성요소가 다른 구성요소에 "연결되어"있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결될 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소가 다른 구성요소에 "직접 연결되어"있다고 언급된 때에는 중간에 다른 구성요소가 존재하지 않는 것으로 이해되어야 할 것이다. 한편, 구성요소들 간의 관계를 설명하는 다른 표현들, 즉 "~사이에"와 "바로 ~사이에" 또는 "~에 이웃하는"과 "~에 직접 이웃하는" 등도 마찬가지로 해석되어야 한다.It is to be understood that when an element is referred to as being "connected" to another element, it may be directly connected to the other element, but there may be other elements in between. On the other hand, when an element is referred to as being "directly connected" to another element, it should be understood that there are no other elements in between. On the other hand, other expressions describing the relationship between the components, such as "between" and "immediately between" or "neighboring to" and "directly neighboring to", should be interpreted as well.

단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한 복수의 표현을 포함하는 것으로 이해되어야 하고, "포함하다"또는 "가지다" 등의 용어는 설시된 특징, 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함을 지정하려는 것이며, 하나 또는 그 이상의 다른 특징이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.It should be understood that the singular " include "or" have "are to be construed as including a stated feature, number, step, operation, component, It is to be understood that the combination is intended to specify that it does not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, or combinations thereof.

각 단계들에 있어 식별부호(예를 들어, a, b, c 등)는 설명의 편의를 위하여 사용되는 것으로 식별부호는 각 단계들의 순서를 설명하는 것이 아니며, 각 단계들은 문맥상 명백하게 특정 순서를 기재하지 않는 이상 명기된 순서와 다르게 일어날 수 있다. 즉, 각 단계들은 명기된 순서와 동일하게 일어날 수도 있고 실질적으로 동시에 수행될 수도 있으며 반대의 순서대로 수행될 수도 있다.In each step, the identification code (e.g., a, b, c, etc.) is used for convenience of explanation, the identification code does not describe the order of each step, Unless otherwise stated, it may occur differently from the stated order. That is, each step may occur in the same order as described, may be performed substantially concurrently, or may be performed in reverse order.

본 발명은 컴퓨터가 읽을 수 있는 기록매체에 컴퓨터가 읽을 수 있는 코드로서 구현될 수 있고, 컴퓨터가 읽을 수 있는 기록 매체는 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록 장치를 포함한다. 컴퓨터가 읽을 수 있는 기록 매체의 예로는 ROM, RAM, CD-ROM, 자기 테이프, 플로피 디스크, 광 데이터 저장 장치 등이 있으며, 또한, 캐리어 웨이브(예를 들어 인터넷을 통한 전송)의 형태로 구현되는 것도 포함한다. 또한, 컴퓨터가 읽을 수 있는 기록 매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산 방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수 있다.The present invention can be embodied as computer-readable code on a computer-readable recording medium, and the computer-readable recording medium includes all kinds of recording devices for storing data that can be read by a computer system . Examples of the computer-readable recording medium include a ROM, a RAM, a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like, and also implemented in the form of a carrier wave (for example, transmission over the Internet) . In addition, the computer-readable recording medium may be distributed over network-connected computer systems so that computer readable codes can be stored and executed in a distributed manner.

여기서 사용되는 모든 용어들은 다르게 정의되지 않는 한, 본 발명이 속하는 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가진다. 일반적으로 사용되는 사전에 정의되어 있는 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 것으로 해석되어야 하며, 본 출원에서 명백하게 정의하지 않는 한 이상적이거나 과도하게 형식적인 의미를 지니는 것으로 해석될 수 없다.
All terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs, unless otherwise defined. The terms defined in the commonly used dictionary should be interpreted to coincide with the meanings in the context of the related art, and should not be interpreted as having ideal or excessively formal meanings unless clearly defined in the present application.

도 1은 본 발명의 일 실시예에 따른 음성 통화 처리 시스템을 설명하는 블록도이다.1 is a block diagram illustrating a voice call processing system according to an embodiment of the present invention.

도 1을 참조하면, 음성 통화 처리 시스템(100)은 모바일 단말(110), 통신망(120) 및 적어도 하나의 사용자 단말(130)을 포함할 수 있다. 여기에서, 모바일 단말(110)은 통신망(120)을 통해 적어도 하나의 사용자 단말(130)과 연결된다.Referring to FIG. 1, the voice call processing system 100 may include a mobile terminal 110, a communication network 120, and at least one user terminal 130. Here, the mobile terminal 110 is connected to at least one user terminal 130 through the communication network 120.

모바일 단말(110)은 휴대 가능한 컴퓨팅 장치에 해당할 수 있고, 예를 들어, 모바일 단말(110)은 스마트폰 및 PDA(Personal Digital Assistant) 등에 해당할 수 있다. 일 실시예에서, 모바일 단말(110)은 발신자 단말 및 수신자 단말 중 어느 하나의 단말에 해당할 수 있다. 모바일 단말(110)이 수신자 단말 및 수신자 단말 중 어느 하나의 단말이 되더라도 하기의 기능을 수행할 수 있다.The mobile terminal 110 may correspond to a portable computing device. For example, the mobile terminal 110 may correspond to a smart phone and a personal digital assistant (PDA). In one embodiment, the mobile terminal 110 may correspond to any one of a sender terminal and a receiver terminal. Even if the mobile terminal 110 becomes one of the receiver terminal and the receiver terminal, the following functions can be performed.

모바일 단말(110)은 적어도 하나의 사용자 단말(110)과의 음성 통화의 수행 과정에서 송수신되는 음성을 클러스팅하여 대화 컨텍스트를 생성하고, 대화 컨텍스트에 기초하여 대화 컨텍스트의 적어도 일부를 디스플레이할 수 있다. The mobile terminal 110 may generate a conversation context by clustering voices transmitted and received in the course of performing a voice call with the at least one user terminal 110 and may display at least a part of the conversation context based on the conversation context. .

적어도 하나의 사용자 단말(130)은 통신망(120)을 통해 모바일 단말(110)과 연결될 수 있는 컴퓨팅 장치에 해당할 수 있고, 예를 들어, 적어도 하나의 사용자 단말(130)은 PC, 유선 통신 단말, 스마트폰 및 PDA 등에 해당할 수 있다. 일 실시예에서, 적어도 하나의 사용자 단말(130)은 발신자 단말 및 수신자 단말 중 어느 하나의 단말에 해당할 수 있다. 적어도 하나의 사용자 단말(130)은 수신자 단말 및 수신자 단말 중 어느 하나의 단말이 적어도 하나의 사용자 단말(130)이 특정 어플리케이션을 저장할 수 있는 단말이라면 모바일 단말(110)과 동일한 기능을 수행할 수 있다.
The at least one user terminal 130 may correspond to a computing device that may be connected to the mobile terminal 110 through the communication network 120. For example, the at least one user terminal 130 may be a PC or a wired communication terminal. , Smart phones and PDAs. In one embodiment, at least one user terminal 130 may correspond to any one terminal of the sender terminal and the receiver terminal. The at least one user terminal 130 may perform the same function as the mobile terminal 110 if any one of the receiver terminal and the receiver terminal is a terminal in which the at least one user terminal 130 can store a specific application. .

도 2는 도 1에 있는 모바일 단말을 설명하는 블록도이다.2 is a block diagram illustrating the mobile terminal of FIG.

도 2를 참조하면, 모바일 단말(110)은 음성 통화 후킹부(210), 대화 컨텍스트 생성부(220) 및 어플리케이션 실행부(230)를 포함하고, 어플리케이션 설치 확인부(240), 어플리케이션 저장부(250), 키워드 데이터베이스(260), 어플리케이션 API 데이터베이스(270), 디스플레이부(280), 음성 저장부(290) 및 제어부(300)를 더 포함할 수 있다.Referring to FIG. 2, the mobile terminal 110 includes a voice call hooking unit 210, a dialogue context generating unit 220, and an application executing unit 230, an application installation checking unit 240, and an application storage unit ( 250, a keyword database 260, an application API database 270, a display unit 280, a voice storage unit 290, and a controller 300 may be further included.

음성 통화 후킹부(210)는 음성 통화의 시작을 후킹한다. 일 실시예에서, 음성 통화 후킹부(210)는 적어도 하나의 사용자 단말(130)과의 음성 통화 세션을 검출하여 검출된 음성 통화 세션에 송수신되는 음성을 가져올 수 있다. 다른 실시예에서, 음성 통화 후킹부(201)는 사용자에 의해 바탕 화면에 있는 통화 버튼이 터치가 되면 송수신되는 음성을 가져올 수 있다.The voice call hooking unit 210 hooks the start of the voice call. In one embodiment, the voice call hooking unit 210 may detect a voice call session with the at least one user terminal 130 to bring voice to and from the detected voice call session. In another embodiment, the voice call hooking unit 201 may bring a voice to be transmitted and received when the call button on the desktop is touched by the user.

대화 컨텍스트 생성부(220)는 음성 통화의 수행 과정에서 송수신되는 음성 또는 음성 통화의 종료 후에 사용자에 의해 녹음된 음성 통화가 재생되면 음성을 클러스터링하여 대화 컨텍스트를 생성할 수 있다. The conversation context generator 220 may generate a conversation context by clustering voices when a voice call recorded by a user is played after a voice or a voice call transmitted and received in the course of performing a voice call.

여기에서, 대화 컨텍스트는 어플리케이션 식별자, 어플리케이션 API 식별자 및 어플리케이션 API 연관 파라미터를 포함할 수 있다. 예를 들어, 어플리케이션 식별자는 "Schedule", 어플리케이션 API 식별자는 "SEND()", 어플리케이션 API 연관 파라미터는 "Title, Year, Month, Day, Time"일 수 있다. Here, the conversation context may include an application identifier, an application API identifier, and an application API association parameter. For example, the application identifier may be "Schedule", the application API identifier may be "SEND ()", and the application API association parameter may be "Title, Year, Month, Day, Time".

일 실시예에서, 대화 컨텍스트 생성부(220)는 현재 시점부터 과거의 미리 설정된 시간 동안에 송수신되는 음성을 수집할 수 있다. 여기에서, 현재 시점부터 과거의 미리 설정된 시간은 모바일 단말(110)의 사용자에 대한 대화 간격을 기초로 동적으로 설정될 수 있다. 대화 간격은 말하는 사람의 발화 지점부터 상대방 발화가 종료하는 지점까지의 간격에 해당할 수 있다.In one embodiment, the dialogue context generator 220 may collect voice transmitted and received during a preset time from the current time point. Here, the preset time from the current time point in the past may be dynamically set based on the conversation interval for the user of the mobile terminal 110. The conversation interval may correspond to an interval from the uttering point of the speaker to the point at which the counterpart uttering ends.

다른 일 실시예에서, 대화 컨텍스트 생성부(220)는 송수신되는 음성에서 적어도 하나의 컨텍스트 트리 각각에 있는 루트 노드를 검색하고, 만일 루트 노드가 검색되면 해당 음성의 이후의 음성에서 해당 루트 노드의 자식 노드를 검색하고, 말단 노드까지 방문되면 해당 노드들 모두에 대한 키워드들을 수집할 수 있다. 예를 들어, 대화 컨텍스트 생성부(220)는 송수신되는 음성에서 루트 노드로 "전화번호"을 검색하고, 해당 음성의 이후의 음성에서 "박광우"를 검색하고, 말단 노드까지 방문되면 "전화번호", "박광우"를 수집할 수 있다.In another embodiment, the dialogue context generator 220 searches for a root node in each of the at least one context tree in the transmitted and received voice, and if the root node is found, the child of the root node in the subsequent voice of the voice. When a node is searched and visited to an end node, keywords for all of the nodes can be collected. For example, the dialogue context generation unit 220 searches for "phone number" as the root node in the voice that is transmitted and received, searches for "Pak Kwang-woo" in the voice after that voice, and visits the "phone number" when the terminal node is visited. , "Park Kwang-woo" can be collected.

또 다른 일 실시예에서, 대화 컨텍스트 생성부(220)는 송수신되는 음성에서 키워드를 검색하고, 만일 키워드가 검색되면 해당 음성의 이후의 음성에서 종속 키워드를 검색하고, 마지막 종속 키워드까지 검색되면 해당 키워드를 수집할 수 있다. In another embodiment, the dialogue context generator 220 searches for a keyword in a voice that is transmitted and received, if a keyword is searched, searches for dependent keywords in a subsequent voice of the corresponding voice, and if the keyword is searched up to the last dependent keyword, the corresponding keyword. Can be collected.

또한, 이러한 실시예와는 달리, 대화 컨텍스트 생성부(220)는 클러스터링된 음성에서 특정 어플리케이션에서 사용되는 추출 용어를 추출하고, 해당 어플리케이션 API 연관 파라미터의 적어도 일부를 추출된 추출 용어에 의하여 추론된 추론 용어로서 치환할 수 있다. 예를 들어, 대화 컨텍스트 생성부(220)는 클러스터링된 음음성에서 스케줄 어플리케이션에서 사용되는 추출 용어로 "내일"을 추출하고, 스케줄 어플리케이션 API 연관 파라미터 "Title, Year, Month, Day, Time" 에서 "Year, Month, Day"를 오늘 날짜를 기준으로 "2012, 04, 28"로 치환할 수 있다.In addition, unlike this embodiment, the dialogue context generator 220 extracts an extraction term used in a specific application from the clustered voice and infers at least a portion of the corresponding application API related parameter inferred by the extracted extraction term. It may be substituted as a term. For example, the dialogue context generator 220 extracts "tomorrow" as the extraction term used in the schedule application from the clustered voice, and "" in the schedule application API related parameters "Title, Year, Month, Day, Time". Year, Month, Day "can be replaced with" 2012, 04, 28 "based on today's date.

어플리케이션 실행부(230)는 사용자에 의해 선택된 어플리케이션을 실행할 수 있다. 일 실시예에서, 어플리케이션 실행부(230)는 대화 컨텍스트의 어플리케이션 API 식별자를 수신하여 해당 어플리케이션을 실행할 수 있다. 다른 일 실시예에서, 어플리케이션 실행부(230)는 후킹된 음성에 기 설정된 키워드가 존재하면 키워드에 해당하는 어플리케이션을 실행할 수 있다. 예를 들어, 클러스터링된 음성 “화요일은 어때?”에서 기 설정된 키워드“화요일”이 존재하여 스케줄 어플리케이션을 실행할 수 있고, 클러스터링된 음성에 “전화번호 알아?”에서 기 설정된 키워드 “전화번호”가 존재하여 연락처 어플리케이션을 실행할 수 있다.The application executor 230 may execute an application selected by the user. In one embodiment, the application executor 230 may execute the application by receiving the application API identifier of the conversation context. In another embodiment, the application executor 230 may execute an application corresponding to the keyword if a preset keyword exists in the hooked voice. For example, the pre-set keyword "Tuesday" exists in the clustered voice "How about Tuesday?" To run the schedule application, and the pre-set keyword "Phone Number" exists in the "Know your phone number" in the clustered voice. To launch the Contacts application.

어플리케이션 설치 확인부(240)는 클러스터링된 음성을 기초로 결정된 어플리케이션 식별자와 연관된 어플리케이션의 설치 여부를 확인할 수 있다. 일 실시예에서, 어플리케이션 설치 확인부(240)는 어플리케이션 저장부(250)에 해당 어플리케이션이 존재하는 여부를 확인할 수 있다.The application installation checker 240 may confirm whether to install the application associated with the application identifier determined based on the clustered voice. In one embodiment, the application installation checker 240 may determine whether the corresponding application exists in the application storage unit 250.

어플리케이션 저장부(250)는 모바일 단말(110)에서 실행 가능한 어플리케이션을 저장하는데 사용된다. 일 실시예에서, 어플리케이션 저장부(250)는 사용자에 의해 어플리케이션 다운로드 명령이 수신되면 해당 어플리케이션을 저장할 수 있다. The application storage unit 250 is used to store an application executable in the mobile terminal 110. In one embodiment, the application storage unit 250 may store the application when an application download command is received by the user.

키워드 데이터베이스(260)는 어플리케이션에 대한 키워드를 저장하는데 사용될 수 있다. 일 실시예에서, 키워드 데이터베이스(260)는 컨텍스트 트리 형태로 키워드를 저장하는데 사용될 수 있다. 컨텍스트 트리는 루트 노드와 적어도 하나의 자식 노드로 구성된다. 다른 일 실시예에서, 키워드 데이터베이스(260)는 중요 키워드를 저장하는데 사용될 수 있다. 다른 일 실시예에서, 키워드 데이터베이스(260)는 후킹된 음성에 기 설정된 키워드가 존재하면 키워드와 연관된 어플리케이션을 실행하도록 해당 키워드와 어플리케이션을 대응시켜 저장하는데 사용될 수 있다.Keyword database 260 can be used to store keywords for an application. In one embodiment, keyword database 260 may be used to store keywords in the form of a context tree. The context tree consists of a root node and at least one child node. In another embodiment, keyword database 260 may be used to store important keywords. In another embodiment, the keyword database 260 may be used to correspond to the keyword and the application so as to execute an application associated with the keyword if a preset keyword exists in the hooked voice.

어플리케이션 API 데이터베이스(270)는 모바일 단말(110)에 저장된 어플리케이션의 API를 저장하는데 사용된다. 예를 들어, 어플리케이션 API 데이터베이스(270)는 어플리케이션의 식별자에 대응하여 어플리케이션 API 식별자 및 어플리케이션 API 연관 파라미터를 저장하는데 사용될 수 있다. 어플리케이션 식별자는 실행 대상 어플리케이션의 고유 번호에 해당할 수 있고, 어플리케이션 API 식별자는 어플리케이션을 실행할 때 사용되는 API에 해당할 수 있고, 어플리케이션 API 연관 파라미터는 어플리케이션을 실행할 때 사용되는 API의 파라미터에 해당할 수 있다.The application API database 270 is used to store the API of the application stored in the mobile terminal 110. For example, application API database 270 may be used to store application API identifiers and application API association parameters corresponding to identifiers of applications. The application identifier may correspond to a unique number of the target application to be executed, the application API identifier may correspond to an API used when executing the application, and the application API association parameter may correspond to a parameter of an API used when executing the application. have.

디스플레이부(280)는 어플리케이션의 실행 과정을 디스플레이한다. 일 실시예에서, 디스플레이부(280)는 적어도 하나의 사용자 단말(130)로부터 음성 통화가 요청되면 해당 사용자 단말의 정보를 시각적으로 디스플레이할 수 있다. 일 실시예에서, 디스플레이부(280)는 만일 해당 사용자 단말의 정보가 존재하면 모바일 단말(110)에 저장된 해당 사용자 단말의 정보(예를 들어, 발신자 이름 및 발신자 전화번호)를 디스플레이할 수 있다. 다른 일 실시예에서, 디스플레이부(280)는 만일 해당 사용자 단말의 정보가 존재하지 않으면 통신사 단말(미도시됨)에 의한 해당 사용자 단말의 정보(예를 들어, 발신자 전화번호)를 디스플레이할 수 있다. The display unit 280 displays an execution process of the application. In one embodiment, the display unit 280 may visually display the information of the user terminal when a voice call is requested from the at least one user terminal 130. In one embodiment, if the information of the corresponding user terminal is present, the display unit 280 may display the information (eg, caller name and caller number) of the corresponding user terminal stored in the mobile terminal 110. In another embodiment, if the information of the corresponding user terminal does not exist, the display unit 280 may display the information of the corresponding user terminal (eg, caller's phone number) by the carrier terminal (not shown). .

일 실시예에서, 디스플레이부(280)는 사용자에 의해 특정 어플리케이션 API 연관 파라미터가 수정되면 특정 어플리케이션 API 연관 파라미터의 내용을 갱신하여 디스플레이할 수 있다.In one embodiment, the display unit 280 may update and display the contents of the specific application API-related parameters when the specific application API-related parameters are modified by the user.

음성 저장부(290)는 적어도 하나의 사용자 단말(130)과 송수신되는 음성을 저장하는데 사용될 수 있다. 일 실시예에서, 음성 저장부(290)는 사용자에 의해 녹음 버튼이 선택되면 음성을 녹음할 수 있다.The voice storage unit 290 may be used to store voice that is transmitted and received with at least one user terminal 130. In one embodiment, the voice storage unit 290 may record the voice when the record button is selected by the user.

제어부(300)는 음성 통화 후킹부(210), 대화 컨텍스트 생성부(220) 및 어플리케이션 실행부(230)를 포함하고, 어플리케이션 설치 확인부(240), 어플리케이션 저장부(250), 키워드 데이터베이스(260), 어플리케이션 API 데이터베이스(270), 디스플레이부(280) 및 음성 저장부(290)를 제어한다.
The controller 300 includes a voice call hooking unit 210, a dialogue context generating unit 220, and an application executing unit 230, and includes an application installation checking unit 240, an application storage unit 250, and a keyword database 260. ), The application API database 270, the display unit 280 and the voice storage unit 290 is controlled.

도 3은 도 1에 있는 모바일 단말의 실행 과정을 설명하는 흐름도이고, 도 4 내지 도 7은 도 3의 실행 과정을 예시하는 도면이다.3 is a flowchart illustrating an execution process of the mobile terminal of FIG. 1, and FIGS. 4 to 7 are views illustrating an execution process of FIG. 3.

도 3 내지 도 7에서, 음성 통화 후킹부(210)는 음성 통화의 시작을 후킹한다(단계 S310). 일 실시예에서, 음성 통화 후킹부(210)는 적어도 하나의 사용자 단말(130) 중 특정 사용자 단말(예를 들어, 강현신이 소유한 사용자 단말)로부터 통화호를 수신하고, 사용자에 의해 통화 버튼(410)이 터치되면 음성 통화의 시작을 후킹할 수 있다. 3 to 7, the voice call hooking unit 210 hooks the start of the voice call (step S310). In one embodiment, the voice call hooking unit 210 receives a call from a particular user terminal (for example, a user terminal owned by Kang Hyunshin) of the at least one user terminal 130, the call button ( If 410 is touched, it may hook the start of a voice call.

대화 컨텍스트 생성부(220)는 음성 통화의 수행 과정에서 송수신되는 음성을 클러스터링하여 대화 컨텍스트를 생성한다(단계 S320). 일 실시예에서, 대화 컨텍스트 생성부(220)는 사용자에 의해 끊기 버튼(420)이 터치되기 전까지의 음성 통화 수행 과정에서 송수신되는 음성을 클러스터링하여 대화 컨텍스트를 생성할 수 있다. The conversation context generating unit 220 generates a conversation context by clustering voices transmitted and received in the course of performing the voice call (step S320). In one embodiment, the conversation context generator 220 may generate a conversation context by clustering voices transmitted and received in the course of performing a voice call until the disconnect button 420 is touched by the user.

어플리케이션 설치 확인부(240)는 클러스터링된 음성을 기초로 결정된 어플리케이션 식별자와 연관된 어플리케이션의 설치 여부를 확인한다(단계 S330). The application installation checker 240 checks whether the application associated with the application identifier determined based on the clustered voice is installed (step S330).

디스플레이부(280)는 만일 해당 어플리케이션이 설치되어 있으면(단계 S340), 어플리케이션 API 연관 파라미터의 적어도 일부를 디스플레이하여 해당 어플리케이션 API의 실행에 대한 프롬프팅을 디스플레이한다(단계 S350). 예를 들어, 디스플레이부(280)는 연락처 어플리케이션 API 연관 파라미터인 이름(520a) 및 전화번호(530a)를 디스플레이하여 연락처 어플리케이션 API의 실행에 대한 프롬프팅을 디스플레이할 수 있다. 다른 예를 들어, 디스플레이부(280)는 스케줄 어플리케이션 API 연관 파라미터인 스케줄 내용(610a) 및 스케줄 장소(620a)를 디스플레이하여 스케줄 어플리케이션 API의 실행에 대한 프롬프팅을 디스플레이할 수 있다. If the corresponding application is installed (step S340), the display unit 280 displays at least a portion of the application API-related parameters and prompts for execution of the corresponding application API (step S350). For example, the display unit 280 may display a prompt for execution of the contact application API by displaying the name 520a and the phone number 530a which are the contact application API related parameters. For another example, the display unit 280 may display the schedule content 610a and the schedule location 620a, which are schedule application API related parameters, to display prompting for execution of the schedule application API.

일 실시예에서, 디스플레이부(280)는 사용자에 의해 어플리케이션 API 연관 파라미터가 수정되면, 어플리케이션 API 연관 파라미터를 갱신하여 디스플레이할 수 있다. 예를 들어, 디스플레이부(280)는 사용자에 의해 연락처 어플리케이션 API 연관 파라미터 중 이미지(510a)가 추가되면 추가된 이미지(510b)를 디스플레이하고, 이름(520a)이 수정되면 수정된 이름(520b)을 디스플레이하고, 전화번호 정보(530a)가 수정되면 수정된 전화번호 정보(530b)를 디스플레이할 수 있다. 다른 예를 들어, 디스플레이부(280)는 스케줄 어플리케이션 API 연관 파라미터 중 스케줄 제목(610a)이 수정되면 수정된 제목(610b)을 디스플레이하고, 스케줄 장소(620a)가 수정되면 수정된 스케줄 장소(620b)를 디스플레이한다.In one embodiment, if the application API association parameter is modified by the user, the display unit 280 may update and display the application API association parameter. For example, the display unit 280 displays the added image 510b when the image 510a is added among the contact application API related parameters by the user, and displays the modified name 520b when the name 520a is modified. If the phone number information 530a is modified, the modified phone number information 530b may be displayed. For another example, the display unit 280 displays the modified title 610b when the schedule title 610a is modified among the schedule application API related parameters, and the modified schedule location 620b when the schedule place 620a is modified. Is displayed.

디스플레이부(280)는 만일 해당 어플리케이션이 설치되어 있지 않으면(단계 S350), 해당 어플리케이션이 설치되어 있지 않은 사실을 디스플레이한다(단계 S360). 디스플레이부(280)는 통화가 종료된 후에 사용자에 의해 해당 어플리케이션의 설치 요청이 수신하면 해당 어플리케이션을 다운로드 받을 수 있는 화면(미도시됨)을 시각적으로 디스플레이한다.
If the corresponding application is not installed (step S350), the display unit 280 displays the fact that the corresponding application is not installed (step S360). The display unit 280 visually displays a screen (not shown) for downloading the corresponding application when the installation request of the corresponding application is received by the user after the call is terminated.

도 8은 도 1에 있는 모바일 단말에서 통화가 종료된 후에 녹음된 음성 통화를 처리하는 과정을 예시하는 도면이다.FIG. 8 is a diagram illustrating a process of processing a recorded voice call after the call ends in the mobile terminal of FIG. 1.

도 8에서, 음성 저장부(290)는 사용자에 의해 적어도 하나의 사용자 단말(110)과 음성 통화 수행 중에 녹음 버튼이 선택되면 송수신되는 음성을 녹음하여 저장하는데 사용된다. 디스플레이부(280)는 사용자에 의해 녹음 리스트 보기(미도시됨)가 선택되면 사용자에 의해 녹음된 음성에 대한 녹음 리스트를 디스플레이할 수 있다. 녹음 리스트는 녹음된 날짜에 따라 순차적으로 정렬되어 표시될 수 있다. 디스플레이부(280)는 녹음 리스트에 있는 음성 중 특정 음성이 선택되면 녹음된 음성 통화를 디스플레이할 수 있다.In FIG. 8, the voice storage unit 290 is used to record and store voices transmitted and received when a record button is selected while performing a voice call with at least one user terminal 110. The display unit 280 may display the recording list of the voice recorded by the user when the recording list view (not shown) is selected by the user. The recording list may be sequentially sorted and displayed according to the recorded date. The display unit 280 may display the recorded voice call when a specific voice is selected from the voices in the recording list.

대화 컨텍스트 생성부(220)는 녹음된 음성 통화를 클러스터링하여 대화 컨텍스트를 생성한다. 일 실시예에서, 대화 컨텍스트 생성부(220)는 현재 시점부터 과거의 미리 설정된 시간 동안에 송수신되는 음성을 수집할 수 있다. 여기에서, 현재 시점부터 과거의 미리 설정된 시간은 모바일 단말(110)의 사용자에 대한 대화 간격을 기초로 동적으로 설정될 수 있다. 대화 간격은 말하는 사람의 발화 지점부터 상대방 발화가 종료하는 지점까지의 간격에 해당할 수 있다. The conversation context generator 220 generates a conversation context by clustering the recorded voice call. In one embodiment, the dialogue context generator 220 may collect voice transmitted and received during a preset time from the current time point. Here, the preset time from the current time point in the past may be dynamically set based on the conversation interval for the user of the mobile terminal 110. The conversation interval may correspond to an interval from the uttering point of the speaker to the point at which the counterpart uttering ends.

다른 일 실시예에서, 대화 컨텍스트 생성부(220)는 송수신되는 음성에서 적어도 하나의 컨텍스트 트리 각각에 있는 루트 노드를 검색하고, 만일 루트 노드가 검색되면 해당 음성의 이후의 음성에서 해당 루트 노드의 자식 노드를 검색하고, 말단 노드까지 방문되면 해당 노드들 모두에 대한 키워드들을 수집할 수 있다.
In another embodiment, the dialogue context generator 220 searches for a root node in each of the at least one context tree in the transmitted and received voice, and if the root node is found, the child of the root node in the subsequent voice of the voice. When a node is searched and visited to an end node, keywords for all of the nodes can be collected.

상기에서는 본 출원의 바람직한 실시예를 참조하여 설명하였지만, 해당 기술 분야의 숙련된 당업자는 하기의 특허청구의 범위에 기재된 본 발명의 사상 및 영역으로부터 벗어나지 않는 범위 내에서 본 발명을 다양하게 수정 및 변경시킬 수 있음을 이해할 수 있을 것이다.
It will be apparent to those skilled in the art that various modifications and variations can be made in the present invention without departing from the spirit or scope of the present invention as defined by the following claims It can be understood that

100: 음성 통화 처리 시스템 110: 모바일 단말
120: 통신망 130: 적어도 하나의 사용자 단말
210: 음성 통화 후킹부 220: 대화 컨텍스트 생성부
230: 어플리케이션 실행부 240: 어플리케이션 설치 확인부
250: 어플리케이션 저장부 260: 키워드 데이터베이스
270: 어플리케이션 API 데이터베이스
280: 디스플레이부 290: 제어부100: voice call processing system 110: mobile terminal
120: communication network 130: at least one user terminal
210: voice call hooking unit 220: conversation context generating unit
230: application execution unit 240: application installation confirmation unit
250: application storage unit 260: keyword database
270: Application API Database
280: display unit 290: control unit

Claims

In the method for processing a voice call of a mobile terminal, the method
Hooking the start of the voice call;
Generating a conversation context by clustering voices transmitted and received in the course of performing the voice call, wherein the conversation context includes an application identifier, an application API identifier, and an application API association parameter;
Checking whether to install an application associated with the application identifier;
If the associated application has been installed, displaying at least a portion of the application API association parameters to prompt execution of the corresponding application API.

The method of claim 1, wherein hooking the start of the voice call comprises:
And presetting an application to be executed when a specific keyword is recognized in the hooked voice in correspondence with the specific keyword.

The method of claim 1, wherein the clustering of the transmitted and received voices to generate a dialogue context
Extracting extracted terms used in a specific application from the clustered speech; And
And substituting at least a portion of a corresponding application API association parameter as an inferred term inferred by the extracted extraction term.

The method of claim 1, wherein prompting execution of the application API is
Allowing the user to modify the displayed application API association parameters under user control.

In the mobile terminal for processing a voice call,
A voice call hooking unit hooking a start of the voice call;
A dialogue context generator configured to generate a dialogue context by clustering voices transmitted and received in the course of performing the voice call;
An application installation checking unit that checks whether an application associated with the application identifier is installed; And
If the associated application is installed, an application execution unit for prompting the execution of the application API by displaying at least a portion of the application API association parameters,
And the conversation context includes an application identifier, an application API identifier, and an application API association parameter.

The method of claim 5, wherein the conversation context generating unit
And extracting an extraction term used in a specific application from the clustered voice and replacing at least a part of the application API related parameter with an inferred term inferred by the extracted extraction term.

The method of claim 5, wherein the conversation context generating unit
And searching for a root node in each of at least one context tree in the transmitted and received voice.