KR101968669B1

KR101968669B1 - Method for providing call service and computer program for executing the method

Info

Publication number: KR101968669B1
Application number: KR1020170106257A
Authority: KR
Inventors: 이명복; 김은정; 이소명
Original assignee: 네이버 주식회사
Priority date: 2017-08-22
Filing date: 2017-08-22
Publication date: 2019-04-12
Also published as: KR20190021103A

Abstract

본 발명의 일 실시예는 제1 단말에서의 통화 서비스 제공 방법에 있어서, 제2 단말로부터 음원을 수신하는 단계; 음성인식 모듈을 이용하여 상기 수신한 음원을 텍스트로 변환하는 단계; 상기 변환된 텍스트를 상기 제1 단말의 표시부에 표시하는 단계; 사용자로부터 텍스트를 입력받는 단계; 상기 입력받은 텍스트를 음성합성 모듈을 이용하여 음원으로 변환하는 단계; 및 상기 변환된 음원을 상기 제2 단말로 전송하는 단계;를 포함하는 통화 서비스 제공 방법을 개시한다.According to an embodiment of the present invention, there is provided a method of providing a call service in a first terminal, comprising: receiving a sound source from a second terminal; Converting the received sound source into text using a speech recognition module; Displaying the converted text on a display unit of the first terminal; Receiving text from a user; Converting the received text into a sound source using a speech synthesis module; And transmitting the converted sound source to the second terminal.

Description

BACKGROUND OF THE INVENTION 1. Field of the Invention [0001] The present invention relates to a call service providing method and a computer program,

통화 서비스 제공 방법 및 컴퓨터 프로그램에 관한 것이다.A method of providing a call service, and a computer program.

전화란, 음성을 전기신호로 바꾸어 다른 장치에 전송하고, 전기 신호를 수신하여 다시 음성으로 재생함으로써, 원격지에 있는 두 장치 간의 통화를 가능하게 하는 기술을 의미한다. 상용 휴대전화가 개발된 이후부터는 언제 어디서나 휴대전화를 이용해 통화할 수 있게 되었고, 그 이후 인터넷을 통해 컴퓨터 장치 간의 통화까지도 가능하게 되었다. 여기서의 컴퓨터 장치는, 인터넷 연결 가능한 모든 장치를 의미하며, 스마트폰을 포함한다.Telephone refers to a technique of converting a voice into an electric signal, transmitting it to another device, receiving an electric signal, and reproducing voice again to enable communication between two devices at a remote location. Since the development of commercial mobile phones, it has become possible to make calls using mobile phones anytime and anywhere, and thereafter, even communication between computer devices via the Internet has become possible. Here, the computer device refers to all devices capable of being connected to the Internet, and includes a smart phone.

한편, 전화는 기본적으로 음성을 전달하는 것인 바, 음성을 재생할 수 없는 상황에 있는 사용자, 혹은 청각 장애가 있어서 음성을 인식하는 것에 어려움이 있는 사용자는, 전화 이용에 제한이 있다.On the other hand, since the telephone basically transmits voice, a user who is unable to reproduce the voice, or a user who has difficulty in recognizing the voice due to the hearing impairment has a limitation in using the telephone.

전술한 배경기술은 발명자가 본 발명의 도출을 위해 보유하고 있었거나, 본 발명의 도출 과정에서 습득한 기술 정보로서, 반드시 본 발명의 출원 전에 일반 공중에게 공개된 공지기술이라 할 수는 없다.The above-described background technology is technical information that the inventor holds for the derivation of the present invention or acquired in the process of deriving the present invention, and can not necessarily be a known technology disclosed to the general public prior to the filing of the present invention.

음원-텍스트 변환 기능을 포함하는 통화 서비스 제공 방법 및 컴퓨터 프로그램을 제공하는 데 있다. 본 실시예가 이루고자하는 기술적 과제는 상기된 바와 같은 기술적 과제들로 한정되지 않으며, 이하의 실시예들로부터 또 다른 기술적 과제들이 유추될 수 있다.And a computer program for providing a call service including a sound source-text conversion function. The technical problem to be solved by this embodiment is not limited to the above-mentioned technical problems, and other technical problems can be deduced from the following embodiments.

본 발명의 일 실시예는, 제1 단말에서의 통화 서비스 제공 방법에 있어서, 제2 단말로부터 음원을 수신하는 단계; 음성인식 모듈을 이용하여 상기 수신한 음원을 텍스트로 변환하는 단계; 상기 변환된 텍스트를 상기 제1 단말의 표시부에 표시하는 단계; 사용자로부터 텍스트를 입력받는 단계; 상기 입력받은 텍스트를 음성합성 모듈을 이용하여 음원으로 변환하는 단계; 및 상기 변환된 음원을 상기 제2 단말로 전송하는 단계;를 포함하는 통화 서비스 제공 방법을 개시한다.According to an embodiment of the present invention, there is provided a method of providing a call service in a first terminal, comprising: receiving a sound source from a second terminal; Converting the received sound source into text using a speech recognition module; Displaying the converted text on a display unit of the first terminal; Receiving text from a user; Converting the received text into a sound source using a speech synthesis module; And transmitting the converted sound source to the second terminal.

일 실시예에 따르면 상기 표시하는 단계는, 상기 수신한 음원의 진폭에 대응하는 서식으로 상기 텍스트를 표시하고, 상기 진폭과 상기 서식의 관계는 상기 진폭이 클수록 상기 텍스트가 크게 표시되도록 기설정될 수 있다.According to an embodiment, the displaying step may display the text in a format corresponding to the amplitude of the received sound source, and the relation between the amplitude and the format may be set so that the larger the amplitude is, have.

일 실시예에 따르면 상기 수신하는 단계, 상기 텍스트로 변환하는 단계는 복수 회 수행되고, 상기 표시하는 단계는, 상기 변환된 텍스트들을 누적하여 표시할 수 있다.According to an embodiment, the step of receiving and the step of converting into text are performed a plurality of times, and the displaying step may accumulatively display the converted texts.

일 실시예에 따르면 사용자에 의해 상기 제1 단말에 입력되는 제1 입력에 대응하여 상기 누적된 텍스트를 스크롤하는 단계;를 더 포함할 수 있다.According to an embodiment of the present invention, the method may further include scrolling the accumulated text corresponding to a first input input by the user to the first terminal.

일 실시예에 따르면 상기 단말의 표시부에 표시되는 화면은, 상기 변환된 텍스트를 표시하는 제1 영역 및 상기 사용자로부터 텍스트를 입력받는 제2 영역을 포함하고, 상기 스크롤은 상기 제1 영역 내에서 수행되고, 상기 스크롤이 수행되더라도 상기 제2 영역은 고정적으로 표시될 수 있다.According to an embodiment, the screen displayed on the display unit of the terminal includes a first area for displaying the converted text and a second area for receiving text from the user, and the scrolling is performed in the first area And the second area can be fixedly displayed even if the scrolling is performed.

일 실시예에 따르면 상기 단말의 표시부에 표시되는 화면은, 상기 변환된 텍스트를 표시하는 제1 영역 및 상기 사용자로부터 텍스트를 입력받는 제2 영역을 포함하고, 상기 스크롤이 수행되면 상기 제2 영역이 표시된 위치까지 상기 제1 영역이 확장되어, 상기 표시부에 상기 제1 영역만 표시될 수 있다.According to an embodiment, the screen displayed on the display unit of the terminal includes a first area for displaying the converted text and a second area for receiving text from the user, and when the scrolling is performed, The first area may be extended to a displayed position, and only the first area may be displayed on the display unit.

일 실시예에 따르면 상기 누적된 텍스트에 포함된 각 단어의 품사, 빈도, 상기 각 단어가 기설정된 형식에 매칭되는지 여부 중 적어도 하나를 고려하여, 상기 텍스트에 포함된 단어들 중 하나 이상을 선정하여 표시하는 단계;를 더 포함할 수 있다.According to one embodiment, at least one of the words included in the text is selected in consideration of at least one of the part of speech, the frequency of each word included in the accumulated text, and whether or not each word matches a predetermined format The method comprising the steps of:

일 실시예에 따르면 상기 선정하여 표시하는 단계는, 상기 각 단어의 품사, 빈도, 상기 각 단어가 기설정된 형식에 매칭되는지 여부 중 적어도 하나를 고려하여, 상기 각 단어의 폰트, 크기, 색상, 표시위치 중 하나 이상을 결정할 수 있다.According to an embodiment, the selecting and displaying step may include selecting at least one of a font, a size, a color, and a display of each word in consideration of at least one of a part of speech, frequency of the word, One or more of the positions may be determined.

일 실시예에 따르면 상기 제1 단말은, 통화 과정에서 수신되는 데이터를 텍스트 형식으로 표시부에 출력하는 텍스트 모드 및 음원 형식으로 스피커를 통해 출력하는 음원 모드를 택일적으로 제공하고, 상기 방법은, 사용자에 의해 상기 텍스트 모드가 선택되는 단계;를 더 포함하고, 상기 선택되는 단계 이후에 상기 수신하는 단계 내지 상기 전송하는 단계가 수행되고, 사용자에 의해 상기 음원 모드가 선택되면, 상기 제2 단말로부터 수신한 음원을 스피커로 출력하고, 상기 제1 단말의 마이크로부터 입력받은 음원을 상기 제2 단말로 전송할 수 있다.According to an embodiment, the first terminal may alternatively provide a text mode for outputting data received in a call process to a display unit in a text format, and a sound source mode for outputting through a speaker in a sound source format, Selecting the text mode by the first terminal, wherein the receiving step or the transmitting step is performed after the selecting step, and when the sound source mode is selected by the user, A sound source may be output to the speaker and a sound source received from the microphone of the first terminal may be transmitted to the second terminal.

일 실시예에 따르면 상기 입력받는 단계는, 사용자로부터 상기 제1 단말 또는 서버에 저장된 텍스트를 선택받고, 상기 음원으로 변환하는 단계는, 상기 선택된 텍스트를 음성합성 모듈을 이용하여 음원으로 변환할 수 있다.According to an embodiment, in the receiving step, the text stored in the first terminal or the server is selected from a user and the converting the selected text into the sound source may convert the selected text into a sound source using a sound synthesis module .

일 실시예에 따르면 상기 입력받는 단계는, 상기 제1 단말의 터치스크린에 하나 이상의 드래그 동작으로 입력되는 문자를 문자인식 모듈을 이용하여 텍스트로 변환하고, 상기 음원으로 변환하는 단계는, 상기 문자인식 모듈을 이용하여 변환된 텍스트를 음성합성 모듈을 이용하여 음원으로 변환할 수 있다.According to another embodiment of the present invention, the receiving step may include converting a character input through the at least one drag operation to the touch screen of the first terminal into text using a character recognition module, The converted text can be converted into a sound source using a speech synthesis module by using a module.

일 실시예에 따르면 상기 표시하는 단계는, 상기 변환된 텍스트를 번역 모듈을 이용하여 제1 언어에서 사용자에 의해 지정된 제2 언어로 번역하고, 제2 언어로 번역된 텍스트를 상기 표시부에 표시할 수 있다.According to one embodiment, the displaying step may translate the converted text into a second language specified by the user in the first language using the translation module, and display the translated text in the second language on the display unit have.

일 실시예에 따르면 상기 전송하는 단계는, 서버가 상기 변환된 음원에 음원 필터를 적용하여 상기 제2 단말로 전송할 수 있도록, 상기 변환된 음원 및 사용자에 의해 선택된 음원 필터 정보를 상기 서버에 전송할 수 있다.According to an exemplary embodiment, the transmitting may include transmitting the converted sound source and the sound source filter information selected by the user to the server so that the server can apply the sound source filter to the converted sound source and transmit the sound source filter to the second terminal have.

본 발명의 일 실시예에 따르면, 제1 단말에서의 통화 서비스 제공 방법에 있어서, 제2 단말로부터 텍스트를 수신하는 단계; 상기 텍스트를 상기 제1 단말의 표시부에 표시하는 단계; 사용자로부터 텍스트를 입력받는 단계; 상기 입력받은 텍스트를 음성합성 모듈을 이용하여 음원으로 변환하는 단계; 및 상기 변환된 음원을 상기 제2 단말로 전송하는 단계;를 포함하는 통화 서비스 제공 방법이 개시된다.According to an embodiment of the present invention, there is provided a method of providing a call service in a first terminal, the method comprising: receiving text from a second terminal; Displaying the text on a display of the first terminal; Receiving text from a user; Converting the received text into a sound source using a speech synthesis module; And transmitting the converted sound source to the second terminal.

일 실시예에 따르면 제2 단말과의 통화 개시 요청을 입력받는 단계; 및 상기 제2 단말로 통화 개시 요청을 전송하고, 상기 제2 단말로부터 통화 개시 승인을 수신하여 통화를 개시하는 단계;를 더 포함하고, 상기 통화 개시 요청은, 상기 제1 단말에서의 출력 형식 정보를 포함하고, 상기 데이터 출력 형식은 텍스트 또는 음원 중 어느 하나로 지정될 수 있다.Receiving a call initiation request with a second terminal according to an embodiment; And initiating a call by transmitting a call initiation request to the second terminal and receiving a call initiation acknowledgment from the second terminal, wherein the call initiation request includes an output format information And the data output format may be specified as either a text or a sound source.

본 발명의 일 실시예에 따르면, 제2 단말로부터 음원을 수신하는 단계; 음성인식 모듈을 이용하여 상기 음원을 텍스트로 변환하고, 변환된 텍스트를 제1 단말에 전송하는 단계; 제1 단말로부터 텍스트를 수신하는 단계; 및 상기 수신된 텍스트를 음성합성 모듈을 이용하여 음원으로 변환하고, 변환된 음원을 상기 제2 단말에 전송하는 단계;를 포함하는 통화 서비스 제공 방법이 개시된다.According to an embodiment of the present invention, there is provided a method for receiving a sound source, the method comprising: receiving a sound source from a second terminal; Converting the sound source into text using a speech recognition module, and transmitting the converted text to a first terminal; Receiving text from a first terminal; And converting the received text into a sound source using a speech synthesis module, and transmitting the converted sound source to the second terminal.

본 발명의 일 실시예에 따르면, 컴퓨터를 이용하여 전술한 어느 한 실시예에 따른 방법을 실행하기 위하여 매체에 저장된 컴퓨터 프로그램이 개시된다.According to an embodiment of the present invention, a computer program stored on a medium for executing a method according to any one of the above embodiments using a computer is disclosed.

본 발명의 실시예들에 관한 통화 서비스 제공 방법 및 컴퓨터 프로그램은, 음원-텍스트 변환 기능을 통해 텍스트 기반 통화 서비스를 제공함으로써, 음성을 재생할 수 없는 상황에 있는 사용자, 혹은 청각 장애가 있어서 음성을 인식하는 것에 어려움이 있는 사용자들도 원활하게 통화 서비스를 이용할 수 있는 효과가 있다.A call service providing method and a computer program according to embodiments of the present invention provide a text-based call service through a sound source-text conversion function, thereby enabling a user who can not reproduce a voice, Users with difficulties can also use the service smoothly.

도 1은 본 발명의 일 실시예에 따른 통화 시스템을 개략적으로 도시한 예이다.
도 2는 본 발명의 일 실시예에 따른 통화 서비스 제공 장치를 개략적으로 도시한 블록도이다.
도 3은 제1 실시예에 따른 통화 서비스 제공 방법을 개략적으로 도시한 흐름도이다.
도 4는 제1 실시예에 따른 통화 서비스 제공 방법의 변형예를 개략적으로 나타낸 흐름도이다.
도 5는 제2 실시예에 따른 통화 서비스 제공 방법을 개략적으로 도시한 흐름도이다.
도 6은 제1 실시예에서 음원-텍스트 변환 기능을 제1 단말과 제2 단말이 나누어 처리하는 경우에 따른 통화 서비스 제공 방법을 개략적으로 도시한 흐름도이다.
도 7 내지 13은 단말에 표시되는 화면의 다양한 예이다.1 is a schematic illustration of a call system in accordance with an embodiment of the present invention.
2 is a block diagram schematically illustrating an apparatus for providing a call service according to an embodiment of the present invention.
3 is a flowchart schematically showing a call service providing method according to the first embodiment.
4 is a flowchart schematically showing a modification of the call service providing method according to the first embodiment.
5 is a flowchart schematically showing a call service providing method according to the second embodiment.
6 is a flowchart schematically illustrating a method for providing a call service according to the first embodiment in which a first terminal and a second terminal divide and process a sound source-text conversion function.
7 to 13 are various examples of screens displayed on the terminal.

본 발명은 다양한 변환을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예들을 도면에 예시하고, 상세한 설명에 상세하게 설명하고자 한다. 그러나, 이는 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변환, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 본 발명을 설명함에 있어서 관련된 공지 기술에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우 그 상세한 설명을 생략한다.BRIEF DESCRIPTION OF THE DRAWINGS The present invention is capable of various modifications and various embodiments, and particular embodiments are illustrated in the drawings and described in detail in the detailed description. It is to be understood, however, that the invention is not to be limited to the specific embodiments, but includes all modifications, equivalents, and alternatives falling within the spirit and scope of the invention. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, the present invention will be described in detail with reference to the accompanying drawings.

제 1, 제 2 등의 용어는 다양한 구성 요소들을 설명하는데 사용될 수 있지만, 구성 요소들은 용어들에 의하여 한정되어서는 안된다. 용어들은 하나의 구성 요소를 다른 구성 요소로부터 구별하는 목적으로만 사용된다.The terms first, second, etc. may be used to describe various elements, but the elements should not be limited by terms. Terms are used only for the purpose of distinguishing one component from another.

본 출원에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "포함한다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성 요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나, 숫자, 단계, 동작, 구성 요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The terminology used in this application is used only to describe a specific embodiment and is not intended to limit the invention. The singular expressions include plural expressions unless the context clearly dictates otherwise. In the present application, the terms "comprises", "having", and the like are used to specify that a feature, a number, a step, an operation, an element, a component, Should not be construed to preclude the presence or addition of one or more other features, integers, steps, operations, elements, parts, or combinations thereof.

이하, 첨부된 도면들에 도시된 본 발명의 바람직한 실시예를 참조하여 본 발명을 보다 상세히 설명한다.Hereinafter, the present invention will be described in more detail with reference to the preferred embodiments of the present invention shown in the accompanying drawings.

도 1은 본 발명의 일 실시예에 따른 통화 시스템을 개략적으로 도시한 예이다.1 is a schematic illustration of a call system in accordance with an embodiment of the present invention.

도 1을 참조하면, 통화 시스템은 통화 주체인 단말(10), 및 단말(10) 간의 통화 서비스를 중개하기 위해 단말(10)과 통신하는 서버(20)를 포함한다. 서버(20) 및 단말(10)은 통신망(30)을 통해 통신할 수 있다. 통신망(30)은 인터넷망 및/또는 유선전화망일 수 있으나, 이에 한정하지 않으며, LANs(Local Area Networks), WANs(Wide Area Networks), MANs(Metropolitan Area Networks), ISDNs(Integrated Service Digital Networks) 등의 유선 네트워크나, 무선 LANs, CDMA, 블루투스, 위성 통신, 3G, 4G, 5G, LTE 등의 무선 네트워크를 망라할 수 있다.Referring to FIG. 1, the communication system includes a terminal 10, which is a call subject, and a server 20, which communicates with the terminal 10 to mediate a call service between the terminals 10. The server 20 and the terminal 10 can communicate through the communication network 30. [ The communication network 30 may be an Internet network and / or a wired telephone network. However, the communication network 30 is not limited to the LAN network such as LANs (Local Area Networks), WANs (Wide Area Networks), MANs (Metropolitan Area Networks), ISDNs , Wireless LANs, CDMA, Bluetooth, satellite communication, 3G, 4G, 5G, LTE, and the like.

단말(10)은 유선 전화망을 이용하는 통화가 가능한 단말 장치 및, 서버(20)와 관련된 인터넷 사이트 접속 또는 서비스 전용 어플리케이션(이하, '서비스 앱'이라 칭함)의 설치 및 실행이 가능한 모든 통신 단말 장치를 의미할 수 있다. 예컨대 단말(10)은 전화기, 스마트폰, 태블릿, PC, 노트북 등의 장치일 수 있다. 단말(10)은 서버(20) 및/또는 서비스 앱의 제어 하에 서비스 화면 구성, 데이터 입력, 데이터 송/수신, 데이터 저장 등 서비스 전반의 동작을 수행할 수 있다. 도 1에서는 복수의 단말(10)이 존재함을 설명하기 위해 제1 단말(11) 및 제2 단말(12)을 예로 도시하였다. The terminal 10 includes a terminal device capable of making a call using a wired telephone network and all communication terminal devices capable of installing and executing an Internet site access or service dedicated application (hereinafter referred to as a "service app") associated with the server 20 It can mean. For example, the terminal 10 may be a device such as a telephone, a smart phone, a tablet, a PC, or a notebook computer. The terminal 10 can perform service overall operations such as service screen configuration, data input, data transmission / reception, and data storage under the control of the server 20 and / or the service application. In FIG. 1, the first terminal 11 and the second terminal 12 are illustrated as an example to explain that a plurality of terminals 10 exist.

서버(20)는, 인터넷 망을 통해 제1 단말(11) 및 제2 단말(12)과 통신하는 통신 서버이거나, 전화 망을 통해 제1 단말(11) 및 제2 단말(12)과 통신하는 기지국이거나, 통신 서버 및 기지국을 모두 포함하는 것일 수 있다. 서버(20)의 구체적인 구성은, 각 실시예에 따라 달라질 수 있다. 이하에서 각 실시예를 설명하면서 필요한 경우 서버(20)의 구성을 설명하기로 한다.The server 20 is a communication server that communicates with the first terminal 11 and the second terminal 12 via the Internet network or communicates with the first terminal 11 and the second terminal 12 via the telephone network A base station, a communication server, and a base station. The specific configuration of the server 20 may vary according to each embodiment. Hereinafter, the configuration of the server 20 will be described while explaining each embodiment.

이하에서 본 발명의 실시예들을 설명할 때, 제1 단말(11)로부터 제2 단말(12)로의 데이터 전송, 및 제2 단말(12)로부터 제1 단말(11)로의 데이터 전송은, 제1 단말(11)과 제2 단말(12)이 직접 데이터를 송/수신하는 경우 뿐 아니라, 제1 단말(11)과 제2 단말(12)이 서버(20)를 통하여 데이터를 송/수신하는 경우까지 포함하는 의미일 수 있다.Data transmission from the first terminal 11 to the second terminal 12 and data transmission from the second terminal 12 to the first terminal 11 are described in the first When the first terminal 11 and the second terminal 12 transmit and receive data through the server 20 as well as when the terminal 11 and the second terminal 12 directly transmit and receive data, May be included.

본 발명의 일 실시예에 따른 통화 시스템에 의해 제공되는 통화 서비스는, 텍스트와 음원 간의 변환 기능을 제공할 함으로써, 기존에 음성 기반 통화에 한정되었던 통화 서비스를 텍스트 기반까지 확장할 수 있다.The call service provided by the call system according to an embodiment of the present invention can expand the call service that was previously limited to the voice based call to the text based service by providing the function of converting between the text and the sound source.

이하에서 본 발명의 실시예들을 설명할 때, 영화관 등의 장소에 있어서 음성을 재생할 수 없는 상황에 있다거나 청각 장애 등의 이유로 음성 인식이 어려운 이유 등으로 텍스트 기반 통화 서비스를 이용하고자 하는 단말이 제1 단말(11)인 것으로, 제1 단말(11)의 통화 상대방 단말이 제2 단말(12)인 것으로 설명한다. 이러한 "제1" 및 "제2" 등의 구분은 단지 설명의 편의를 위해 복수의 장치 각각을 구별하기 위한 것이다. 제1 단말(11)은 텍스트 기반 통화 서비스를 이용하고자 하는 단말인 바, 텍스트를 표시할 수 있는 표시부를 포함할 수 있다.In the following, embodiments of the present invention will be described. When a user wants to use a text-based call service for reasons such as being unable to reproduce a voice in a place such as a movie theater, 1 terminal 11 and that the communication partner terminal of the first terminal 11 is the second terminal 12. [ The distinction between "first" and "second" is intended to distinguish each of a plurality of devices for convenience of explanation. The first terminal 11 may include a display unit capable of displaying text, which is a terminal to use a text-based call service.

도 2는 본 발명의 일 실시예에 따른 통화 서비스 제공 장치를 개략적으로 도시한 블록도이다.2 is a block diagram schematically illustrating an apparatus for providing a call service according to an embodiment of the present invention.

도 2에 도시된 구성요소들은 본 발명의 실시예들의 설계에 따라 단말(10)에 구비되거나, 서버(20)에 구비될 수 있다. 또는 도 2에 도시된 구성요소들의 일부는 단말(10)에 구비되고, 나머지 일부는 서버(20)에 구비될 수 있다. The components shown in FIG. 2 may be provided in the terminal 10 or may be provided in the server 20 according to the design of the embodiments of the present invention. Or a part of the components shown in FIG. 2 may be provided in the terminal 10, and the remaining part may be provided in the server 20. FIG.

제1 실시예에 따르면, 도 2의 구성요소들은 모두 단말(10)에 구비될 수 있고, 서버(20)는 제1 단말(11)과 제2 단말(12) 간의 데이터 송/수신을 중개하는 역할만을 수행할 수 있다. 즉, 통화 서비스에서 사용되는 음원-텍스트 변환 기능은 단말(10)에서 처리될 수 있다. 본 예에서 음원-텍스트 변환 기능은 통화 당사자인 제1 단말(11)과 제2 단말(12) 중 어느 하나의 단말에서만 처리되거나, 양측 단말에서 나누어 처리될 수 있다.2 may be provided in the terminal 10 and the server 20 may exchange data transmitted / received between the first terminal 11 and the second terminal 12 Only the role can be performed. That is, the sound source-text conversion function used in the call service can be processed in the terminal 10. In this example, the sound source-text conversion function may be processed only in the terminal of either the first terminal 11 or the second terminal 12, which is a communication party, or may be processed separately in both terminals.

제2 실시예에 따르면, 도 2의 구성요소들 중 표시 제어부(122)는 단말(10)에 구비되고, 나머지 구성요소들은 서버(20)에 구비됨으로써, 서버(20)가 중간에서 제1 단말(11)과 제2 단말(12) 간에 송/수신 되는 데이터를 가공하여 전달할 수 있다. 한편, 제2 실시예에서 통화 제어부(121)가 서버(10)에 구비되고 표시 제어부(122)가 단말(10)에 구비될 수 있다. 이 경우, 통화 제어부(121)의 역할은 통신 서버에 의해 수행되거나, 기지국에 의해 수행되거나, 통신 서버 및 기지국의 협업에 의해 수행될 수 있다. 이하의 도 2의 각 구성요소들에 관한 설명들은 기본적으로는 전술한 제1 실시예 및 제2 실시예에 대하여 모두 적용될 수 있으며, 특별한 사정이 있는 경우에는 그에 대한 설명을 부가하기로 한다.According to the second embodiment, the display control unit 122 of the components of FIG. 2 is provided in the terminal 10 and the remaining components are provided in the server 20, The data transmitted / received between the first terminal 11 and the second terminal 12 can be processed and transmitted. In the second embodiment, the call control unit 121 may be provided in the server 10 and the display control unit 122 may be provided in the terminal 10. [ In this case, the role of the call control unit 121 may be performed by the communication server, by the base station, or by cooperation of the communication server and the base station. The following description of each component in FIG. 2 can basically be applied to both the first and second embodiments described above, and if there are special circumstances, a description thereof will be added.

도 2를 참조하면, 통화 서비스 제공 장치(100)는 메모리(110) 및 프로세서(120)를 포함할 수 있다. 도 2에 도시된 통화 서비스 제공 장치(100)는 본 발명의 실시예들의 특징이 흐려지는 것을 방지하기 위해 본 발명의 실시예들과 관련된 구성요소들을 도시한 것이다. 따라서 도 2에 도시된 구성요소들 외에 다른 범용적인 구성요소들, 예컨대 입출력부, 표시부, 통신부 등의 구성요소들이 더 포함될 수 있다.Referring to FIG. 2, the call service providing apparatus 100 may include a memory 110 and a processor 120. The call service providing apparatus 100 shown in FIG. 2 shows the components related to the embodiments of the present invention to prevent the features of the embodiments of the present invention from being blurred. Therefore, components other than the components shown in FIG. 2 may be further included, such as general-purpose components such as an input / output unit, a display unit, and a communication unit.

메모리(110)는 통화 서비스 제공에 필요한 각종 모듈을 저장할 수 있다. 일 실시예에 르면 메모리(110)는 음성인식 모듈(111) 및 음성합성 모듈(112)을 포함한다. 메모리(110)는 추가적인 기능을 더 제공하기 위해 문자인식 모듈(113) 및/또는 번역 모듈(114)을 더 포함할 수 있다.The memory 110 may store various modules required for providing a call service. In one embodiment, the memory 110 includes a speech recognition module 111 and a speech synthesis module 112. The memory 110 may further include a character recognition module 113 and / or a translation module 114 to further provide additional functionality.

프로세서(120)는 통화 서비스를 제공하기 위한 데이터 처리 전반을 수행할 수 있다. 프로세서(120)는 통화와 관련된 데이터 처리 전반을 제어하는 통화 제어부(121) 및 단말(10)의 표시부에 표시되는 화면을 제어하고 사용자 인터페이스를 제공하는 표시 제어부(122)를 포함할 수 있다.The processor 120 may perform overall data processing to provide a call service. The processor 120 may include a call control unit 121 for controlling overall data processing related to a call and a display control unit 122 for controlling a screen displayed on a display unit of the terminal 10 and providing a user interface.

일 실시예에 따른 음성인식 모듈(111)은 음성인식 기능을 제공한다. 음성인식(speech recognition)은 음성 언어를 해석하여 그 내용을 텍스트로 전환하는 처리, 음원으로부터 텍스트를 추출하는 처리, 음원을 텍스트로 변환하는 처리 등을 포함하는 의미일 수 있다. 음성인식을 위한 구체적인 방법으로는 STT(Speech-to-Text) 관련된 다양한 기술이 제한 없이 차용될 수 있으며, 대표적 알고리즘의 예로는 HMM(hidden markov model)이 있다.The speech recognition module 111 according to one embodiment provides a speech recognition function. Speech recognition may include a process of interpreting a speech language and converting its contents into text, a process of extracting text from a sound source, a process of converting a sound source into text, and the like. As a concrete method for speech recognition, various techniques related to STT (Speech-to-Text) can be borrowed without limitation. An example of a representative algorithm is HMM (hidden markov model).

일 실시예에 따른 음성합성 모듈(112)은 음성합성 기능을 제공한다. 음성합성(speech synthesis)은 텍스트로부터 해당 텍스트를 읽는 음성을 합성해내는 처리, 텍스트를 음원으로 변환하는 처리 등을 의미할 수 있다. 음성합성을 위한 구체적인 방법으로는 TTS(Text-to-Speech) 관련된 다양한 기술이 제한 없이 차용될 수 있다.The speech synthesis module 112 according to one embodiment provides a speech synthesis function. Speech synthesis may refer to a process of synthesizing a voice that reads the text from text, a process of converting the text into a sound source, and the like. As a concrete method for speech synthesis, various techniques related to TTS (Text-to-Speech) can be borrowed without limitation.

일 실시예에 따른 문자인식 모듈(113)은 필기된 문자를 인식하여 텍스트로 변환하는 처리를 의미한다. 예를 들어, 문자인식 모듈(113)은 터치스크린 등을 통해 사용자로부터 직접 입력되는 필기문자를 판독하여 텍스트로 변환할 수 있다. 필기문자는 하나 이상의 드래그 입력으로 구성되거나, 전자펜으로 입력되는 하나 이상의 연속적 입력으로 구성되거나, 이들 입력의 조합으로 구성될 수 있다. 문자인식 모듈(113)은 각 입력의 좌표, 필기 순서 등을 고려하여, 기저장된 모델 패턴 중 가장 근접한 모델 패턴을 인식 결과로 결정함으로써, 최종적으로 어떤 문자인지를 판독할 수 있다.The character recognition module 113 according to one embodiment refers to a process of recognizing a written character and converting it into a text. For example, the character recognition module 113 can read a handwritten character directly input from a user through a touch screen or the like, and convert the handwritten character into text. A handwritten character may consist of one or more drag inputs, or consist of one or more consecutive inputs input with an electronic pen, or may be composed of a combination of these inputs. The character recognition module 113 can determine which character is the last by determining the closest model pattern among the pre-stored model patterns as the recognition result in consideration of the coordinates of each input, the writing order, and the like.

일 실시예에 따른 번역 모듈(114)은 복수의 언어들 간의 번역 기능을 제공할 수 있다. 예를 들어, 제1 언어를 제1 언어와 상이한 제2 언어로 번역하는 기능을 제공할 수 있다. The translation module 114 according to one embodiment may provide a translation function between a plurality of languages. For example, it is possible to provide a function of translating a first language into a second language different from the first language.

일 실시예에 따른 통화 제어부(121)는, 통화와 관련된 모든 처리를 수행할 수 있다. 통화 제어부(121)는 메모리(111)의 음성인식 모듈(111)을 이용하여 음원을 텍스트로 변환하는 처리, 음성합성 모듈(112)을 이용하여 텍스트를 음원으로 변환하는 처리, 음원 및/또는 텍스트를 외부 장치로부터 수신하거나 외부 장치에 전송하는 처리를 수행할 수 있다. 그 외에 통화 제어부(121)는 통화 과정에서 송/수신되는 음원 및/또는 텍스트 데이터를 가공하여, 사용자에게 제공하기 위한 유의미한 데이터를 추출할 수 있다. 통화 제어부(121)는 단말(10)의 표시부에 표시될 텍스트의 서식을 결정할 수 있다. The call control unit 121 according to an exemplary embodiment may perform all the processes related to the call. The call control unit 121 performs a process of converting a sound source into text using the speech recognition module 111 of the memory 111, a process of converting text into a sound source using the sound synthesis module 112, a sound source and / From the external device or to the external device. In addition, the call control unit 121 may process the sound source and / or text data transmitted / received during the call process to extract meaningful data for providing to the user. The call control unit 121 can determine the format of the text to be displayed on the display unit of the terminal 10. [

일 실시예에 따른 표시 제어부(122)는 단말(10)의 표시부에 표시되는 화면을 제어할 수 있다. 표시 제어부(122)는 통화 서비스와 관련된 사용자 인터페이스를 제공할 수 있다.The display control unit 122 according to an embodiment can control a screen displayed on the display unit of the terminal 10. [ The display control unit 122 may provide a user interface related to the call service.

도 3은 제1 실시예에 따른 통화 서비스 제공 방법을 개략적으로 도시한 흐름도이다.3 is a flowchart schematically showing a call service providing method according to the first embodiment.

도 3에 도시된 흐름도에서 제1 단말(11)과 제2 단말(12)은 직접 데이터를 주고받는 것으로 도시되었으나 이에 한정하지 않는다. 예컨대 제1 단말(11)과 제2 단말(12) 간의 통신은, 전술한 것과 같이 서버(20)를 통하는 것일 수 있다. 본 예에서의 서버(20)는 통신 서버만을 포함하거나, 기지국 만을 포함하거나, 통신 서버 및 기지국을 포함할 수 있다. 즉, 제1 실시예에 따른 제1 단말(11)과 제2 단말(12) 간의 통화는, 인터넷 기반 서버를 통한 인터넷 전화도 가능하고, 기지국을 통한 유선전화도 가능하다.In the flowchart shown in FIG. 3, the first terminal 11 and the second terminal 12 are shown to directly exchange data, but the present invention is not limited thereto. For example, the communication between the first terminal 11 and the second terminal 12 may be via the server 20 as described above. The server 20 in this example may include only a communication server, only a base station, or a communication server and a base station. That is, the call between the first terminal 11 and the second terminal 12 according to the first embodiment can be Internet-based through the Internet-based server, and can also be made via the base station.

도 3에 도시된 흐름도는, 제1 실시예 중 제1 단말(11)에서 음원-텍스트 변환 기능을 모두 처리하는 예를 도시한 것이다.The flowchart shown in Fig. 3 shows an example of processing all the sound-text conversion functions in the first terminal 11 in the first embodiment.

도 3을 참조하면, 단계 31에서 제2 단말(12)은 마이크 등의 장치를 통해 사용자로부터 음원(사용자의 목소리 등)을 입력받는다. Referring to FIG. 3, in step 31, the second terminal 12 receives a sound source (user's voice, etc.) from a user through a microphone or the like.

단계 32에서 제1 단말(11)은 제2 단말(12)로부터 음원을 수신한다.In step 32, the first terminal 11 receives the sound source from the second terminal 12.

단계 33에서 제1 단말(11)의 통화 제어부(121)는 음성인식 모듈(111)을 이용하여 단계 32에서 수신한 음원을 텍스트로 변환한다.In step 33, the call control unit 121 of the first terminal 11 converts the sound source received in step 32 into text using the speech recognition module 111.

단계 34에서 제1 단말(11)의 표시 제어부(122)는 단계 32에서 변환된 텍스트를 제1 단말(11)의 표시부에 표시한다.In step 34, the display control unit 122 of the first terminal 11 displays the converted text in step 32 on the display unit of the first terminal 11.

단계 35에서 제1 단말(11)의 표시 제어부(122)는 사용자가 텍스트를 입력할 수 있는 사용자 인터페이스를 표시부에 표시하고, 사용자 인터페이스를 통해 사용자로부터 사용자로부터 텍스트를 입력받는다.In step 35, the display control unit 122 of the first terminal 11 displays on the display unit a user interface through which the user can input text, and receives text from the user through the user interface.

단계 36에서 제1 단말(11)의 통화 제어부(121)는 단계 35에서 입력받은 텍스트를 음성합성 모듈(112)을 이용하여 음원으로 변환한다.In step 36, the call control unit 121 of the first terminal 11 converts the text input in step 35 into a sound source using the speech synthesis module 112. [

단계 37에서 제1 단말(11)의 통화 제어부(121)는 단계 36에서 변환된 음원을 제2 단말(12)로 전송한다.In step 37, the call control unit 121 of the first terminal 11 transmits the sound source converted in step 36 to the second terminal 12.

단계 38에서 제2 단말(12)은 단계 37에서 수신한 음원을 출력한다.In step 38, the second terminal 12 outputs the sound source received in step 37.

제1 실시예에 따르면, 음원-텍스트 변환 기능을 제1 단말(11)에서 수행하는 바, 제1 단말(11)은 본 발명의 일 실시예에 따른 통화 서비스를 제공하는 서비스 앱을 설치하고, 제1 단말(11)에 설치된 서비스 앱은, 통화 서비스에 필요한 기능들을 제공할 수 있다.According to the first embodiment, the sound source-text conversion function is performed in the first terminal 11, and the first terminal 11 installs a service application providing a call service according to an embodiment of the present invention, The service application installed in the first terminal 11 can provide functions necessary for the call service.

한편 제1 실시예에서 제2 단말(12)은 음원-텍스트 변환 기능을 수행하지 않는 바, 전술한 서비스 앱을 설치하지 않을 수 있으며, 통상적인 전화 통화 방법으로 제1 단말(11)과 통화할 수 있다. On the other hand, in the first embodiment, the second terminal 12 does not perform the sound source-text conversion function and thus can not install the above-mentioned service application, and can communicate with the first terminal 11 through a normal telephone call method .

제1 실시예에 따르면 제1 단말(11)은 제2 단말(12)로부터 수신된 음원을 텍스트로 출력하며, 사용자로부터 텍스트를 입력받아 이를 변환하여 제2 단말(12)에 제공하는 바, 제1 단말(11)의 사용자는 표시부에 표시된 텍스트를 읽고 텍스트를 입력하는 동작만으로 제2 단말(12)과 전화 통화를 할 수 있게 된다.According to the first embodiment, the first terminal 11 outputs the sound source received from the second terminal 12 as text, receives the text from the user, converts the text, and provides the converted text to the second terminal 12, The user of the first terminal 11 can read the text displayed on the display unit and enter a text to make a telephone conversation with the second terminal 12. [

제1 실시예에서의 통화는, 인터넷 전화, 유선 전화, 영상통화 등 다양한 통화 방식에 모두 적용될 수 있다. The call in the first embodiment can be applied to various types of communication such as an Internet telephone, a landline telephone, and a video call.

도 3에 도시된 단계들은, 통화가 진행되는 과정에서 여러 번 반복적으로 수행될 수 있다.The steps shown in FIG. 3 may be repeatedly performed a plurality of times in the course of a call.

도 4는 제1 실시예에 따른 통화 서비스 제공 방법의 변형예를 개략적으로 나타낸 흐름도이다. 상세히, 도 4는 도 3의 실시예에서 제1 단말(11)과 제2 단말(12) 간의 통화가 영상통화인 경우의 예를 도시한 것이다.4 is a flowchart schematically showing a modification of the call service providing method according to the first embodiment. In particular, FIG. 4 shows an example of a case where a call between the first terminal 11 and the second terminal 12 is a video call in the embodiment of FIG.

단계 41에서 제2 단말(12)은 사용자로부터 비디오를 입력받는다. 비디오는 음원 성분 및 영상 성분을 포함할 수 있다.In step 41, the second terminal 12 receives video from the user. The video may include a sound source component and an image component.

단계 42에서 제1 단말(11)은 제2 단말(12)로부터 음원 성분 및 영상 성분을 포함하는 비디오를 수신한다.In step 42, the first terminal 11 receives video including a sound source component and an image component from the second terminal 12.

단계 43에서 제1 단말(11)의 통화 제어부(121)는 비디오에 포함된 음원 성분을 음성인식 모듈(111)을 이용하여 텍스트로 변환한다.In step 43, the call control unit 121 of the first terminal 11 converts the sound source component included in the video into text using the voice recognition module 111. [

단계 44에서 제1 단말(11)의 표시 제어부(122)는 비디오에 포함된 영상 성분을 표시부에 표시하고, 단계 43에서 변환된 텍스트를 표시부에 표시한다.In step 44, the display control unit 122 of the first terminal 11 displays the image components included in the video on the display unit, and displays the converted text on the display unit in step 43. [

일 실시예에 따르면 단계 43 및 단계 44는 시간적으로 분리되어 수행될 수도 있고, 병렬적으로 수행될 수도 있다. 예컨대, 제1 단말(11)은 비디오에 포함된 영상 성분을 표시부에 표시하면서 동시에 비디오에 포함된 음원 성분을 음성인식 모듈(111)을 이용하여 텍스트로 변환할 수 있고, 변환된 텍스트를 영상 성분과 함께 표시부에 표시할 수 있다. According to one embodiment, steps 43 and 44 may be performed separately in time, or may be performed in parallel. For example, the first terminal 11 can display the image components included in the video on the display unit, simultaneously convert the sound source components included in the video into text using the voice recognition module 111, Can be displayed on the display unit.

일 실시예에 따르면, 단계 44에서 제1 단말(11)은 비디오에 포함된 음원 성분을 스피커로 출력할 수 있으나, 음원의 출력 여부는 제1 단말(11)의 사용자가 사전에 설정할 수 있다.According to one embodiment, in step 44, the first terminal 11 can output the sound source component included in the video to the speaker, but the user of the first terminal 11 can preset whether to output the sound source.

단계 45에서 제1 단말(11)은 카메라를 통해 비디오를 입력받고, 제1 단말(11)의 표시부에 표시된 사용자 인터페이스를 통해 사용자가 입력하는 텍스트를 입력받는다.In step 45, the first terminal 11 receives the video through the camera, and receives text input by the user through the user interface displayed on the display unit of the first terminal 11.

단계 46에서 제1 단말(11)은 단계 45에서 입력된 텍스트를 음성합성 모듈(112)을 이용하여 음원으로 변환한다.In step 46, the first terminal 11 converts the text input in step 45 into a sound source using the speech synthesis module 112.

단계 47에서 제1 단말(11)은 제2 단말(12)로 비디오를 전송할 수 있다. 이 때, 제2 단말(12)로 전송되는 비디오는, 음원 성분으로써 단계 46에서 변환된 음원을 포함할 수 있다. 또한, 비디오는 영상 성분으로써 단계 45에서 촬영된 비디오의 영상 성분을 포함할 수 있다. In step 47, the first terminal 11 may transmit video to the second terminal 12. At this time, the video transmitted to the second terminal 12 may include the sound source converted in step 46 as a sound source component. In addition, the video may include an image component of the video captured at step 45 as an image component.

일 실시예에 따르면 단계 47에서 전송되는 비디오의 음원 성분은, 단계 45에서 촬영된 비디오의 음원 성분과 단계 46에서 변환된 음원을 모두 포함할 수도 있다.According to one embodiment, the sound source component of the video transmitted in step 47 may include both the sound source component of the video shot in step 45 and the sound source converted in step 46. [

단계 48에서 제2 단말(12)은 단계 47에서 수신한 비디오를 출력한다. 비디오의 영상 성분은 표시부를 통해, 음원 성분은 스피커를 통해 출력될 수 있다. In step 48, the second terminal 12 outputs the video received in step 47. The video component of the video can be output via the display, and the sound component can be output via the speaker.

도 5는 제2 실시예에 따른 통화 서비스 제공 방법을 개략적으로 도시한 흐름도이다.5 is a flowchart schematically showing a call service providing method according to the second embodiment.

도 5에 도시된 흐름도에서 제1 단말(11)과 제2 단말(12) 간의 통신은, 전술한 것과 같이 서버(20)를 통하는 것일 수 있다. 본 예에서의 서버(20)는 통신 서버를 포함하며, 추가로 기지국을 더 포함할 수 있다. 즉, 제2 실시예에 따른 제1 단말(11)과 제2 단말(12) 간의 통화는, 인터넷 기반 서버를 통한 인터넷 전화도 가능하고, 인터넷 전화와 유선 전화가 결합된 형태도 가능하다. 도 5에 도시된 실시예에서 도 2의 메모리(110) 및 통화 제어부(121)는 서버(20)에 마련될 수 있고, 표시 제어부(122)는 제1 단말(11)에 마련될 수 있다. 한편, 제2 단말(12)은 음원을 입력받아 전송하고, 또한 음원을 수신하여 출력하는, 통상적인 전화 통화를 수행하는 바, 별도의 서비스 앱을 설치하지 않을 수 있다.The communication between the first terminal 11 and the second terminal 12 in the flow chart shown in Fig. 5 may be via the server 20 as described above. The server 20 in this example includes a communication server, and may further include a base station. That is, the communication between the first terminal 11 and the second terminal 12 according to the second embodiment can be an internet telephone through an Internet-based server, or a combination of an Internet telephone and a wire telephone. In the embodiment shown in FIG. 5, the memory 110 and the call control unit 121 of FIG. 2 may be provided in the server 20, and the display control unit 122 may be provided in the first terminal 11. On the other hand, the second terminal 12 performs a normal telephone conversation in which a sound source is received and transmitted, and a sound source is received and output, so that a separate service application may not be installed.

도 5를 참조하면, 단계 51에서 제2 단말(12)은 마이크 등의 장치를 통해 사용자로부터 음원(사용자의 목소리 등)을 입력받는다. Referring to FIG. 5, in step 51, the second terminal 12 receives a sound source (user's voice, etc.) from a user through a device such as a microphone.

단계 52에서 서버(20)는 제2 단말(12)로부터 음원을 수신한다.In step 52, the server 20 receives the sound source from the second terminal 12.

단계 53에서 서버(20)의 통화 제어부(121)는 음성인식 모듈(111)을 이용하여 단계 52에서 수신한 음원을 텍스트로 변환한다.In step 53, the call control unit 121 of the server 20 uses the voice recognition module 111 to convert the sound source received in step 52 into text.

단계 54에서 서버(20)는 변환된 텍스트를 제1 단말(11)에 전송한다.In step 54, the server 20 transmits the converted text to the first terminal 11.

단계 551에서 제1 단말(11)의 표시 제어부(122)는 단계 54에서 수신한 텍스트를 제1 단말(11)의 표시부에 표시한다.In step 551, the display control unit 122 of the first terminal 11 displays the text received in step 54 on the display unit of the first terminal 11.

단계 552에서 제1 단말(11)의 표시 제어부(122)는 사용자가 텍스트를 입력할 수 있는 사용자 인터페이스를 표시부에 표시하고, 사용자 인터페이스를 통해 사용자로부터 사용자로부터 텍스트를 입력받는다.In step 552, the display control unit 122 of the first terminal 11 displays on the display unit a user interface through which the user can input text, and receives text from the user through the user interface.

단계 56에서 제1 단말(11)은 단계 552에서 입력된 텍스트를 서버(20)에 전송한다.In step 56, the first terminal 11 transmits the text input in step 552 to the server 20.

단계 57에서 서버(20)의 통화 제어부(121)는 단계 56에서 입력받은 텍스트를 음성합성 모듈(112)을 이용하여 음원으로 변환한다.In step 57, the call control unit 121 of the server 20 converts the text input in step 56 into a sound source using the speech synthesis module 112. [

단계 58에서 서버(20)의 통화 제어부(121)는 단계 57에서 변환된 음원을 제2 단말(12)로 전송한다.In step 58, the call control unit 121 of the server 20 transmits the sound source converted in step 57 to the second terminal 12.

단계 59에서 제2 단말(12)은 단계 58에서 수신한 음원을 스피커 등의 장치를 이용하여 출력한다.In step 59, the second terminal 12 outputs the sound source received in step 58 using a device such as a speaker.

한편, 도 3에 도시된 흐름도를 변형하여 도 4에 도시된 것과 같이 영상통화에 적용하였듯이, 같은 방법으로 도 5에 도시된 흐름도를 변형하여 영상통화에 적용하는 것도 가능하다고 할 것이다. Meanwhile, it is also possible to modify the flowchart shown in FIG. 3 and apply it to the video call by modifying the flowchart shown in FIG. 5 in the same way as applied to the video call as shown in FIG.

도 6은 제1 실시예에서 음원-텍스트 변환 기능을 제1 단말(11)과 제2 단말(12)이 나누어 처리하는 경우에 따른 통화 서비스 제공 방법을 개략적으로 도시한 흐름도이다.6 is a flowchart schematically illustrating a method of providing a call service according to a case where the first terminal 11 and the second terminal 12 divide and process a sound source-text conversion function in the first embodiment.

도 6에 도시된 실시예에서, 제1 단말(11)과 제2 단말(12)은 통화 서비스 제공 장치(100)의 구성들, 예컨대 메모리(110)에 포함된 하나 이상의 모듈과 통화 제어부(121) 및 표시 제어부(122)를 각각 포함할 수 있으며, 음원-텍스트 변환 기능을 각각 처리할 수 있다. 이를 위하여 본 발명의 일 실시예에 따라 서버(20)에서 제공되는 서비스 앱이 제1 단말(11)과 제2 단말(12)에 각각 설치될 수 있다. 6, the first terminal 11 and the second terminal 12 can communicate with the configurations of the communication service providing apparatus 100, for example, one or more modules included in the memory 110 and the call control unit 121 And a display control unit 122, and can process sound source-text conversion functions, respectively. To this end, service apps provided by the server 20 may be installed in the first terminal 11 and the second terminal 12, respectively, according to an embodiment of the present invention.

도 6을 참조하면, 단계 61에서 제2 단말(12)은 사용자로부터 마이크를 통해 음원을 입력받는다.Referring to FIG. 6, in step 61, the second terminal 12 receives a sound source from a user via a microphone.

단계 62에서 제2 단말(12)은 음성인식 모듈(111)을 이용하여 음원을 텍스트로 변환한다.In step 62, the second terminal 12 converts the sound source into text using the speech recognition module 111. [

단계 63에서 제2 단말(12)은 변환된 텍스트를 제1 단말(11)에 전송한다. 제1 단말(11)은 제2 단말(12)로부터 텍스트를 수신한다.In step 63, the second terminal 12 transmits the converted text to the first terminal 11. The first terminal 11 receives text from the second terminal 12.

단계 64에서 제1 단말(11)은 단계 63에서 수신한 텍스트를 제1 단말(11)의 표시부에 표시한다.In step 64, the first terminal 11 displays the text received in step 63 on the display of the first terminal 11.

단계 65에서 제1 단말(11)은 사용자로부터 텍스트를 입력받는다.In step 65, the first terminal 11 receives text from the user.

단계 66에서 제1 단말(11)은 단계 65에서 입력받은 텍스트를 음성합성 모듈(112)을 이용하여 음원으로 변환한다.In step 66, the first terminal 11 converts the text input in step 65 into a sound source using the speech synthesis module 112.

단계 67에서 제1 단말(11)은 단계 66에서 변환된 음원을 제2 단말(12)에 전송한다.In step 67, the first terminal 11 transmits the converted sound source to the second terminal 12 in step 66.

단계 68에서 제2 단말(12)은 단계 67에서 수신한 음원을 출력한다. In step 68, the second terminal 12 outputs the sound source received in step 67.

도 6에 도시된 실시예에서는, 제2 단말(12)에서 사용자가 다루는 데이터 입출력 형식은 음원이고, 제1 단말(11)에서 사용자가 다루는 데이터 입출력 형식은 텍스트로 설정되었다. 즉, 제2 단말(12)의 사용자는 통상적인 방법으로 마이크에 이야기하고 스피커를 통해 상대방의 이야기를 들을 수 있고, 제2 단말(11)의 사용자는 상대방의 이야기를 텍스트로 확인할 수 있고, 상대방에게 전하고 싶은 말을 텍스트로 입력할 수 있다. 이와 같은 설정은 통화 개시 단계에서, 혹은 서비스 앱에서의 사전 설정을 통해 사용자에 의해 지정될 수 있다.In the embodiment shown in FIG. 6, the data input / output format handled by the user at the second terminal 12 is a sound source, and the data input / output format handled by the user at the first terminal 11 is set to text. That is, the user of the second terminal 12 can talk to the microphone and listen to the talk of the other party through the speaker in a conventional manner, and the user of the second terminal 11 can confirm the story of the other party by text, You can enter the words you want to tell in text. Such settings may be specified by the user at the beginning of a call or through a preset in the service app.

예컨대, 단계 61 이전에, 제1 단말(11)이 사용자로부터 제2 단말(12)과의 통화 개시 요청을 입력받는 단계, 제1 단말(11)이 제2 단말(12)로 통화 개시 요청을 전송하는 단계, 및 제1 단말(11)이 제2 단말(12)로부터 통화 개시 승인을 수신하여 통화를 개시하는 단계 등이 더 포함될 수 있다. 이 때 통화 개시 요청은 제1 단말(11)에서의 데이터 입출력 형식에 관한 정보를 포함할 수 있으며, 데이터 입출력 형식에 관한 정보는 텍스트 또는 음원 중 어느 하나로 지정될 수 있다. 통화 개시 승인은 제2 단말(12)에서의 데이터 입출력 형식에 관한 정보를 포함할 수 있다. 도 6의 예에서 제1 단말(11)은 데이터 입출력 형식을 텍스트로 지정한 통화 개시 요청을 제2 단말(12)에 전송할 수 있다. 제2 단말(12)은 데이터 입출력 형식을 음원으로 지정한 통화 개시 승인을 제1 단말(11)에 전송할 수 있다. 이하에서 데이터 입출력 형식이 텍스트로 지정된 상태의 통화를 텍스트 모드, 데이터 입출력 형식이 음원으로 지정된 상태의 통화를 음원 모드라고 지칭하기로 한다.For example, before step 61, when the first terminal 11 receives a call initiation request from the user with the second terminal 12, the first terminal 11 sends a call initiation request to the second terminal 12 And initiating a call by receiving the call initiation acknowledgment from the second terminal 12 by the first terminal 11, and the like. At this time, the call initiation request may include information on the data input / output format of the first terminal 11, and the information on the data input / output format may be designated as either text or sound source. The call initiation acknowledgment may include information about the data input / output format at the second terminal 12. In the example of FIG. 6, the first terminal 11 may transmit a call initiation request to the second terminal 12 specifying a data input / output format as text. The second terminal 12 can transmit the call initiation approval to the first terminal 11 in which the data input / output format is designated as the sound source. Hereinafter, a call in a state in which a data input / output format is specified as text is referred to as a text mode, and a call in a state in which a data input / output format is specified as a sound source is referred to as a sound source mode.

도 6에서는, 데이터를 전송하는 측에서 데이터의 형식을 변환하여 전송하는 예가 도시되었으나, 본 발명은 데이터를 수신하는 측에서 데이터의 형식을 변환하는 실시예도 제공할 수 있다. 6 shows an example in which the format of data is converted and transmitted on the data transmitting side. However, the present invention can also provide an embodiment for converting the format of data on the data receiving side.

예를 들어, 단계 62가 제1 단말(11)에서 수행될 수 있고, 단계 66이 제2 단말(12)에서 수행될 수 있다. 본 실시예에 따르면 제1 단말(11)은 제2 단말(12)로부터 음원을 수신하고 이를 텍스트로 변환하여 표시부에 표시할 수 있고, 제2 단말(12)은 제1 단말(11)로부터 텍스트를 수신하고 이를 음원으로 변환하여 스피커로 출력할 수 있다.For example, step 62 may be performed at the first terminal 11, and step 66 may be performed at the second terminal 12. [ The first terminal 11 can receive the sound source from the second terminal 12 and convert it into text and display it on the display unit and the second terminal 12 can receive the sound source from the first terminal 11, And converts it into a sound source and outputs it to a speaker.

도 7 내지 13은 단말(10)에 표시되는 화면의 다양한 예이다. 이하에서는 도 7 내지 13을 참조하여, 도 1의 통화 시스템 및 도 2의 통화 서비스 제공 장치의 기능에 대해 더욱 자세히 설명한다.Figs. 7 to 13 show various examples of screens displayed on the terminal 10. Fig. Hereinafter, the call system of FIG. 1 and the function of the call service providing apparatus of FIG. 2 will be described in more detail with reference to FIGS.

도 7은 제1 단말(11)에 표시되는 화면의 예이다. FIG. 7 shows an example of a screen displayed on the first terminal 11.

도 7을 참조하면, 제1 단말(11)은 제2 단말(12)과 영상통화를 하고 있으며, 제2 단말(12)에서 촬영된 비디오의 영상 성분(71)이 표시되었고, 제2 단말(12)에서 촬영된 비디오의 음원 성분으로부터 변환된 텍스트(72)가 표시되었다.7, the first terminal 11 is in video communication with the second terminal 12, the video component 71 of the video captured by the second terminal 12 is displayed, and the second terminal 12 12, the converted text 72 was displayed from the sound source component of the video.

도 7을 참조하면, 텍스트(72)가 표시되었는 바 제1 단말(11)은 텍스트 모드의 통화 서비스를 이용하고 있음을 알 수 있다. 한편 도 7을 참조하면, 통화 제어부(121)는 텍스트 모드를 선택하거나 해제할 수 있는 토글 기능을 제공할 수 있고, 표시 제어부(122)는 토글 아이콘(73)을 화면에 표시할 수 있다. 이와 같은 아이콘(73)을 통해 제1 단말(11)은 통화 과정에서 수신되는 데이터를 텍스트 형식으로 표시부에 출력하는 텍스트 모드 및 음원 형식으로 스피커를 통해 출력하는 음원 모드를 택일적으로 제공할 수 있다.Referring to FIG. 7, the text 72 is displayed, indicating that the first terminal 11 is using the text mode call service. 7, the call control unit 121 may provide a toggle function to select or cancel the text mode, and the display control unit 122 may display the toggle icon 73 on the screen. Through the icon 73, the first terminal 11 can alternatively provide a text mode for outputting data received in the course of a call in text format and a sound source mode for outputting through a speaker in a sound source format .

도 7에서는 사용자에 의해 아이콘(73)이 클릭되고 텍스트 모드가 선택되어 텍스트 모드로 통화가 진행되고 있다. 만약 사용자가 아이콘(73)을 다시 선택하여 텍스트 모드를 해제하는 경우, 제1 단말(11)은 음원 모드로 설정되어 텍스트(72)를 표시하지 않고, 제2 단말(12)에서 수신되는 음원(또는 제2 단말(12)에서 촬영된 비디오의 음원 성분)을 제1 단말(11)의 스피커로 출력할 수 있다. 또한 제1 단말(11)은 음원 모드에서는 제1 단말(11)의 마이크를 통해 음원을 입력받고, 이를 제2 단말(12)에 전송할 수 있다.In FIG. 7, the icon 73 is clicked by the user, the text mode is selected, and the call is proceeding in the text mode. If the user deselects the icon 73 to release the text mode, the first terminal 11 is set to the sound source mode and does not display the text 72, Or the sound source component of the video photographed by the second terminal 12) to the speaker of the first terminal 11. In the sound source mode, the first terminal 11 can receive a sound source through the microphone of the first terminal 11 and transmit the sound source to the second terminal 12.

제1 단말(11)의 카메라를 통해 촬영되는 영상(74)이 화면 일 영역에 표시될 수 있다.The image 74 photographed through the camera of the first terminal 11 can be displayed in one area of the screen.

제1 단말(11)은 다양한 기능을 제공하는 아이콘들을 더 표시할 수 있다. 예를 들어, 터치스크린, 전자펜 등을 이용하여 직접 글자를 필기함으로써 텍스트를 입력할 수 있는 필기모드를 제공하는 아이콘(75), 자판(물리적 자판 또는 터치스크린 상에 표시된 자판 모두 가능함)을 통해 텍스트를 타이핑하는 타이핑 모드를 제공하는 아이콘(76), 통화 종료 아이콘(77)이 제공될 수 있다. 또한 목소리 필터 기능을 제공하는 아이콘(78), 저장된 텍스트를 불러올 수 있는 아이콘(79), 번역 기능을 제공하는 아이콘(710), 후레시 아이콘(711)이 제공될 수 있다.The first terminal 11 may further display icons providing various functions. For example, an icon 75 for providing a handwriting mode for inputting text by directly writing a character using a touch screen or an electronic pen, and a keyboard (both a physical keyboard or a keyboard displayed on the touch screen) An icon 76 that provides a typing mode for typing text, and a call termination icon 77 may be provided. In addition, an icon 78 for providing a voice filter function, an icon 79 for loading stored text, an icon 710 for providing a translation function, and a flashing icon 711 can be provided.

도 7에서 아이콘(710)은 "가"라고 표시되었고, 이는 현재 사용언어가 한국어로 설정되어 제2 단말(12)로부터 수신된 음원을 한국어로 인식하여 화면에 표시하고 있음을 나타내는 것일 수 있다. 사용자가 아이콘(710)을 선택하면 사용언어가 변경될 수 있다. 예를 들어, 사용자는 아이콘(710)을 선택하여 사용언어를 일본어로 변경할 수 있다. 이 경우 통화 제어부(121)는 제2 단말(12)로부터 수신된 음원을 음성인식 모듈(111)을 이용하여 변환한 텍스트를, 번역 모듈(114)을 이용하여 변환된 텍스트의 제1 언어에서, 사용자에 의해 사용언어로 지정된 제2 언어로, 번역할 수 있다. 표시 제어부(122)는 제2 언어로 번역된 텍스트를 표시부에 표시할 수 있다.In Fig. 7, the icon 710 is indicated as " A ", which indicates that the sound source received from the second terminal 12 is recognized in Korean and the current language is set to Korean and displayed on the screen. When the user selects the icon 710, the used language may be changed. For example, the user can select the icon 710 and change the language used to Japanese. In this case, the call control unit 121 converts the text converted from the sound source received from the second terminal 12 using the speech recognition module 111 into the first language of the converted text using the translation module 114, In a second language designated by the user in the language used by the user. The display control unit 122 can display the translated text in the second language on the display unit.

아이콘(711)이 선택되면 제1 단말(11)은 후레시 점등 명령을 제2 단말(12)에 전송할 수 있고, 제2 단말(12)은 후레시 점등 명령에 따라 후레시를 기설정된 패턴으로 점등할 수 있다. 제1 단말(11)의 사용자는 아이콘(711)을 이용하여 제2 단말(12)의 사용자의 주의를 환기시킬 수 있다.When the icon 711 is selected, the first terminal 11 can transmit a flashing-on command to the second terminal 12, and the second terminal 12 can turn on the flashing in a predetermined pattern according to the flashing- have. The user of the first terminal 11 can call the user of the second terminal 12 using the icon 711. [

도 8은 제1 단말(11)에 표시되는 화면의 다른 예이다. 도 8은 영상통화가 아닌 일반 유선전화가 이루어지는 경우에 제1 단말(11)에 표시되는 화면의 예이다. 본 예에서 통화 상대방인 제2 단말(12)은 통상의 유선전화를 진행할 수 있다.8 is another example of a screen displayed on the first terminal 11. In FIG. 8 is an example of a screen displayed on the first terminal 11 when a general landline telephone is made rather than a video call. In this example, the second terminal 12, which is the other party of the call, can proceed with a normal wired telephone call.

도 8을 참조하면, 타이핑 모드가 활성화된 것을 볼 수 있다. 타이핑 모드가 활성화되면 제1 단말(11)의 표시부에 표시되는 화면(80)은 제2 단말(12)에서 입력된 음원을 변환한 텍스트(T2)를 표시하는 제1 영역(81) 및 제1 단말(11)의 사용자로부터 텍스트를 입력받는 제2 영역(82)을 포함할 수 있다. Referring to FIG. 8, it can be seen that the typing mode is activated. When the typing mode is activated, the screen 80 displayed on the display unit of the first terminal 11 is divided into a first area 81 for displaying the text T2 converted from the sound source input from the second terminal 12, And a second area 82 for receiving text from a user of the terminal 11.

한편 제1 영역(81)은 제1 단말(11)의 사용자에 의해 최근에 입력되었던 텍스트(T1)를 더 표시할 수 있다. 텍스트(T1)는 음원으로 변환되어 제2 단말(12)로 전송되었을 수 있다.Meanwhile, the first area 81 may further display the text T1 recently input by the user of the first terminal 11. The text T1 may be converted to a sound source and transmitted to the second terminal 12.

제1 영역(81)은, 제2 단말(12)에서 입력된 음원을 변환한 텍스트와, 제1 단말(11)에서 입력된 텍스트를, 시간 순으로 누적 및 나열하여 표시할 수 있다. The first area 81 can accumulate and display the text converted from the sound source input from the second terminal 12 and the text input from the first terminal 11 in chronological order.

제1 영역(81) 내에서 제2 단말(12)에서 입력된 음원을 변환한 텍스트와 제1 단말(11)에서 입력된 텍스트는 서로 다른 서식으로 표시될 수 있다.The text converted from the sound source input from the second terminal 12 in the first area 81 and the text input from the first terminal 11 can be displayed in different formats.

도 9a는 제1 단말(11)에서 표시되는 화면의 다른 예이다.9A is another example of a screen displayed on the first terminal 11. As shown in FIG.

도 9a를 참조하면, 통화 제어부(121)는 제2 단말(12)에서 입력된 음원을 변환한 텍스트(T2)와 제1 단말(11)에서 입력된 텍스트(T1)를 시간순으로 누적 및 나열하여 표시할 수 있다. Referring to FIG. 9A, the call control unit 121 accumulates and arranges, in chronological order, the text (T2) converted from the sound source input from the second terminal 12 and the text (T1) input from the first terminal 11 Can be displayed.

도 9a에서 텍스트(T1)와 텍스트(T2)는 화자를 구분하기 위하여 서로 다른 서식으로 구분되어 표시되었다. 도 9a의 예에서는 텍스트(T1)에 "기울임" 서식을 적용하고 텍스트(T2)에 "bold" 서식을 적용하여 굵게 표시함으로써 텍스트(T1)와 텍스트(T2)를 구분하였다. 그러나, 본 발명은 이에 한정하지 않는다. 예컨대, 텍스트의 색상, 음영, 글꼴, 표시위치, 정렬방법(오른쪽 정렬, 또는 왼쪽 정렬) 등의 서식이 서로 다르게 지정될 수 있다.In FIG. 9A, the text T1 and the text T2 are displayed in different formats in order to distinguish the speakers. In the example of FIG. 9A, the text (T1) is distinguished from the text (T2) by applying an "italic" format to the text (T1) and bolding the text (T2) by applying a "bold" format. However, the present invention is not limited thereto. For example, the format of text color, shade, font, display position, alignment method (right alignment, or left alignment) can be specified differently.

본 발명의 일 실시예에 따르면, 통화 제어부(121)는 음원을 변환한 텍스트(T2)를 표시할 서식을, 해당 음원의 진폭에 대응하여 결정할 수 있다. 예를 들어, 통화 제어부(121)는 제1 음원으로부터 변환된 제1 텍스트(T21)와 제2 음원으로부터 변환된 제2 텍스트(T22)를 각각 표시할 서식을 결정할 수 있으며, 제1 음원의 진폭에 대응하여 제1 텍스트(T21)의 서식을, 제2 음원의 진폭에 대응하여 제2 텍스트(T22)의 서식을 결정할 수 있다. 표시 제어부(122)는 음원의 진폭에 대응하여 결정된 서식으로 해당 텍스트를 표시할 수 있다. According to an embodiment of the present invention, the call control unit 121 can determine a format for displaying the text (T2) converted from the sound source in accordance with the amplitude of the sound source. For example, the call control unit 121 can determine a format for displaying the first text T21 converted from the first sound source and the second text T22 converted from the second sound source, respectively, and the amplitude of the first sound source The format of the first text T21 corresponding to the amplitude of the second sound source and the format of the second text T22 corresponding to the amplitude of the second sound source can be determined. The display control unit 122 can display the text in a format determined in accordance with the amplitude of the sound source.

음원의 진폭에 대응되는 서식을 결정한다는 것은, 음원의 진폭이 클수록 텍스트가 크게 표시되도록 서식을 결정하는 것일 수 있다. 예를 들어, 음원의 진폭이 기설정된 임계값을 초과하는 경우 서식을 기본값보다 크게 결정할 수 있다. 또는, 하나 이상의 음원 진폭 구간 및 이에 대응되는 서식을 미리 설정하여 둘 수 있으며, 통화 제어부(121)는 음원의 진폭이 해당하는 구간에 대응되는 서식으로, 음원으로부터 변환된 텍스트의 서식을 결정할 수 있다.Determining the format corresponding to the amplitude of the sound source may be to determine the format so that the larger the amplitude of the sound source is, the larger the text is displayed. For example, if the amplitude of a sound source exceeds a predetermined threshold, the format may be determined to be greater than the default value. Alternatively, the at least one sound source amplitude section and the corresponding format may be set in advance, and the call control section 121 may determine the format of the text converted from the sound source in the format corresponding to the section corresponding to the amplitude of the sound source .

도 9a에서는 제2 음원의 진폭이 기설정된 임계값을 초과하여 제2 텍스트(T22)가 나머지 텍스트들보다 크게 표시된 예가 도시되었다.9A shows an example in which the amplitude of the second sound source exceeds a predetermined threshold value and the second text T22 is displayed larger than the remaining text.

도 9a를 참조하면, 제1 단말(11)의 사용자는 음원을 직접 듣지 않더라도 텍스트의 서식을 통해 음원에 포함된 정보, 예컨대 화자의 목소리 크기, 감정 등을 유추할 수 있게 된다.Referring to FIG. 9A, the user of the first terminal 11 can infer the information included in the sound source, for example, the voice size, emotion, etc. of the speaker through the format of the text without directly listening to the sound source.

도 9b 및 도 9c는 제1 단말(11)에 표시되는 화면의 다른 예이다. 도 9b 및 도 9c는 도 9a에 도시된 화면이 표시된 상태에서 사용자에 의해 제1 단말(11)에 입력되는 제1 입력에 대응하여, 누적된 대화 텍스트가 스크롤된 화면의 예이다.9B and 9C show another example of the screen displayed on the first terminal 11. 9B and 9C are examples of screens in which the accumulated dialog text is scrolled in response to the first input inputted to the first terminal 11 by the user in the state where the screen shown in FIG. 9A is displayed.

제1 입력은, 스크롤바(미도시)를 이용하는 스크롤 입력이거나, 도 9a의 화면 일 지점을 터치한 상태로 터치포인트를 상/하로 이동하는 드래그 또는 플리킹 입력일 수 있다. The first input may be a scroll input using a scroll bar (not shown), or a drag or flicking input that moves the touch point up / down while touching a point on the screen of FIG. 9A.

일 예에 따르면, 도 9a에 도시된 화면 일 지점을 터치한 상태를 기설정된 시간 이상 유지하게 되면 제1 단말(11)은 스크롤 동작이 시작된 것으로 판단하여 도 9b에 도시된 것과 같이 과거 대화 내역(91)을 더 표시할 수 있다.According to one example, when the user touches one point of the screen shown in FIG. 9A for a predetermined time or more, the first terminal 11 determines that the scroll operation has been started. As shown in FIG. 9B, 91) can be further displayed.

일 예에 따르면, 사용자가 도 9a에 도시된 화면 일 지점을 터치한 상태를 유지한 채로 터치포인트를 상/하로 이동하는 드래그(92) 또는 플리킹을 입력하게 되면, 제1 단말(11)은 드래그(92) (또는 플리킹) 방향에 대응하여 대화내역을 스크롤할 수 있다. 도 9c에서 드래그(92)가 아래 방향으로 입력되어 대화 내역도 아래로 스크롤되었으며, 이에 따라 과거 대화 내역이 상단에 추가로 표시된 것을 볼 수 있다.According to an example, when the user inputs dragging 92 or flicking to move the touch point up / down while maintaining the state of touching one point of the screen shown in FIG. 9A, the first terminal 11 It is possible to scroll the conversation history corresponding to the direction of the drag 92 (or flicking). In FIG. 9C, the drag 92 is input in the downward direction, and the conversation history is also scrolled down, so that the past conversation history is additionally displayed at the top.

한편, 도 8에 도시된 것과 같이 제1 단말(11)의 표시부가 제2 단말(12)로부터 수신된 음원을 변환한 텍스트를 표시하는 제 1 영역(81)과 제1 단말(11)의 사용자가 입력하는 텍스트를 표시하는 제2 영역(82)을 한 화면에 표시하는 경우에는, 제1 영역(81) 내에서 드래그가 입력되었을 때 제1 단말(11)은 제1 영역(81) 내에서만 스크롤을 수행할 수 있다. 이에 따르면 제1 영역(81) 내에서 스크롤이 수행되더라도 제2 영역(82)은 화면에 고정적으로 표시될 수 있다.8, the display unit of the first terminal 11 displays a first area 81 for displaying the text converted from the sound source received from the second terminal 12, The first terminal 11 is displayed only in the first area 81 when dragging is input in the first area 81. In this case, You can perform scrolling. Accordingly, even if scrolling is performed in the first area 81, the second area 82 can be fixedly displayed on the screen.

다른 예에 따르면, 제1 영역(81) 내에서 드래그가 입력되면 제1 단말(11)의 표시 제어부(122)는 제2 영역(82)이 표시된 위치까지 제 영역(81)을 확장하여 표시하고, 확장된 제1 영역(81)에서 스크롤을 수행할 수 있다. 이에 따르면 드래그가 입력된 후 화면에는 제1 영역(81)만 표시될 수 있다. 사용자는, 화면 상의 아이콘(필기 입력 아이콘, 타이핑 입력 아이콘 등)을 이용하여 다시 제2 영역(82)이 표시되도록 할 수 있다.According to another example, when a drag is input in the first area 81, the display control part 122 of the first terminal 11 expands and displays the area 81 to the position where the second area 82 is displayed , It is possible to perform scrolling in the extended first area 81. [ According to this, only the first area 81 can be displayed on the screen after the dragging is inputted. The user can display the second area 82 again using an icon (handwriting input icon, typing input icon, etc.) on the screen.

도 10은 제1 단말(11)에 표시되는 화면의 다른 예이다. 도 10은 도7의 화면에서 아이콘(76) 및 아이콘(78)이 선택된 경우에 표시될 수 있는 화면의 예이다.10 is another example of a screen displayed on the first terminal 11. In FIG. 10 is an example of a screen that can be displayed when the icon 76 and the icon 78 are selected on the screen of Fig.

제1 단말(11)은 타이핑 기능 아이콘(76) 선택에 대응하여, 터치스크린 상에 자판(1001) 및 타이핑된 텍스트가 표시되는 제2 영역(1002)을 표시할 수 있다. 또한 아이콘(78) 선택에 대응하여, 목소리 필터 목록(1003)을 표시할 수 있다. The first terminal 11 may display the keyboard 1001 on the touch screen and the second area 1002 in which the typed text is displayed corresponding to the selection of the typing function icon 76. [ Also, corresponding to the selection of the icon 78, the voice filter list 1003 can be displayed.

목록(1003)에서 특정 필터가 선택되면, 제1 단말(11)은 사용자에 의해 선택된 음원 필터 정보를 서버(20)에 전송할 수 있다. 이에 따라 서버(20)는, 제1 단말(11)로부터 입력된 텍스트(도 10에서 "오~"에 해당할 수 있다.)로부터 합성된 음원에, 음원 필터 정보에 대응되는 음원 필터를 적용하여, 제2 단말(12)로 전송할 수 있다. 여기서 음원의 합성은, 제1 단말(11) 또는 서버(20)에서 처리될 수 있다. 본 예에서 제1 단말(11)은 음원 필터 목록만을 저장할 수 있고, 각 음원 필터를 음원에 적용하기 위해 필요한 실질적인 필터 데이터는 서버(20)가 저장할 수 있다. When a specific filter is selected in the list 1003, the first terminal 11 can transmit the sound source filter information selected by the user to the server 20. The server 20 applies the sound source filter corresponding to the sound source filter information to the sound source synthesized from the text input from the first terminal 11 (corresponding to " o " in Fig. 10) , To the second terminal 12. Here, the synthesis of the sound source can be processed in the first terminal 11 or the server 20. [ In this example, the first terminal 11 can store only the sound source filter list, and the actual filter data necessary for applying each sound source filter to the sound source can be stored by the server 20.

도 11은 제1 단말(11)에 표시되는 화면의 다른 예이다. 도 11은 도7의 화면에서 아이콘(79)이 선택된 경우에 표시될 수 있는 화면의 예이다. 사용자는 입력하고자 하는 텍스트를 직접 타이핑할 수 있을 뿐 아니라, 아이콘(79)을 선택하여 기저장된 텍스트를 불러올 수 있다. 11 is another example of a screen displayed on the first terminal 11. In FIG. 11 is an example of a screen that can be displayed when the icon 79 is selected on the screen of Fig. The user can directly type the text to be input, as well as select the icon 79 to retrieve the pre-stored text.

제1 단말(11)의 통화 제어부(121)는 아이콘(79)이 선택되면, 사서함에 기 저장되었던 텍스트를 제공할 수 있다. 사서함의 텍스트는 제1 단말(11)의 자체 메모리, 또는 서버(20)에 저장된 것일 수 있다. 도 11에서 사서함에 저장된 일 텍스트(1102)가 화면에 표시되었고, 사용자는 스크롤바(1101)를 이용하거나 텍스트(1102)를 좌/우로 드래그하여 사서함에 저장된 다른 텍스트를 조회할 수 있다. 사용자는 텍스트(1102)를 터치, 클릭하거나, 텍스트(1102)를 상단에 영상이 표시된 화면으로 끌어놓는 드래그 등의 방법으로 선택할 수 있다.The call control unit 121 of the first terminal 11 can provide the text that was previously stored in the mailbox when the icon 79 is selected. The text of the mailbox may be stored in the first memory 11 of the first terminal 11, or in the server 20. In FIG. 11, one text 1102 stored in the mailbox is displayed on the screen, and the user can inquire other text stored in the mailbox by using the scroll bar 1101 or dragging the text 1102 left / right. The user can select the text 1102 by touching and clicking, or by dragging the text 1102 onto the screen where the image is displayed at the top.

제1 단말(11)은 이와 같이 사용자로부터 제1 단말(11) 또는 서버(20)에 저장된 텍스트를 선택받을 수 있다. 텍스트가 선택되면 통화 제어부(121)는, 선택된 텍스트를 음성합성 모듈(112)을 이용하여 음원으로 변환하여 제2 단말(12)에 제공한다.The first terminal 11 can receive the text stored in the first terminal 11 or the server 20 from the user. When the text is selected, the call control unit 121 converts the selected text into a sound source using the sound synthesis module 112 and provides the sound source to the second terminal 12.

도 12는 제1 단말(11)에 표시되는 화면의 다른 예이다. 도 12는 도 7의 화면에서 아이콘(75)이 선택된 경우에 표시될 수 있는 화면의 예이다. 사용자는 입력하고자 하는 텍스트를 타이핑할 수 있을 뿐 아니라, 전자펜 또는 터치스크린을 이용하여 직접 필기할 수 있다. 도 12를 참조하면, 사용자는 "그래"라는 문자를 터치스크린 상에 하나 이상의 드래그 동작으로 직접 필기하여 입력하였고, 필기된 문자(1201)가 도시되었다. 제1 단말(11)은 필기된 문자(1201)를 문자인식 모듈(113)을 이용하여 텍스트로 변환하고, 변환된 텍스트(1202)를 화면 하단에 표시할 수 있다. 사용자가 전송(1203)을 선택하면, 통화 제어부(121)는 변환된 텍스트(1202)를 음성합성 모듈(112)을 이용하여 음원으로 변환하여 제2 단말(12)에 전송할 수 있다. 사용자는 우측 상단의 닫기 아이콘을 선택하여, 필기모드를 종료할 수 있다.12 is another example of a screen displayed on the first terminal 11. [ 12 is an example of a screen that can be displayed when the icon 75 is selected on the screen of Fig. In addition to typing the text to be input, the user can write directly using an electronic pen or a touch screen. Referring to Figure 12, the user has directly typed the word " yes " on the touch screen with one or more drag operations, and the handwritten characters 1201 are shown. The first terminal 11 can convert the handwritten character 1201 into text using the character recognition module 113 and display the converted text 1202 at the bottom of the screen. When the user selects transmission 1203, the call control unit 121 can convert the converted text 1202 into a sound source using the speech synthesis module 112 and transmit the converted sound to the second terminal 12. The user can exit the handwriting mode by selecting the close icon in the upper right corner.

도 13은 제1 단말(11)에 표시되는 화면의 다른 예이다.13 is another example of a screen displayed on the first terminal 11. [

통화 제어부(121)는, 통화가 종료된 후 통화 내역을 저장할 수 있고, 통화 요약 컨텐츠를 제공할 수 있다. 도 13은 통화 요약 컨텐츠가 표시된 화면의 예이다.The call control unit 121 can store the call history after the call is terminated, and can provide call summary contents. 13 is an example of a screen in which call summary contents are displayed.

통화 요약 컨텐츠는, 통화동안 누적된 텍스트에 포함된 단어들 중 선정된 하나 이상의 키워드를 포함할 수 있다. 통화 제어부(121)는 각 단어의 품사, 빈도, 각 단어가 기설정된 형식에 매칭되는지 여부 중 적어도 하나를 고려하여, 누적된 텍스트에 포함된 단어들 중 하나 이상의 키워드를 선정할 수 있다. 통화 제어부(121)는 위 나열한 조건들에 따라 각 키워드에 점수 또는 우선순위를 부여할 수 있고, 점수 또는 우선순위가 높은 순으로 기설정된 개수의 키워드를 선정할 수 있다. 예를 들어, 단어의 품사가 명사인 경우 점수 또는 우선순위를 높게 부여할 수 있다. 단어의 빈도가 높을수록, 점수 또는 우선순위를 높게 부여할 수 있다. 통화 제어부(121)는 위 나열한 조건들 중 적어도 하나를 고려하여 각 키워드의 폰트, 크기, 색상, 표시위치 중 하나 이상을 결정할 수 있다. 예를 들어, 명사는 기설정된 제1 서식으로, 빈도가 가장 높은 키워드는 제2 서식으로 결정할 수 있다. 또는 빈도에 따라 키워드의 크기를 차등적으로 결정할 수 있다. 예컨대 제1 서식은 "bold"를 적용하는 것이거나, 특정 색상으로 표시하는 것이거나, 특정 표시효과(예를 들어 음영처리)를 적용하는 것일 수 있다. The call summary content may include one or more keywords selected from words included in the text accumulated during the call. The call control unit 121 may select one or more keywords among the words included in the accumulated text in consideration of at least one of the part of speech, frequency, and whether or not each word matches the predetermined format. The call control unit 121 may assign a score or a priority to each keyword according to the above-mentioned conditions, and may select a predetermined number of keywords in descending order of score or priority. For example, if a part of a word is a noun, it may be assigned a higher score or priority. The higher the frequency of the word, the higher the score or priority can be given. The call control unit 121 may determine at least one of the font, size, color, and display position of each keyword in consideration of at least one of the above listed conditions. For example, a noun may be determined in a first format, and a keyword having the highest frequency may be determined in a second format. Alternatively, the size of the keyword can be determined differentially according to the frequency. For example, the first style may be to apply "bold", to display in a specific color, or to apply a specific display effect (eg shading).

통화 요약 컨텐츠는, 통화영상을 요약한 요약영상을 제공할 수 있다. 요약영상은, 전체 통화영상 중 일부를 포함할 수 있다. 요약영상은 전체 통화영상 중, 키워드로 선정된 단어가 등장하는 구간만을 포함하도록 생성될 수 있다. 요약 영상은 전체 통화영상 중, 키워드로 선정된 단어 중 점수 또는 우선순위가 가장 높게 부여된 기설정된 개수의 키워드가 등장하는 구간만을 포함하도록 생성될 수 있다. 요약 영상의 생성 방법은 이에 한정하지 않는다.The call summary content may provide a summary image summarizing the call image. The summary image may include some of the entire call images. The summary image may be generated so as to include only a section in which the word selected as the keyword appears in the entire call image. The summary image may be generated so as to include only a section in which a predetermined number of keywords assigned with the highest score or the highest priority among the words selected as keywords among the entire call images. The method of generating the summary image is not limited thereto.

일 예에 따르면, 통화 제어부(121)는, 통화가 종료된 후 자동으로 요약 컨텐츠를 제공할 수 있다. 일 예에 따르면, 통화 제어부(121)는, 저장된 통화 내역의 목록을 제공할 수 있고, 사용자는 목록에서 원하는 통화 내역을 선택하여 통화 내역을 다시 재생해볼 수 있다. 한편, 통화 제어부(121)는 목록에서 특정 통화 내역이 선택되었을 때 해당 통화 내역에 대한 통화 요약 컨텐츠를 제공함으로써, 사용자가 통화 요약 컨텐츠를 먼저 확인해본 후 전체 통화 내역을 재생할지 여부를 선택하도록 할 수 있다.According to one example, the call control unit 121 can automatically provide the summary contents after the call is terminated. According to an example, the call control unit 121 can provide a list of stored call history, and the user can select a desired call history from the list and play back the call history again. On the other hand, when a specific call history is selected from the list, the call control unit 121 provides the call summary contents for the call history, so that the user can check the call summary contents first and then select whether to play the entire call history .

본 발명의 일 실시예에 따르면, 상대방과의 통화 내용이 전부 텍스트의 형태로 한번 이상 입력되거나 출력되는 바, 그러한 텍스트를 활용한 다양한 서비스들이 부가적으로 제공될 수 있다. 통화 요약 컨텐츠는 부가적 서비스의 일 예이다.According to an embodiment of the present invention, since the contents of communication with the other party are inputted or outputted one or more times in the form of text, various services using such text can be additionally provided. The call summary content is an example of an additional service.

매체는 컴퓨터로 실행 가능한 프로그램을 계속 저장하거나, 실행 또는 다운로드를 위해 임시 저장하는 것일 수도 있다. 또한, 매체는 단일 또는 수개 하드웨어가 결합된 형태의 다양한 기록수단 또는 저장수단일 수 있는데, 어떤 컴퓨터 시스템에 직접 접속되는 매체에 한정되지 않고, 네트워크 상에 분산 존재하는 것일 수도 있다. 매체의 예시로는, 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM 및 DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical medium), 및 ROM, RAM, 플래시 메모리 등을 포함하여 프로그램 명령어가 저장되도록 구성된 것이 있을 수 있다. 또한, 다른 매체의 예시로, 애플리케이션을 유통하는 앱 스토어나 기타 다양한 소프트웨어를 공급 내지 유통하는 사이트, 서버 등에서 관리하는 기록매체 내지 저장매체도 들 수 있다.The medium may be one that continues to store computer executable programs, or temporarily store them for execution or download. In addition, the medium may be a variety of recording means or storage means in the form of a combination of a single hardware or a plurality of hardware, but is not limited to a medium directly connected to a computer system, but may be dispersed on a network. Examples of the medium include a magnetic medium such as a hard disk, a floppy disk and a magnetic tape, an optical recording medium such as CD-ROM and DVD, a magneto-optical medium such as a floptical disk, And program instructions including ROM, RAM, flash memory, and the like. As another example of the medium, a recording medium or a storage medium managed by a site or a server that supplies or distributes an application store or various other software to distribute the application may be mentioned.

이제까지 본 발명에 대하여 그 바람직한 실시예들을 중심으로 살펴보았다. 본 발명은 도면에 도시된 실시예를 참고로 설명되었으나 이는 예시적인 것에 불과하며, 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자라면 본 발명이 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 변형된 형태로 구현될 수 있으며, 균등한 다른 실시 예가 가능함을 이해할 수 있을 것이다. 그러므로 개시된 실시예들은 한정적인 관점이 아니라 설명적인 관점에서 고려되어야 한다. 본 발명의 범위는 전술한 설명이 아니라 특허청구범위에 나타나 있으며, 그와 동등한 범위 내에 있는 모든 차이점은 본 발명에 포함된 것으로 해석되어야 할 것이다.The present invention has been described with reference to the preferred embodiments. While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is evident that many alternatives, modifications and variations will be apparent to those skilled in the art. It will be appreciated that other equivalent embodiments are possible. Therefore, the disclosed embodiments should be considered in an illustrative rather than a restrictive sense. The scope of the present invention is defined by the appended claims rather than by the foregoing description, and all differences within the scope of equivalents thereof should be construed as being included in the present invention.

10: 단말
11: 제1 단말
12: 제2 단말
20: 서버
100: 통화 서비스 제공 장치
110: 메모리
120: 프로세서
111: 음성인식 모듈
112: 음성합성 모듈
113: 문자인식 모듈
114: 번역 모듈
121: 통화 제어부
122: 표시 제어부10: Terminal
11:
12:
20: Server
100: a call service providing device
110: Memory
120: Processor
111: Speech recognition module
112: speech synthesis module
113: Character recognition module
114: translation module
121:
122:

Claims

A method for providing a call service in a first terminal,
Receiving a sound source from a second terminal;
Converting the received sound source into text using a speech recognition module;
Receiving the sound source, and converting the text into a plurality of times;
Accumulating and displaying the converted texts on a display unit of the first terminal;
Receiving text from a user;
Converting the received text into a sound source using a speech synthesis module; And
And transmitting the converted sound source to the second terminal,
And scrolling the accumulated text corresponding to a first input to the first terminal by the user.

The method according to claim 1,
Wherein the displaying comprises:
Displaying the text in a format corresponding to the amplitude of the received sound source,
Wherein the relationship between the amplitude and the format is such that the larger the amplitude is,
A method of providing a call service.

delete

The method according to claim 1,
Wherein the screen displayed on the display unit of the terminal includes a first area for displaying the converted text and a second area for receiving text from the user,
Wherein the scrolling is performed in the first area and the second area is fixedly displayed even if the scrolling is performed,
A method of providing a call service.

The method according to claim 1,
Wherein the screen displayed on the display unit of the terminal includes a first area for displaying the converted text and a second area for receiving text from the user,
When the scrolling is performed, the first area extends to a position where the second area is displayed, and only the first area is displayed on the display unit
A method of providing a call service.

The method according to claim 1,
Selecting at least one of the words included in the text in consideration of at least one of a part of the words included in the accumulated text, a frequency of the words included in the accumulated text, and whether or not each of the words matches the predetermined format; More included
A method of providing a call service.

8. The method of claim 7,
Wherein said selecting and displaying comprises:
Determining at least one of a font, a size, a color, and a display position of each word in consideration of at least one of a part of the word, a frequency of the word, and whether or not each word matches the predetermined format
A method of providing a call service.

The method according to claim 1,
The first terminal,
A text mode for outputting data received in a call process in a text format on a display unit, and a sound source mode for outputting through a speaker in a sound source format,
The method comprises:
Further comprising: selecting the text mode by a user,
The receiving step or the transmitting step is performed after the selecting step,
When the sound source mode is selected by the user, the sound source received from the second terminal is output to the speaker, and the sound source received from the microphone of the first terminal is transmitted to the second terminal
A method of providing a call service.

The method according to claim 1,
The method of claim 1,
Selecting text stored in the first terminal or server from a user,
Wherein the step of converting the selected text into the sound source includes converting the selected text into a sound source using a sound synthesis module
A method of providing a call service.

A method for providing a call service in a first terminal,
Receiving a sound source from a second terminal;
Converting the received sound source into text using a speech recognition module;
Displaying the converted text on a display unit of the first terminal;
Converting a character input to the touch screen of the first terminal into at least one drag operation by using a character recognition module;
Converting the converted text into a sound source using a speech synthesis module using the character recognition module; And
And transmitting the converted sound source to the second terminal.

The method according to claim 1,
Wherein the displaying comprises:
Translating the converted text into a second language specified by the user in the first language using the translation module and displaying the translated text in the second language on the display unit
A method of providing a call service.

The method according to claim 1,
Wherein the transmitting comprises:
And transmitting the converted sound source and the sound source filter information selected by the user to the server so that the server can apply the sound source filter to the converted sound source and transmit the sound source filter to the second terminal.
A method of providing a call service.

A method for providing a call service in a first terminal,
Receiving a plurality of texts from a second terminal;
Accumulating and displaying the text on a display unit of the first terminal;
Receiving text from a user;
Converting the received text into a sound source using a speech synthesis module; And
And transmitting the converted sound source to the second terminal,
And scrolling the accumulated text corresponding to a first input to the first terminal by the user.

15. The method of claim 14,
Receiving a call initiation request with the second terminal; And
Further comprising: transmitting a call initiation request to the second terminal, receiving a call initiation acknowledgment from the second terminal, and initiating a call;
Wherein the call initiation request includes the output format information of the first terminal and the output format information is specified as either a text or a sound source,
A method of providing a call service.

delete

A computer program stored on a medium for performing the method of any one of claims 1, 2, and 5 to 15 using a computer.