KR20210114328A

KR20210114328A - Method for managing information of voice call recording and computer program for the same

Info

Publication number: KR20210114328A
Application number: KR1020200184847A
Authority: KR
Inventors: 황귀만; 양경훈; 서문교; 옥영진
Original assignee: 주식회사 엘에이치랩
Priority date: 2020-03-10
Filing date: 2020-12-28
Publication date: 2021-09-23
Also published as: KR102198424B1

Abstract

The present invention relates to a method for managing information of voice call recording and a computer program for the same. According to an embodiment of the present invention, a method for managing information of voice call recording and a computer program for the same are disclosed. The method comprises transmitting a voice call recording file generated by recording a call voice of a mobile terminal to a server, and performing voice-to-text conversion (STT) on the voice call recording file in a server to generate call recording text information and a mobile terminal sent to the mobile terminal, and allowing the user of the mobile terminal to easily search for the voice call recording in the form of text information.

Description

Method for managing information of voice call recording and computer program for the same}

본 발명은 통화 녹음 정보 관리방법, 이를 위한 컴퓨터 프로그램에 관한 것으로서, 모바일 단말의 통화 음성을 녹음하여 생성한 통화 녹음 파일을 서버로 전송하고, 통화 녹음 파일을 서버에서 음성-텍스트 변환(STT)하여 통화 녹음 텍스트 정보를 생성하고 모바일 단말에 전송하여, 모바일 단말 사용자가 통화 녹음 내용을 텍스트 정보 형태로 간편하게 검색 사용할 수 있도록 구성된 통화 녹음 정보 관리방법, 이를 위한 컴퓨터 프로그램에 관한 것이다. The present invention relates to a method for managing call recording information, and to a computer program therefor, by transmitting a call recording file generated by recording a call voice of a mobile terminal to a server, and performing voice-text conversion (STT) on the call recording file in the server To a method for managing call recording information configured to generate and transmit call recording text information to a mobile terminal so that a mobile terminal user can easily search and use the call recording contents in the form of text information, and a computer program for the same.

스마트폰과 같은 모바일 단말에서 통화 녹음된 내용을 빠르고 쉽게 검색하기 위한 기술들이 제안된 바 있다. Techniques for quickly and easily retrieving call recordings in a mobile terminal such as a smart phone have been proposed.

종래기술의 일예로, 대한민국 공개특허 10-2011-0053397(2011년05월23일 공개)는 검색 키워드를 이용한 멀티미디어 파일 검색 방법에 관한 것으로서, 키워드가 포함된 멀티미디어 파일에 대하여 사용자가 입력한 각종 형태의 검색 키워드를 이용하여 특정 멀티미디어 파일 또는 그 안의 내용을 간편하게 검색하는 구성을 제안하였다. 특히 상기 종래기술은 음성 파일의 경우, 음성 데이터와 함께 문자 키워드가 녹음시점을 매개로 삽입된 녹음파일을 생성하여 검색이 가능하도록 하였다. As an example of the prior art, Korean Patent Laid-Open Publication No. 10-2011-0053397 (published on May 23, 2011) relates to a multimedia file search method using a search keyword, and various forms input by a user for a multimedia file including the keyword. We proposed a configuration for easily searching a specific multimedia file or its contents using the search keyword of . In particular, in the case of a voice file, the prior art creates a recorded file in which a text keyword is inserted through the recording time as a medium along with the voice data, so that a search is possible.

그런데, 상기 종래기술은 문자 키워드로 녹음파일을 검색할 수 있지만, 녹음 재생 프로그램을 구동해야만 검색된 녹음파일의 내용을 확인할 수 있다는 한계점이 있었다. However, although the prior art can search for a recorded file using a text keyword, there is a limitation in that the contents of the searched recorded file can be checked only by driving a recording and reproducing program.

종래기술의 또다른 예로, 대한민국 등록특허 10-2036721(2019년10월21일 등록)는 녹음 음성에 대한 빠른 검색을 지원하는 단말 장치 및 그 동작 방법에 관한 것으로서, 음성 데이터의 재생을 지원하는 단말 장치에서, 사용자가 특정 단어를 검색어로 입력하면서, 해당 단어가 음성으로 포함된 부분의 검색을 요청할 때, 사용자에게 전체 음성 데이터로부터 상기 단어가 음성으로 포함된 부분을 검색 결과로 신속하게 찾아서 제공하는 구성을 제안하였다. As another example of the prior art, Korean Patent Registration No. 10-2036721 (registered on October 21, 2019) relates to a terminal device supporting a quick search for recorded voice and an operating method thereof, and a terminal supporting the reproduction of voice data In the device, when a user inputs a specific word as a search term and requests a search for a part containing the word as a search word, the user can quickly find and provide the part containing the word as a search result from the entire voice data to the user. configuration was proposed.

그런데, 상기 종래기술은 단어 정보 저장부, 음성 데이터 조각 생성부, 텍스트 변환부, 벡터 생성부 등과 같이 컴퓨팅 리소스의 소모가 많은 애플리케이션 프로그램이 휴대 기기에서 구동되어야 하므로, 컴퓨팅 리소스가 상대적으로 한정된 스마트폰이나 태블릿 PC 등에 적용하기에 어려움이 있었다. However, in the prior art, since an application program that consumes a lot of computing resources, such as a word information storage unit, a voice data fragment generation unit, a text conversion unit, a vector generation unit, etc., must be driven in a portable device, the computing resource is relatively limited in the smart phone. It was difficult to apply to tablet PCs or the like.

또한 상기 종래기술은 음성-텍스트 변환 기능이 포함되지만, 검색 결과로서 음성 데이터를 재생한 음성 출력을 제공하는 점에서 사용자가 검색 결과를 신속하게 확인하기에 어려움이 있었다. In addition, although the prior art includes a voice-to-text conversion function, it is difficult for a user to quickly check a search result in that it provides an audio output obtained by reproducing voice data as a search result.

대한민국 공개특허 10-2011-0053397(2011년05월23일 공개)Republic of Korea Patent Publication 10-2011-0053397 (published on May 23, 2011) 대한민국 등록특허 10-2036721(2019년10월21일 등록)Republic of Korea Patent Registration 10-2036721 (Registered on October 21, 2019)

본 발명은 상기와 같은 문제점을 감안하여 안출한 것으로서, 모바일 단말의 통화 음성을 녹음하여 생성한 통화 녹음 파일을 서버로 전송하고, 통화 녹음 파일을 서버에서 음성-텍스트 변환(STT)하여 통화 녹음 텍스트 정보를 생성하고 모바일 단말에 전송하여, 모바일 단말 사용자가 통화 녹음 내용을 텍스트 정보 형태로 간편하게 검색 사용할 수 있도록 구성된 통화 녹음 정보 관리방법, 이를 위한 컴퓨터 프로그램을 제공하는 것을 목적으로 한다. The present invention has been devised in view of the above problems, and transmits a call recording file generated by recording a call voice of a mobile terminal to a server, and performs voice-to-text conversion (STT) on the call recording file in the server to text the call recording. An object of the present invention is to provide a method for managing call recording information configured to generate information and transmit it to a mobile terminal so that a mobile terminal user can easily search and use the call recording in the form of text information, and a computer program for the same.

상기 목적을 감안한 본 발명의 일 측면에 따르면, 통화 녹음 정보 관리 애플리케이션이 설치되고 네트워크를 통해 서버와 연동하는 모바일 단말에서 실행하는 통화 녹음 정보 관리방법으로서, 1) 통화 녹음 파일에 관한 통화 식별 정보를 통화 녹음 정보 관리 DB에 기록하는 단계; 2) 통화 식별 정보를 포함한 통화 녹음 파일을 상기 서버로 전송하는 단계; 3) 상기 서버로 전송한 상기 통화 녹음 파일을 음성-텍스트 변환(STT) 대상으로서 식별하기 위해 서버에서 부여한 STT 식별 정보를 상기 서버로부터 전송받으며, 상기 STT 식별 정보를 상기 통화 녹음 정보 관리 DB에 상기 통화 식별 정보에 매칭하여 기록하는 단계; 4) 상기 서버로부터 통화 녹음 텍스트 정보를 전송받으며, 상기 녹음 텍스트 정보를 상기 통화 녹음 정보 관리 DB에 상기 STT 식별 정보에 매칭하여 기록하는 단계- 상기 통화 녹음 텍스트 정보는, 상기 서버가 상기 통화 녹음 파일로부터 모바일 단말의 사용자와 통화 상대방의 음성을 화자 분리하고, 전체 통화 음성, 사용자 음성, 통화 상대방 음성에 대해 각각 음성-텍스트 변환(STT)하여 생성한 각각의 텍스트 정보를 구분하여 포함함-; 및 5) 상기 통화 녹음 정보 관리 DB에 기초하여 상기 통화 녹음 텍스트 정보에 관한 검색 모드를 제공하는 단계;를 포함하여 구성된 통화 녹음 정보 관리방법이 개시된다. According to one aspect of the present invention in view of the above object, there is provided a call recording information management application installed and a method for managing call recording information executed in a mobile terminal interworking with a server through a network, 1) call identification information about the call recording file Recording the call recording information management DB; 2) transmitting a call recording file including call identification information to the server; 3) STT identification information given by the server to identify the call recording file transmitted to the server as a voice-to-text conversion (STT) target is transmitted from the server, and the STT identification information is stored in the call recording information management DB Recording by matching the call identification information; 4) receiving the call recording text information from the server, and recording the recorded text information in the call recording information management DB by matching the STT identification information - The call recording text information is the call recording file by the server Separating the voice of the user of the mobile terminal and the other party's voice from the speaker, and separately including each text information generated by voice-to-text conversion (STT) for the entire call voice, the user's voice, and the voice of the other party; and 5) providing a search mode for the call recording text information based on the call recording information management DB.

바람직하게 본 발명은, 상기 2)단계 이후, 21) 상기 통화 녹음 파일의 서버로의 전송 상태에 관한 전송 상태 정보를 상기 통화 녹음 정보 관리 DB에 상기 통화 식별 정보에 매칭하여 기록하는 단계;를 더욱 포함하여 구성된다. Preferably, the present invention further comprises, after step 2), 21) recording the transmission status information regarding the transmission status of the call recording file to the server in the call recording information management DB by matching the call identification information; consists of including

바람직하게, 상기 통화 식별 정보는, 통화 고유식별 정보, 통화 일시 정보, 통화 상대방 번호 정보, 수신/발신을 구분하는 통화 타입 정보를 포함한다. Preferably, the call identification information includes call unique identification information, call date and time information, call counterpart number information, and call type information for distinguishing incoming/outgoing calls.

바람직하게, 상기 검색 모드는 사용자가 입력한 키워드가 포함된 통화 식별 정보 또는 통화 녹음 텍스트 정보를 검색하기 위한 검색 화면 형태로 제공된다. Preferably, the search mode is provided in the form of a search screen for searching call identification information or call recording text information including the keyword input by the user.

바람직하게, 상기 4)단계는, 41) 감정 상태 분석 정보를 상기 서버로부터 전송받으며, 상기 감정 상태 분석 정보를 상기 통화 녹음 정보 관리 DB에 상기 STT 식별 정보에 매칭하여 기록하는 단계- 상기 감정 상태 분석 정보는, 상기 서버가 사용자 음성 및 통화 상대방 음성에 대해 각각 음성-텍스트 변환(STT)하여 생성한 각각의 텍스트 정보에 기초하여 사용자와 통화 상대방 각각의 감정 상태를 분석한 정보임-;를 더욱 포함하여 구성된다. Preferably, in step 4), 41) receiving the emotional state analysis information from the server, and recording the emotional state analysis information in the call recording information management DB by matching the STT identification information - The emotional state analysis The information is information obtained by analyzing the emotional state of each of the user and the call party based on the text information generated by the server by performing voice-to-text conversion (STT) for the user's voice and the call party's voice, respectively; is composed by

바람직하게 본 발명은, P1) 통화 수신 또는 발신을 감지하고 통화 녹음을 개시하며, 통화 종료를 감지하여 통화 녹음 파일을 생성하는 단계;를 더욱 포함하며, 상기 P1) 단계에 따라 통화 종료 감지 시에, 상기 P1) 단계의 결과로 생성된 통화 녹음 파일에 대해 상기 1)단계를 실행한다. Preferably, the present invention further includes; P1) detecting the incoming or outgoing call, starting the call recording, and detecting the end of the call to generate a call recording file; , step 1) is executed for the call recording file created as a result of step P1).

본 발명의 또다른 일 측면에 따르면, 하나 이상의 명령을 저장하는 메모리와 상기 메모리에 저장된 상기 하나 이상의 명령을 실행하는 프로세서를 포함하는 하드웨어와 결합되어 통화 녹음 정보 관리방법을 실행하도록 컴퓨터 판독 가능 매체에 저장된 컴퓨터 프로그램으로서, 상기 통화 녹음 정보 관리방법은, 1) 통화 녹음 파일에 관한 통화 식별 정보를 통화 녹음 정보 관리 DB에 기록하는 단계; 2) 통화 식별 정보를 포함한 통화 녹음 파일을 상기 서버로 전송하는 단계; 3) 상기 서버로 전송한 상기 통화 녹음 파일을 음성-텍스트 변환(STT) 대상으로서 식별하기 위해 서버에서 부여한 STT 식별 정보를 상기 서버로부터 전송받으며, 상기 STT 식별 정보를 상기 통화 녹음 정보 관리 DB에 상기 통화 식별 정보에 매칭하여 기록하는 단계; 4) 상기 서버로부터 통화 녹음 텍스트 정보를 전송받으며, 상기 녹음 텍스트 정보를 상기 통화 녹음 정보 관리 DB에 상기 STT 식별 정보에 매칭하여 기록하는 단계- 상기 통화 녹음 텍스트 정보는, 상기 서버가 상기 통화 녹음 파일로부터 모바일 단말의 사용자와 통화 상대방의 음성을 화자 분리하고, 전체 통화 음성, 사용자 음성, 통화 상대방 음성에 대해 각각 음성-텍스트 변환(STT)하여 생성한 각각의 텍스트 정보를 구분하여 포함함-; 및 5) 상기 통화 녹음 정보 관리 DB에 기초하여 상기 통화 녹음 텍스트 정보에 관한 검색 모드를 제공하는 단계;를 포함하여 구성된 것을 특징으로 하는 컴퓨터 판독 가능 매체에 저장된 컴퓨터 프로그램이 개시된다. According to another aspect of the present invention, it is combined with hardware including a memory for storing one or more instructions and a processor for executing the one or more instructions stored in the memory to be stored in a computer-readable medium to execute a method for managing call recording information. A stored computer program, the method for managing call recording information comprises the steps of: 1) recording call identification information about a call recording file in a call recording information management DB; 2) transmitting a call recording file including call identification information to the server; 3) STT identification information given by the server to identify the call recording file transmitted to the server as a voice-to-text conversion (STT) target is transmitted from the server, and the STT identification information is stored in the call recording information management DB Recording by matching the call identification information; 4) receiving the call recording text information from the server, and recording the recorded text information in the call recording information management DB by matching the STT identification information - The call recording text information is the call recording file by the server Separating the voice of the user of the mobile terminal and the other party's voice from the speaker, and separately including each text information generated by voice-to-text conversion (STT) for the entire call voice, the user's voice, and the voice of the other party; and 5) providing a search mode related to the call recording text information based on the call recording information management DB.

이와 같은 본 발명은, 통화 녹음 파일을 모바일 단말에서 서버로 전송하고, 서버에서 음성-텍스트 변환(STT)하여 통화 녹음 텍스트 정보를 생성하고 모바일 단말에 전송하는 방식이므로, 컴퓨팅 리소스가 상대적으로 한정된 모바일 단말에서 음성-텍스트 변환(STT) 프로그램을 구동하지 않아도 된다는 장점이 있다. As described above, the present invention transmits a call recording file from a mobile terminal to a server, and the server performs voice-to-text conversion (STT) to generate call recording text information and transmit it to the mobile terminal, so computing resources are relatively limited. There is an advantage that the terminal does not need to run a speech-to-text conversion (STT) program.

특히 본 발명은 모바일 단말에서 통화 녹음 텍스트 정보를 텍스트 파일로 저장 관리하지 않고 통화 녹음 정보 관리 DB에 기록함으로써, 텍스트 정보를 간편하고 신속하게 검색 사용할 수 있다는 장점이 있다. In particular, the present invention has the advantage that text information can be searched and used simply and quickly by recording the call recording text information in the call recording information management DB instead of storing and managing the call recording text information as a text file in the mobile terminal.

도 1은 본 발명의 일실시예에 따른 통화 녹음 정보 관리방법을 실행하는 전체 시스템 구성도,
도 2는 본 발명의 일실시예에 따른 모바일 단말의 하드웨어 관점의 모식도,
도 3은 본 발명의 일실시예에 따른 통화 녹음 정보 관리방법의 흐름도,
도 4는 본 발명의 일실시예에 따른 통화 녹음 정보 관리 DB의 구성 예시도이다. 1 is an overall system configuration diagram for executing a call recording information management method according to an embodiment of the present invention;
2 is a schematic diagram from a hardware perspective of a mobile terminal according to an embodiment of the present invention;
3 is a flowchart of a method for managing call recording information according to an embodiment of the present invention;
4 is an exemplary configuration diagram of a call recording information management DB according to an embodiment of the present invention.

본 발명은 그 기술적 사상 또는 주요한 특징으로부터 벗어남이 없이 다른 여러가지 형태로 실시될 수 있다. 따라서, 본 발명의 실시예들은 모든 점에서 단순한 예시에 지나지 않으며 한정적으로 해석되어서는 안 된다.The present invention may be embodied in various other forms without departing from its technical spirit or main characteristics. Accordingly, the embodiments of the present invention are merely illustrative in all respects and should not be construed as limiting.

제1, 제2 등의 용어는 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 본 발명의 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다. The terms 1st, 2nd, etc. are used only for the purpose of distinguishing one component from another component. For example, without departing from the scope of the present invention, a first component may be referred to as a second component, and similarly, a second component may also be referred to as a first component.

어떤 구성요소가 다른 구성요소에 "연결되어" 있다거나 "접속되어" 있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다. When a component is referred to as being “connected” or “connected” to another component, it may be directly connected or connected to the other component, but another component may exist in between.

본 출원에서 사용한 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "포함하다" 또는 "구비하다", "가지다" 등의 용어는 명세서에 기재된 구성요소 또는 이들의 조합이 존재하는 것을 표현하려는 것이지, 다른 구성요소 또는 특징이 존재 또는 부가될 가능성을 미리 배제하는 것은 아니다. As used herein, the singular expression includes the plural expression unless the context clearly dictates otherwise. In the present application, terms such as "comprise" or "comprising", "have" and the like are intended to represent the presence of elements or combinations thereof described in the specification, and the possibility that other elements or features may be present or added. It is not precluded.

이하, 첨부된 도면을 참조하여 본 발명에 따른 바람직한 실시예를 상세히 설명한다.Hereinafter, preferred embodiments according to the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명의 일실시예에 따른 통화 녹음 정보 관리방법을 실행하는 전체 시스템 구성도, 도 2는 본 발명의 일실시예에 따른 모바일 단말의 하드웨어 관점의 모식도이다. 1 is an overall system configuration diagram for executing a method for managing call recording information according to an embodiment of the present invention, and FIG. 2 is a schematic diagram from a hardware perspective of a mobile terminal according to an embodiment of the present invention.

본 실시예의 모바일 단말(100)은, 통화 녹음 정보 관리 애플리케이션이 설치되고 네트워크(10)를 통해 서버(200)와 연동하는 모바일 단말(100)로서 본 실시예의 통화 녹음 정보 관리방법을 실행한다. The mobile terminal 100 of this embodiment executes the call recording information management method of this embodiment as a mobile terminal 100 in which a call recording information management application is installed and interworks with the server 200 through the network 10 .

본 실시예의 모바일 단말(100)은 음성 통화 기능과 컴퓨팅 기능을 구비한 통상의 이동통신 단말기(예, 스마트폰)이다. The mobile terminal 100 of this embodiment is a normal mobile communication terminal (eg, a smart phone) having a voice call function and a computing function.

본 실시예의 모바일 단말(100)은 기능적 관점에서, 음성 통화를 녹음하여 통화 녹음 파일을 생성하는 음성 녹음부(110), 통화 녹음 파일을 저장하며 통화 녹음 파일에 관한 통화 식별 정보를 통화 녹음 정보 관리 DB(130)에 기록 관리하는 음성 저장/관리부(120), 통화 식별 정보를 포함한 통화 녹음 파일을 서버(200)로 전송하는 음성 전송부(140), 서버(200)에서 부여한 STT 식별 정보를 전송받아 통화 녹음 정보 관리 DB(130)에 기록하며 서버(200)로부터 통화 녹음 텍스트 정보를 전송받아 통화 녹음 정보 관리 DB(130)에 기록하는 텍스트 정보 수신부(110), 통화 녹음 정보 관리 DB(130)에 기초하여 상기 통화 녹음 텍스트 정보에 관한 검색 모드를 제공하는 검색부(150), 통화 식별 정보와 STT 식별 정보와 통화 녹음 텍스트 정보를 매칭하여 데이터베이스 테이블로서 관리하며 검색 모드를 통해 사용자에게 제공하는 통화 녹음 정보 관리 DB(130)를 포함한다. From a functional point of view, the mobile terminal 100 of this embodiment is a voice recorder 110 that records a voice call to generate a call recording file, stores the call recording file, and manages call recording information for call identification information about the call recording file. The voice storage/management unit 120 that records and manages the DB 130, the voice transmission unit 140 that transmits a call recording file including call identification information to the server 200, transmits the STT identification information given by the server 200 Text information receiving unit 110, call recording information management DB 130 that receives and records the call recording information management DB 130 and receives the call recording text information from the server 200 and records it in the call recording information management DB 130 The search unit 150 that provides a search mode for the call recording text information based on and a recording information management DB 130 .

상기와 같은 모바일 단말(100)의 각각의 기능은 통화 녹음 정보 관리 애플리케이션에 의해 제공될 수 있다. Each function of the mobile terminal 100 as described above may be provided by a call recording information management application.

상기 기능은 하나의 예시로서, 복수의 기능이 하나의 기능 요소에 의해 통합 제공되거나, 하나의 기능이 복수의 기능 요소에 의해 연동 제공될 수 있다. 상기 기능 구성 외에 본 실시예의 모바일 단말(100)은 통상의 이동통신 단말기(예, 스마트폰)의 기능을 제공할 수 있다. The above function is an example, and a plurality of functions may be integrated and provided by one functional element, or a single function may be provided by interworking with a plurality of functional elements. In addition to the above functional configuration, the mobile terminal 100 of the present embodiment may provide a function of a normal mobile communication terminal (eg, a smart phone).

일예로, 음성 통화 녹음 기능은 모바일 단말의 OS(operating system)에서 제공하는 통화 녹음 API(Application Program Interface)를 이용하여 본 실시예의 통화 녹음 정보 관리 애플리케이션에서 실행할 수 있다.For example, the voice call recording function may be executed in the call recording information management application of the present embodiment by using a call recording application program interface (API) provided by an operating system (OS) of the mobile terminal.

일예로, 모바일 단말의 OS가 녹음 기능을 지원하는 경우, 통화 녹음 정보 관리 애플리케이션은 음성 통화의 개시 시점부터 음성 통화 종료 시점까지 하나의 파일 형태로 메타 데이터와 미디어 데이터를 포함하는 통화 녹음 파일을 생성할 수 있다. For example, if the OS of the mobile terminal supports the recording function, the call recording information management application creates a call recording file including metadata and media data in the form of one file from the start of the voice call to the end of the voice call. can do.

다른예로, 모바일 단말의 OS가 애플리케이션 프로그램의 녹음 기능을 지원하지는 않지만 OS 자체 기능으로 녹음 기능의 실행이 가능한 경우, 통화 녹음 정보 관리 애플리케이션은 OS 자체 기능으로 하나의 파일 형태로 메타 데이터와 미디어 데이터를 포함하는 통화 녹음 파일을 생성할 수 있다. As another example, if the OS of the mobile terminal does not support the recording function of the application program, but the recording function can be executed with the OS's own function, the call recording information management application is the OS's own function and the metadata and media data in the form of a single file You can create a call recording file containing the.

도 2를 참조하면 하드웨어적 관점에서, 본 실시예의 모바일 단말(100)은 하나 이상의 명령을 저장하는 메모리(1) 및 상기 메모리(1)에 저장된 상기 하나 이상의 명령을 실행하는 프로세서(4)를 포함하며, 통화 녹음 정보 관리방법을 실행하도록 매체에 저장된 통화 녹음 정보 관리 애플리케이션 프로그램이 실행되는 컴퓨팅 장치이다. 본 실시예의 모바일 단말(100)은 데이터 입출력 인터페이스(6)와 통신 인터페이스(8), 데이터 표시 수단(3), 데이터 저장 수단(5)을 포함할 수 있다. Referring to FIG. 2 , from a hardware point of view, the mobile terminal 100 of this embodiment includes a memory 1 for storing one or more instructions and a processor 4 for executing the one or more instructions stored in the memory 1 . It is a computing device that executes the call recording information management application program stored in the medium to execute the call recording information management method. The mobile terminal 100 of this embodiment may include a data input/output interface 6 , a communication interface 8 , a data display means 3 , and a data storage means 5 .

본 실시예의 서버(200)는, 통화 녹음 정보 관리 애플리케이션이 설치된 하나 이상의 모바일 단말(100)과 통신사 서버(미도시) 및 네트워크(10)를 통해 연결되며, 음성 통화가 이뤄지는 모바일 단말(100)과 연동하여 본 실시예의 통화 녹음 정보 관리방법을 실행한다. 본 실시예의 네트워크(10)는 유선망 및/또는 무선망을 통해 서버(200)와 모바일 단말(100)이 연동되도록 하는 통상의 통신 네트워크이다. The server 200 of this embodiment is connected through one or more mobile terminals 100 in which a call recording information management application is installed, a communication company server (not shown) and a network 10, and a mobile terminal 100 through which a voice call is made and The call recording information management method of this embodiment is executed in conjunction with each other. The network 10 of this embodiment is a typical communication network that allows the server 200 and the mobile terminal 100 to interwork through a wired network and/or a wireless network.

본 실시예의 음성 통화는, 이동전화 교환국을 이용한 이동통신 서비스를 통해 이뤄지는 음성 통화일 수도 있고, IP 네트워크를 이용한 인터넷 전화를 통해 이뤄지는 음성 통화일 수도 있다. The voice call of the present embodiment may be a voice call made through a mobile communication service using a mobile phone switching center or a voice call made through an Internet phone using an IP network.

본 실시예의 서버(200)는 기능적 관점에서, 모바일 단말(100)에서 전송받은 통화 녹음 파일로부터 모바일 단말(100)의 사용자와 통화 상대방의 음성을 화자 분리하는 화자 분리부(210), 전체 통화 음성, 사용자 음성, 통화 상대방 음성에 대해 각각 음성-텍스트 변환(STT)하고 각각의 텍스트 정보를 구분하여 생성하는 음성인식 및 변환부(220), 생성한 텍스트 정보에 대한 형태소 분석 및 개체명 인식 등의 자연어 처리를 실행하여 텍스트 정보에 포함된 자립형 형태소와 같은 주요 단어를 명확하게 인식 및 추출하도록 하는 자연어 처리부(230)를 포함한다. From a functional point of view, the server 200 of this embodiment includes a speaker separation unit 210 that separates the speaker's voice from the user of the mobile terminal 100 and the other party's voice from the call recording file transmitted from the mobile terminal 100, the entire call voice , voice-to-text conversion (STT) for the user's voice and the other party's voice, and the voice recognition and conversion unit 220 that separates and generates each text information, morphological analysis of the generated text information, and entity name recognition, etc. and a natural language processing unit 230 that performs natural language processing to clearly recognize and extract key words such as free-standing morphemes included in text information.

또한, 본 실시예의 서버(200)는, 사용자별로 통화 녹음 정보 관리 애플리케이션의 회원 등록 및 로그인 관리를 하는 운영처리부(미도시), 사용자 등록 정보와 통화 녹음 관리 정보를 기록 관리하는 정보관리부(미도시) 등을 더욱 포함할 수 있다. In addition, the server 200 of this embodiment, an operation processing unit (not shown) that manages member registration and login of the call recording information management application for each user, and an information management unit (not shown) that records and manages user registration information and call recording management information ) and the like may be further included.

본 실시예의 서버(200)는 하드웨어적 관점에서 메모리, 프로세서, 데이터 입출력 인터페이스, 통신 인터페이스, 데이터 표시 수단, 데이터 저장 수단을 포함하는 통상의 컴퓨팅 수단의 구성을 포함할 수 있다. The server 200 of this embodiment may include a configuration of a typical computing means including a memory, a processor, a data input/output interface, a communication interface, a data display means, and a data storage means in terms of hardware.

도 4는 본 발명의 일실시예에 따른 통화 녹음 정보 관리 DB의 구성 예시도이다. 4 is an exemplary configuration diagram of a call recording information management DB according to an embodiment of the present invention.

일예로, 본 실시예의 통화 녹음 정보 관리 DB(130)는 도 4의 테이블의 형태로 구성될 수 있으며, 각각의 데이터 필드는 아래의 표 1과 같은 내용의 데이터를 기록 저장할 수 있다. For example, the call recording information management DB 130 of this embodiment may be configured in the form of a table of FIG. 4 , and each data field may record and store data of the contents shown in Table 1 below.

필드field 데이터 내용data content NoNo 데이터 일련 번호data serial number CallNoCallNo 통화 고유식별 정보
(모바일 단말에서 모든 통화 및 MMS에 유니크하게 생성되는 번호, 모바일 단말을 초기화 하지 않는 이상 변경되지 않음)Currency unique identification information
(A number that is uniquely generated for all calls and MMS in the mobile terminal, does not change unless the mobile terminal is initialized) CalldateCalldate 통화 일시(日時) 정보Call date and time information Phonenumberphone number 통화 상대방 번호 정보
(수신/발신 정보에서 획득된 통화 상대방 전화번호 정보)Call party number information
(Call party phone number information obtained from incoming/outgoing information) CallstateCallstate 통화/거절/부재중 등 통화에 대한 상태 정보Status information about calls, such as calls/rejected/missed Calltypecalltype 수신/발신을 구분하는 통화 타입 정보Call type information to distinguish incoming/outgoing FavoritesFavorites 즐겨찾기 기능 사용시 플래그Flags when using the Favorites function Sttstatesttstate 전송 상태 정보
(서버에 대한 STT(음성-텍스트 변환) 요청 상태에 관한 정보)
(예, 1-STT요청 성공, 2-요청 실패, 3-인터넷 연결 실패, 4-재요청, 5-요청 중)Transmission status information
(Information about the status of speech-to-text (STT) requests to the server)
(eg, 1-STT request successful, 2-request failed, 3-internet connection failed, 4-request, 5-request) SttNoSttNo STT 식별 정보
(서버에서 통화 별 STT를 구분하기 위한 번호, 해당 번호를 통해 STT 데이터가 입력 및/또는 처리됨)STT identification information
(Number to distinguish STT for each call in the server, STT data is input and/or processed through the number) SttTimeSttTime 서버에 전송될 STT 시간 정보
(STT 처리될 통화 녹음 파일의 총 시간 길이)STT time information to be sent to the server
(Total time length of call recording file to be processed STT) Stt_allStt_all 전체 통화 음성(사용자 및 통화 상대방 음성)을 음성-텍스트 변환(STT)하여 생성한 텍스트 정보Text information generated by voice-to-text conversion (STT) of the entire call voice (user and callee voice) Stt_meStt_me 화자 분리한 사용자의 음성을 음성-텍스트 변환(STT)하여 생성한 텍스트 정보Text information generated by voice-to-text conversion (STT) of the user's voice separated by the speaker Stt_otherStt_other 화자 분리한 통화 상대방 음성을 음성-텍스트 변환(STT)하여 생성한 텍스트 정보Text information generated by voice-to-text conversion (STT) of the other party's voice separated by the speaker Stt_me_emStt_me_em 화자 분리한 사용자의 음성의 감정 상태 분석 정보
(예, 1/2/3/4/5/6/7/8/9/10 = 기쁨/재미/긍지/불만/공포/슬픔/혐오/분노/만족/안심)Emotional state analysis information of the user's voice separated by the speaker
(eg 1/2/3/4/5/6/7/8/9/10 = joy/fun/pride/dissatisfaction/fear/sad/hate/anger/satisfaction/relief) Stt_other_emStt_other_em 화자 분리한 통화 상대방 음성의 감정 상태 분석 정보
(예, 1/2/3/4/5/6/7/8/9/10 = 기쁨/재미/긍지/불만/공포/슬픔/혐오/분노/만족/안심)Emotional state analysis information of the voice of the other party on the call, separated by the speaker
(eg 1/2/3/4/5/6/7/8/9/10 = joy/fun/pride/dissatisfaction/fear/sad/hate/anger/satisfaction/relief) FilepathFilepath 실제 음성 녹음 파일이 저장된 사용자 단말 내의 위치 정보Location information in the user terminal where the actual voice recording file is stored

도 3은 본 발명의 일실시예에 따른 통화 녹음 정보 관리방법의 흐름도이다. 3 is a flowchart of a method for managing call recording information according to an embodiment of the present invention.

P1)단계에서 모바일 단말(100)은, 통화 수신 또는 발신을 감지하고 통화 녹음을 개시하며, 통화 종료를 감지하여 통화 녹음 파일을 생성한다. In step P1), the mobile terminal 100 detects the incoming or outgoing call, starts recording the call, and detects the end of the call to generate a call recording file.

상기 P1) 단계에 따라 통화 종료 감지 시에, 상기 P1) 단계의 결과로 생성된 통화 녹음 파일에 대해 1)단계를 실행한다. Upon detecting the end of a call according to step P1), step 1) is executed for the call recording file created as a result of step P1).

1)단계에서 모바일 단말(100)은, 통화 녹음 파일에 관한 통화 식별 정보를 통화 녹음 정보 관리 DB(130)에 기록한다. 통화 식별 정보는 하나의 통화를 다른 하나의 통화와 구분하여 식별하기 위한 정보이다. In step 1), the mobile terminal 100 records call identification information about the call recording file in the call recording information management DB 130 . The currency identification information is information for distinguishing and identifying one currency from another.

일예로, 상기 통화 식별 정보는, 통화 고유식별 정보, 통화 일시 정보, 통화 상대방 번호 정보, 수신/발신을 구분하는 통화 타입 정보를 포함한다. For example, the call identification information includes call unique identification information, call date and time information, call counterpart number information, and call type information for distinguishing incoming/outgoing calls.

통화 고유식별 정보는 모바일 단말(100)에서 모든 통화 및 MMS에 고유하게 생성되는 번호이며, 모바일 단말(100)을 초기화 하지 않는 이상 변경되지 않는다. The call unique identification information is a number that is uniquely generated for all calls and MMS in the mobile terminal 100 , and is not changed unless the mobile terminal 100 is initialized.

2)단계에서 모바일 단말(100)은, 통화 식별 정보를 포함한 통화 녹음 파일을 상기 서버(200)로 전송한다. 일예로, STT 처리 속도 향상 및 품질 향상을 위해 통화 녹음 파일을 미리 설정된 시간조건(예, 1 분) 단위로 분할하여 상기 서버(200)로 전송할 수 있다. In step 2), the mobile terminal 100 transmits a call recording file including call identification information to the server 200 . For example, in order to improve STT processing speed and quality, the call recording file may be divided into units of a preset time condition (eg, 1 minute) and transmitted to the server 200 .

바람직하게, 상기 2)단계 이후, 21)단계에서 모바일 단말(100)은, 상기 통화 녹음 파일의 서버(200)로의 전송 상태에 관한 전송 상태 정보를 상기 통화 녹음 정보 관리 DB(130)에 상기 통화 식별 정보에 매칭하여 기록할 수 있다. Preferably, after step 2), in step 21), the mobile terminal 100 transmits the transmission status information regarding the transmission status of the call recording file to the server 200 to the call recording information management DB 130 for the call. It can be recorded by matching the identification information.

상기 전송 상태 정보는, 서버(200)에 대한 STT(음성-텍스트 변환) 요청 상태에 관한 정보로서, 예를 들어, 1(STT 요청 성공), 2(요청 실패), 3(인터넷 연결 실패), 4(재요청), 5(요청 중)과 같이 구분될 수 있다. The transmission status information is information about the STT (speech-text conversion) request status for the server 200, for example, 1 (STT request success), 2 (request failure), 3 (Internet connection failure), It can be divided into 4 (request) and 5 (request).

통화 녹음 정보 관리 애플리케이션은 전송 상태 정보가 2(요청 실패), 3(인터넷 연결 실패) 등인 경우에는 미리 설정된 시간 설정에 따라 재요청 처리를 할 수 있다. When the transmission status information is 2 (request failure), 3 (internet connection failure), etc., the call recording information management application may process a re-request according to a preset time setting.

3)단계에서 모바일 단말(100)은, 상기 서버(200)로 전송한 상기 통화 녹음 파일을 음성-텍스트 변환(STT) 대상으로서 식별하기 위해 서버(200)에서 부여한 STT 식별 정보를 상기 서버(200)로부터 전송받으며, 상기 STT 식별 정보를 상기 통화 녹음 정보 관리 DB(130)에 상기 통화 식별 정보에 매칭하여 기록한다. STT 식별 정보는 하나의 통화 녹음 파일의 STT 처리와 다른 하나의 통화 녹음 파일의 STT 처리를 구분하기 위한 값이다. In step 3), the mobile terminal 100 transmits the STT identification information given by the server 200 to the server 200 in order to identify the call recording file transmitted to the server 200 as a voice-to-text conversion (STT) target. ), and records the STT identification information by matching the call identification information in the call recording information management DB 130 . The STT identification information is a value for distinguishing the STT processing of one call recording file from the STT processing of the other call recording file.

상술한 바와 같이, 통화 녹음 파일 생성 시에 통화 녹음 정보 관리 DB(130)에 기록되는 통화 고유식별 정보는 모바일 단말(100)에서 모든 통화 및 MMS에 유니크하게 생성되는 번호이지만, 모바일 단말(100)이 초기화 되는 경우 정보가 부여되는 상태가 변경(예, 다시 초기번호부터 시작)될 수 있다. As described above, when the call recording file is created, the call unique identification information recorded in the call recording information management DB 130 is a number that is uniquely generated for all calls and MMS in the mobile terminal 100, but the mobile terminal 100 When this is initialized, the state to which information is given may be changed (eg, starting from the initial number again).

만일 모바일 단말(100)의 초기화로 인해, 통화 고유식별 정보가 기존에 이미 생성된 통화 고유식별 정보와 동일한 값으로 생성되어 통화 녹음 정보 관리 DB(130)에 새로이 기록되는 경우, 서로 다른 통화에 대해 동일한 통화 고유식별 정보가 부여될 수도 있다. If, due to the initialization of the mobile terminal 100, the call unique identification information is generated with the same value as the previously generated call unique identification information and is newly recorded in the call recording information management DB 130, for different calls The same currency-specific identification information may be provided.

이러한 상황 발생 시에, 서버(200)에서 생성한 통화 녹음 텍스트 정보를 모바일 단말(100)에서 생성한 통화 고유식별 정보에 기초하여 통화 녹음 정보 관리 DB(130)에 기록 관리한다면, 통화 녹음 내용과 통화 녹음 텍스트 정보가 상호 불일치하는 상황이 발생될 우려가 있다. When this situation occurs, if the call recording text information generated by the server 200 is recorded and managed in the call recording information management DB 130 based on the call unique identification information generated by the mobile terminal 100, the call recording contents and There is a risk that the call recording text information may be inconsistent with each other.

이러한 점을 고려하여 본 실시예의 통화 녹음 정보 관리방법에서는, 모바일 단말(100)에서 서버(200)로 전송한 통화 녹음 파일을 음성-텍스트 변환(STT) 대상으로서 식별하기 위해 서버(200)에서 STT 식별 정보를 부여하고, 이를 서버(200)에서 모바일 단말(100)로 전송하여, 상기 STT 식별 정보를 상기 통화 녹음 정보 관리 DB(130)에 상기 통화 식별 정보에 매칭하여 기록한다. 또한, 상기 STT 식별 정보를 서버(200)에서 생성한 통화 녹음 텍스트 정보를 기록하기 위한 기준 데이터로 사용한다. 이러한 구성을 취하는 경우, 통화 녹음 정보 관리 DB(130)에서 STT 식별 정보는 항상 다른 STT 식별 정보와 구분되는 고유의 값을 갖고 있으므로, 통화 녹음 텍스트 정보의 관리 및 검색 시에 오류 발생이 방지된다. In consideration of this point, in the call recording information management method of this embodiment, the STT in the server 200 to identify the call recording file transmitted from the mobile terminal 100 to the server 200 as a voice-to-text conversion (STT) target. Identification information is given, and it is transmitted from the server 200 to the mobile terminal 100, and the STT identification information is matched with the call identification information in the call recording information management DB 130 and recorded. In addition, the STT identification information is used as reference data for recording the call recording text information generated by the server 200 . In the case of taking such a configuration, since the STT identification information in the call recording information management DB 130 always has a unique value to be distinguished from other STT identification information, errors are prevented during management and retrieval of the call recording text information.

한편, 상기 2)단계에서, 상기 통화 녹음 파일의 파일명은 상기 통화 식별 정보에 포함된 정보 중 적어도 하나를 포함할 수 있다. 이 경우, 상기 3)단계에서, 상기 STT 식별 정보는 상기 통화 녹음 파일의 파일명에 기초하여 각각의 파일을 구분하여 상기 서버(200)에서 통화 녹음 파일별로 구분되는 고유의 값으로 생성할 수 있다. Meanwhile, in step 2), the file name of the call recording file may include at least one of information included in the call identification information. In this case, in step 3), the STT identification information may be generated as a unique value for each call recording file in the server 200 by classifying each file based on the file name of the call recording file.

일예로, 상기 통화 녹음 파일의 파일명은 '수신/발신을 구분하는 통화 타입 정보(Calltype) 및 '통화 상대방 번호 정보(Phonenumber)'을 포함하여 생성될 수 있다. 다만 반드시 이러한 방식으로 한정되는 것은 아니다. For example, the file name of the call recording file may be generated including 'call type information for distinguishing incoming/outgoing calls (Calltype) and 'call counterpart number information (Phonenumber)'. However, it is not necessarily limited in this way.

4)단계에서 모바일 단말(100)은, 상기 서버(200)로부터 통화 녹음 텍스트 정보를 전송받으며, 상기 녹음 텍스트 정보를 상기 통화 녹음 정보 관리 DB(130)에 상기 STT 식별 정보에 매칭하여 기록한다. In step 4), the mobile terminal 100 receives the call recording text information from the server 200, and records the recorded text information in the call recording information management DB 130 by matching the STT identification information.

일예로, 상기 통화 녹음 텍스트 정보는, 상기 서버(200)가 상기 통화 녹음 파일로부터 모바일 단말(100)의 사용자와 통화 상대방의 음성을 화자 분리하고, 전체 통화 음성, 사용자 음성, 통화 상대방 음성에 대해 각각 음성-텍스트 변환(STT)하여 생성한 각각의 텍스트 정보를 구분하여 포함한다. For example, in the call recording text information, the server 200 separates the speaker's voice from the user of the mobile terminal 100 and the other party's voice from the call recording file, Each piece of text information generated by speech-to-text conversion (STT) is separately included.

화자 분리는 다양한 공지기술에 의해 구현될 수 있다. 일예로, 성문 비교(聲紋, voice print)를 통해 상기 통화 녹음 파일로부터 모바일 단말(100)의 사용자와 통화 상대방의 음성을 각각 추출하여 사용자 음성 파일과 통화 상대방 음성 파일을 각각 생성하고, 각각의 파일을 음성-텍스트 변환(STT)하여 각각의 텍스트 정보를 생성할 수 있다. 성문 비교는 예를 들어 미리 등록된 모바일 단말(100)의 사용자의 음성을 기준으로 동일한 성문을 가진 음성은 모바일 단말(100)의 사용자의 음성으로 추출하고, 상이한 성문을 가진 음성은 통화 상대방의 음성으로 추출하는 방식으로 실행될 수 있다. 다만 반드시 이러한 방식으로 한정되는 것은 아니다. Speaker separation may be implemented by various known techniques. For example, by extracting the voices of the user of the mobile terminal 100 and the other party's voice from the call recording file through voice print comparison (聲紋, voice print), the user's voice file and the calling party's voice file are respectively generated, and each Each text information can be generated by voice-to-text conversion (STT) of the file. In the voiceprint comparison, for example, based on the voice of the user of the mobile terminal 100 registered in advance, the voice having the same voiceprint is extracted as the voice of the user of the mobile terminal 100 , and the voice having a different voiceprint is the voice of the other party. It can be executed by extracting . However, it is not necessarily limited in this way.

성문은 사람의 목소리를 음성분석기(소나 그래프)를 통해 길이·높이·강도 등을 분석, 지문처럼 무늬로 시각화한 것으로 개인마다 특이한 성질을 지니고 있어 지문이나 혈액형과 같이 개인 식별의 중요한 단서로 사용되고 있다. 성문 비교는 공지의 성문 비교 알고리즘 또는 미리 학습된 인공신경망을 통해 실행될 수 있으며, 상세 설명은 생략한다. The voice gate is visualized as a pattern like a fingerprint by analyzing the length, height, and intensity of a human voice through a voice analyzer (sonar graph). . The glottal comparison may be performed through a known glottal comparison algorithm or a pre-trained artificial neural network, and a detailed description thereof will be omitted.

5)단계에서 모바일 단말(100)은, 상기 통화 녹음 정보 관리 DB(130)에 기초하여 상기 통화 녹음 텍스트 정보에 관한 검색 모드를 제공한다. In step 5), the mobile terminal 100 provides a search mode for the call recording text information based on the call recording information management DB 130 .

일예로, 상기 검색 모드는 사용자가 입력한 키워드가 포함된 통화 식별 정보 또는 통화 녹음 텍스트 정보를 검색하기 위한 검색 화면 형태로 제공된다. For example, the search mode is provided in the form of a search screen for searching call identification information or call recording text information including a keyword input by the user.

일예로, 사용자가 입력한 키워드는 통화 녹음 텍스트 정보에 포함된 특정 단어가 될 수 있다. 다른예로, 사용자가 입력한 키워드는 통화 식별 정보에 포함된 개별 정보(예, 통화 고유식별 정보, 통화 일시 정보, 통화 상대방 번호 정보, 수신/발신을 구분하는 통화 타입 정보)가 될 수도 있다. For example, the keyword input by the user may be a specific word included in the call recording text information. As another example, the keyword input by the user may be individual information included in the call identification information (eg, call unique identification information, call date and time information, call party number information, and call type information for distinguishing incoming/outgoing calls).

검색 모드에서 사용자가 키워드를 입력하면, 입력된 키워드가 포함된 데이터 레코드를 통화 녹음 정보 관리 DB(130)에서 검색하여 해당 키워드가 포함된 데이터 필드를 검색 결과로 제공한다. When the user inputs a keyword in the search mode, the data record including the input keyword is searched for in the call recording information management DB 130 and a data field including the keyword is provided as a search result.

본 실시예의 통화 녹음 정보 관리방법은, 모바일 단말(100)에서 각각의 통화 녹음 파일별로 생성된 통화 녹음 텍스트 정보를 텍스트 파일 형태로 개별 저장 관리하지 않고 통화 녹음 정보 관리 DB(130)의 레코드 형태로 기록 관리함으로써, 텍스트 정보를 간편하고 신속하게 검색 사용할 수 있다. The call recording information management method of this embodiment does not store and manage the call recording text information generated for each call recording file in the mobile terminal 100 individually in the form of a text file, but in the form of a record of the call recording information management DB 130 . By managing the records, text information can be searched and used easily and quickly.

한편, 바람직하게 상기 검색 모드는, 상기 통화 녹음 텍스트 정보의 검색 조건으로서, 전체 통화 음성, 사용자 음성, 통화 상대방 음성을 구분하여 검색 가능하도록 구성된다. On the other hand, preferably, the search mode is configured to be able to search by distinguishing the entire call voice, the user's voice, and the call party's voice as a search condition for the call recording text information.

즉, 검색 키워드를 입력 시에 해당 키워드가 전체 통화 음성에 포함된 것인지, 사용자 음성에 포함된 것인지, 통화 상대방 음성에 포함된 것인지를 구분하여 검색 가능하도록, 상기 검색 모드는 선택 기능을 제공한다. That is, the search mode provides a selection function so that when a search keyword is input, it is possible to distinguish whether the keyword is included in the entire call voice, included in the user's voice, or included in the voice of the other party.

이러한 선택 검색 기능을 통해, 사용자는 화자를 구분하여 특정 키워드가 포함된 통화의 내역 및 내용을 텍스트 정보 형태로 검색하여 볼 수 있다. Through this selective search function, the user can classify the speaker and search and view the history and contents of a call including a specific keyword in the form of text information.

바람직하게, 적어도 상기 2)단계 내지 4)단계는 통화 녹음 정보 관리 애플리케이션이 상기 서버(200)에 로그인 세션이 유지된 상태에서 실행될 수 있다. Preferably, at least the steps 2) to 4) may be executed while the call recording information management application maintains a login session with the server 200 .

이 경우, 상기 3)단계에서, 상기 STT 식별 정보는 상기 통화 녹음 파일의 파일명 및 통화 녹음 정보 관리 애플리케이션의 로그인 정보에 기초하여 상기 서버(200)에서 생성할 수 있다. In this case, in step 3), the STT identification information may be generated by the server 200 based on the file name of the call recording file and the login information of the call recording information management application.

이와 같이, 2)단계 내지 4)단계를 통화 녹음 정보 관리 애플리케이션이 상기 서버(200)에 로그인 세션이 유지된 상태에서 실행하는 경우, 통화 녹음 파일을 전송한 모바일 단말(100)을 로그인 세션에 의해 식별하여 서버(200)에서 STT 식별 정보를 부여할 수 있다. In this way, when the call recording information management application executes steps 2) to 4) while the log-in session is maintained in the server 200, the mobile terminal 100 that has transmitted the call recording file is performed by the log-in session. By identifying, the server 200 may provide STT identification information.

한편, 변형예로서, 상기 4)단계에서 서버(200)는 감정 상태 분석 정보를 더욱 생성하여 제공할 수 있다. Meanwhile, as a modification, in step 4), the server 200 may further generate and provide emotional state analysis information.

상술한 화자 분리 기능에 의해, 상기 서버(200)는 사용자 음성 및 통화 상대방 음성에 대해 각각 음성-텍스트 변환(STT)하여 각각의 텍스트 정보를 생성하고, 사용자 음성 텍스트 정보(Stt_me) 및 통화 상대방 음성 텍스트 정보(Stt_other)에 각각 기초하여 사용자와 통화 상대방 각각의 감정 상태를 분석하고 그 결과로 감정 상태 분석 정보를 생성할 수 있다. 텍스트 정보에 기초하여 감정 상태를 분석하는 기술은 예를 들어, 텍스트 정보에 포함된 형태소를 분석하고 이를 통계적 분류 방법 또는 기계학습에 의해 여러가지 감정 상태로 분석하는 다양한 기술들이 공지된 바 있으므로 이에 대한 상세 설명은 생략한다. By the above-described speaker separation function, the server 200 performs voice-to-text conversion (STT) for the user's voice and the other party's voice, respectively, to generate each text information, and the user's voice text information (Stt_me) and the other party's voice Based on the text information Stt_other, the emotional state of each of the user and the call counterpart may be analyzed, and as a result, emotional state analysis information may be generated. As a technique for analyzing an emotional state based on text information, for example, various techniques for analyzing a morpheme included in text information and analyzing it into various emotional states by a statistical classification method or machine learning have been known. A description is omitted.

화자 분리한 사용자의 음성의 감정 상태 분석 정보(Stt_me_em 또는 Stt_other_em)는 예를 들어, 1(기쁨), 2(재미), 3(긍지), 4(불만), 5(공포), 6(슬픔), 7(혐오), 8(분노), 9(만족), 10(안심) 등과 같이 분류될 수 있으며, 반드시 이에 한정되는 것은 아니다. The emotional state analysis information (Stt_me_em or Stt_other_em) of the user's voice separated by the speaker is, for example, 1 (joy), 2 (fun), 3 (pride), 4 (dissatisfaction), 5 (fear), 6 (sadness) , 7 (disgust), 8 (anger), 9 (satisfaction), 10 (relief), and the like, but is not necessarily limited thereto.

41)단계에서 모바일 단말(100)은, 감정 상태 분석 정보를 상기 서버(200)로부터 전송받으며, 상기 감정 상태 분석 정보를 상기 통화 녹음 정보 관리 DB(130)에 상기 STT 식별 정보에 매칭하여 기록한다. In step 41), the mobile terminal 100 receives the emotional state analysis information from the server 200, and records the emotional state analysis information in the call recording information management DB 130 by matching the STT identification information. .

이 경우, 상기 검색 모드는, 사용자가 입력한 감정 상태에 해당하는 통화 식별 정보 및 통화 녹음 텍스트 정보를 검색하되, 사용자 및 통화 상대방을 구분하여 감정 상태를 입력하여 검색 가능하도록 구성된다. In this case, the search mode is configured to search for call identification information and call recording text information corresponding to the emotional state input by the user, but to distinguish the user from the call party and input the emotional state to be searchable.

즉, 검색 키워드를 특정 감정 상태로 입력 시에, 해당 감정 상태가 사용자 음성에 포함된 것인지, 통화 상대방 음성에 포함된 것인지를 구분하여 검색 가능하도록, 상기 검색 모드는 선택 기능을 제공한다. That is, when a search keyword is input as a specific emotional state, the search mode provides a selection function so that it is possible to distinguish whether the corresponding emotional state is included in the user's voice or the other party's voice to be searched.

이러한 선택 검색 기능을 통해, 사용자는 화자를 구분하여 특정 감정 상태로 통화가 이뤄진 통화의 내역 및 내용을 텍스트 정보 형태로 검색하여 볼 수 있다. Through this selective search function, the user can search for and view the history and contents of a call in the form of text information in which a call is made in a specific emotional state by classifying the speaker.

본 발명의 실시예들은 다양한 컴퓨터로 구현되는 동작을 수행하기 위한 프로그램과 이를 기록한 컴퓨터 판독 가능 기록 매체를 포함한다. 상기 컴퓨터 판독 가능 기록 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체는 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM, DVD, USB 드라이브와 같은 광기록 매체, 플롭티컬 디스크와 같은 자기-광 매체, 및 롬, 램, 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다.Embodiments of the present invention include a program for performing various computer-implemented operations and a computer-readable recording medium recording the same. The computer-readable recording medium may include program instructions, data files, data structures, etc. alone or in combination. The media may be specially designed and configured for the present invention, or may be known and available to those skilled in the art of computer software. Examples of computer-readable recording media include hard disks, magnetic media such as floppy disks and magnetic tapes, optical recording media such as CD-ROMs, DVDs, and USB drives, magneto-optical media such as floppy disks, and ROM, RAM, Included are hardware devices specially configured to store and execute program instructions, such as flash memory and the like. Examples of program instructions include not only machine language codes such as those generated by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like.

100: 모바일 단말
200: 서버100: mobile terminal
200: server

Claims

A method for managing call recording information that is executed in a mobile terminal in which a call recording information management application is installed and interworking with a server through a network,
1) recording call identification information about the call recording file in a call recording information management DB;
2) transmitting a call recording file including call identification information to the server;
3) STT identification information given by the server to identify the call recording file transmitted to the server as a voice-to-text conversion (STT) target is transmitted from the server, and the STT identification information is stored in the call recording information management DB Recording by matching the call identification information;
4) receiving the call recording text information from the server, and recording the recorded text information in the call recording information management DB by matching the STT identification information - The call recording text information is the call recording file by the server Separating the voices of the user of the mobile terminal and the other party's voice from the caller, and separately including each text information generated by voice-to-text conversion (STT) for the entire call voice, the user's voice, and the call party's voice; and
5) providing a search mode for the call recording text information based on the call recording information management DB, wherein the search mode is configured to be searchable by distinguishing a speaker;

A computer program stored in a computer readable medium to execute a call recording information management method in combination with hardware comprising a memory for storing one or more instructions and a processor for executing the one or more instructions stored in the memory,
The method of managing the call recording information,
1) recording call identification information about the call recording file in a call recording information management DB;
2) transmitting a call recording file including call identification information to the server;
3) STT identification information given by the server to identify the call recording file transmitted to the server as a voice-to-text conversion (STT) target is transmitted from the server, and the STT identification information is stored in the call recording information management DB Recording by matching the call identification information;
4) receiving the call recording text information from the server, and recording the recorded text information in the call recording information management DB by matching the STT identification information - The call recording text information is the call recording file by the server Separating the voices of the user of the mobile terminal and the other party's voice from the caller, and separately including each text information generated by voice-to-text conversion (STT) for the entire call voice, the user's voice, and the call party's voice; and
5) providing a search mode related to the call recording text information based on the call recording information management DB, wherein the search mode is configured to be searchable by distinguishing a speaker; A computer program stored on a medium.