KR102011036B1

KR102011036B1 - Method and system for voice control of notifications

Info

Publication number: KR102011036B1
Application number: KR1020170036814A
Authority: KR
Inventors: 김석태; 김대성; 정충인; 차상옥; 박승현
Original assignee: 네이버랩스 주식회사
Priority date: 2017-03-23
Filing date: 2017-03-23
Publication date: 2019-08-14
Also published as: KR20180107909A

Abstract

알림에 대한 음성 조작 방법 및 시스템이 개시된다. 컴퓨터로 구현되는 알림 음성 조작 시스템의 알림 음성 조작 방법은, 알림이 수신되면 알림 내용을 포함하는 알림 화면을 출력하는 단계; 및 상기 알림이 사용자 응답이 요구되는 알림인 경우 상기 알림 화면이 출력되는 동안 상기 사용자 응답을 음성 형태의 명령으로 입력 받기 위한 음성 인식 기능을 활성화 하는 단계를 포함할 수 있다.Disclosed are a voice manipulation method and system for notification. A notification voice manipulation method of a notification voice manipulation system implemented by a computer may include: outputting a notification screen including notification contents when a notification is received; And when the notification is a notification requiring a user response, activating a voice recognition function for receiving the user response as a voice command while the notification screen is output.

Description

METHOOD AND SYSTEM FOR VOICE CONTROL OF NOTIFICATIONS}

아래의 설명은 알림(notification)에 대한 사용자 조작을 처리하는 기술에 관한 것이다.The description below relates to techniques for handling user operations on notifications.

최근 데이터 처리 속도의 발전, 무선 데이터 통신 속도의 향상 및 GPS(global positioning system) 시스템의 지속적인 발달과 더불어, 이러한 기술들을 집약적으로 사용하는 내비게이션(navigation) 시스템이 대중적으로 널리 보편화되어 사용되고 있다.Recently, with the development of data processing speed, the improvement of the wireless data communication speed and the continuous development of the global positioning system (GPS), the navigation system using these technologies intensively is widely used and widely used.

일반적으로, 내비게이션 시스템은 목적지의 안내를 위한 지리정보 또는 지도정보를 디지털 데이터화 하여 저장수단에 저장하고, GPS, 즉, 복수의 위성에서 보내는 신호를 수신하여 사용자의 현재 위치를 산출하는 위성 항법 시스템 등과 같은 위치인식수단을 통하여 현재 위치를 인식하며, 인식된 현재 위치를 바탕으로 상기 저장된 지리/지도정보와 매칭하여 목적지까지의 길 안내를 함으로써 길 찾기를 도 와주는 기술이다.In general, a navigation system digitalizes geographic information or map information for guiding a destination and stores it in a storage means, and GPS, that is, a satellite navigation system that receives signals from a plurality of satellites and calculates a current location of a user. Recognizing the current location through the same location recognition means, and matching the stored geography / map information on the basis of the recognized current location to guide the way to the destination to help find the way.

내비게이션 시스템은 일반적으로 자동차에 장착되어 사용자에게 위치와 주변 지도 등의 정보를 전송하고 목적지에 이르는 경로나 최단 거리 등을 알려줄 때 사용된다.Navigation systems are typically used in cars to send information, such as location and nearby maps, and to inform users of the route or shortest distance to their destination.

최근에는 MP3, 동영상 플레이어는 물론 지상파 DMB 수신 장치까지 포함한 내비게이션 시스템이 개발되어 있으며, 더 나아가, 스마트폰과 같은 휴대 단말기에 구비된 장치들을 기반으로 휴대 단말기에서도 내비게이션 시스템이 구현되는 단계에 이르렀다.Recently, a navigation system including an MP3 and a video player as well as a terrestrial DMB receiving device has been developed. Furthermore, a navigation system has been implemented in a portable terminal based on devices provided in a portable terminal such as a smartphone.

내비게이션 시스템은 운전자의 안전 운행과 편리성을 보장하기 위해 다양한 음성 안내 기능을 지원하고 있다. 예컨대, 한국공개특허공보 제10-1999-0047060호(공개일 1999년 07월 05일)에는 자동차의 이상 상태가 발생하는 경우 운전자에게 자동차의 이상 상태의 내용을 음성으로 출력시켜 줄 수 있도록 하는 차량 문제 발생 음성 알림 장치 및 그 제어 방법이 개시되어 있다.The navigation system supports various voice guidance functions to ensure the driver's safe driving and convenience. For example, Korean Patent Laid-Open Publication No. 10-1999-0047060 (published July 05, 1999) discloses a vehicle that allows a driver to output a description of an abnormal state of a vehicle by voice when an abnormal state of a vehicle occurs. Disclosed are a problem occurrence voice notification device and a method of controlling the same.

사용자 응답이 요구되는 알림에 대해서 음성 명령을 트리거(trigger)할 수 있는 방법 및 시스템을 제공한다.A method and system are provided that can trigger a voice command for a notification requiring a user response.

알림이 표출되는 시간 동안 음성 인식 가능 상태를 자동 활성화 하여 알림에 대해 음성 명령으로 조작할 수 있는 방법 및 시스템을 제공한다.The present invention provides a method and system for automatically activating a voice recognition capable state during a time when a notification is displayed to operate by voice command on the notification.

알림 표출 시간 동안 활성화 되는 음성 인식 기능에 있어 입력 가능한 음성 명령어를 안내해 줄 수 있는 방법 및 시스템을 제공한다.The present invention provides a method and system for guiding voice commands that can be input in the voice recognition function activated during the notification display time.

알림에 포함된 선택지의 문구를 이용하여 음성 인식 범주를 한정함으로써 해당 범주 내에서 음성 명령을 인식할 수 있는 방법 및 시스템을 제공한다.The present invention provides a method and system for recognizing voice commands within a category by defining a voice recognition category using phrases of an option included in the notification.

컴퓨터로 구현되는 알림 음성 조작 시스템의 알림 음성 조작 방법에 있어서, 알림이 수신되면 알림 내용을 포함하는 알림 화면을 출력하는 단계; 및 상기 알림이 사용자 응답이 요구되는 알림인 경우 상기 알림 화면이 출력되는 동안 상기 사용자 응답을 음성 형태의 명령으로 입력 받기 위한 음성 인식 기능을 활성화 하는 단계를 포함하는 알림 음성 조작 방법을 제공한다.A notification voice manipulation method of a notification voice manipulation system implemented by a computer, the method comprising: outputting a notification screen including notification contents when a notification is received; And activating a voice recognition function for receiving the user response as a voice command while the notification screen is output when the notification is a notification requiring a user response.

일 측면에 따르면, 상기 출력하는 단계는, 상기 알림 화면을 표시함과 동시에 상기 알림 내용을 TTS(text to speech)를 통해 음성으로 출력할 수 있다.According to an aspect, the outputting may include displaying the notification screen and simultaneously outputting the notification contents through voice to text (TTS).

다른 측면에 따르면, 상기 출력하는 단계는, 상기 알림 화면을 표시함과 동시에 상기 사용자 응답으로 입력 가능한 상기 알림 화면에 포함된 액션 버튼의 선택지 문구를 음성으로 출력할 수 있다.According to another aspect, the outputting step may output the selection text of the action button included in the notification screen that can be input in response to the user's response while displaying the notification screen.

또 다른 측면에 따르면, 상기 출력하는 단계는, 상기 알림의 종류에 따라 알림 형태를 구분하여 출력할 수 있다.According to another aspect, the outputting step may be output by dividing the notification type according to the type of the notification.

또 다른 측면에 따르면, 상기 출력하는 단계는, 상기 알림의 종류를 상기 사용자 응답이 요구되는 알림과 나머지 알림으로 구분하여 상기 알림의 종류에 따라 시각적인 요소와 청각적인 요소 중 적어도 하나를 달리 출력할 수 있다.According to another aspect, the outputting may be performed by dividing the type of the notification into a notification for which the user response is required and the remaining notification, and outputting at least one of a visual element and an audio element according to the type of the notification. Can be.

또 다른 측면에 따르면, 상기 활성화 하는 단계는, 상기 음성 인식 기능을 통해 상기 알림 화면에 포함된 액션 버튼의 선택지 문구와 일치하는 음성 명령을 인식할 수 있다.According to another aspect, in the activating step, the voice command function may recognize a voice command that matches an option phrase of an action button included in the notification screen.

또 다른 측면에 따르면, 상기 활성화 하는 단계는, 상기 알림 화면에 포함된 액션 버튼의 선택지 문구를 이용하여 음성 인식 범주를 설정하는 단계; 및 상기 음성 인식 기능을 통해 상기 음성 인식 범주에 해당되는 음성 명령을 인식하는 단계를 포함할 수 있다.According to another aspect, the activating may include: setting a voice recognition category by using an option phrase of an action button included in the notification screen; And recognizing a voice command corresponding to the voice recognition category through the voice recognition function.

또 다른 측면에 따르면, 상기 설정하는 단계는, 상기 선택지 문구와 의미적 유사도를 가진 문장 세트를 상기 음성 인식 범주로 설정할 수 있다.According to another aspect, the setting may include setting a sentence set having semantic similarity to the option phrase as the speech recognition category.

또 다른 측면에 따르면, 상기 알림 화면에는 상기 사용자 응답으로 입력 가능한 적어도 하나의 액션 버튼이 포함되고, 상기 음성 인식 기능을 통해 인식된 음성 명령에 해당되는 액션 버튼을 나머지 액션 버튼과 구분하여 표시하는 단계를 더 포함할 수 있다.According to another aspect, the notification screen includes at least one action button that can be input in response to the user response, and displaying the action button corresponding to the voice command recognized through the voice recognition function distinguished from the other action buttons It may further include.

또 다른 측면에 따르면, 상기 알림 화면이 출력되는 동안 상기 음성 인식 기능을 통해 취소 명령에 해당되는 음성 명령이 인식되면 이전에 인식된 음성 명령을 취소하는 단계를 더 포함할 수 있다.According to another aspect, the method may further include canceling a previously recognized voice command when a voice command corresponding to a cancellation command is recognized through the voice recognition function while the notification screen is output.

알림 음성 조작 방법을 실행시키기 위해 컴퓨터 판독 가능한 기록 매체에 기록된 컴퓨터 프로그램에 있어서, 상기 알림 음성 조작 방법은, 알림이 수신되면 알림 내용을 포함하는 알림 화면을 출력하는 단계; 및 상기 알림이 사용자 응답이 요구되는 알림인 경우 상기 알림 화면이 출력되는 동안 상기 사용자 응답을 음성 형태의 명령으로 입력 받기 위한 음성 인식 기능을 활성화 하는 단계를 포함하는, 컴퓨터 판독 가능한 기록 매체에 기록된 컴퓨터 프로그램을 제공한다.A computer program recorded on a computer readable recording medium for executing a notification voice manipulation method, the notification voice manipulation method comprising: outputting a notification screen including a notification content when a notification is received; And activating a voice recognition function for receiving the user response as a command in a voice form while the notification screen is output when the notification is a notification requiring a user response. Provide a computer program.

컴퓨터로 구현되는 알림 음성 조작 시스템에 있어서, 컴퓨터가 판독 가능한 명령을 실행하도록 구현되는 적어도 하나의 프로세서를 포함하고, 상기 적어도 하나의 프로세서는, 알림이 수신되면 알림 내용을 포함하는 알림 화면을 출력하는 알림 출력 제어부; 및 상기 알림이 사용자 응답이 요구되는 알림인 경우 상기 알림 화면이 출력되는 동안 상기 사용자 응답을 음성 형태의 명령으로 입력 받기 위한 음성 인식 기능을 활성화 하는 음성 인식 제어부를 포함하는 알림 음성 조작 시스템을 제공한다.A computer-implemented notification voice manipulation system, comprising: at least one processor configured to execute a computer readable instruction, wherein the at least one processor outputs a notification screen including notification contents when the notification is received. A notification output control unit; And a voice recognition controller for activating a voice recognition function for receiving the user response as a command in the form of a voice while the notification screen is output when the notification is a notification requiring a user response. .

본 발명의 실시예에 따르면, 특정 알림, 가령 사용자 응답이 요구되는 알림에 대해서 음성 명령을 트리거 할 수 있다.According to an embodiment of the present invention, a voice command may be triggered for a specific notification, for example, a notification requiring a user response.

본 발명의 실시예에 따르면, 알림이 표출되는 시간 동안 음성 인식 가능 상태를 자동 활성화 하여 알림에 대해 바로 음성 명령으로 조작할 수 있다.According to an embodiment of the present invention, the voice recognition capable state is automatically activated during the time when the notification is displayed, and the voice command can be directly manipulated for the notification.

본 발명의 실시예에 따르면, 알림 표출 시간 동안 활성화 되는 음성 인식 기능에 있어 입력 가능한 음성 명령어를 안내해 줌으로써 사용자가 가능한 음성 명령어를 인지하고 있지 않더라도 음성 명령을 쉽게 내릴 수 있다.According to an exemplary embodiment of the present invention, the voice command that is activated during the notification display time may be guided so that the voice command may be easily issued even if the user does not recognize the voice command.

본 발명의 실시예에 따르면, 알림에 포함된 선택지의 문구를 이용하여 음성 인식 범주를 한정함으로써 해당 범주 내에서 음성 명령을 인식할 수 있어 음성 오인식률을 개선할 수 있다.According to an exemplary embodiment of the present invention, the voice command may be recognized within the corresponding category by limiting the voice recognition category using the phrase of the option included in the notification, thereby improving the voice recognition rate.

도 1은 본 발명의 일 실시예에 있어서 컴퓨터 시스템의 내부 구성의 일례를 설명하기 위한 블록도이다.
도 2는 본 발명의 일 실시예에 따른 컴퓨터 시스템의 프로세서가 포함할 수 있는 구성요소의 예를 도시한 도면이다.
도 3은 본 발명의 일 실시예에 따른 컴퓨터 시스템이 수행할 수 있는 알림 음성 조작 방법의 예를 도시한 순서도이다.
도 4는 본 발명의 일실시예에 있어서 음성 인식 과정의 예를 도시한 순서도이다.
도 5 내지 도 9는 본 발명의 일실시예에 있어서 알림 화면을 설명하기 위한 예시 도면이다.1 is a block diagram illustrating an example of an internal configuration of a computer system according to an embodiment of the present invention.
2 illustrates an example of components that may be included in a processor of a computer system according to an exemplary embodiment of the present invention.
3 is a flowchart illustrating an example of a notification voice manipulation method that may be performed by a computer system according to an exemplary embodiment.
4 is a flowchart illustrating an example of a speech recognition process according to an embodiment of the present invention.
5 to 9 are exemplary views for explaining a notification screen in one embodiment of the present invention.

이하, 본 발명의 실시예를 첨부된 도면을 참조하여 상세하게 설명한다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

본 발명의 실시예들은 알림에 대한 사용자 조작을 처리하는 기술에 관한 것으로, 더욱 상세하게는 사용자 응답이 요구되는 알림에 대해서 알림이 표출되는 시간 동안 음성 인식 기능을 자동 활성화 하여 알림에 대해 음성 명령으로 조작할 수 있는 방법 및 시스템에 관한 것이다.Embodiments of the present invention relate to a technique for processing a user operation for a notification, and more particularly, a voice command for a notification by automatically activating a voice recognition function for a time when the notification is displayed for a notification requiring a user response. A method and system that can be operated.

본 명세서에서 구체적으로 개시되는 것들을 포함하는 실시예들은 사용자 응답이 요구되는 알림에 대해 바로 음성으로 조작할 수 있는 기술을 구현할 수 있고, 이를 통해 편의성, 효율성, 정확성, 비용 절감 등의 측면에 있어서 상당한 장점들을 달성한다.Embodiments, including those specifically disclosed herein, may implement techniques that can directly manipulate voice for notifications that require user response, thereby providing significant benefits in terms of convenience, efficiency, accuracy, cost savings, and the like. Achieve the advantages.

도 1은 본 발명의 일 실시예에 있어서 컴퓨터 시스템의 내부 구성의 일례를 설명하기 위한 블록도이다. 예를 들어, 본 발명의 실시예들에 따른 알림 음성 조작 시스템이 도 1의 컴퓨터 시스템(100)을 통해 구현될 수 있다. 도 1에 도시한 바와 같이, 컴퓨터 시스템(100)은 알림 음성 조작 방법을 실행하기 위한 구성요소로서 프로세서(110), 메모리(120), 영구 저장 장치(130), 버스(140), 입출력 인터페이스(150) 및 네트워크 인터페이스(160)를 포함할 수 있다.1 is a block diagram illustrating an example of an internal configuration of a computer system according to an embodiment of the present invention. For example, the notification voice manipulation system according to embodiments of the present invention may be implemented through the computer system 100 of FIG. 1. As shown in FIG. 1, the computer system 100 is a component for executing a notification voice manipulation method, and includes a processor 110, a memory 120, a persistent storage device 130, a bus 140, and an input / output interface ( 150 and network interface 160.

프로세서(110)는 명령어들의 시퀀스를 처리할 수 있는 임의의 장치를 포함하거나 그의 일부일 수 있다. 프로세서(110)는 예를 들어 컴퓨터 프로세서, 이동 장치 또는 다른 전자 장치 내의 프로세서 및/또는 디지털 프로세서를 포함할 수 있다. 프로세서(110)는 예를 들어, 서버 컴퓨팅 디바이스, 서버 컴퓨터, 일련의 서버 컴퓨터들, 서버 팜, 클라우드 컴퓨터, 컨텐츠 플랫폼, 이동 컴퓨팅 장치, 스마트폰, 태블릿, 셋톱 박스, 미디어 플레이어 등에 포함될 수 있다. 프로세서(110)는 버스(140)를 통해 메모리(120)에 접속될 수 있다.Processor 110 may include or be part of any device capable of processing a sequence of instructions. Processor 110 may include, for example, a processor within a computer processor, mobile device or other electronic device, and / or a digital processor. The processor 110 may be included in, for example, a server computing device, a server computer, a series of server computers, a server farm, a cloud computer, a content platform, a mobile computing device, a smartphone, a tablet, a set top box, a media player, and the like. The processor 110 may be connected to the memory 120 through the bus 140.

메모리(120)는 컴퓨터 시스템(100)에 의해 사용되거나 그에 의해 출력되는 정보를 저장하기 위한 휘발성 메모리, 영구, 가상 또는 기타 메모리를 포함할 수 있다. 메모리(120)는 예를 들어 랜덤 액세스 메모리(RAM: random access memory) 및/또는 동적 RAM(DRAM: dynamic RAM)을 포함할 수 있다. 메모리(120)는 컴퓨터 시스템(100)의 상태 정보와 같은 임의의 정보를 저장하는 데 사용될 수 있다. 메모리(120)는 예를 들어 알림 음성 조작을 제어하기 위한 명령어들을 포함하는 컴퓨터 시스템(100)의 명령어들을 저장하는 데에도 사용될 수 있다. 컴퓨터 시스템(100)은 필요에 따라 또는 적절한 경우에 하나 이상의 프로세서(110)를 포함할 수 있다.Memory 120 may include volatile memory, permanent, virtual, or other memory for storing information used by or output by computer system 100. The memory 120 may include, for example, random access memory (RAM) and / or dynamic RAM (DRAM). Memory 120 may be used to store any information, such as status information of computer system 100. Memory 120 may also be used to store instructions of computer system 100, including, for example, instructions for controlling notification voice manipulation. Computer system 100 may include one or more processors 110 as needed or where appropriate.

버스(140)는 컴퓨터 시스템(100)의 다양한 컴포넌트들 사이의 상호작용을 가능하게 하는 통신 기반 구조를 포함할 수 있다. 버스(140)는 컴퓨터 시스템(100)의 컴포넌트들 사이에, 예를 들어 프로세서(110)와 메모리(120) 사이에 데이터를 운반할 수 있다. 버스(140)는 컴퓨터 시스템(100)의 컴포넌트들 간의 무선 및/또는 유선 통신 매체를 포함할 수 있으며, 병렬, 직렬 또는 다른 토폴로지 배열들을 포함할 수 있다.Bus 140 may include a communication infrastructure that enables interaction between various components of computer system 100. Bus 140 may carry data between components of computer system 100, for example, between processor 110 and memory 120. Bus 140 may include wireless and / or wired communication media between components of computer system 100 and may include parallel, serial, or other topology arrangements.

영구 저장 장치(130)는 (예를 들어, 메모리(120)에 비해) 소정의 연장된 기간 동안 데이터를 저장하기 위해 컴퓨터 시스템(100)에 의해 사용되는 바와 같은 메모리 또는 다른 영구 저장 장치와 같은 컴포넌트들을 포함할 수 있다. 영구 저장 장치(130)는 컴퓨터 시스템(100) 내의 프로세서(110)에 의해 사용되는 바와 같은 비휘발성 메인 메모리를 포함할 수 있다. 영구 저장 장치(130)는 예를 들어 플래시 메모리, 하드 디스크, 광 디스크 또는 다른 컴퓨터 판독 가능 매체를 포함할 수 있다.Persistent storage 130 is a component, such as a memory or other persistent storage, such as used by computer system 100 to store data for some extended period of time (eg, relative to memory 120). Can include them. Persistent storage 130 may include non-volatile main memory as used by processor 110 in computer system 100. Persistent storage 130 may include, for example, flash memory, hard disk, optical disk, or other computer readable medium.

입출력 인터페이스(150)는 키보드, 마우스, 음성 명령 입력, 디스플레이 또는 다른 입력 또는 출력 장치에 대한 인터페이스들을 포함할 수 있다. 구성 명령들 및/또는 알림 음성 조작과 관련된 입력이 입출력 인터페이스(150)를 통해 수신될 수 있다.The input / output interface 150 may include interfaces for a keyboard, mouse, voice command input, display, or other input or output device. Input relating to configuration commands and / or alert voice manipulation may be received via the input / output interface 150.

네트워크 인터페이스(160)는 근거리 네트워크 또는 인터넷과 같은 네트워크들에 대한 하나 이상의 인터페이스를 포함할 수 있다. 네트워크 인터페이스(160)는 유선 또는 무선 접속들에 대한 인터페이스들을 포함할 수 있다. 구성 명령들은 네트워크 인터페이스(160)를 통해 수신될 수 있다. 그리고, 알림과 관련된 정보들은 네트워크 인터페이스(160)를 통해 수신 또는 송신될 수 있다.Network interface 160 may include one or more interfaces to networks such as a local area network or the Internet. Network interface 160 may include interfaces for wired or wireless connections. Configuration commands may be received via network interface 160. In addition, the information related to the notification may be received or transmitted through the network interface 160.

또한, 다른 실시예들에서 컴퓨터 시스템(100)은 도 1의 구성요소들보다 더 많은 구성요소들을 포함할 수도 있다. 그러나, 대부분의 종래기술적 구성요소들을 명확하게 도시할 필요성은 없다. 예를 들어, 컴퓨터 시스템(100)은 상술한 입출력 인터페이스(150)와 연결되는 입출력 장치들 중 적어도 일부를 포함하도록 구현되거나 또는 트랜시버(transceiver), GPS(Global Positioning System) 모듈, 카메라, 각종 센서, 데이터베이스 등과 같은 다른 구성요소들을 더 포함할 수도 있다. 보다 구체적인 예로, 컴퓨터 시스템(100)이 스마트폰과 같은 모바일 기기의 형태로 구현되는 경우, 일반적으로 스마트폰이 포함하고 있는 가속도 센서나 자이로 센서, 카메라, 각종 물리적인 버튼, 터치패널을 이용한 버튼, 입출력 포트, 진동을 위한 진동기 등의 다양한 구성요소들이 컴퓨터 시스템(100)에 더 포함되도록 구현될 수 있다.In addition, in other embodiments, computer system 100 may include more components than the components of FIG. 1. However, it is not necessary to clearly show most of the prior art components. For example, the computer system 100 may be implemented to include at least some of the input / output devices connected to the input / output interface 150 described above, or may include a transceiver, a global positioning system (GPS) module, a camera, various sensors, It may further include other components such as a database. As a more specific example, when the computer system 100 is implemented in the form of a mobile device such as a smartphone, an acceleration sensor, a gyro sensor, a camera, various physical buttons, a button using a touch panel, Various components, such as an input / output port and a vibrator for vibration, may be implemented to be further included in the computer system 100.

최근 운전 환경에서는 일반적으로 음성 에이전트가 수신된 알림을 음성으로 출력하고 이에 사용자는 후속 액션을 위해 음성 입력 버튼(예컨대, 마이크 버튼 등)을 조작하여 음성으로 명령을 입력하는 방식을 취하고 있다. 그러나, 모든 인터랙션이 음성으로 이루어지는 경우 사용자는 입력 가능한 명령어를 사전에 인지하고 있어야 하는 불편이 있고, 또한 차량 내 음성 명령을 위한 마이크 버튼이 없는 차종의 경우 사용에 어려움이 있다.In a recent driving environment, a voice agent generally outputs a received notification as a voice and a user inputs a voice command by operating a voice input button (for example, a microphone button) for a subsequent action. However, when all interactions are made by voice, the user is inconvenient to be aware of an inputable command in advance, and in the case of a vehicle model having no microphone button for in-vehicle voice commands, it is difficult to use.

본 실시예에서는 운전 중 발생하는 알림에 대해 바로 음성으로 액션할 수 있도록 알림 화면이 출력되는 시간에 한해 음성 인식 가능 상태로 자동 전환하고 가능한 음성 명령어를 알림 화면에 표시함으로써 사용자가 인지하고 있지 않더라도 음성 명령을 쉽게 내릴 수 있도록 유도한다.In the present embodiment, only the time when the notification screen is outputted so that the user can immediately perform a voice action on the notification occurring while driving, the voice recognition function is automatically switched and the voice command is displayed on the notification screen so that the voice is not recognized by the user. Encourage them to issue orders easily.

본 명세서에서는 운전 환경에 제공되는 알림에 대해 설명하고 있으나, 이에 한정되는 것은 아니며 사용자에게 알림이 필요한 모든 시스템이나 환경에 적용 가능함은 물론이다.In the present specification, the notification provided to the driving environment is described, but the present invention is not limited thereto, and the present disclosure may be applied to any system or environment that requires notification to the user.

도 2는 본 발명의 일 실시예에 따른 컴퓨터 시스템의 프로세서가 포함할 수 있는 구성요소의 예를 도시한 도면이고, 도 3은 본 발명의 일 실시예에 따른 컴퓨터 시스템이 수행할 수 있는 알림 음성 조작 방법의 예를 도시한 순서도이다.FIG. 2 is a diagram illustrating an example of components that a processor of a computer system according to an embodiment of the present invention may include, and FIG. 3 is a notification voice that may be performed by the computer system according to an embodiment of the present invention. It is a flowchart which shows the example of an operation method.

도 2에 도시된 바와 같이, 프로세서(110)는 알림 출력 제어부(210), 음성 인식 제어부(220), 및 기능 실행부(230)를 포함할 수 있다. 이러한 프로세서(110)의 구성요소들은 적어도 하나의 프로그램 코드에 의해 제공되는 제어 명령에 따라 프로세서(110)에 의해 수행되는 서로 다른 기능들(different functions)의 표현들일 수 있다. 예를 들어, 프로세서(110)가 알림을 출력하도록 컴퓨터 시스템(100)을 제어하기 위해 동작하는 기능적 표현으로서 알림 출력 제어부(210)가 사용될 수 있다. 프로세서(110) 및 프로세서(110)의 구성요소들은 도 3의 알림 음성 조작 방법이 포함하는 단계들(S310 내지 S340)을 수행할 수 있다. 예를 들어, 프로세서(110) 및 프로세서(110)의 구성요소들은 메모리(120)가 포함하는 운영체제의 코드와 상술한 적어도 하나의 프로그램 코드에 따른 명령(instruction)을 실행하도록 구현될 수 있다. 여기서, 적어도 하나의 프로그램 코드는 알림 음성 조작 방법을 처리하기 위해 구현된 프로그램의 코드에 대응될 수 있다.As shown in FIG. 2, the processor 110 may include a notification output controller 210, a voice recognition controller 220, and a function execution unit 230. The components of such a processor 110 may be representations of different functions performed by the processor 110 in accordance with a control instruction provided by at least one program code. For example, the notification output controller 210 may be used as a functional representation that the processor 110 operates to control the computer system 100 to output the notification. The processor 110 and the components of the processor 110 may perform steps S310 to S340 included in the notification voice manipulation method of FIG. 3. For example, the processor 110 and the components of the processor 110 may be implemented to execute instructions according to the code of the operating system included in the memory 120 and the at least one program code described above. Here, the at least one program code may correspond to a code of a program implemented to process the notification voice manipulation method.

알림 음성 조작 방법은 도시된 순서대로 발생하지 않을 수 있으며, 단계들 중 일부가 생략되거나 추가의 과정이 더 포함될 수 있다.The notification voice manipulation method may not occur in the order shown, and some of the steps may be omitted or an additional process may be further included.

단계(S310)에서 프로세서(110)는 알림 음성 조작 방법을 위한 프로그램 파일에 저장된 프로그램 코드를 메모리(120)에 로딩할 수 있다. 예를 들어, 알림 음성 조작 방법을 위한 프로그램 파일은 도 1을 통해 설명한 영구 저장 장치(130)에 저장되어 있을 수 있고, 프로세서(110)는 버스를 통해 영구 저장 장치(130)에 저장된 프로그램 파일로부터 프로그램 코드가 메모리(120)에 로딩되도록 컴퓨터 시스템(110)을 제어할 수 있다. 이때, 프로세서(110) 및 프로세서(110)가 포함하는 알림 출력 제어부(210)와 음성 인식 제어부(220) 및 기능 실행부(230) 각각은 메모리(120)에 로딩된 프로그램 코드 중 대응하는 부분의 명령을 실행하여 이후 단계들(S320 내지 S340)을 실행하기 위한 프로세서(110)의 서로 다른 기능적 표현들일 수 있다. 단계들(S320 내지 S340)의 실행을 위해, 프로세서(110) 및 프로세서(110)의 구성요소들은 직접 제어 명령에 따른 연산을 처리하거나 또는 컴퓨터 시스템(100)을 제어할 수 있다.In operation S310, the processor 110 may load the program code stored in the program file for the notification voice manipulation method into the memory 120. For example, the program file for the notification voice manipulation method may be stored in the persistent storage device 130 described with reference to FIG. 1, and the processor 110 may store the program file stored in the persistent storage device 130 through a bus. Computer system 110 may be controlled such that program code is loaded into memory 120. In this case, each of the processor 110 and the notification output controller 210, the voice recognition controller 220, and the function execution unit 230 included in the processor 110 may correspond to a corresponding portion of the program code loaded in the memory 120. It may be different functional representations of the processor 110 for executing the instructions to execute the subsequent steps S320 to S340. In order to execute the steps S320 to S340, the processor 110 and the components of the processor 110 may directly process an operation according to a control command or control the computer system 100.

단계(S320)에서 알림 출력 제어부(210)는 알림이 수신되면 알림 내용을 포함하는 알림 화면을 출력함과 동시에 알림 내용을 음성으로 출력할 수 있다. 일례로, 알림 출력 제어부(210)는 알림이 수신되는 시점에 알림 화면을 팝업(popup) 표시하고 알림 내용을 TTS(text to speech)로 읽어줄 수 있다. 이때, 알림 출력 제어부(210)는 사전에 정해진 설정 시간 동안 알림 팝업을 유지할 수 있다. 설정 시간에는 알림 내용을 음성으로 출력하는데 소요되는 시간이 포함될 수 있고, 다른 예로는 알림 내용을 음성으로 출력하는 과정이 종료되는 시점 이후부터 설정 시간 동안 알림 팝업을 유지할 수 있다. 수신된 알림 이외에도 컴퓨터 시스템(100) 내에서 알림이 필요한 이벤트가 감지되는 경우 해당 이벤트와 관련된 알림 화면을 제공하는 것 또한 가능하다.In operation S320, when the notification is received, the notification output controller 210 may output a notification screen including the notification contents and output the notification contents by voice. For example, the notification output controller 210 may display a notification screen at a time point at which a notification is received and read the notification content as text to speech (TTS). In this case, the notification output control unit 210 may maintain the notification pop-up for a predetermined time. The setting time may include a time required for outputting the notification contents by voice. As another example, the notification pop-up may be maintained for a predetermined time after the end of the process of outputting the notification contents by voice. In addition to the received notification, when an event requiring notification in the computer system 100 is detected, it is also possible to provide a notification screen related to the event.

그리고, 알림 출력 제어부(210)는 알림의 종류에 따라 알림 형태를 구분하여 표출할 수 있다. 일례로, 알림 출력 제어부(210)는 사용자 응답이 요구되는 제1 알림과 사용자 응답이 요구되지 않는 제2 알림을 구분하여 표출할 수 있다. 이때, 제1 알림은 사용자와의 인터랙션(interaction)을 위해 소정의 액션 버튼을 포함한 알림을 의미할 수 있고, 제2 알림은 사용자로부터 응답을 요구하지 않고 단순히 알림 내용을 표출하는 알림을 의미할 수 있다. 알림 출력 제어부(210)는 알림의 종류에 따라 알림 화면의 색깔, 크기 등 디스플레이 요소를 달리하거나, 혹은 알림음의 종류, 패턴, 길이 등 알림음을 달리할 수 있다. 시각적인 요소 및/또는 청각적인 요소를 활용하여 알림의 종류를 구분하여 표출할 수 있다.The notification output control unit 210 may classify and display the notification type according to the type of the notification. For example, the notification output controller 210 may distinguish and display a first notification that requires a user response and a second notification that does not require a user response. In this case, the first notification may refer to a notification including a predetermined action button for interaction with the user, and the second notification may refer to a notification simply displaying the contents of the notification without requiring a response from the user. have. The notification output controller 210 may change display elements such as the color and size of the notification screen according to the type of the notification, or may change the notification sounds such as the type, pattern, and length of the notification sound. The visual and / or audio elements may be used to classify the notifications.

단계(S330)에서 음성 인식 제어부(220)는 알림의 종류에 따라 알림 화면이 출력되는 시점에 음성 인식 기능을 자동으로 활성화 할 수 있다. 특히, 음성 인식 제어부(220)는 사용자 응답이 요구되는 알림의 경우 알림 화면이 출력되는 동안 사용자 응답을 음성 형태의 명령으로 입력 받기 위한 음성 인식 기능을 활성화 할 수 있다. 다시 말해, 음성 인식 제어부(220)는 알림 수신 상태에서 바로 마이크를 켜고 음성 명령이 인식 가능한 상태로 자동 전환할 수 있다. 이때, 알림 출력 제어부(210)는 음성 인식 기능이 활성화 되면 알림 화면 상에 음성 인식 가능한 시간을 표시해 줄 수 있으며, 다른 예로 음성 인식 가능 시간에 대한 카운트를 음성으로 표출할 수 있다. 여기서, 음성 인식 가능 시간은 알림 팝업을 유지하는 시간인 설정 시간과 대응될 수 있다.In operation S330, the voice recognition controller 220 may automatically activate the voice recognition function at the time when the notification screen is output according to the type of notification. In particular, the voice recognition controller 220 may activate a voice recognition function for receiving a user response as a command in the form of a voice while the notification screen is output in the case of the notification requiring the user response. In other words, the voice recognition controller 220 may immediately turn on the microphone in the notification reception state and automatically switch to a state in which the voice command can be recognized. In this case, when the voice recognition function is activated, the notification output control unit 210 may display a voice recognition time on the notification screen. As another example, the notification output control unit 210 may express a count of the voice recognition time as a voice. Here, the voice recognition possible time may correspond to a setting time which is a time for maintaining the notification popup.

단계(S340)에서 기능 실행부(230)는 음성 인식 기능을 통해 인식된 음성 명령에 해당되는 기능을 수행할 수 있다. 기능 실행부(230)는 사용자 응답이 요구되는 알림에 대해 음성 인식 기능을 통해 사용자 응답으로 인식된 음성 명령에 따라 해당 명령의 기능을 수행할 수 있다.In operation S340, the function execution unit 230 may perform a function corresponding to the voice command recognized through the voice recognition function. The function execution unit 230 may perform a function of a corresponding command according to a voice command recognized as a user response through a voice recognition function for a notification requiring a user response.

도 4는 본 발명의 일실시예에 있어서 상기 단계(S330)의 음성 인식 과정의 예를 도시한 순서도이다.4 is a flowchart illustrating an example of a speech recognition process of step S330 according to an embodiment of the present invention.

음성 인식 제어부(220)는 음성 인식 기능이 활성화 된 상태에서 사용자 발화에 따른 음성 명령을 인식할 수 있다. 알림 수신 상태에서 사용자 응답을 입력 받기 위한 UI(user interface)를 제공하되, 알림 화면 상에 사용자가 입력 가능한 명령의 선택지를 액션 버튼으로 제공할 수 있다. 이때, 음성 인식 제어부(220)는 음성 인식 기능이 활성화 된 상태에서 액션 버튼으로 구성된 선택지 문구를 음성 명령으로 인식 가능한 문구로 활용할 수 있다. 다시 말해, 음성 인식 제어부(220)는 액션 버튼의 선택지 문구와 일치하거나 유사한 음성을 인식하여 인식된 음성을 해당 액션 버튼의 명령으로 인식할 수 있다. 더 나아가, 음성 인식 제어부(220)는 액션 버튼의 선택지 문구와 정확히 일치하지 않더라도 의미적으로 유사한 문장의 음성을 인식함으로써 사용자 편의를 향상시키고 음성 오인식률을 최소화 할 수 있도록 알림 화면의 액션 버튼 각각에 대하여 음성 인식 범주를 설정할 수 있다.The voice recognition controller 220 may recognize the voice command according to the user's speech while the voice recognition function is activated. While providing a user interface (UI) for receiving a user response in a notification reception state, a user may input a selection of a command inputtable on the notification screen as an action button. In this case, the voice recognition control unit 220 may use the phrase selected from the action button as a phrase that can be recognized as a voice command while the voice recognition function is activated. In other words, the voice recognition controller 220 may recognize the recognized voice as a command of the corresponding action button by recognizing a voice that matches or is similar to the phrase of the action button. Furthermore, the voice recognition control unit 220 recognizes the voices of semantically similar sentences even if they do not exactly match the option phrases of the action buttons, so as to improve user convenience and minimize voice recognition rates. Speech recognition category can be set.

도 4를 참조하면, 단계(S401)에서 음성 인식 제어부(220)는 알림 화면의 액션 버튼으로 구성된 선택지의 문구를 기준으로 음성 인식 범주를 설정할 수 있다. 음성 인식 범주는 선택지 문구에 대해 실제 명령으로 인식 가능한 문장 세트를 의미할 수 있다. 예를 들어, 선택지 문구 '길 안내'에 대해 해당 명령으로 인식할 수 있는 문장 세트, "길 안내", "길 안내해주세요", "안내해주세요", "길 보여주세요" 등 의미적으로 유사도를 가진 말 뭉치가 음성 인식 범주로 설정될 수 있다. 또한, 선택지 문구 '취소'에 대해 해당 명령으로 인식할 수 있는 문장 세트, "취소", "취소해주세요", "취소바람", "취소할래" 등 의미적 유사도를 가진 말 뭉치가 음성 인식 범주로 설정될 수 있다. 이러한 음성 인식 범주를 설정하기 위해서는 일례로 알림 화면의 액션 버튼으로 구성 가능한 선택지 문구 각각에 대해 해당 문구와 관련된 말 뭉치가 사전에 데이터베이스로 구축되어 메모리(120)에 저장될 수 있다. 음성 인식 제어부(220)는 데이터베이스에 저장된 말 뭉치를 이용하여 알림 화면에 포함된 액션 버튼에 따라 해당 알림에서의 음성 인식 범주를 설정할 수 있다.Referring to FIG. 4, in operation S401, the voice recognition controller 220 may set a voice recognition category based on a phrase of a choice made up of an action button on a notification screen. The speech recognition category may refer to a set of sentences recognizable as actual commands for an option phrase. For example, for the phrase "Directions", the set of sentences that can be recognized by the command, "Directions", "Directions", "Guide", "Show me", etc. The wad may be set to a speech recognition category. In addition, a group of sentences with semantic similarity such as the sentence set, "Cancel", "Cancel", "Cancel", "Cancel", which can be recognized by the command for the option phrase 'Cancel' are classified as speech recognition categories. Can be set. In order to set the speech recognition category, for example, a corpus related to the phrase may be previously stored in the memory 120 for each option phrase that can be configured as an action button on the notification screen. The speech recognition controller 220 may set a speech recognition category in the notification according to the action button included in the notification screen by using the corpus stored in the database.

단계(S402)에서 음성 인식 제어부(220)는 음성 인식 기능을 통해 인식된 음성이 음성 인식 범주에 해당되는 음성 명령인지 여부를 판단할 수 있다. 음성 인식 제어부(220)는 음성 인식 범주에 해당되는 음성 명령이 인식되는 경우 해당 음성 명령을 알림 출력 제어부(210) 및/또는 기능 실행부(230)로 전달할 수 있다. 이때, 알림 출력 제어부(210)는 알림 화면에 포함된 액션 버튼 중 음성 인식 기능을 통해 인식된 음성 명령에 해당되는 액션 버튼과 나머지 액션 버튼을 구분하여 표시할 수 있다. 일례로, 알림 출력 제어부(210)는 음성 인식 기능을 통해 인식된 음성 명령에 해당되는 액션 버튼을 강조하여 표시함으로써 나머지 액션 버튼과 구분할 수 있다.In operation S402, the voice recognition controller 220 may determine whether the voice recognized through the voice recognition function is a voice command corresponding to a voice recognition category. When the voice command corresponding to the voice recognition category is recognized, the voice recognition controller 220 may transmit the voice command to the notification output controller 210 and / or the function execution unit 230. In this case, the notification output controller 210 may classify and display the action button corresponding to the voice command recognized through the voice recognition function from the action buttons included in the notification screen and the remaining action buttons. For example, the notification output controller 210 may distinguish from the other action buttons by highlighting and displaying an action button corresponding to the voice command recognized through the voice recognition function.

단계(S403)에서 음성 인식 제어부(220)는 음성 인식 기능을 통해 인식된 음성이 음성 인식 범주에 해당되는 음성 명령이 아닌 경우 음성 인식 실패로 판단한 후 음성 인식 결과를 초기화 하여 음성 인식을 재시도할 수 있다. 다시 말해, 음성 인식 제어부(220)는 알림 화면에 포함된 액션 버튼의 선택지 문구에 해당되는 음성 인식 범주 이외의 음성이 인식되면 사용자에게 발화를 다시 요청할 수 있다. 음성 인식 제어부(220)는 음성 인식 기능이 실행된 이후, 그리고 음성 인식 결과가 초기화 된 이후 일정 시간 동안 음성 인식이 감지되지 않으면 음성 인식 기능을 종료할 수 있으며, 이때 알림 화면의 팝업이 사라지게 된다.In operation S403, if the voice recognized by the voice recognition function is not a voice command corresponding to the voice recognition category, the voice recognition controller 220 determines that the voice recognition has failed and initializes the voice recognition result to retry voice recognition. Can be. In other words, if a voice other than the voice recognition category corresponding to the selection phrase of the action button included in the notification screen is recognized, the voice recognition control unit 220 may request the user to speak again. The speech recognition controller 220 may terminate the speech recognition function after the speech recognition function is executed and if the speech recognition is not detected for a predetermined time after the speech recognition result is initialized, and the pop-up of the notification screen disappears.

도 5 내지 도 9는 본 발명의 일실시예에 있어서 알림 화면을 설명하기 위한 예시 도면이다.5 to 9 are exemplary views for explaining a notification screen in one embodiment of the present invention.

프로세서(110)는 알림이 수신되거나 알림이 필요한 이벤트가 감지되는 경우 알림 화면의 팝업과 함께, 알림 내용을 TTS로 읽어줄 수 있다. 이때, 프로세서(110)는 알림의 종류에 따라 알림 형태를 구분하여 표출할 수 있다.When the notification is received or an event requiring the notification is detected, the processor 110 may read the notification content to the TTS along with a pop-up of the notification screen. In this case, the processor 110 may display the notification type according to the type of the notification.

도 5는 제1 알림으로서 알림 내용과 관련되어 사용자 응답을 입력 받기 위한 소정의 액션 버튼(510)이 포함된 알림 화면(500)의 예를 나타내고 있고, 도 6은 제2 알림으로서 사용자 응답을 요구하지 않고 단순 알림 내용이 포함된 알림 화면(600)의 예를 나타내고 있다.FIG. 5 illustrates an example of a notification screen 500 including a predetermined action button 510 for receiving a user response in association with the notification content as a first notification, and FIG. 6 requests a user response as a second notification. Instead, an example of a notification screen 600 including simple notification contents is shown.

제1 알림의 알림 화면(500)과 제2 알림의 알림 화면(600)은 서로 다른 종류의 알림임을 나타내기 위해 화면의 색깔이나 크기와 같은 시각적인 요소, 혹은 알림음의 종류나 패턴과 같은 청각적인 요소를 서로 달리하여 구분할 수 있다.The notification screen 500 of the first notification and the notification screen 600 of the second notification are visual elements such as color or size of the screen, or an auditory such as a type or pattern of a notification sound to indicate different kinds of notifications. Different elements can be distinguished from one another.

프로세서(110)는 사용자 응답을 요구하는 알림인 경우 알림 화면(500)이 팝업된 직후부터, 혹은 알림 화면(500)이 팝업되고 알림 내용을 음성으로 출력하는 과정이 종료된 직후부터 마이크를 자동으로 켜고 사용자 응답을 음성 형태의 명령으로 입력 받기 위해 음성 인식 기능을 활성화 할 수 있다.The processor 110 automatically outputs the microphone immediately after the notification screen 500 pops up, or immediately after the notification screen 500 pops up and the process of outputting the notification content by voice is terminated. You can turn it on and activate the speech recognition feature to receive user responses as voice commands.

프로세서(110)는 음성 인식 기능이 활성화 되면 도 5에 도시한 바와 같이 알림 화면(500)을 통해 음성 인식이 가능한 상태임을 나타내는 음성 인식 대기 상태 정보(501)를 표시할 수 있고, 해당 정보(501), 즉 음성 인식 대기 상태임을 음성으로 함께 출력할 수 있다.When the voice recognition function is activated, the processor 110 may display the voice recognition waiting state information 501 indicating that the voice recognition is possible through the notification screen 500 as shown in FIG. 5, and the corresponding information 501. ), That is, the voice recognition standby state can be output together.

그리고, 프로세서(110)는 사용자가 알림 화면(500)을 눈으로 확인하지 않더라도 어떠한 음성 명령이 가능한지 쉽게 인지할 수 있도록 알림 화면(500)에 포함된 액션 버튼(510)의 선택지 문구를 음성으로 출력해 줄 수 있다. 따라서, 알림 화면(500)에 대해 음성 인식 기능으로 입력 가능한 음성 명령어를 안내해 줌으로써 사용자가 인지하고 있지 않더라도 음성 명령을 쉽게 내릴 수 있다.In addition, the processor 110 outputs a voice message of a choice of the action button 510 included in the notification screen 500 so that the user can easily recognize what voice commands are possible even without the user visually checking the notification screen 500. I can do it. Therefore, by providing a voice command that can be input by the voice recognition function to the notification screen 500, the voice command can be easily issued even if the user does not recognize it.

한편, 프로세서(110)는 사용자 응답이 요구되지 않은 알림의 경우 도 6에 도시한 바와 같이 알림 화면(600) 상에 알림 내용만을 표시하고 알림 내용과 관련하여 사용자 응답을 입력받기 위한 액션 버튼이 포함되지 않기 때문에 마이크를 켜지 않고 음성 인식 기능 또한 활성화 하지 않는다.On the other hand, the processor 110, in the case of a notification that does not require a user response, as shown in Figure 6, only the notification content on the notification screen 600, and includes an action button for receiving a user response with respect to the notification content Because it does not turn on the microphone, it does not activate the voice recognition function.

프로세서(110)는 음성 인식 기능이 활성화 되면 알림 화면(500) 상에 음성 인식이 가능한 시간 정보를 표시할 수 있다. 일례로, 프로세서(110)는 도 7에 도시한 바와 같이 음성 인식이 가능한 시간을 프로그래스 바(progress bar)(702)로 표시할 수 있다. 프로그래스 바(702)는 팝업이 사라지는 시점, 즉 음성 인식 가능 시간이 종료되는 시점까지 남은 시간을 카운트다운(countdown) 하면서 표시해 줄 수 있다. 이때, 프로세서(110)는 음성 인식 가능 시간에 대한 카운트다운을 음성으로 함께 출력할 수 있다.When the voice recognition function is activated, the processor 110 may display time information capable of voice recognition on the notification screen 500. For example, as illustrated in FIG. 7, the processor 110 may display a progress bar 702 as a time when speech recognition is possible. The progress bar 702 may display the time remaining until the pop-up disappears, that is, the time at which the voice recognition possible time ends, while counting down. In this case, the processor 110 may output a countdown with respect to the voice recognition time as voice.

음성 인식 가능 시간을 나타내는 프로그래스 바(702)는 일반적인 막대 형태 이외에도 다양한 모양으로 구현될 수 있으며, 예를 들어 음성 인식 기능을 나타내는 마이크 아이콘의 주변을 둘러싸는 원형 등으로 구현하는 것 또한 가능하다.The progress bar 702 indicating the voice recognition time may be implemented in various shapes in addition to the general bar shape. For example, the progress bar 702 may be implemented in a circle surrounding the microphone icon representing the voice recognition function.

사용자는 알림 화면(500) 상의 액션 버튼(510)을 참조하여 음성 명령을 발화할 수 있다. 이때, 프로세서(110)는 음성 인식 기능을 통해 사용자 발화에 따른 음성이 인식되면 도 8에 도시한 바와 같이 음성 인식 가능 시간에 대한 카운트다운을 일시적으로 중단하고 알림 화면(500) 상에 음성을 인식 중에 있음을 나타내는 음성 인식 상태 정보(803)를 표시할 수 있다. 음성 인식 중인 상태에서 일정 시간이 경과하도록 음성 입력이 감지되지 않으면 알림 화면(500)은 사라지게 된다.The user may speak a voice command with reference to the action button 510 on the notification screen 500. In this case, when the voice according to the user's speech is recognized through the voice recognition function, the processor 110 temporarily stops the countdown for the voice recognition possible time and recognizes the voice on the notification screen 500 as shown in FIG. 8. Voice recognition status information 803 indicating the presence of a message may be displayed. If the voice input is not detected so that a predetermined time elapses while the voice is being recognized, the notification screen 500 disappears.

프로세서(110)는 음성 인식이 성공하면, 즉 알림 화면(500) 상의 액션 버튼(510) 중 어느 하나의 액션 버튼에 해당되는 선택지 문구와 일치하거나 유사한 음성 명령을 인식하는 경우 도 9에 도시한 바와 같이 인식된 음성 명령의 액션 버튼(920)을 나머지 액션 버튼(930)과 구분하여 표시할 수 있다. 예를 들어, 인식된 음성 명령에 해당되는 액션 버튼(920)의 선택지 문구를 나머지 액션 버튼(930)의 선택지 문구보다 강조하여(highlight) 표시할 수 있다. 다른 예로, 알림 화면(500) 상에 음성으로 인식된 액션 버튼(920), 즉 사용자에 의해 선택된 액션 버튼(920)만을 표시하고 나머지 액션 버튼(930)은 숨김 처리할 수 있다. 또한, 프로세서(110)는 인식된 음성 명령에 대한 정보를 음성으로 출력함으로써 사용자에 의해 어떤 액션 버튼이 선택됐는지 음성 안내를 함께 제공할 수 있다.When the voice recognition is successful, that is, when the voice command is recognized as a voice command that matches or is similar to the choice phrase corresponding to one of the action buttons 510 of the action button 510 on the notification screen 500 as shown in FIG. 9. The action button 920 of the recognized voice command may be displayed separately from the other action buttons 930. For example, the selection text of the action button 920 corresponding to the recognized voice command may be displayed by highlighting the selection text of the remaining action button 930. As another example, only the action button 920 recognized as a voice, that is, the action button 920 selected by the user, may be displayed on the notification screen 500, and the remaining action buttons 930 may be hidden. In addition, the processor 110 may provide voice guidance of what action button is selected by the user by outputting information on the recognized voice command as a voice.

프로세서(110)는 음성 인식이 성공하면 인식된 음성 명령을 취소할 수 있는 기능을 보장할 수 있으며, 알림 화면(500) 상에 인식된 음성 명령을 취소할 수 있음을 나타내는 취소 가능 상태 정보(904)를 표시할 수 있다. 취소 가능 상태 정보(904)가 표시되는 시간 이내에 사용자로부터 '취소'와 일치하거나 유사한 문장의 음성(취소 음성 명령)이 인식되면 이전에 인식된 음성 명령을 취소하고 음성 인식을 위한 발화를 사용자에게 다시 요청할 수 있다. 알림 화면(500)의 팝업을 유지하고 있는 상태에서는 인식된 음성 명령의 취소가 언제든지 가능하다.The processor 110 may guarantee a function of canceling the recognized voice command if the voice recognition is successful, and the cancelable state information 904 indicating that the voice command recognized on the notification screen 500 may be canceled. ) Can be displayed. If a voice of a sentence that matches or is similar to 'Cancel' is recognized from the user within the time that the reversible status information 904 is displayed, the voice command for canceling the previously recognized voice command is canceled and the speech for voice recognition is returned to the user. You can request While maintaining the pop-up of the notification screen 500, it is possible to cancel the recognized voice command at any time.

프로세서(110)는 사용자 발화에 따른 음성 인식 결과가 알림 화면(500) 상의 액션 버튼(510)에 해당되는 선택지 문구와 일치하거나 유사하지 않는 경우 음성 인식 실패로 판단한 후 알림 화면(500) 및/또는 음성 안내를 통해 사용자에게 음성 인식을 위한 발화를 다시 요청할 수 있다. 이후, 일정 시간 동안 음성 인식이 감지되지 않으면 알림 화면(500)은 사라지게 된다.The processor 110 determines that the voice recognition fails after the voice recognition result according to the user's speech matches or does not match the option phrase corresponding to the action button 510 on the notification screen 500, and / or the notification screen 500 and / or The voice guidance may request the user to speak the voice again for speech recognition. Thereafter, if the voice recognition is not detected for a certain time, the notification screen 500 disappears.

알림에 대하여 알림 내용과 각종 정보 표출은 화면을 이용한 시각적인 출력 방식은 물론, 음성을 통한 청각적인 출력 방식이 선택적으로 혹은 필수적으로 병행될 수 있다.Regarding the notification, the contents of the notification and the display of various information may be selectively or necessarily parallel to a visual output method using a screen, as well as an audio output method through voice.

도 5 내지 도 9를 통해 설명한 알림 화면은 예시적인 것일 뿐, 알림 내용이나 화면 구성 등은 얼마든지 변경 가능하다.The notification screen described with reference to FIGS. 5 to 9 is merely an example, and the contents of the notification or the screen configuration may be changed.

이처럼 본 발명의 실시예들에 따르면, 특정 알림, 가령 사용자 응답이 요구되는 알림에 대해서 음성 명령을 트리거 할 수 있다. 그리고, 본 발명의 실시예에 따르면, 알림이 표출되는 시간 동안 음성 인식 가능 상태를 자동 활성화 하여 알림에 대해 바로 음성 명령으로 조작할 수 있다. 또한, 본 발명의 실시예에 따르면, 알림 표출 시간 동안 활성화 되는 음성 인식 기능에 있어 입력 가능한 음성 명령어를 안내해 줌으로써 사용자가 가능한 음성 명령어를 인지하고 있지 않더라도 음성 명령을 쉽게 내릴 수 있다. 더욱이, 본 발명의 실시예에 따르면, 알림에 포함된 선택지의 문구를 이용하여 음성 인식 범주를 한정함으로써 해당 범주 내에서 음성 명령을 인식할 수 있어 음성 오인식률을 개선할 수 있다.As such, according to embodiments of the present disclosure, a voice command may be triggered for a specific notification, for example, a notification requiring a user response. In addition, according to an embodiment of the present invention, the voice recognition possible state is automatically activated during the time when the notification is expressed, and the notification can be directly manipulated by a voice command. In addition, according to an embodiment of the present invention, by providing a voice command that can be input in the voice recognition function activated during the notification display time, the voice command can be easily issued even if the user does not recognize the voice command. Furthermore, according to an exemplary embodiment of the present invention, the voice command may be recognized within the corresponding category by limiting the voice recognition category by using the phrase of the option included in the notification, thereby improving the voice recognition rate.

이상에서 설명된 장치는 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치 및 구성요소는, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPGA(field programmable gate array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 어플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다. 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The apparatus described above may be implemented as a hardware component, a software component, and / or a combination of hardware components and software components. For example, the devices and components described in the embodiments may include a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate array (FPGA), a programmable PLU (programmable). It can be implemented using one or more general purpose or special purpose computers, such as logic units, microprocessors, or any other device capable of executing and responding to instructions. The processing device may execute an operating system (OS) and one or more software applications running on the operating system. The processing device may also access, store, manipulate, process, and generate data in response to the execution of the software. For convenience of explanation, one processing device may be described as being used, but one of ordinary skill in the art will appreciate that the processing device includes a plurality of processing elements and / or a plurality of types of processing elements. It can be seen that it may include. For example, the processing device may include a plurality of processors or one processor and one controller. In addition, other processing configurations are possible, such as parallel processors.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치에 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.The software may include a computer program, code, instructions, or a combination of one or more of the above, and configure the processing device to operate as desired, or process it independently or collectively. You can command the device. Software and / or data may be any type of machine, component, physical device, virtual equipment, computer storage medium or device in order to be interpreted by or to provide instructions or data to the processing device. It can be embodied in. The software may be distributed over networked computer systems so that they may be stored or executed in a distributed manner. Software and data may be stored on one or more computer readable recording media.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 이때, 매체는 컴퓨터로 실행 가능한 프로그램을 계속 저장하거나, 실행 또는 다운로드를 위해 임시 저장하는 것일 수도 있다. 또한, 매체는 단일 또는 수 개의 하드웨어가 결합된 형태의 다양한 기록수단 또는 저장수단일 수 있는데, 어떤 컴퓨터 시스템에 직접 접속되는 매체에 한정되지 않고, 네트워크 상에 분산 존재하는 것일 수도 있다. 매체의 예시로는, 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체, CD-ROM 및 DVD와 같은 광기록 매체, 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical medium), 및 ROM, RAM, 플래시 메모리 등을 포함하여 프로그램 명령어가 저장되도록 구성된 것이 있을 수 있다. 또한, 다른 매체의 예시로, 어플리케이션을 유통하는 앱 스토어나 기타 다양한 소프트웨어를 공급 내지 유통하는 사이트, 서버 등에서 관리하는 기록매체 내지 저장매체도 들 수 있다.The method according to the embodiment may be embodied in the form of program instructions that can be executed by various computer means and recorded in a computer readable medium. In this case, the medium may be to continuously store a program executable by the computer, or to temporarily store for execution or download. In addition, the medium may be a variety of recording means or storage means in the form of a single or several hardware combined, not limited to a medium directly connected to any computer system, it may be distributed on the network. Examples of the medium include magnetic media such as hard disks, floppy disks and magnetic tape, optical recording media such as CD-ROMs and DVDs, magneto-optical media such as floptical disks, And ROM, RAM, flash memory, and the like, configured to store program instructions. In addition, examples of another medium may include a recording medium or a storage medium managed by an app store that distributes an application, a site that supplies or distributes various software, a server, or the like.

이상과 같이 실시예들이 비록 한정된 실시예와 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기의 기재로부터 다양한 수정 및 변형이 가능하다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.Although the embodiments have been described by the limited embodiments and the drawings as described above, various modifications and variations are possible to those skilled in the art from the above description. For example, the described techniques may be performed in a different order than the described method, and / or components of the described systems, structures, devices, circuits, etc. may be combined or combined in a different form than the described method, or other components. Or even if replaced or substituted by equivalents, an appropriate result can be achieved.

그러므로, 다른 구현들, 다른 실시예들 및 특허청구범위와 균등한 것들도 후술하는 특허청구범위의 범위에 속한다.Therefore, other implementations, other embodiments, and equivalents to the claims are within the scope of the claims that follow.

Claims

In the notification voice operation system of the computer implemented notification voice operation method,
Outputting a notification screen including the notification contents when the notification is received; And
Activating a voice recognition function for receiving the user response as a voice command while the notification screen is output when the notification is a notification requiring a user response.
Including,
The outputting step,
Simultaneously displaying the notification screen and outputting the notification content through voice to text (TTS);
After the end of the process of outputting the notification content by voice, the output of the notification screen is maintained for a predetermined time,
The activating step,
Only the time at which the notification screen is output is automatically switched to a state in which a voice command can be recognized.
When the voice recognition function is activated, a voice recognition time corresponding to the set time is displayed on the notification screen, and the remaining time is counted down until the time when the notification screen disappears.
If a voice input is not detected for a certain time after the voice recognition function is activated, the voice recognition function ends and the notification screen disappears.
The activating step,
Setting a speech recognition category by using an option phrase of an action button included in the notification screen; And
Recognizing a voice command corresponding to the voice recognition category through the voice recognition function
Including,
A set of sentences recognizable by a real command with respect to the option phrase, wherein a bunch of words having semantic similarity with the option phrase is set to the speech recognition category
Notification voice operation method characterized in that.

delete

The method of claim 1,
The outputting step,
Displaying the notification screen and simultaneously outputting an option text of an action button included in the notification screen that can be inputted in the user response;
Notification voice operation method characterized in that.

The method of claim 1,
The outputting step,
Outputting the classified notification type according to the type of the notification;
Notification voice operation method characterized in that.

The method of claim 1,
The outputting step,
Dividing the type of the notification into a notification for which the user response is required and the remaining notification and outputting at least one of a visual element and an audio element according to the type of the notification;
Notification voice operation method characterized in that.

delete

The method of claim 1,
The notification screen includes at least one action button that can be input in response to the user,
Displaying an action button corresponding to a voice command recognized through the voice recognition function separately from the other action buttons
Notification voice operation method further comprising.

The method of claim 9,
Canceling a previously recognized voice command when a voice command corresponding to a cancellation command is recognized through the voice recognition function while the notification screen is displayed.
Notification voice operation method further comprising.

A computer program recorded on a computer readable recording medium for executing a method of operating a voice notification,
The notification voice operation method,
Outputting a notification screen including the notification contents when the notification is received; And
Activating a voice recognition function for receiving the user response as a voice command while the notification screen is output when the notification is a notification requiring a user response.
Including,
The outputting step,
Simultaneously displaying the notification screen and outputting the notification content through voice to text (TTS);
After the end of the process of outputting the notification content by voice, the output of the notification screen is maintained for a predetermined time,
The activating step,
Only the time at which the notification screen is output is automatically switched to a state in which a voice command can be recognized.
When the voice recognition function is activated, a voice recognition time corresponding to the set time is displayed on the notification screen, and the remaining time is counted down until the time when the notification screen disappears.
If a voice input is not detected for a certain time after the voice recognition function is activated, the voice recognition function ends and the notification screen disappears.
The activating step,
Setting a speech recognition category by using an option phrase of an action button included in the notification screen; And
Recognizing a voice command corresponding to the voice recognition category through the voice recognition function
Including,
A set of sentences recognizable by a real command with respect to the option phrase, wherein a bunch of words having semantic similarity with the option phrase is set to the speech recognition category
And a computer program recorded on a computer readable recording medium.

In the computer-implemented notification voice manipulation system,
At least one processor implemented to execute computer-readable instructions
Including,
The at least one processor,
A notification output control unit for outputting a notification screen including notification contents when a notification is received; And
When the notification is a notification that requires a user response, a voice recognition control unit for activating a voice recognition function for receiving the user response as a command in the form of a voice while the notification screen is output.
Including,
The notification output control unit,
Simultaneously displaying the notification screen and outputting the notification content through voice to text (TTS);
After the end of the process of outputting the notification content by voice, the output of the notification screen is maintained for a predetermined time,
The voice recognition control unit,
Only the time at which the notification screen is output is automatically switched to a state in which a voice command can be recognized.
The notification output control unit,
When the voice recognition function is activated, a voice recognition time corresponding to the set time is displayed on the notification screen, and the remaining time is counted down until the time when the notification screen disappears.
If a voice input is not detected for a certain time after the voice recognition function is activated, the voice recognition function ends and the notification screen disappears.
The voice recognition control unit,
Set the speech recognition category by using the choice phrase of the action button included in the notification screen,
The voice recognition function recognizes a voice command corresponding to the voice recognition category,
A set of sentences recognizable by a real command with respect to the option phrase, wherein a bunch of words having semantic similarity with the option phrase is set to the speech recognition category
Notification voice operation system, characterized in that.

The method of claim 12,
The notification output control unit,
Dividing the type of the notification into a notification for which the user response is required and the remaining notification and outputting at least one of a visual element and an audio element according to the type of the notification;
Notification voice operation system, characterized in that.

delete

The method of claim 12,
The notification screen includes at least one action button that can be input in response to the user,
The notification output control unit,
Displaying an action button corresponding to a voice command recognized through the voice recognition function separately from the other action buttons
Notification voice operation system, characterized in that.