KR20130102702A

KR20130102702A - Multi-modal input device using handwriting and voice recognition and control method thereof

Info

Publication number: KR20130102702A
Application number: KR1020120023725A
Authority: KR
Inventors: 도정인
Original assignee: 주식회사 디오텍
Priority date: 2012-03-08
Filing date: 2012-03-08
Publication date: 2013-09-23
Also published as: KR101385012B1

Abstract

PURPOSE: A multi-modal input device using writing and voice recognition functions and a method for controlling thereof supply desired input information exactly as a result by modifying and grasping a misrecognition part of initial writing automatically when voice information is provided after a writing recognition result is confirmed. CONSTITUTION: A multi-modal input device using writing and voice recognition functions and a method for controlling thereof comprise the following steps: a writing recognition module (110) recognizes writing information, and converts it into first conversion information including first result information; a voice recognition module (120) is activated automatically for a predetermined time right after a transform operation, recognizes voice information, and converts it into a second conversion information including a second result information; a determining unit (132) grasps a misrecognition part of the first result information by using the second conversion information; a correction part (134) obtains a third result information by modifying the misrecognition part; and a display device (140) indicates the first and the third result information. [Reference numerals] (110) Writing recognition module; (120) Voice recognition module; (132) Determining unit; (134) Correction part; (136) Language option unit; (138) Notifying unit; (140) Display device; (150) Dictionary DB; (AA, EE) Writing information; (BB,GG,KK) Word information; (CC,DD) First conversion information; (FF) Conversion information determined by a user signal; (HH) Language model information; (II) First and second conversion information; (JJ) Third result information; (LL,MM) Second conversion information; (NN) Voice information

Description

Multi-modal input device using handwriting and voice recognition and control method

본 발명은 멀티모달 입력장치 및 그 입력장치의 제어방법으로서, 더욱 상세하게는 필기인식모듈의 변환 동작 직후 미리 결정된 시간 동안 자동으로 활성화되는 음성인식모듈로부터 수신한 음성정보를 이용하여 필기인식모듈이 오인식한 부분을 파악 및 수정하여 결과를 표시하는 필기 및 음성 인식을 이용한 멀티모달 입력장치 및 그 입력장치의 제어방법에 관한 것이다.
The present invention relates to a multi-modal input device and a method of controlling the input device, and more particularly, a handwriting recognition module using voice information received from a voice recognition module that is automatically activated for a predetermined time immediately after a conversion operation of the handwriting recognition module. The present invention relates to a multi-modal input device using handwriting and speech recognition for identifying and correcting misidentified parts and displaying a result thereof, and a method of controlling the input device.

최근 컴퓨터, 이동 단말기기 등과 같은 전자기계장치와 인간 사이의 통신하는 수단과 관련해서, 키보드, 펜, 음성 등과 같은 복수 개의 입력을 통합하여 이용하는 멀티모달(다중모드) 방식의 텍스트 입력 인터페이스 활용 기술이 주목받고 있다. Recently, with regard to a means of communicating between an electromechanical device such as a computer or a mobile terminal device and a human, a technology of utilizing a multimodal (multimode) type text input interface using a plurality of inputs such as a keyboard, a pen, and a voice has been integrated. It is attracting attention.

이러한 멀티모달 기술을 활용한 방식 중에 최초 입력방식을 통해 얻은 결과 정보(텍스트)의 오인식 부분을 두 번째 입력방식을 통해 교정 및 수정하는 방식들이 종래에 공개된 바 있다. Among the methods using the multi-modal technology, methods for correcting and correcting the misrecognition portion of the result information (text) obtained through the first input method through the second input method have been disclosed in the past.

본 발명과 관련된 종래기술인 미국등록특허 US 7,506,271에서는 필기 인식 방식을 통해 얻은 최초 결과 정보를 표시하면, 사용자가 오인식 부분을 선택하고 그 오인식 부분을 음성 인식, 키보드 활용 혹은 교체 가능한 단어 리스트 제공을 통해 교정 및 수정하는 방식을 공개하고 있다. In US Patent No. 7,506,271, which is related with the present invention, when the initial result information obtained through the handwriting recognition method is displayed, the user selects a misrecognition portion and corrects the misrecognition portion by providing a speech recognition, a keyboard utilization, or a replaceable word list. And how to modify it.

또한, 본 발명과 관련된 다른 종래기술인 미국등록특허 US 7,941,316에서는 사용자가 활성화시킨 음성 인식 모듈을 통해 얻은 최초 결과 정보를 표시하면, 한번에 한 단어씩 오인식 여부를 체크하도록 키보드 혹은 교체 가능한 단어 리스트를 제공하여 그 오인식 부분을 교정 및 수정하는 방식에 대해 공개하고 있다. In addition, US Patent No. 7,941,316, which is related to the present invention, displays initial result information obtained through a voice recognition module activated by a user, and provides a keyboard or a list of replaceable words to check for misrecognition one word at a time. It discloses how to correct and correct the misrecognition.

그러나, 종래에 공개된 멀티모달 기술에서 오인식 부분을 교정 및 수정하기 위해서는 사용자가 직접 오인식 부분을 선택하여 키보드를 이용해 키 입력을 하거나 교체 가능 리스트에서 선택해야 하는 불편함이 있었고, 음성 인식을 통해 수정하는 경우에도 음성 인식 엔진 활성화를 위해 사용자의 별도 조작이 필요하다는 불편함이 있었다.
However, in order to correct and correct the misrecognition part in the conventionally disclosed multi-modal technology, the user has to select the misrecognition part directly and use a keyboard to input a key or select from a replaceable list. Even in the case of the user, there is an inconvenience that a separate operation of the user is required for activating the speech recognition engine.

본 발명은 상기와 같은 문제점을 개선하기 위해서, 사용자가 별도 조작에 의해 지정하지 않아도 음성인식모듈이 필기인식모듈의 변환 동작 직후 미리 결정된 시간 동안 자동으로 활성화되고, 수신되는 음성정보가 있으면 이를 바탕으로 최초 필기정보의 오인식 부분을 자동으로 파악 및 수정하여, 사용자가 원하는 입력정보를 정확하고도 편리하게 얻을 수 있는 필기 및 음성 인식을 이용한 멀티모달 입력장치 및 그 입력장치의 제어방법을 제공하는데 그 목적이 있다.
The present invention, in order to improve the above problems, the voice recognition module is automatically activated for a predetermined time immediately after the conversion operation of the handwriting recognition module even if the user does not specify by a separate operation, if there is received voice information based on this To provide a multi-modal input device and a control method of the input device using handwriting and voice recognition that can automatically and correctly identify and correct the first recognition information of the handwriting information. There is this.

상기와 같은 필기 및 음성 인식을 이용한 멀티모달 입력장치는, 필기정보를 인식하여 제1 결과정보를 포함하는 복수 개의 제1 변환정보로 변환하는 필기인식모듈; 상기 필기인식모듈의 변환 동작 직후 미리 결정된 시간 동안 자동으로 활성화되어 음성정보를 인식하여 제2 결과정보를 포함하는 복수 개의 제2 변환정보로 변환하는 음성인식모듈; 상기 음성인식모듈로부터 수신한 복수 개의 제2 변환정보를 이용하여 상기 필기인식모듈로부터 수신한 제1 결과정보의 오인식 부분을 파악하는 판단부와, 상기 제1 결과정보의 오인식 부분을 수정하여 제3 결과정보를 얻는 수정부를 포함하는 제어기; 및 상기 제1 결과정보와 제3 결과정보 중 적어도 어느 하나의 정보를 표시하는 디스플레이 장치;를 포함하여 구성된다.
The multi-modal input device using the handwriting and the voice recognition as described above comprises: a handwriting recognition module for recognizing handwriting information and converting the handwritten information into a plurality of first conversion information including first result information; A voice recognition module which is automatically activated for a predetermined time immediately after a conversion operation of the handwriting recognition module to recognize voice information and convert the voice information into a plurality of second conversion information including second result information; A determination unit for identifying a misrecognition portion of the first result information received from the handwriting recognition module using the plurality of second conversion information received from the speech recognition module, and correcting the misrecognition portion of the first result information by modifying a third A controller including a correction unit for obtaining result information; And a display device displaying at least one of the first result information and the third result information.

한편, 본 발명의 다른 측면인 필기 및 음성 인식을 이용한 멀티모달 입력장치의 제어방법은, 필기인식모듈에서 필기정보를 인식하여 변환된 복수 개의 제1 변환정보 중에 제1 결과정보를 디스플레이 장치로 표시하는 제1 단계; 상기 제1 변환정보로의 변환 직후에 음성인식모듈이 미리 결정된 시간 동안 자동으로 활성화되어, 상기 시간 동안 음성정보를 인식하면 제2 결과정보를 포함하는 복수 개의 제2 변환정보로 변환하는 제2 단계; 상기 복수 개의 제2 변환정보를 이용하여 상기 제1 결과정보의 오인식 부분을 파악하는 제3 단계; 및 파악된 오인식 부분을 수정하여 상기 제1 결과정보 대신에 제3 결과정보를 상기 디스플레이 장치에 표시하거나, 상기 제1, 제3 결과정보를 동시에 상기 디스플레이 장치에 표시하는 제4 단계;를 포함하여 구성된다.
On the other hand, the control method of the multi-modal input device using the handwriting and speech recognition of another aspect of the present invention, the first result information of the plurality of first conversion information converted by recognizing the handwriting information in the handwriting recognition module to display on the display device A first step of doing; A second step of automatically activating the voice recognition module immediately after the conversion to the first conversion information and converting the voice recognition module into a plurality of second conversion information including second result information when the voice information is recognized during the predetermined time; ; A third step of identifying a misrecognition portion of the first result information by using the plurality of second transformation information; And a fourth step of correcting the identified misrecognition portion to display third result information on the display device instead of the first result information, or to simultaneously display the first and third result information on the display device. It is composed.

상기와 같은 본 발명에 따른 필기 및 음성 인식을 이용한 멀티모달 입력장치 및 그 제어방법을 이용하면, 필기인식 결과를 확인한 후 사용자의 별도 조작 없이도 음성정보만을 제공하면 최초 필기정보의 오인식 부분을 자동으로 파악 및 수정할 수 있어, 사용자가 원하는 입력정보를 정확하게 결과로 얻을 수 있다는 장점이 있다. When using the multi-modal input device using the handwriting and speech recognition and the control method according to the present invention as described above, after confirming the handwriting recognition result and providing only the voice information without the user's separate operation, the first recognition information of the first handwriting information is automatically Since it can be identified and corrected, there is an advantage that the user can accurately obtain the desired input information as a result.

또한, 종래기술의 경우 음성인식모듈을 음성 인식이 가능한 상태로 활성화 또는 전환시키기 위해 별도의 사용자 조작이 필요한 것과 다르게, 본 발명의 음성인식모듈은 필기인식모듈의 변환 동작 직후 미리 결정된 시간 동안 자동으로 활성화됨으로써, 음성 입력이 인가되는 타이밍을 미리 결정해두어 사용자의 조작이 필요하지 않아 편리하고, 불필요한 전력 소비를 줄이면서 오인식률을 낮출 수 있다는 장점이 있다. In addition, in the case of the prior art, unlike a separate user operation is required to activate or switch the voice recognition module to a state in which voice recognition is possible, the voice recognition module of the present invention automatically for a predetermined time immediately after the conversion operation of the handwriting recognition module. By activating, there is an advantage in that the timing at which the voice input is applied is determined in advance so that the user's operation is not necessary, and thus, the false recognition rate can be lowered while reducing unnecessary power consumption.

아울러, 본 발명에 따른 필기 및 음성 인식을 이용한 멀티모달 입력장치 및 그 제어방법을 이용하면, 인식 가능한 언어를 선택할 수 있는 옵션을 제공하여 어떠한 언어를 사용하는 사람이라도 쉽게 이용할 수 있으며, 알림창을 디스플레이 장치에 표시하여 처음 사용하는 사람일지라도 필기인식 변환 후 미리 결정된 시간 동안 음성인식을 통해 오인식 부분을 수정할 수 있다는 사실을 알 수 있다는 장점이 있다. In addition, using the multi-modal input device using the handwriting and speech recognition and the control method according to the present invention, by providing an option to select a language that can be recognized, anyone using any language can easily use, and display a notification window Even the first person to display on the device has an advantage that it is possible to know that the false recognition portion can be corrected through voice recognition for a predetermined time after handwriting recognition conversion.

덧붙여, 본 발명에서는 최초 필기인식된 결과에서 오인식 부분을 자동으로 파악하고 오인식 부분 파악에 대한 정확도 수치에 따라 결정되는 매칭 방식, 언어 모델 방식, 직접 선택 방식 등의 다양한 수정 방식을 제공함으로써, 오인식률을 탁월하게 줄일 수 있다는 장점이 있다.
In addition, in the present invention, by automatically detecting the misrecognition part from the first handwriting recognition result and providing various modification methods such as a matching method, a language model method, and a direct selection method, which are determined according to the accuracy value for the misrecognition part, the false recognition rate There is an advantage that can be reduced to excellent.

도 1 은 본 발명의 일 실시예에 따른 필기 및 음성 인식을 이용한 멀티모달 입력장치의 내부 개략도
도 2는 본 발명의 일 실시예에 따른 필기 및 음성 인식을 이용한 멀티모달 입력장치의 외부 개략도
도 3은 본 발명의 일 실시예에 따른 필기 및 음성 인식을 이용한 멀티모달 입력장치의 제어방법에 대한 순서도
도 4는 본 발명의 다른 실시예에 따른 필기 및 음성 인식을 이용한 멀티모달 입력장치의 제어방법에 대한 순서도
도 5a는 제1 결과정보의 오인식 부분을 파악하는 과정에서 제1, 제2 변환정보를 비교하는 것을 나타내는 개략도
도 5b는 도 5a에서 제1, 제2 변환정보 중 서로 매칭되는 제2 결과정보를 검색하는 것을 나타내는 개략도
도 6a는 도 2의 디스플레이 장치에 표시되는 결과에 대한 하나의 예를 나타내는 도면
도 6b는 도 2의 디스플레이 장치에 표시되는 결과에 대한 다른 예를 나타내는 도면1 is an internal schematic diagram of a multi-modal input device using handwriting and speech recognition according to an embodiment of the present invention.
2 is an external schematic diagram of a multi-modal input apparatus using handwriting and speech recognition according to an embodiment of the present invention.
3 is a flowchart illustrating a control method of a multi-modal input apparatus using handwriting and voice recognition according to an embodiment of the present invention.
4 is a flowchart illustrating a control method of a multi-modal input apparatus using handwriting and voice recognition according to another embodiment of the present invention.
5A is a schematic diagram illustrating comparing first and second converted information in a process of identifying a misrecognition portion of first result information;
FIG. 5B is a schematic diagram illustrating searching for second result information matching each other among the first and second transform information in FIG. 5A; FIG.
6A is a diagram illustrating an example of a result displayed on the display apparatus of FIG. 2.
6B is a diagram illustrating another example of a result displayed on the display apparatus of FIG. 2.

이하, 본 발명을 첨부도면을 참조로 보다 상세하게 설명하기로 한다. 본 명세서 및 특허청구범위에 사용된 용어나 단어는 발명자가 그 자신의 발명을 가장 최선의 방법으로 설명하기 위해 용어의 개념을 적절하게 정의할 수 있다는 원칙에 입각하여 본 발명의 기술적 사상에 부합하는 의미와 개념으로 해석되어야 한다. BRIEF DESCRIPTION OF THE DRAWINGS The invention will now be described in more detail with reference to the accompanying drawings. The terms and words used in the present specification and claims are to be construed in accordance with the technical idea of the present invention based on the principle that the inventor can properly define the concept of a term in order to explain its invention in the best way It should be interpreted as meaning and concept.

도 1 은 본 발명의 일 실시예에 따른 필기 및 음성 인식을 이용한 멀티모달 입력장치의 내부 개략도이다. 1 is an internal schematic diagram of a multi-modal input device using handwriting and speech recognition according to an embodiment of the present invention.

본 발명에 따른 필기 및 음성 인식을 이용한 멀티모달 입력장치(100)는, 필기정보를 인식하여 제1 결과정보를 포함하는 복수 개의 제1 변환정보로 변환하는 필기인식모듈(110); 그 필기인식모듈(110)의 변환 동작 직후 미리 결정된 시간 동안 자동으로 활성화되어 음성정보를 인식하여 제2 결과정보를 포함하는 복수 개의 제2 변환정보로 변환하는 음성인식모듈(120); 그 음성인식모듈(120)로부터 수신한 복수 개의 제2 변환정보를 이용하여 필기인식모듈(110)로부터 수신한 제1 결과정보의 오인식 부분을 파악하는 판단부(132)와, 제1 결과정보의 오인식 부분을 수정하여 제3 결과정보를 얻는 수정부(134)를 포함하는 제어기(130); 및 제1 결과정보와 제3 결과정보 중 적어도 어느 하나의 정보를 표시하는 디스플레이 장치(140);를 포함하여 구성되는 점에 기술적 특징이 있다. Multi-modal input device 100 using handwriting and speech recognition according to the present invention, the handwriting recognition module 110 for recognizing the handwriting information and converting it into a plurality of first conversion information including the first result information; A voice recognition module 120 which is automatically activated for a predetermined time immediately after the conversion operation of the handwriting recognition module 110 to recognize voice information and convert the voice information into a plurality of second conversion information including second result information; A determination unit 132 for identifying a misrecognition portion of the first result information received from the handwriting recognition module 110 using the plurality of second conversion information received from the voice recognition module 120, and the first result information. A controller 130 including a correction unit 134 for correcting a misrecognition portion to obtain third result information; And a display device 140 displaying at least one of the first result information and the third result information.

앞으로 설명할 본 발명에 따른 필기 및 음성 인식을 이용한 멀티모달 입력장치(100)는 필기정보를 먼저 수신하여 그 결과를 표시하고, 음성정보를 통해 표시된 결과의 오인식 부분을 파악 및 수정해 다시 결과를 표시하는 것을 주된 특징으로 하며, 사용자가 입력하는 필기정보나 음성정보는 하나 이상의 단어 또는 복수 개의 단어로 이루어진 하나 이상의 문장인 것이 바람직하다.The multi-modal input device 100 using handwriting and speech recognition according to the present invention to be described later receives the handwriting information and displays the result, and recognizes and corrects the misrecognition part of the result displayed through the voice information and then returns the result. The main characteristic of the display is that the writing information or voice information input by the user is one or more words or one or more sentences composed of a plurality of words.

본 발명에서 필기인식모듈(110)은 필기정보를 인식하여 복수 개의 제1 변환정보로 변환하는 장치로서, 대표적으로 입력 아날로그 데이터인 좌표를 판독하여 디지털 형식으로 전달해주는 입력장치인 디지타이저(digitizer)를 이용한 별도의 장치 혹은 터치스크린 방식의 디스플레이 장치(140)에 대해 전자펜 혹은 손가락 등을 이용해서 사용자가 입력하는 필기정보를 인식하고, 그 필기정보를 미리 결정된 알고리즘에 따라 복수 개의 제1 변환정보로 변환한다. In the present invention, the handwriting recognition module 110 is a device for recognizing handwriting information and converting the handwritten information into a plurality of first conversion information, and typically, a digitizer, which is an input device that reads coordinates, which are input analog data, and delivers them in a digital format. Recognizing handwriting information input by a user using an electronic pen or a finger with respect to a separate device or a touch screen display device 140 used, and converts the handwriting information into a plurality of first conversion information according to a predetermined algorithm. To convert.

이때, 복수 개의 제1 변환정보 중에는 후술할 디스플레이 장치(140)에 표시되는 제1 결과정보가 포함되어 있는데, 제1 결과정보 선정기준은 위 알고리즘에 따라 달라질 수 있고, 이러한 알고리즘은 종래에 알려진 어떠한 방식이라도 적용 가능하며, 필기인식모듈(110)에서는 예를 들어 필기정보와 확률적으로 가장 유사한 제1 변환정보를 제1 결과정보로 결정할 수 있다.At this time, the plurality of first conversion information includes the first result information displayed on the display device 140 to be described later, the first result information selection criteria may vary according to the above algorithm, the algorithm is known in the art The method may be applied, and the handwriting recognition module 110 may determine, for example, first conversion information that is most likely similar to the handwriting information as the first result information.

음성인식모듈(120)은 음성정보를 인식하여 복수 개의 제2 변환정보로 변환하는 장치로서, 상술한 필기인식모듈(110)의 변환 동작 직후에 사용자의 특별한 조작 없이도 미리 결정된 시간 동안 자동으로 활성화되는 점에 특징이 있고, 사용자가 마이크 등을 통해 인가하는 음성정보를 인식하여 그 음성정보를 미리 결정된 알고리즘에 따라 복수 개의 제2 변환정보로 변환한다.The voice recognition module 120 is a device for recognizing voice information and converting the voice information into a plurality of second conversion information. The voice recognition module 120 is automatically activated for a predetermined time without a user's special operation immediately after the above-described conversion operation of the handwriting recognition module 110. The present invention is characterized in that it recognizes voice information applied by a user through a microphone or the like and converts the voice information into a plurality of second conversion information according to a predetermined algorithm.

이때, 음성인식모듈(120)이 활성화되는 시간은 사용자에 의해 조절될 수 있고, 복수 개의 제2 변환정보에는 제2 결과정보가 포함되어 있는데, 제2 결과정보에 대한 검색과 관련해서는 후술하기로 한다. In this case, the time at which the voice recognition module 120 is activated may be adjusted by a user, and the plurality of second conversion information includes second result information. The search for the second result information will be described later. do.

제어기(130)는 상술한 음성인식모듈(120)로부터 수신한 복수 개의 제2 변환정보를 이용하여 필기인식모듈(110)로부터 수신한 제1 결과정보의 오인식 부분을 파악 및 수정하여 제3 결과정보로 변환하는 장치로서, 특히, 제1 결과정보의 오인식 부분을 파악하는 판단부(132)와, 파악된 오인식 부분을 수정하여 제3 결과정보를 얻는 수정부(134)를 포함하여 구성된다.The controller 130 identifies and corrects the misrecognition portion of the first result information received from the handwriting recognition module 110 using the plurality of second conversion information received from the voice recognition module 120 as described above, and thus the third result information. In particular, the apparatus for converting the data into an apparatus includes a determination unit 132 for identifying a misrecognition portion of the first result information, and a correction unit 134 for correcting the identified misrecognition portion to obtain third result information.

판단부(132)는 오인식 부분을 파악하는 장치로서, 필기인식모듈(110)로부터 수신한 복수 개의 제1 변환정보와 음성인식모듈(120)로부터 수신한 복수 개의 제2 변환정보 중에 서로 매칭되는 제2 결과정보를 검색 및 이용하여 오인식 부분을 자동으로 파악하도록 프로그램되거나 제어되는 것이 바람직하다. Determination unit 132 is a device for identifying the misrecognition portion, the first matching information received from the handwriting recognition module 110 and the second matching information received from the plurality of second conversion information received from the voice recognition module 120 2 It is desirable to be programmed or controlled to search for and use the result information to automatically identify misrecognition parts.

수정부(134)는 판단부(132)에서 파악된 오인식 부분에 대한 정보를 수신하여 오인식 부분을 자동으로 수정하도록 프로그램되거나 제어되는 장치로서, 다양한 수정 방법이나 알고리즘이 미리 저장되어 있을 수 있는데, 오인식 부분 파악에 대한 정확도 수치와 미리 결정된 기준 값을 비교를 통해 결정된 수정 방법을 이용하여 제1 결과정보의 오인식 부분을 수정하는 것이 바람직하다.The correction unit 134 is a device that is programmed or controlled to automatically correct the misrecognition portion by receiving information on the misrecognition portion identified by the determination unit 132, and various correction methods or algorithms may be stored in advance. It is preferable to correct the misrecognition portion of the first result information by using a correction method determined by comparing the accuracy value for the partial grasp and the predetermined reference value.

판단부(132)의 오인식 부분 파악방법 및 수정부(134)의 오인식 부분 수정방법 등과 관련된 구체적인 설명은 후술하기로 한다.Detailed descriptions related to a method of identifying a misrecognition portion of the determination unit 132 and a method of correcting a misrecognition portion of the correction unit 134 will be described later.

아울러, 상술한 판단부(132) 및 수정부(134)는 사전 데이터베이스(150)에 저장되어 있는 단어정보, 언어 모델 정보를 포함하는 정보를 수신할 수 있는데, 특히 판단부(132)는 언어 모델 정보를 수신하여 오인식 부분을 파악하는 데에 독립적으로 혹은 보조적으로 이용할 수 있고, 수정부(134)는 언어 모델 정보를 수신하여 오인식 부분을 수정하는 데에 독립적으로 혹은 보조적으로 이용할 수 있다. In addition, the determination unit 132 and the correction unit 134 described above may receive information including word information and language model information stored in the dictionary database 150. In particular, the determination unit 132 may include a language model. Receiving the information can be used independently or auxiliary to identify the misrecognition portion, and the correction unit 134 can be used independently or auxiliary to receive the language model information to correct the misrecognition portion.

사전 데이터베이스(150)는 본 발명에 따른 멀티모달 입력장치(100) 내에 탑재된 내장형이거나 외부 서버와 유무선 통신망을 통해 연동되는 외장형일 수 있고, 지속적인 업데이트가 이루어질 수 있으며, 상술한 필기인식모듈(110)과 음성인식모듈(120)에 대해서도 단어정보를 포함하는 정보를 송신할 수 있다.The dictionary database 150 may be an internal type mounted in the multi-modal input device 100 according to the present invention or an external type interlocked with an external server through a wired or wireless communication network, and may be continuously updated, and the handwriting recognition module 110 described above. ) And the voice recognition module 120 may also transmit information including word information.

여기서, 언어 모델(language model) 정보라 함은 제1 결과정보의 오인식 부분을 파악 및 수정하는 데에 독립적으로 또는 보조적으로 사용될 수 있는 것으로서, 일예로 영어로 된 문장인 제1 결과정보의 오인식 부분을 파악하는 경우에는 제1 결과정보를 이루는 각 단어별로 주어, 동사, 목적어, 보어 등의 위치에 알맞게 있는가를 분석할 수 있고, 파악된 오인식 부분을 수정하는 경우에는 복수 개의 제2 변환정보 중에서 오인식 부분에 문법적으로 대체 가능한 변환정보가 어떤 것인지 분석할 수 있다. Here, the language model information may be used independently or auxiliary to identify and correct the misrecognition portion of the first result information, and for example, the misrecognition portion of the first result information that is a sentence in English. In the case of grasping the first result information, each word constituting the first result information can be analyzed according to the position of the verb, the object, the bore, and the like. You can analyze which conversion information is syntactically replaceable in.

덧붙여, 제어기(130)는 판단부(132) 및 수정부(134) 이외에도 언어옵션부(136)와, 알림부(138)를 더 포함하여 구성될 수 있는데, 언어옵션부(136)는 음성인식모듈(120)의 활성화와 동시에 알림창, 예를 들면 오인식 부분에 대해 음성으로 수정 가능하다는 내용 및/또는 마이크 기호 등을 담고 있는 알림창을 후술할 디스플레이 장치(140)에 표시하도록 제어할 수 있다. In addition, the controller 130 may further include a language option unit 136 and a notification unit 138, in addition to the determination unit 132 and the correction unit 134, the language option unit 136 is a voice recognition Simultaneously with the activation of the module 120, a notification window, for example, a notification window containing a content that can be corrected by voice and / or a microphone symbol and the like, may be controlled to be displayed on the display device 140 to be described later.

여기서, 상기 알림창의 표시 형식은 다양하게 변경될 수 있음은 물론이다. Here, the display format of the notification window may be changed in various ways.

또, 알림부(138)는 본 발명에 따른 멀티모달 입력장치(100) 동작 중에 필기정보 및 음성정보와 관련하여 선택 가능한 언어 옵션을 디스플레이 장치(140)에 표시하도록 제어하는 것이 바람직하다. In addition, the notification unit 138 may control the display device 140 to display a selectable language option in relation to the handwriting information and the voice information during the operation of the multi-modal input device 100 according to the present invention.

언어옵션부(136)는 필기인식이나 음성인식이 이루어지기 전에 후술할 사용자신호를 수신하여 작동하는 것이 바람직하고, 알림부(138)는 필기인식이 이루어진 직후에 작동하는 것이 바람직하나, 작동 시점은 조절 가능하다.The language option unit 136 preferably operates by receiving a user signal to be described below before handwriting recognition or voice recognition is performed, and the notification unit 138 preferably operates immediately after handwriting recognition is made. It is adjustable.

본 발명에서 디스플레이 장치(140)는 제1 결과정보와 제3 결과정보 중 적어도 어느 하나의 정보를 표시하는 장치로서, 본 발명에서 형태, 종류, 크기, 재질 등의 조건에 특별한 제한없이 적용 가능하며, 터치 스크린이 디스플레이 장치(140)로 적용되면 일방적으로 결과를 표시하는 장치가 아닌 그 자체가 하나의 입력 인터페이스로 동작하여 필기정보를 필기인식모듈(110)로 송신할 수 있다. In the present invention, the display device 140 is a device for displaying at least one of the first result information and the third result information, the present invention can be applied without particular limitation to conditions such as form, type, size, material, etc. When the touch screen is applied to the display device 140, the handwriting information may be transmitted to the handwriting recognition module 110 by operating as a single input interface rather than a device that unilaterally displays the result.

나아가, 후술할 수정부(134)의 구체적인 오인식 부분 수정방법에 따라서 디스플레이 장치(140)는 오인식 부분과 대응되는 제1 변환정보 및 제2 변환정보를 표시할 수 있고, 가상 키보드 등의 다른 입력방식을 표시할 수 있으며, 디스플레이 장치(140)를 통해 인가되는 사용자신호에 의해 결정된 변환정보를 제어기(130)의 수정부(134)로 전달할 수도 있다.Furthermore, according to a specific misrecognition portion correcting method of the correction unit 134, which will be described later, the display device 140 may display the first transformation information and the second transformation information corresponding to the misrecognition portion, and another input method such as a virtual keyboard. May be displayed, and the conversion information determined by the user signal applied through the display device 140 may be transmitted to the correction unit 134 of the controller 130.

즉, 디스플레이 장치(140)는 경우에 따라 필기인식모듈(110), 음성인식모듈(120)로부터 각각 제1 변환정보와 제2 변환정보를 직접적으로 수신하여 표시할 수 있고, 제어기(130)로부터도 제1, 제2 변환정보 및 제3 결과정보를 수신하여 표시할 수 있다.That is, the display device 140 may directly receive and display first conversion information and second conversion information, respectively, from the handwriting recognition module 110 and the voice recognition module 120 in some cases, and from the controller 130. First and second conversion information and third result information may be received and displayed.

본 발명의 일 실시예에 따른 필기 및 음성 인식을 이용한 멀티모달 입력장치(200)의 외부 개략도인 도 2를 참고하면, 사용자가 디지타이저(210) 상에 전자펜(220)을 이용하여 필기체로 필기정보(250)를 입력하고 있고, 디스플레이 장치(230)가 필기정보(250)로부터 변환된 제1 결과정보(260)를 표시하고 있다. Referring to FIG. 2, which is an external schematic diagram of the multi-modal input device 200 using handwriting and speech recognition according to an embodiment of the present invention, a user writes in handwriting using the electronic pen 220 on the digitizer 210. The information 250 is input, and the display apparatus 230 displays the first result information 260 converted from the handwriting information 250.

다만, 입력한 필기정보(250) 중 일부(252)가 오인식되어 디스플레이 장치(230)에 표시되어 있음을 사용자가 확인할 수 있고, 사용자가 오인식 부분(262)에 대한 수정 및 교정을 위해 음성정보(272)를 마이크(240) 부근에서 인가하고 있다. However, the user may confirm that some of the input writing information 250 is 252 is misrecognized and displayed on the display device 230, and the user may use voice information (eg, to correct and correct the misrecognition portion 262). 272 is applied near the microphone 240.

상술한 바와 같이 본 발명의 음성인식모듈은 필기인식모듈의 변환 동작 직후 미리 결정된 시간 동안만 자동으로 활성화되므로, 사용자는 수정하고자 하는 음성정보(272)를 필기인식 결과 확인 후 일정 시간 내에 인가해야 한다는 것을 이해할 수 있다. As described above, since the voice recognition module of the present invention is automatically activated only for a predetermined time immediately after the conversion operation of the handwriting recognition module, the user should apply the voice information 272 to be corrected within a predetermined time after checking the result of the writing recognition. I can understand that.

이와 같은 필기 및 음성 인식을 이용한 멀티모달 입력장치(100)를 통해, 필기인식 결과를 확인한 후 음성정보를 제공하면 최초 필기정보의 오인식 부분을 자동으로 파악 및 수정할 수 있을 뿐만 아니라 음성 인식이 가능한 타이밍을 미리 결정해두어 별도의 사용자 조작이나 불필요한 전력 소비를 줄일 수 있다는 장점이 있다. Through the multi-modal input device 100 using such handwriting and speech recognition, after confirming the result of the handwriting recognition, and providing the voice information, the timing of not only detecting and correcting the misrecognition portion of the initial handwriting information but also the timing of the voice recognition is possible. It can be decided in advance to reduce the extra user operation or unnecessary power consumption.

한편, 본 발명의 다른 측면인 필기 및 음성 인식을 이용한 멀티모달 입력장치의 제어방법에 대해서 도 3 및 도 4를 참고하여 설명하면서 특히, 제1 결과정보의 오인식 부분을 파악하는 방법 및 수정하는 방법에 대하여 도 5a, 도 5b 그리고 도 6a, 도 6b를 참고하여 설명한다. Meanwhile, a control method of a multi-modal input device using handwriting and voice recognition, which is another aspect of the present invention, will be described with reference to FIGS. 3 and 4, and in particular, a method of correcting and identifying a misrecognition portion of first result information. This will be described with reference to FIGS. 5A, 5B and 6A, 6B.

도 3은 본 발명의 일 실시예에 따른 필기 및 음성 인식을 이용한 멀티모달 입력장치의 제어방법에 대한 순서도이다. 3 is a flowchart illustrating a control method of a multi-modal input apparatus using handwriting and voice recognition according to an embodiment of the present invention.

본 발명에 따른 필기 및 음성 인식을 이용한 멀티모달 입력장치의 제어방법은, 필기인식모듈에서 필기정보를 인식하여 변환된 복수 개의 제1 변환정보 중에 제1 결과정보를 디스플레이 장치로 표시하는 제1 단계; 제1 변환정보로의 변환 직후에 음성인식모듈이 미리 결정된 시간 동안 자동으로 활성화되어, 그 시간 동안 음성정보를 인식하면 제2 결과정보를 포함하는 복수 개의 제2 변환정보로 변환하는 제2 단계; 복수 개의 제2 변환정보를 이용하여 제1 결과정보의 오인식 부분을 파악하는 제3 단계; 및 파악된 오인식 부분을 수정하여 제1 결과정보 대신에 제3 결과정보를 디스플레이 장치에 표시하거나, 제1, 제3 결과정보를 동시에 디스플레이 장치에 표시하는 제4 단계;를 포함하여 이루어지는 점에 기술적 특징이 있다.A control method of a multi-modal input apparatus using handwriting and voice recognition according to the present invention includes a first step of displaying first result information on a display device among a plurality of first converted information converted by recognizing handwriting information by a handwriting recognition module. ; A second step of automatically activating the voice recognition module immediately after the conversion to the first conversion information for a predetermined time, and converting the voice recognition module into a plurality of second conversion information including second result information when the voice information is recognized during the time; A third step of identifying a misrecognition portion of the first result information by using the plurality of second transformation information; And a fourth step of displaying the third result information on the display device instead of the first result information by correcting the identified misrecognition part or simultaneously displaying the first and third result information on the display device. There is a characteristic.

다시 말하자면, 도 3에 도시된 것처럼 본 발명에 따른 필기 및 음성 인식을 이용한 멀티모달 입력장치의 제어가 시작되면(S300), 우선 필기인식모듈에서 필기정보를 인식하고 그 인식된 정보를 내장된 알고리즘에 따라서 변환시키는데(S302), 이러한 알고리즘은 종래에 알려진 어떠한 것이든 적용이 가능하다. In other words, when the control of the multi-modal input apparatus using handwriting and voice recognition according to the present invention starts as shown in FIG. 3 (S300), first, the handwriting recognition module recognizes handwriting information and embeds the recognized information into an algorithm. In accordance with the conversion (S302), this algorithm can be applied to any conventionally known.

변환된 정보 중에서 위 알고리즘에 의해 결정된 필기변환 결과정보를 디스플레이 장치에서 표시하고, 음성인식모듈이 필기인식모듈의 변환 동작 직후에 사용자의 특별한 조작 없이도 미리 결정된 시간 동안 자동으로 활성화되며(S304), 활성화되는 시간은 사용자의 설정 등에 의해 가변적인 것이 바람직하다. The handwriting conversion result information determined by the above algorithm among the converted information is displayed on the display device, and the voice recognition module is automatically activated for a predetermined time without a user's special operation immediately after the conversion operation of the handwriting recognition module (S304). It is preferable that the time to be changed varies depending on the user's setting or the like.

여기서, 음성인식이 자동으로 활성화 된 상태 이후에 음성정보 인식을 대기 단계 및 필기입력 대기 단계를 더 두는 것도 바람직하다. Here, after the voice recognition is automatically activated, the voice information recognition step may further include a waiting step and a writing input waiting step.

그 이유는, 필기인식이 완료된 후 음성인식 기능을 활성화하지만, 활성화 중에 필기입력이 되면 음성인식 기능을 비활성화하고, 필기정보 인식 및 변환(S302) 단계로 이동해야 할 필요가 있기 때문이다. The reason is that after the handwriting recognition is completed, the voice recognition function is activated. However, if the handwriting input is activated during the activation, the voice recognition function needs to be deactivated and the process needs to be moved to the step of recognizing and converting the handwriting information (S302).

계속해서, 활성화 시간 동안 음성정보가 인식되는가 판단하며(S306), 음성정보의 인식 여부와 상관없이 미리 결정된 시간이 지나면 음성인식모듈은 자동으로 비활성 상태로 변화한다. Subsequently, it is determined whether the voice information is recognized during the activation time (S306), and the voice recognition module automatically changes to the inactive state after a predetermined time regardless of whether the voice information is recognized.

만약, 사용자에 의한 음성정보를 인식하면 그 인식된 정보를 적용된 알고리즘에 따라서 변환시키고, 음성변환정보를 이용하여 디스플레이 장치에 표시된 필기변환 결과정보의 오인식 부분을 파악한다(S308). If the voice information is recognized by the user, the recognized information is converted according to the applied algorithm, and the misrecognition portion of the handwriting conversion result information displayed on the display apparatus is grasped using the voice conversion information (S308).

제어기에서 파악된 오인식 부분을 다양한 수정 방법 중 하나를 통해 수정하고, 수정한 결과정보를 디스플레이 장치에 표시하는데(S310), 기존의 필기변환 결과정보 대신에 수정한 결과정보를 표시하거나 양 결과정보를 동시에 표시한다. The misrecognition part detected by the controller is corrected through one of various correction methods, and the corrected result information is displayed on the display device (S310). Instead of the existing handwriting conversion result information, the modified result information is displayed or both result information is displayed. Display at the same time.

만약, 활성화 시간 동안 음성인식모듈이 음성정보를 인식하지 못하면(S312) 오인식 부분이 없다고 판단하여 필기변환 결과정보만을 표시하고, 제어를 종료한다(S314). If, during the activation time, the voice recognition module does not recognize the voice information (S312), it is determined that there is no misrecognition portion, only the handwriting conversion result information is displayed, and the control is terminated (S314).

덧붙여, 오인식 부분을 파악 및 수정할 때 언어 모델이 함께 적용되는 것이 바람직하고, 오인식 부분의 수정 방법에 대한 결정은 오인식 부분 파악에 대한 정확도 수치와 미리 결정된 기준 값을 비교 이후에 이루어지는 것이 바람직하다.In addition, it is preferable that a language model is applied together when identifying and correcting a misrecognition part, and a decision on a method of correcting a misrecognition part is preferably made after comparing an accuracy value for identifying a misrecognition part with a predetermined reference value.

도 4는 본 발명의 다른 실시예에 따른 필기 및 음성 인식을 이용한 멀티모달 입력장치의 제어방법에 대한 순서도로서, 특히 제1 결과정보의 오인식 부분에 대한 파악 방법 및 수정 방법에 대해 나타낸다. 4 is a flowchart illustrating a control method of a multi-modal input apparatus using handwriting and voice recognition according to another embodiment of the present invention. In particular, FIG.

본 발명에 따른 필기 및 음성 인식을 이용한 멀티모달 입력장치의 제어방법에 의해 제어가 시작되면(S400), 우선 제어기의 언어옵션부가 작동하여 입력되는 필기정보 및 음성정보와 관련하여 선택 가능한 언어 옵션을 디스플레이 장치에 표시할 수 있다(S402).When the control is started by the control method of the multi-modal input apparatus using handwriting and speech recognition according to the present invention (S400), the language option of the controller is operated first to select a language option that can be selected in relation to the input writing information and voice information. The display may be displayed on the display device (S402).

언어옵션부의 작동시점은 사용자가 전체 과정 중에 언제라도 선택할 수 있는 것이고, 언어옵션부 작동을 통해 제공되는 언어 중 하나의 언어를 선택하면 그 선택된 언어에 따라 필기인식모듈 및 음성인식모듈의 변환동작이 이루어진다. The operation time of the language option unit can be selected by the user at any time during the whole process. When one of the languages provided through the operation of the language option unit is selected, the conversion operation of the handwriting recognition module and the voice recognition module is performed according to the selected language. Is done.

사용자는 언어 옵션으로 제공되는 한국어, 영어, 중국어, 일본어 등의 언어 중 하나의 언어를 선택할 수 있어 편의성이 증대되고, 사전 데이터베이스를 통해 지속적인 업데이트가 이루어질 수 있어 제공 가능한 언어가 늘어나면서 보다 향상된 인식률을 보일 수 있다. Users can select one of the languages, such as Korean, English, Chinese, and Japanese, which are available as language options, for increased convenience, and continuous updates can be made through the dictionary database. Can be seen.

이후 필기인식모듈에서 필기정보를 인식하고 그 인식된 정보를 내장된 알고리즘에 따라서 복수 개의 제1 변환정보로 변환시키고(S404), 복수 개의 제1 변환정보 중에서 알고리즘에 의해 결정된 제1 결과정보를 디스플레이 장치에 표시하며, 필기인식모듈의 변환 동작 직후에 사용자의 특별한 조작 없이도 미리 결정된 시간 동안 음성인식모듈이 자동으로 활성화된다(S406).Thereafter, the handwriting recognition module recognizes the handwriting information and converts the recognized information into a plurality of first conversion information according to an embedded algorithm (S404), and displays first result information determined by an algorithm among the plurality of first conversion information. It is displayed on the device, and immediately after the conversion operation of the handwriting recognition module, the voice recognition module is automatically activated for a predetermined time without user's special operation (S406).

이때, 제어기의 알림부가 작동하여 사용자에게 오인식 부분에 대해 음성으로 수정이 가능하다는 것을 알리는 알림창을 디스플레이 장치에 표시하는 것이 바람직하고(S408), 이렇게 표시되는 알림창을 통해 처음 사용하는 사용자나 사용에 익숙하지 않은 사용자라도 제1 결과정보의 오인식 부분을 확인하고 음성으로 오인식 부분을 수정할 수 있다.At this time, it is preferable to display a notification window on the display device that the notification unit of the controller operates to inform the user that the mistake can be corrected by voice (S408). Even a user who does not check the misrecognition portion of the first result information and correct the misrecognition portion by voice.

이어서 활성화 시간 동안 음성정보가 인식되는가 판단하며(S410), 음성정보의 인식 여부와 상관없이 미리 결정된 시간이 지나면 음성인식모듈은 자동으로 비활성 상태로 변화한다. Subsequently, it is determined whether the voice information is recognized during the activation time (S410), and the voice recognition module automatically changes to the inactive state after a predetermined time regardless of whether the voice information is recognized.

만약, 음성정보가 인식되면 활성화 상태의 음성인식모듈은 그 음성정보를 적용된 알고리즘에 따라서 복수 개의 제2 변환정보로 변환시키고(S412), 제1 변환정보와 제2 변환정보를 수신한 제어기의 판단부는 양 변환정보를 서로 비교하는 단계(S414)와, 제1, 제2 변환정보 중 서로 매칭되는 제2 결과정보를 검색하는 단계(S416)와, 제2 결과정보를 이용하여 오인식 부분을 파악하는 단계(S418)를 수행할 수 있다. If the voice information is recognized, the voice recognition module in the activated state converts the voice information into a plurality of second conversion information according to the applied algorithm (S412), and determines the controller that has received the first conversion information and the second conversion information. The step of comparing the two pieces of conversion information with each other (S414), searching for the second result information matching each other of the first and second conversion information (S416), and using the second result information to identify the misrecognition portion Step S418 may be performed.

이때, 오인식 부분 파악 과정에서 복수 개의 제2 변환정보를 이용함과 동시에 언어 모델도 적용하는 것이 바람직하다.In this case, it is preferable to use a plurality of second transformation information and also apply a language model in the misrecognition part identification process.

이하, 도 5a 및 도 5b를 참고하여 'I like that boy'라는 필기정보(250)가 사용자에 의해 입력된 경우 제1 결과정보의 오인식 부분을 파악하는 과정에 대해 구체적으로 살펴본다.Hereinafter, a process of identifying a misrecognition portion of the first result information when the writing information 250 of 'I like that boy' is input by the user will be described in detail with reference to FIGS. 5A and 5B.

도 5a는 제어기의 판단부에서 제1 결과정보의 오인식 부분을 파악하는 과정 중에서 제1, 제2 변환정보를 비교하는 것을 나타내는 개략도로서, 필기인식모듈은 'I like that boy'라는 필기정보(250)를 복수 개의 제1 변환정보로 변환하고, 음성인식모듈은 'boy'라는 음성정보(272)를 복수 개의 제2 변환정보로 변환하며, 제1 변환정보 및 제2 변환정보를 수신한 판단부는 제1 변환정보에 대해 미리 결정된 우선순위에 따라 각각의 제2 변환정보를 비교할 수 있다. FIG. 5A is a schematic diagram illustrating comparing first and second conversion information in a process of determining a misrecognition portion of first result information by a determination unit of a controller, wherein the handwriting recognition module is 'I like that boy'. ) Is converted into a plurality of first conversion information, the voice recognition module converts the voice information 272 called 'boy' into a plurality of second conversion information, and the determination unit that has received the first conversion information and the second conversion information Each second transformation information may be compared according to a predetermined priority with respect to the first transformation information.

즉, 'I like that bag'이란 제1 결과정보(260)를 얻는 과정에서 'I', 'like', 'that', 'bag' 각각에 대한 변환 값인 제1 변환정보가 존재하고, 판단부는 제2 변환정보인 'boi', 'boy', 'voy', 'voit' 등을 각각 1순위, 2순위, 3순위, 4순위 등으로 나열하여 순위대로 복수 개의 제1 변환정보와 비교해나갈 수 있다. That is, in the process of obtaining the first result information 260 of 'I like that bag', first conversion information, which is a conversion value for each of 'I', 'like', 'that', and 'bag', is present. The second conversion information 'boi', 'boy', 'voy', 'voit', etc. can be listed in the 1st, 2nd, 3rd and 4th ranks, respectively, and compared with the plurality of the first conversion information according to the ranking. have.

일예로 1순위인 'boi'를 제1 변환정보와 비교하면 매칭되는 값이 검색되지 않고, 2순위인 'boy'를 제1 변환정보와 비교하면 도 5b에 도시된 것처럼 매칭되는 값이 검색되는데 이를 제2 결과정보(270)라고 정의하며, 판단부는 복수 개의 제2 변환정보 중 제1 변환정보와 매칭되는 제2 결과정보(270)를 이용하여 오인식 부분을 파악할 수 있다. For example, when the first rank 'boi' is compared with the first transformation information, no matching value is searched. When the second rank 'boy' is compared with the first transformation information, the matching value is searched as shown in FIG. 5B. This is defined as second result information 270, and the determination unit may determine a misrecognition part by using second result information 270 matching the first transform information among the plurality of second transform information.

만약, 제2 변환정보 중 1순위인 'boi'와 매칭되는 제1 변환정보가 검색되는 경우라도 제1 결과정보(260) 중에서 'bag' 부분과 대응되는 'bay', 'bey', 'bog', 'boy' 등과 같은 변환 값들 중 하나일 확률이 가장 높아, 판단부는 이 경우에도 제2 변환정보를 이용하여 오인식 부분을 파악할 수 있으며, 사전 데이터베이스로부터 수신한 언어 모델 정보를 함께 적용하여 오인식 부분에 대한 파악 정확도를 더 높일 수 있다. If first conversion information matching 'boi', which is the first rank among the second conversion information, is searched, 'bay', 'bey', and 'bog' corresponding to the 'bag' part of the first result information 260 may be used. It is most likely to be one of the conversion values such as', 'boy', and so on. In this case, the determination unit may also identify the misrecognition part by using the second transformation information, and apply the language model information received from the dictionary database together. It can increase the accuracy of grasping for.

여기서, 정확하게 동일한 변환정보에 대해 매칭이라고 판단하여 제2 결과정보(270)로 선정하는 것이 바람직하지만, 판단부의 내부 설정에 따라 일정 확률 이상의 유사성을 갖는 변환정보라 판단하여 매칭되는 제2 결과정보(270)로 선정할 수도 있다.Here, it is preferable to select the second result information 270 by determining that the matching information is exactly the same, but the second result information that is matched by being determined to be conversion information having a similarity or higher than a predetermined probability according to the internal setting of the determination unit. 270).

계속해서 파악된 제1 결과정보의 오인식 부분을 수정하는 일련의 절차가 제어기의 수정부에서 이루어지는데, 수정부에는 다양한 방식의 수정 방법이 저장되어 있을 수 있다. A series of procedures for correcting the misrecognition portion of the first result information that is identified is carried out in the correction unit of the controller, the correction unit may be stored in various ways.

수정부에서 오인식 부분의 수정 방법을 결정함에 있어서, 오인식 부분 파악에 대한 정확도 수치와 미리 결정된 기준 값을 비교하는 단계(S420)를 거쳐 결정하는 것이 바람직하다. In determining the correction method of the misrecognition portion in the correction part, it is preferable to determine through the step (S420) comparing the accuracy value for identifying the misrecognition portion with a predetermined reference value.

오인식 부분 파악에 대한 정확도 수치는 수정부에 적용된 알고리즘이나 프로그램에 따라서 결정되는데, 기준 값과 함께 본 발명에 따른 필기 및 음성 인식을 이용한 멀티모달 입력장치의 신뢰성을 정할 수 있는 값이므로, 각 언어별로 반복적인 테스트 결과와 지속적인 업데이트 정보를 토대로 결정되는 것이 바람직하다.The accuracy value for misrecognition is determined according to the algorithm or program applied to the correction unit, and it is a value that can determine the reliability of the multi-modal input device using handwriting and speech recognition according to the present invention together with the reference value. It is desirable to make decisions based on repeated test results and continuous update information.

일예로 비교결과 정확도 수치가 미리 결정된 기준 값보다 동일하거나 높으면, 수정부는 오인식 부분을 제1, 제2 변환정보 중 서로 매칭되는 제2 결과정보로 수정하고(S422), 디스플레이 장치에 제2 결과정보가 반영된 제3 결과정보를 표시(S424)하도록 제어할 수 있다. For example, if the accuracy of the comparison result is equal to or higher than the predetermined reference value, the correction unit modifies the misrecognition portion into second result information matching each other among the first and second conversion information (S422), and the second result information is displayed on the display device. Control to display the reflected third result information (S424).

즉, 도 5b에 도시된 것과 같이 매칭되는 제2 결과정보(270)인 'boy'를 이용하여 오인식 부분인 'bag'을 수정하게 되고, 제3 결과정보인 'I like that boy'를 얻는다.That is, as shown in FIG. 5B, 'bag', which is a misrecognition part, is corrected using 'boy', which is matched second result information 270, and 'I like that boy', which is third result information, is obtained.

다른 예로 비교결과 정확도 수치가 미리 결정된 기준 값보다 동일하거나 높으면, 수정부는 오인식 부분을 각 제2 변환정보에 대해 언어 모델을 적용하여 결정된 언어 모델 변환정보로 수정하고(S426), 디스플레이 장치에 언어 모델 변환정보가 반영된 제3 결과정보를 표시(S428)하도록 제어할 수 있다.As another example, if the accuracy value is equal to or higher than the predetermined reference value, the correction unit modifies the misrecognition portion into language model conversion information determined by applying a language model to each second conversion information (S426), and sets the language model to the display device. In operation S428, the third result information reflecting the conversion information may be displayed.

이처럼 언어 모델 정보를 독립적으로 적용하여 오인식 부분을 수정할 수 있지만, 상술한 일예의 경우에 언어 모델 정보를 보조적으로 적용하여 오인식 부분을 수정하는 것도 가능하다. As described above, the misrecognition part may be modified by applying language model information independently. However, in the above-described example, the misrecognition part may be modified by assisting the language model information.

또 다른 예로 비교결과 정확도 수치가 미리 결정된 기준 값보다 낮으면, 수정부는 오인식 부분을 디스플레이 장치에 표시(S430)된 오인식 부분과 대응되는 제1, 제2 변환정보 중에 사용자신호에 의해 결정된 변환정보로 수정하고(S432), 디스플레이 장치에 그 결정된 변환정보가 반영된 제3 결과정보를 표시(S424)하도록 제어할 수 있다. As another example, when the comparison result shows that the accuracy value is lower than the predetermined reference value, the correction unit converts the misrecognition portion into the conversion information determined by the user signal among the first and second transformation information corresponding to the misrecognition portion displayed on the display apparatus (S430). In operation S432, the display apparatus may control to display the third result information reflecting the determined conversion information in operation S424.

즉, 정확도 수치가 미리 결정된 기준 값보다 낮으면, 수정부는 디스플레이 장치에 오인식 부분과 대응되는 제1, 제2 변환정보의 리스트 또는 가상 키보드가 표시되도록 제어할 수 있고, 사용자는 리스트 중 원래 필기하려 했던 변환정보를 선택하거나 가상 키보드를 이용해 직접 입력할 수 있다. That is, if the accuracy value is lower than the predetermined reference value, the correction unit may control the display device to display a list or a virtual keyboard of the first and second conversion information corresponding to the misrecognition portion, and the user tries to write the original in the list. You can select the conversion information you have done or enter it directly using the virtual keyboard.

사용자의 선택 혹은 입력에 의해 발생한 사용자신호는 제어기의 수정부로 전달되고, 수정부는 사용자신호에 의해 결정된 변환정보로 오인식 부분을 수정할 수 있다.The user signal generated by the user's selection or input is transmitted to the correction unit of the controller, and the correction unit may correct the misrecognition part with the conversion information determined by the user signal.

이처럼 정확도 수치가 미리 결정된 기준 값보다 낮은 경우로는, 사람 이름 또는 지역명칭과 같은 고유명사이거나, 사전 데이터베이스에 등록되어 있지 않은 단어인 경우가 해당될 수 있다.As such, when the accuracy value is lower than the predetermined reference value, the case may be a proper noun such as a person name or a local name or a word not registered in the dictionary database.

상술한 것과 같이 파악된 오인식 부분을 다양한 방법으로 수정하여 제1 결과정보를 제3 결과정보로 변환시킨 후, 디스플레이 장치에 제1 결과정보 대신에 제3 결과정보를 표시하거나 제1, 제3 결과정보를 동시에 표시하고(S424), 제어를 종료한다(S436). After correcting the misrecognition part identified as described above in various ways to convert the first result information to the third result information, the third result information is displayed on the display device instead of the first result information or the first and third results are displayed. Information is displayed at the same time (S424), and the control ends (S436).

덧붙여, 제1, 제3 결과정보를 동시에 표시하는 경우 특별한 입력이 없으면 제어기에서 표시된 제3 결과정보가 사용자가 입력되길 원했던 정보라고 판단하여 제3 결과정보만을 표시하도록 제어할 수 있고, 다시 수정이 필요한 경우라면 제어기에서 상술한 수정 방법 중 하나가 디스플레이 장치를 통해 제공되도록 제어할 수 있다.In addition, when displaying the first and third result information at the same time, if there is no special input, the controller may determine that the displayed third result information is the information that the user wanted to input and control to display only the third result information. If necessary, the controller may control one of the above-described modification methods to be provided through the display apparatus.

혹은 만약 음성정보 인식 여부를 판단하여(S410) 활성화 시간 동안 음성인식모듈이 음성정보를 인식하지 못하면(S434), 오인식 부분이 없다고 판단하여 제1 결과정보만을 표시하고(S438), 제어를 종료한다(S436).Alternatively, if it is determined whether the voice information is recognized (S410), if the voice recognition module does not recognize the voice information during the activation time (S434), it is determined that there is no misrecognition portion, and only the first result information is displayed (S438), and the control ends. (S436).

이하, 지금까지 설명한 일련의 멀티모달 입력장치의 제어방법을 이용한 경우, 도 2의 디스플레이 장치에 표시되는 결과에 대해 도 6a 및 도 6b를 참고하여 살펴본다.Hereinafter, when the control method of the series of multi-modal input devices described so far is used, the results displayed on the display device of FIG. 2 will be described with reference to FIGS. 6A and 6B.

도 6a에서 나타낸 일 실시예를 참고하면, 사용자가 디지타이저(210) 상에 전자펜(220)을 이용하여 필기체로 'I like that boy'라는 필기정보(250)를 입력하였으나, 디스플레이 장치(230)에 표시된 'I like that bag'이라는 필기인식에 의한 제1 결과정보(260)를 살펴보면 'boy'(252) 대신에 'bag'(262)로 오인식되었음을 사용자가 알 수 있다. Referring to the exemplary embodiment illustrated in FIG. 6A, although the user inputs the writing information 250 such as 'I like that boy' into the handwriting using the electronic pen 220 on the digitizer 210, the display device 230 Looking at the first result information 260 by the handwriting recognition of 'I like that bag' shown in FIG. 2, the user may know that the bag is recognized as a bag 262 instead of the boy 252.

즉, 본 발명에 따른 필기인식모듈은 적용된 알고리즘이나 프로그램으로 판단했을 때, 복수 개의 제1 변환정보 중에서 'bog', 'bey', 'bay', 'bou', 'boy' 등보다는 'bag'이 가장 필기정보와 유사하다고 판단하여 'bag'을 포함하는 제1 결과정보(260)로 표시한 것임을 알 수 있다.That is, the handwriting recognition module according to the present invention, when determined by the applied algorithm or program, 'bag' rather than 'bog', 'bey', 'bay', 'bou', 'boy', etc. among the plurality of first conversion information. It can be seen that the first result information 260 including 'bag' is determined to be similar to the most handwritten information.

이어서 알림부는 디스플레이 장치(230)에 음성인식모듈의 활성화와 동시에 오인식 부분(262)에 대해 음성으로 일정 시간 내에 수정 가능하다는 내용과 알림창(232)을 표시되도록 제어하고, 사용자는 원래 입력하고자 했던 'boy'라는 음성정보(272)를 인가할 수 있다.Subsequently, the notification unit controls the display device 230 to display a notification window 232 indicating that the voice recognition module can be corrected within a predetermined time with the activation of the voice recognition module at the same time, and the user originally intended to input ' Voice information 272 called 'boy' may be applied.

판단부는 음성인식모듈로부터 제2 결과정보인 'boy'를 포함하는 복수 개의 제2 변환정보를 수신받아 제1 결과정보(260)의 'bag'이 오인식 부분(262)임을 파악하고, 수정부는 'bag'(262)을 'boy'(282)로 수정하여 얻은 제3 결과정보(280)를 디스플레이 장치(230)에 표시하도록 제어한다. The determination unit receives a plurality of second conversion information including the second result information 'boy' from the speech recognition module to determine that 'bag' of the first result information 260 is a misrecognition portion 262, and the correction unit The third result information 280 obtained by modifying the bag '262 to' boy '282 is controlled to be displayed on the display apparatus 230.

이때, 상술한 도 5a 및 도 5b에 도시된 것처럼 판단부는 복수 개의 제1, 제2 변환정보를 서로 비교하고 매칭되는 제2 결과정보인 'boy'를 이용하여 제1 결과정보의 오인식 부분을 파악하는 것이 바람직하며, 사전 데이터베이스에 저장된 언어 모델 정보를 참조하여 오인식 부분을 파악할 수도 있다.In this case, as illustrated in FIGS. 5A and 5B, the determination unit compares the plurality of first and second conversion information with each other and uses the 'boy', which is the matching second result information, to identify a misrecognition portion of the first result information. It is preferable to check the language model information stored in the dictionary database.

또한, 도 6b에서 나타낸 다른 실시예를 참고하면, 사용자가 디지타이저(210) 상에 전자펜(220)을 이용하여 필기체로 'I like that boy, Luna'라는 필기정보(250)를 입력하였으나, 디스플레이 장치(230)에 표시된 'I like that bag, Lune'이라는 필기인식에 의한 제1 결과정보(260)를 살펴보면 'boy'(252) 대신에 'bag'(262)로 그리고 'Luna'(254) 대신에 'Lune'(264)으로 오인식되었음을 사용자가 알 수 있다.In addition, referring to another embodiment illustrated in FIG. 6B, although the user inputs the writing information 250 such as 'I like that boy, Luna' on the digitizer 210 by using the electronic pen 220, the display is displayed. Looking at the first result information 260 by the handwriting recognition of 'I like that bag, Lune' displayed on the device 230, the 'bag' 262 instead of the 'boy' 252 and the 'Luna' 254 Instead, the user may know that it is misrecognized as 'Lune' 264.

즉, 본 발명에 따른 필기인식모듈은 적용된 알고리즘이나 프로그램으로 판단했을 때, 필기정보인 'boy'(252)와 관련된 제1 변환정보 중에서는 'bog', 'bey', 'bay', 'bou', 'boy' 등보다는 'bag'이 가장 유사하다고 판단하고, 필기정보인 'Luna'(254)와 관련된 제1 변환정보 중에서는 'Lana', 'Lane', 'Luna' 등보다는 'Lune'이 가장 유사하다고 판단하여 오인식 부분(262, 264)을 포함하는 제1 결과정보(260)를 표시한 것이다.That is, when the handwriting recognition module according to the present invention is determined by the applied algorithm or program, among the first conversion information related to the 'boy' 252 which is the handwriting information, 'bog', 'bey', 'bay', 'bou' 'bag' is judged to be the most similar to ',' 'boy', and 'Lune' rather than 'Lana', 'Lane', 'Luna' among the first converted information related to 'Luna' 254, which is handwriting information. The first result information 260 including the misrecognition portions 262 and 264 is displayed by determining that the most similarity is found.

이어서 사용자는 제1 결과정보(260)로부터 오인식 부분(262, 264)을 확인하고 필기인식을 통해 원래 입력하고자 했던 'boy'와, 'Luna'라는 음성정보(272, 274)를 일정시간 내에 순서대로 말할 수 있다. Subsequently, the user checks the misrecognition portions 262 and 264 from the first result information 260 and sequentially orders the 'boy' and 'Luna' voice information 272 and 274 originally intended to be input through handwriting recognition within a predetermined time. I can say it.

판단부는 음성인식모듈로부터 제2 결과정보인 'boy', 'Luna'를 포함하는 복수 개의 제2 변환정보를 수신받아 제1 결과정보(260)의 'bag', 'Lune'이 오인식 부분(262, 264)임을 파악하고, 수정부는 'bag'(262)을 'boy'(282)로 수정하여 얻은 제3 결과정보(280)를 디스플레이 장치(230)에 표시하도록 제어한다. The determination unit receives a plurality of second conversion information including the second result information 'boy' and 'Luna' from the speech recognition module, and 'bag' and 'Lune' of the first result information 260 are misrecognized. 264), and the correction unit controls the display apparatus 230 to display the third result information 280 obtained by modifying the bag 262 with the boy 282.

다만, 'Luna'의 경우 사람 이름에 해당하는 고유명사로서, 수정부에서 오인식 부분 파악에 대한 정확도 수치가 미리 결정된 기준 값보다 낮게 판단될 수 있다. However, in case of 'Luna', a proper noun corresponding to a person's name may be determined to be lower than a predetermined reference value.

즉, 이 경우 디스플레이 장치는 오인식 부분으로 파악된 'Lune'과 대응되는 제1 변환정보(292)인 'Lana', 'Lane', 'Luna'와, 제2 변환정보(294)인 'runa', 'lunar', 'luna' 등의 리스트를 표시할 수 있고, 사용자는 이들 변환정보(290) 중 원래 필기로 인식시키고자 했던 것을 선택(298)하거나 가상키보드(296) 방식을 통해 직접 입력할 수 있다. That is, in this case, the display device may include 'Lana', 'Lane', 'Luna', which are first conversion information 292 corresponding to 'Lune', which is identified as a misrecognition portion, and 'runa', which is second conversion information 294. , 'lunar', 'luna', etc. can be displayed, and the user can select (298) or input directly through the virtual keyboard (296) method of the conversion information 290, which was originally intended to be recognized as handwriting. Can be.

이러한 사용자의 입력에 의해 발생한 사용자신호는 하나의 변환정보를 가지고 있게 되고, 그 사용자신호에 의해 결정된 변환정보는 제어기의 수정부로 전달된다. The user signal generated by the user's input has one conversion information, and the conversion information determined by the user signal is transmitted to the controller of the controller.

결국, 수정부는 사용자신호에 의해 결정된 변환정보로 'Lune'(264)을 'Luna'(284)로 수정하여 얻은 제3 결과정보(280)를 디스플레이 장치(230)에 표시하도록 제어한다. As a result, the correction unit controls the display apparatus 230 to display the third result information 280 obtained by modifying the 'Lune' 264 to the 'Luna' 284 as the conversion information determined by the user signal.

이처럼 필기 및 음성 인식을 이용한 멀티모달 입력장치의 제어방법을 통해, 필기인식 결과를 확인한 후 음성정보만을 제공하면 최초 필기정보의 오인식 부분을 자동으로 파악 및 수정할 수 있을 뿐만 아니라 음성 입력이 인가되는 타이밍을 미리 결정해두어 별도의 사용자 조작이나 불필요한 전력 소비를 줄일 수 있다는 장점이 있고, 오인식 부분 파악에 대한 정확도 수치에 따라 결정되는 다양한 수정 방식을 제공하여 오인식률을 줄일 수 있다는 장점이 있다. Through the control method of the multi-modal input device using handwriting and voice recognition, if only the voice information is provided after checking the result of the handwriting recognition, the false recognition part of the initial handwriting information can be automatically recognized and corrected, and the timing at which the voice input is applied. It is possible to reduce a user's operation or unnecessary power consumption by deciding in advance, and to reduce a false recognition rate by providing various correction methods that are determined according to accuracy figures for identifying misidentification parts.

이상으로 본 발명에 따른 특정의 바람직한 실시예에 대해 설명하였다. 그러나, 본 발명이 상술한 실시예로 한정되는 것은 아니며, 상술한 실시예가 본 발명의 원리를 응용한 다양한 실시예의 일부를 나타낸 것에 지나지 않음을 이해하여야 한다. 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자라면 이하의 특허청구범위에 기재된 본 발명의 기술적 사상의 요지를 벗어남이 없이 얼마든지 다양하게 변경 실시할 수 있을 것이다.
In the above, a specific preferred embodiment according to the present invention has been described. However, it is to be understood that the present invention is not limited to the above-described embodiments, and the above-described embodiments merely represent a part of various embodiments to which the principles of the present invention are applied. Those skilled in the art to which the present invention pertains may make various changes without departing from the spirit of the technical idea of the present invention described in the claims below.

100 : 멀티모달 입력장치 110 : 필기인식모듈
120 : 음성인식모듈 130 : 제어기
132 : 판단부 134 : 수정부
140 : 디스플레이 장치 150 : 사전 데이터베이스100: multi-modal input device 110: handwriting recognition module
120: voice recognition module 130: controller
132: judgment unit 134: government
140: display device 150: dictionary database

Claims

In a multi-modal input device,
A handwriting recognition module recognizing handwriting information and converting the handwriting information into a plurality of first conversion information including first result information;
A voice recognition module which is automatically activated for a predetermined time immediately after a conversion operation of the handwriting recognition module to recognize voice information and convert the voice information into a plurality of second conversion information including second result information;
A determination unit for identifying a misrecognition portion of the first result information received from the handwriting recognition module using the plurality of second conversion information received from the speech recognition module, and correcting the misrecognition portion of the first result information by modifying a third A controller including a correction unit for obtaining result information; And
And a display device configured to display at least one of the first result information and the third result information.

The method of claim 1, wherein the determination unit searches and uses the second result information that matches each other among a plurality of first conversion information received from the handwriting recognition module and a plurality of second conversion information received from the voice recognition module. Multimodal input device using handwriting and speech recognition, characterized in that the grasp portion.

The multimodal input apparatus of claim 1, wherein the correction unit modifies the misrecognition portion using a correction method determined by comparing an accuracy value for identifying a misrecognition portion with a predetermined reference value. .(newly open)

The multimodal input apparatus of claim 1, wherein the determination unit and the correction unit receive language model information stored in the dictionary database to identify and correct the misrecognition portions, respectively.

The apparatus of claim 1, wherein the controller further comprises a notification unit configured to display, on the display device, a notification window indicating that the voice recognition module can be corrected at the same time as the voice recognition module is activated. Modal input device.

The apparatus of claim 1, wherein the controller further comprises a language option unit configured to display, on the display apparatus, a language option selectable in relation to the handwriting information and the voice information during the input device operation. Multi modal input device.

In the control method of the multi-modal input device,
A first step of displaying the first result information on the display apparatus among the plurality of first conversion information converted by recognizing the writing information by the writing recognition module;
A second step of automatically activating the voice recognition module immediately after the conversion to the first conversion information and converting the voice recognition module into a plurality of second conversion information including second result information when the voice information is recognized during the predetermined time; ;
A third step of identifying a misrecognition portion of the first result information by using the plurality of second transformation information; And
And correcting the identified misrecognition portion to display third result information on the display device instead of the first result information, or to simultaneously display the first and third result information on the display device. A control method of a multi-modal input apparatus using handwriting and speech recognition.

The method according to claim 7, wherein in each step, the language option selectable in relation to the handwriting information and the voice information is displayed on the display device, wherein the handwriting recognition module and the voice recognition module are converted according to the selected language. Control method of multi-modal input device using speech recognition.

The method of claim 7, wherein when the voice recognition module is activated in the second step, a notification window is displayed on the display device to notify the display device that a notification can be corrected by voice. Control method of multi modal input device.

The method of claim 7, wherein the third step is
Comparing the plurality of first and second transformation information with each other, and identifying the misrecognition portion by using second result information that is matched with each other among the first and second transformation information. And controlling a multi-modal input device using speech recognition.

The multimodal input apparatus of claim 7 or 10, wherein in the third step, the misrecognition portion is determined by using a language model and simultaneously applying the plurality of second transformation information. Control method.

The multimodal input using handwriting and speech recognition according to claim 7 or 10, wherein in the fourth step, a method of correcting the misrecognition portion is determined by comparing an accuracy value for identifying the misrecognition portion with a predetermined reference value. Control method of the device. (newly open)

The method according to claim 12, wherein if the accuracy value is equal to or higher than a predetermined reference value, the misrecognition portion is corrected with second result information that matches each other among the first and second conversion information, and the second result information is displayed on the display device. The control method of the multi-modal input apparatus using handwriting and speech recognition, characterized in that for displaying the third result information reflected.

The method of claim 12, wherein when the accuracy value is equal to or higher than a predetermined reference value, the misrecognition portion is corrected to language model transformation information determined by applying a language model to each second transformation information, and the language model is applied to the display device. And a third result information reflecting the converted information.

The method according to claim 12, wherein when the accuracy value is lower than a predetermined reference value, the misrecognition portion is corrected to the conversion information determined by the user signal among the first and second conversion information corresponding to the misrecognition portion displayed on the display device. And displaying third result information reflecting the determined conversion information on the display device.