KR20060133190A

KR20060133190A - Sign language phone system using sign recconition and sign generation

Info

Publication number: KR20060133190A
Application number: KR1020050052893A
Authority: KR
Inventors: 오영준; 정성훈; 장효영; 정진우; 변증남
Original assignee: 한국과학기술원
Priority date: 2005-06-20
Filing date: 2005-06-20
Publication date: 2006-12-26
Also published as: KR100730573B1

Abstract

A two-way language phone system for the recognition and generation of sign language is provided so that aurally handicapped persons can lead leisure life through a video phone system program or obtain information required for social activities or education. A two-way language phone system for the recognition and creation of sign language consists of a system processing part(100), a sign language input part(110), a sign language output part(120), a communication line part(130), and a DB group(140). The system processing part(100) comprises a sentence generation part(200), a voice generation part(400), a sentence creation part(500), and a sign language generation part(300). The sign language input part(110) recognizes a sign language operation or a sign language form through a camera and provides corresponding sign language image data to the sentence generation part(200). The sentence generation part(200) searches the sign language DB of the DB group(140) for a sentence corresponding to the sign language image data provided from the sign language input part(110). The voice generation part(400) obtains the phonemes corresponding to the sentence, provided from the sentence generation part(200), from the voice DB of the DB group(140), combines them into a voice, and provides it to the communication line part(130). The sentence creation part(500) searches the voice recognition DB of the DB group(140) for a sentence corresponding to the voice provided from the communication line part(130) and provides it to the sign language generation part(300).

Description

Sign Language Phone System using Sign Recconition and Sign Generation}

도 1은 본 발명에 따른 수화 인식과 수화 발생을 구현한 양방향 수화 전화 시스템의 일 실시예를 나타낸 블록도,1 is a block diagram showing an embodiment of a two-way sign language telephone system implementing sign language recognition and sign language generation according to the present invention;

도 2는 도 1에 도시된 수화발생부의 일 실시예를 나타낸 블록도,FIG. 2 is a block diagram illustrating an embodiment of a sign language generating unit shown in FIG. 1;

도 3은 도 1에 도시된 문장발생부의 일 실시예를 나타낸 블록도,3 is a block diagram illustrating an embodiment of a sentence generator shown in FIG. 1;

도 4는 도 1에 도시된 음성발생부의 일 실시예를 나타낸 블록도,4 is a block diagram showing an embodiment of a voice generator shown in FIG. 1;

도 5는 도 1에 도시된 문장생성부의 일 실시예를 나타낸 블록도,5 is a block diagram illustrating an embodiment of a sentence generation unit illustrated in FIG. 1;

도 6은 도 2에 도시된 자음변환기의 동작을 단계별로 나타낸 순서도,6 is a flow chart showing step by step the operation of the consonant transducer shown in FIG.

도 7은 수화형태소/3차원수화애니메이션DB, 수화단어DB, 및 손모양DB의 종속관계를 나타낸 도면,7 is a view showing the dependency of sign language morpheme / 3D sign language animation DB, sign language word DB, and hand-shaped DB,

도 8은 도 2에 도시된 자음변환기에서 자음변환하기 전의 음절과 입술모양을 구현할 모음을 나타낸 도면,FIG. 8 is a diagram illustrating a collection of syllables and lips before consonant conversion in the consonant converter shown in FIG. 2;

도 9는 도 2에 도시된 자음변환기의 동작에 따른 다양한 입술모양을 나타낸 도면,9 is a view showing a variety of lips according to the operation of the consonant transducer shown in FIG.

도 10은 고개운동DB를 나타낸 도면,10 is a view showing a head motion DB,

도 11은 수화단어DB의 얼굴표정값 정보에 따른 3차원 수화자 모델의 다양한 얼굴표정을 나타낸 도면,FIG. 11 is a diagram illustrating various face expressions of a 3D speaker model according to face expression information of a sign language word DB.

도 12는 손운동DB에 저장되어 있는 손운동 데이터를 나타낸 도면,12 is a diagram showing hand movement data stored in the hand movement DB;

도 13은 손자세DB에 저장되어 있는 손자세 데이터에 대응하는 손방향을 나타낸 도면,13 is a view showing hand directions corresponding to hand posture data stored in a hand posture DB;

도 14는 손자세DB에 저장되어 있는 손자세 데이터에 대응하는 손모양을 나타낸 도면,14 is a view showing a hand shape corresponding to the hand posture data stored in the hand posture DB;

도 15는 수화자의 손 위치 이미지 세부분할과 손 위치 데이터내용을 나타낸 도면,Fig. 15 is a view showing the hand position image subdivision and hand position data contents of the called party;

도 16은 수화자 얼굴 안에 손이 위치할 경우, 수화자 얼굴 분할 이미지와 수화자 얼굴 영역 분할에 의하여 생성한 손 위치 정보값을 나타낸 도면,16 is a diagram illustrating hand position information values generated by dividing a face of a talker's face and segmenting the face of a talker when the hand is located in the face of the talker;

도 17은 수화소DB의 수화소 구성내용을 나타낸 도면,17 is a diagram showing the details of a pixel configuration of a pixel DB;

도 18은 수화소DB의 자음지화소 구성내용을 나타낸 도면,18 is a view showing the constitution of consonant pixels in the hydration DB;

도 19는 수화소DB의 모음지화소 구성내용을 나타낸 도면,19 is a view showing the configuration of a collection pixel of a hydration pixel DB;

도 20은 수화소DB의 문자배열 구성내용을 나타낸 도면,20 is a view showing the contents of a character array of a pixel DB;

도 21은 청각장애인모드 화면배치를 나타낸 도면,21 is a view showing the deaf mode screen layout,

도 22는 건청인모드 화면배치를 나타낸 도면.Fig. 22 is a diagram showing a dry subject mode screen layout.

본 발명은 수화 인식과 수화 발생을 구현한 양방향 수화 전화 시스템(Sign Language Phone System)에 관한 것으로, 특히, 청각장애인이 건청인에게 전화를 걸 때 청각장애인이 표현한 수화 특징을 추출해서 수화영상DB(DataBase)를 참조하여 그 수화를 문장이나 음성으로 표현하도록 하고, 건청인이 청각장애인에게 전화를 걸 때는 건청인이 발성한 음성을 추출하여 문자로 변환하고 각종 전처리 알고리즘(algorithm)을 통하여 수화식 표현으로 변환하고 수화 애니메이션DB를 참조하여 3차원 수화 애니메이션으로 표현함으로써 청각장애인들의 정보 접근을 증진시키는 수화 전화 시스템에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a Sign Language Phone System that implements sign language recognition and sign language generation. In particular, when a deaf person makes a call to a hearing person, a sign language feature expressed by a deaf person may be extracted. Refer to) to express the sign language by sentence or voice, and when the callee makes a call to the hearing impaired person, the caller extracts the spoken voice and converts it into a letter and converts it into a sign language expression through various preprocessing algorithms. The present invention relates to a sign language telephone system that promotes access to information for the hearing impaired by expressing a 3D sign language animation with reference to a sign language animation DB.

공개특허공보 제2004-0010945호, 제2004-0076907호 및 등록특허공보 제0397692호, 등록실용신안공보 제0234151호를 통하여 청각장애인용 화상전화, 필기인식전화, 손동작인식 화상통신 등이 잘 알려져 있다. 그러나 이들은 음성신호 - 수화 애니메이션 화면의 변환 및 수화인식 - 음성합성 변환 기술에 대하여는 전혀 고려하지 않다. 또한 이들 문헌에는 수화기능의 표시를 위한 3차원 수화 애니메이션 기법과 수화 인식 기법, 손 위치 기법, 고개운동 기법, 입술모양 변형 기법에 대하여 제시 하지 못하고 있다. 따라서 수화 전화 시스템 개발을 위한 구 체적인 방안을 고려하지 못하고 있다.Through the publications of Korean Patent Laid-Open Publication Nos. 2004-0010945, 2004-0076907, Published Patent Publication No. 0397692 and Published Utility Model Publication No. 0234151, video phones, handwriting recognition phones, hand gesture recognition video communications for the hearing impaired are well known. . However, they do not consider the conversion of sign language-sign language animation and sign recognition-speech synthesis. Also, these documents do not present 3D sign language animation technique, sign language recognition technique, hand position technique, head movement technique, and lip shape transformation technique. Therefore, the specific plan for developing a sign telephone system is not considered.

본 발명은 상기한 종래 기술의 문제점을 해소하기 위하여 안출한 것으로, 청각장애인이 표현한 수화에 대응하는 영상데이터를 특징 추출과 손동작 변화, 손 위치 시작점으로 분류해서 수화DB 검색을 통하여 텍스트로 변환하고 이를 다시 음성합성할 수 있고, 통신회선을 통해 상대방 건청인에게 전송하면서 건청인이 발성한 음성을 추출해서 음성인식기에 의해 문장으로 변환해서 수화식 전처리 알고리즘으로 처리하여 입술모양과 얼굴표정감정, 고개운동과 함께 3차원 수화 애니메이션을 발생할 수 있는 양방향 수화전화시스템을 제공하는 데 그 목적이 있다.The present invention has been made to solve the above problems of the prior art, the image data corresponding to the sign language represented by the hearing impaired by the feature extraction, hand motion changes, hand position start point classified and converted into text through the sign language DB search and this The voice can be synthesized again and transmitted to the other party through the communication line, the voice extracted by the person is converted into a sentence by the voice recognizer, processed by a sign language preprocessing algorithm, and combined with the shape of the lips, facial expressions, and head movements. The purpose is to provide a two-way sign language telephone system that can generate a three-dimensional sign language animation.

이와 같은 목적을 달성하기 위한 본 발명은, 제공되는 수화영상데이터에 대응하는 한글 문장을 발생하는 문장발생부와, 상기 문장발생부로부터 제공되는 한글 문장에 대응하는 음성소를 사용해서 음성을 합성하여 통신회선부로 제공하는 음성발생부와, 상기 통신회선부로부터 제공되는 음성에 대응하는 한글 문장을 생성하는 문장생성부와, 상기 문장생성부로부터 제공되는 한글 문장에 대응하는 수화를 수행하는 아바타를 표현하는 수화발생부를 포함하는 것을 특징으로 한다.The present invention for achieving the above object, by using a sentence generation unit for generating a Hangul sentence corresponding to the provided sign language image data, and using a phonetic language corresponding to the Hangul sentence provided from the sentence generation unit Represents a voice generation unit provided to the communication line unit, a sentence generation unit for generating a Hangul sentence corresponding to the voice provided from the communication line unit, and an avatar performing sign language corresponding to the Hangul sentence provided from the sentence generation unit. It characterized in that it comprises a hydration generating unit.

이와 같은 본 발명의 실시예를 첨부된 도면을 참조하여 상세히 설명하면 다음과 같다.When described in detail with reference to the accompanying drawings an embodiment of the present invention as follows.

도 1은 본 발명에 따른 수화 인식과 수화 발생을 구현한 양방향 수화 전화 시스템의 일 실시예를 나타낸 블록도로, 시스템처리부(100), 수화입력부(110), 수화출력부(120), 통신회선부(130), 및 DB그룹(140)으로 구성된다. 상기 시스템처리 부(100)는 문장발생부(200), 음성발생부(400), 문장생성부(500), 및 수화발생부(300)를 구비한다.Figure 1 is a block diagram showing an embodiment of a two-way sign language telephone system that implements sign language recognition and sign generation according to the present invention, a system processing unit 100, sign language input unit 110, sign language output unit 120, communication line unit 130, and the DB group 140. The system processor 100 includes a sentence generator 200, a voice generator 400, a sentence generator 500, and a sign language generator 300.

동 도면에 있어서, 수화입력부(110)는 카메라를 사용하여 수화동작이나 수화모양을 인식해서 이에 대응하는 수화영상데이터를 시스템처리부(100) 내의 문장발생부(200)로 제공한다.In the figure, the sign language input unit 110 recognizes a sign language operation or a sign language by using a camera and provides the sign language image data corresponding to the sign language image unit 200 in the system processing unit 100.

문장발생부(200)는 수화입력부(110)로부터 제공되는 수화영상데이터에 대응하는 문장을 DB그룹(140) 내의 수화DB에서 탐색하여 음성발생부(400)로 제공한다.The sentence generation unit 200 searches for a sentence corresponding to the sign language image data provided from the sign language input unit 110 in a sign language DB in the DB group 140 and provides the sentence to the voice generation unit 400.

음성발생부(400)는 문장발생부(200)로부터 제공되는 문장에 대응하는 음성소를 DB그룹(140) 내의 음성DB에서 탐색해서 해당 음성을 합성하여 통신회선부(전화망 또는, 데이터 통신망)(130)로 제공한다.The voice generator 400 searches for a voice station corresponding to the sentence provided by the sentence generator 200 in the voice DB in the DB group 140, synthesizes the corresponding voice, and communicates with the communication line (telephone network or data communication network) ( 130).

문장생성부(500)는 통신회선부(130)로부터 제공되는 음성에 대응하는 문장을 DB그룹(140) 내의 음성인식DB에서 탐색하여 수화발생부(300)로 제공한다.The sentence generation unit 500 searches for a sentence corresponding to the speech provided from the communication line unit 130 in the speech recognition DB in the DB group 140 and provides the sentence generation unit 300.

수화발생부(300)는 문장생성부(500)로부터 제공되는 문장에 대응하는 수화를 DB그룹(140) 내의 수화소DB에서 탐색하여 화면 표시용 데이터로 바꾸어서 수화출력부(120)로 제공한다.The sign language generation unit 300 searches for a sign language corresponding to a sentence provided from the sentence generation unit 500 in the sign language DB in the DB group 140, and converts the sign language into data for screen display and provides the sign language output unit 120 to the display language.

수화출력부(120)는 수화발생부(300)로부터 제공되는 화면 표시용 데이터를 사용하여 화면 출력장치에 해당 수화를 수행하는 아바타를 디스플레이하여 이용자가 볼 수 있도록 한다.The sign language output unit 120 uses the screen display data provided from the sign language generator 300 to display an avatar performing the sign language on the screen output device for the user to view.

도 2는 도 1에 도시된 수화발생부(300)의 일 실시예를 나타낸 블록도이다. 이때, 각 DB(211, 212, 213, 214, 215, 221, 222, 223)는 도 1에 도시된 DB그룹(140) 내에 구성된다.2 is a block diagram illustrating an embodiment of the sign language generator 300 shown in FIG. 1. At this time, each DB (211, 212, 213, 214, 215, 221, 222, 223) is configured in the DB group 140 shown in FIG.

동 도면에 있어서, 수화발생부(300) 내의 형태소 분석부(201)는 문장생성부(500)로부터 제공되는 문장을 분석하여 각 형태소로 분해한다. 이때, 본 발명에서 사용되는 문장은 한글 문장으로서, 형태소 분석부(201)는 한글 문장을 분석하여 각 한글 형태소로 분해한다.In the figure, the morpheme analysis unit 201 in the sign language generation unit 300 analyzes a sentence provided from the sentence generation unit 500 and decomposes it into each morpheme. At this time, the sentence used in the present invention is a Hangul sentence, the morpheme analysis unit 201 analyzes the Hangul sentence and decomposes each Hangul morpheme.

수화DB검색부(203)는 형태소 분석부(201)로부터 제공되는 분해된 각 한글 형태소에 대응하는 수화단어가 수화형태소/3차원수화애니메이션DB(221) 및 도 11과 같은 수화단어DB(222)에서 검색되면, 수화단어DB(222)와 손모양DB(223)에서 검색된 수화단어에 대응하는 모델링정보값을 추출하여 3차원 수화 그래픽으로 표현한다. 상기 도 11은 수화단어DB(222)의 얼굴표정값 정보에 따른 3차원 수화자 모델의 다양한 얼굴표정을 나타낸 도면이다.The sign language DB search unit 203 has a sign language word corresponding to each of the decomposed Hangul morphemes provided from the morpheme analysis unit 201 in a sign language morpheme / 3D sign language animation DB 221 and a sign language word DB 222 shown in FIG. 11. When retrieved from, the sign language word DB 222 and the hand-shaped DB 223 extracts modeling information values corresponding to the retrieved word and expresses it in 3D sign language. FIG. 11 is a diagram illustrating various face expressions of a 3D speaker model according to face expression value information of a sign language word DB 222.

얼굴표현추가부(206)는 수화DB검색부(203)에 접속되어, 수화DB검색부(203)가 해당 수화단어를 검색했을 경우, 그 검색된 수화단어의 얼굴 표정값을 표정DB(212)로부터 검색하고, 그 검색된 수화단어의 고개운동을 도 10과 같은 고개운동DB(213)로부터 검색하여 그 고개운동을 통해서 수화의 색동감을 표현한다. 예로, 머리를 숙이거나, 놀란 모습을 표현하듯이 고개를 뒷쪽으로 약간 내미는 등의 고개운동을 3차원 수화 애니메이션으로 표현할 수 있다.The face expression adding unit 206 is connected to the sign language DB search unit 203, and when the sign language DB search unit 203 searches for the sign language word, the facial expression value of the retrieved sign language word is expressed from the expression DB 212. Then, the head movement of the retrieved sign language word is searched from the head movement DB 213 as shown in FIG. 10, and the color motion of sign language is expressed through the head movement. For example, three-dimensional sign language animation can be used to express head movements, such as bowing your head or pushing your head slightly backwards to express a surprised appearance.

반면, 수화DB검색부(203)는 형태소 분석부(201)로부터 제공되는 분해된 각 한글 형태소에 대응하는 수화단어가 수화형태소/3차원수화애니메이션DB(221) 및 수 화단어DB(222)에서 검색되지 않으면, 이와 같이 검색되지 않은 상황을 미등록처리부(205)에게 보고하여, 미등록처리부(205)가 3차원지화애니메이션DB(211)를 검색하여 수화단어의 초/중/종성에 해당하는 3차원지화애니메이션을 표현하도록 한다.On the other hand, the sign language DB search unit 203 has a sign language word corresponding to each decomposed Hangul morpheme provided from the morpheme analysis unit 201 in a sign language morpheme / 3D sign language animation DB 221 and a sign language word DB 222. If it is not found, the unregistered situation is reported to the unregistered processing unit 205, and the unregistered processing unit 205 searches the 3D paper animation DB 211, and the 3D corresponding to the elementary / medium / finality of the sign language word. Express animations.

자음변환기(202)는 형태소분석부(201)에 접속되어, 한글 형태소에 있는 초성의 자음을 "ㅇ"으로 바꾸고 형태소에서 종성을 제거한다. 이때, 자음변환기(202)에서 자음변환하기 전의 음절과 입술모양을 구현할 모음은 도 8과 같다. 자음변환기(202)의 동작에 따른 다양한 입술모양은 도 9와 같으며, 수화 아바타 모델은 자음변환기(202)의 결과에서 나온 각 모음을 각 입술모양으로 표현한다. 즉, 입모양을 이용해서 수화를 보조하여 수화정보에 대한 시각적 이해를 전달 할 수 있는 아바타 모델이 표현되는 것이다.The consonant converter 202 is connected to the morpheme analysis unit 201 to change the consonant of the initial consonant in the Hangul morpheme to "o" and removes the finality from the morpheme. At this time, the vowels to implement the syllables and lips before the consonant conversion in the consonant transducer 202 is as shown in FIG. Various lip shapes according to the operation of the consonant transducer 202 are as shown in FIG. 9, and the sign language avatar model expresses each vowel resulting from the result of the consonant transducer 202 into each lip shape. In other words, an avatar model that expresses an understanding of sign language information by assisting sign language using a mouth shape is represented.

즉, 입술표현추가부(204)는 입술모양DB(214)로부터 자음변환기(202)의 출력에 대응하는 해당 입술모양을 검색하여 아바타의 입술모양을 생성한다.That is, the lip expression adder 204 retrieves the corresponding lip shape corresponding to the output of the consonant transducer 202 from the lip shape DB 214 to generate the lip shape of the avatar.

3D 수화 그래픽 발생부(207)는 얼굴표현추가부(206), 미등록처리부(205), 입술표현추가부(204), 및 인체모델DB(215)에 접속되어, 수화아바타를 3차원 수화 애니메이션으로 표현하여 사용자가 수화아바타의 발음을 이해할 수 있도록 한다.The 3D sign language graphic generation unit 207 is connected to the face expression adding unit 206, the unregistered processing unit 205, the lip expression adding unit 204, and the human body model DB 215 to convert the sign language avatar into a 3D sign language animation. Expresses the pronunciation of sign language avatars.

표현부(208)는 3D 수화 그래픽 발생부(207)의 3차원 수화 애니메이션 출력을 화면 표시용 데이터로 바꾸어서 수화출력부(120)로 제공한다.The expression unit 208 converts the 3D sign language animation output of the 3D sign language graphic generator 207 into data for screen display and provides it to the sign language output unit 120.

도 3은 도 1에 도시된 문장발생부(200)의 일 실시예를 나타낸 블록도이다. 이때, 각 DB(311, 312, 313, 314)는 도 1에 도시된 DB그룹(140) 내에 구성된다.3 is a block diagram showing an embodiment of the sentence generator 200 shown in FIG. 1. At this time, each DB (311, 312, 313, 314) is configured in the DB group 140 shown in FIG.

동 도면에 있어서, 문장발생부(200) 내의 데이터 획득부(301)는 수화입력부(110)로부터 수화영상데이터를 제공받아 수화자의 손동작을 실시간으로 인식하고 칼라 영상 데이터를 생성하여 영상 전처리부(302)로 제공한다.In the drawing, the data acquisition unit 301 in the sentence generation unit 200 receives the sign language image data from the sign language input unit 110 to recognize the hand gesture of the receiver in real time and generates color image data to generate the image preprocessor 302. To provide.

영상 전처리부(302)는 데이터 획득부(301)로부터 제공되는 손 영상데이터를 처리하여 손궤적 데이터, 손자세 데이터, 및 손위치 데이터로 분류한다.The image preprocessor 302 processes the hand image data provided from the data acquirer 301 and classifies the image into hand trajectory data, hand posture data, and hand position data.

손운동분류부(303)는 영상 전처리부(302)로부터 제공되는 손궤적 데이터와 손운동DB(311)에서 검색된 손운동데이터를 비교하여 손운동정보값을 생성한다.The hand motion classification unit 303 generates a hand motion information value by comparing the hand locus data provided from the image preprocessor 302 with the hand motion data retrieved from the hand motion DB 311.

손자세분류부(304)는 영상 전처리부(302)로부터 제공되는 손자세 데이터와 손자세DB(312)의 데이터를 비교하여 손자세정보값을 생성한다.The hand posture classification unit 304 compares the hand posture data provided from the image preprocessor 302 with the data of the hand posture DB 312 to generate a hand posture information value.

손위치분류부(305)는 수화자의 중심에 맞추고 수화자의 얼굴을 기준으로 하여, 영상 전처리부(302)로부터 제공되는 손위치 데이터의 이미지를 7X5 하부이미지로 분할하고, 그 데이터를 손위치DB(313)의 손위치 이미지 데이터와 비교하고 손위치정보값을 생성한다. 이때, 손위치 데이터의 손가락이 위치 데이터의 얼굴 부분에 접할 경우 사용자의 코를 중심으로 하여 얼굴 이미지를 다시 분할하고, 그 분할된 위치 이미지 데이터를 손위치DB(313)의 위치 이미지 데이터와 비교하여 손위치 정보값을 생성한다.The hand position classifier 305 divides the image of the hand position data provided from the image preprocessor 302 into 7X5 lower images based on the center of the receiver and the face of the receiver, and divides the data into the hand position DB ( And hand position information value is generated. In this case, when the finger of the hand position data is in contact with the face portion of the position data, the face image is divided again based on the user's nose, and the divided position image data is compared with the position image data of the hand position DB 313. Generate hand position information.

수화데이터구성부(306)는 손운동분류부(303)로부터 제공되는 손운동정보값, 손자세분류부(304)로부터 제공되는 손자세정보값, 및 손위치분류부(305)로부터 제공되는 손위치정보값을 수화소DB(314)에 저장된 손운동정보값, 손자세정보값, 및 손위치정보값에 비교하여 수화소DB(314) 안의 수화소단어이름에 해당하는 텍스트단 어이름을 생성한다.The sign language data configuration unit 306 is a hand motion information value provided from the hand motion classification unit 303, a hand position information value provided from the hand position classification unit 304, and a hand position provided from the hand position classification unit 305. Comparing the information value to the hand movement information value, the hand posture information value, and the hand position information value stored in the pixel source 314, a text word name corresponding to the name of the word of the pixel in the pixel source 314 is generated. .

문장구성부(307)는 수화데이터구성부(306)로부터 제공되는 텍스트단어이름을 사용하여 문장을 다시 구성하여 출력한다.The sentence constructing unit 307 recomposes and outputs the sentence using the text word name provided from the sign language data constructing unit 306.

이때, 손운동분류부(303)는 손이 운동하지 않고 정지상태로 있을 경우, 손운동정보값을 발생하지 않는다. 이에, 수화데이터구성부(306)는 손운동데이터를 정지상태로 처리하고, 그 손운동데이터가 정지상태이고 손자세데이터에 고유한 값이 들어 있을 경우, 수화소 데이터가 수화소DB(314) 안의 수화소이름에 해당하는 자음/모음에 일치하면 자모/모음을 생성하고, 그 생성된 자음/모음을 조합하여 고유명사를 생성한다.At this time, the hand movement classification unit 303 does not generate a hand movement information value when the hand is in a stationary state without movement. Accordingly, the sign language data configuration unit 306 processes the hand movement data in a stationary state, and when the hand movement data is in the stationary state and the hand posture data contains a unique value, the hydration pixel data is the hydration DB 314. If it matches the consonants / vowels corresponding to the name of the pelvic name, the consonants / vowels are generated, and the generated consonants / vowels are combined to generate proper nouns.

도 4는 도 1에 도시된 음성발생부(400)의 일 실시예를 나타낸 블록도이다.4 is a block diagram showing an embodiment of the voice generator 400 shown in FIG. 1.

동 도면에 있어서, 음성 합성부(401)는 문장발생부(200)로부터 제공되는 문장에 대응하여 음성합성DB(411)에 저장된 음성데이터를 탐색해서 출력함으로써 청각장애인의 수화 정보를 비수화자에게 전달할 수 있도록 한다.In the same figure, the speech synthesis unit 401 searches for and outputs speech data stored in the speech synthesis DB 411 in response to a sentence provided from the sentence generation unit 200 to deliver sign language information of a hearing impaired person to a non-speaker. To be able.

도 5는 도 1에 도시된 문장생성부(500)의 일 실시예를 나타낸 블록도이다.FIG. 5 is a block diagram illustrating an embodiment of the sentence generation unit 500 shown in FIG. 1.

동 도면에 있어서, 음성 인식부(501)는 통신회선부(130)로부터 제공되는 음성에 대응하여, 음성인식DB(511)에서 미리 학습된 음성 데이터를 적절하게 비교해서 문장을 발생시켜 수화발생부(300)로 제공한다.In the figure, the speech recognition unit 501 generates a sentence by appropriately comparing the speech data previously learned in the speech recognition DB 511 in response to the speech provided from the communication circuit 130, and generating a sentence. Provided by 300.

도 6은 도 2에 도시된 자음변환기의 동작을 단계별로 나타낸 순서도이다.6 is a flowchart illustrating the operation of the consonant transducer illustrated in FIG. 2 step by step.

먼저, 수화동작과 입술모양의 변화를 구현하기 위해 현재 어절을 한 음절씩 분리한다(단계 S2).First, the current word is separated by one syllable in order to implement a sign language movement and a change in the shape of lips (step S2).

분리된 각 음절을 초/중/종성으로 변환한다(단계 S4).Each of the separated syllables is converted into seconds, mid, and finality (step S4).

초성의 자음을 'ㅇ'으로 변환한다(단계 S6).The consonant is converted to 'o' (step S6).

한 음절에서 종성을 제거하고, 입술모양을 구현할 수 있도록 모음만 표현한다(단계 S8). 이때, 모음이 이중모음일 경우 이중모음을 각각 단순모음으로 분리한다.The vowel is removed from one syllable, and only the vowels are expressed to implement the lip shape (step S8). In this case, when the vowel is a double vowel, the double vowel is separated into a simple vowel.

음절이 남아 있을 경우 상기 단계(S4)부터 다시 수행한다(단계 S10).If the syllable remains, the process is performed again from the step S4 (step S10).

도 7은 수화형태소/3차원수화애니메이션DB(221), 수화단어DB(222), 및 손모양DB(223)의 종속관계를 나타낸 도면으로, 수화DB검색부(203)의 동작에 따라 수화형태소/3차원수화애니메이션DB(221) 검색에 의해 입력된 형태소단어가 각각 수화단어로 분리되고, 아바타를 구현하기 위하여 수화단어DB(222)와 손모양DB(223)로부터 얻은 모델링 값 정보를 추출한다.FIG. 7 is a diagram illustrating a dependency relationship between a sign language morpheme / 3D sign language animation DB 221, a sign language word DB 222, and a hand-shaped DB 223. The sign language morpheme according to the operation of the sign language DB search unit 203 is shown. Morphological words input by the 3D sign language animation 221 search are separated into sign language words, and modeling value information obtained from the sign language word DB 222 and the hand-shaped DB 223 is extracted to implement an avatar. .

도 12는 손운동DB(311)에 저장되어 있는 손운동 데이터를 나타낸 도면이다.12 is a diagram illustrating hand movement data stored in the hand movement DB 311.

도 13은 손자세DB(312)에 저장되어 있는 손자세 데이터에 대응하는 손방향을 나타낸 도면이다.FIG. 13 is a view showing hand directions corresponding to hand posture data stored in the hand posture DB 312.

도 14는 손자세DB(312)에 저장되어 있는 손자세 데이터에 대응하는 손모양을 나타낸 도면이다.FIG. 14 is a diagram showing a hand shape corresponding to the hand posture data stored in the hand posture DB 312.

도 15는 수화자의 손 위치 이미지 세부분할과 손 위치 데이터내용을 나타낸 도면으로, 얼굴 밖에 손이 위치할 경우 수화자 분할 이미지와 수화자 영역 분할에 의하여 생성한 손 위치 정보값을 나타낸다.FIG. 15 is a diagram illustrating the subdividing of the hand position image and the contents of the hand position data of the talker, and shows the hand position information generated by the talker division image and the talker area division when the hand is located outside the face.

도 16은 수화자 얼굴 안에 손이 위치할 경우, 수화자 얼굴 분할 이미지와 수화자 얼굴 영역 분할에 의하여 생성한 손 위치 정보값을 나타낸 도면이다.FIG. 16 is a diagram illustrating hand position information values generated by dividing a face of a talker's face and segmenting the face of a talker when the hand is located in the face of the talker.

도 17은 수화소DB(314)의 수화소 구성내용을 나타낸 도면으로, 수화데이터구성부(306)는 손운동분류부(303)로부터 제공되는 손운동정보값, 손자세분류부(304)로부터 제공되는 손자세정보값, 및 손위치분류부(306)로부터 제공되는 손위치정보값을 받아 그 3개의 정보값이 수화소DB(314)에 미리 저장한 데이터에 일치되면 해당 텍스트단어를 생성한다.FIG. 17 is a view showing the details of the pixel configuration of the pixel DB 314. The sign language data configuration section 306 provides the hand motion information value provided from the hand motion classification section 303 and the hand detail classification section 304. As shown in FIG. If the hand position information value and the hand position information value provided from the hand position classification unit 306 are received and the three information values match the data previously stored in the pixel DB 314, a corresponding text word is generated.

도 18은 수화소DB(314)의 자음지화소 구성내용을 나타낸 도면으로, 수화데이터구성부(306)는 모든 손운동데이터가 정지상태이면 자음을 생성할 수 있다.FIG. 18 is a view showing the consonant pixel configuration of the pixel DB 314. The sign language data configuration unit 306 may generate consonants when all hand movement data is in a stopped state.

도 19는 수화소DB(314)의 모음지화소 구성내용을 나타낸 도면으로, 수화데이터구성부(306)는 모든 손운동데이터가 정지상태이면 모음을 생성한다.19 is a view showing the configuration of the collection pixel of the pixel DB 314. The sign language data configuration unit 306 generates a collection when all the hand movement data is in a stopped state.

도 20은 수화소DB(314)의 문자배열 구성내용을 나타낸 도면으로, 수화사용자의 지화표현순서대로 모음지화소와 자음지화소를 함께 조합하고 문자를 생성할 수 있다. 많이 구성하면 문자 두 개 이상을 모아 고유명사를 발생할 수 있다.FIG. 20 is a view illustrating the arrangement of characters in the sign language DB 314. The collection pixel and the consonant point pixel may be combined and generated in the order of the sign representation of the sign language user. If you configure a lot, you can generate proper nouns by collecting two or more characters.

도 21은 수화 전화 시스템의 청각장애인모드 화면배치를 나타낸 도면이고, 도 22는 수화 전화 시스템의 건청인모드 화면배치를 나타낸 도면으로, 수화 전화 시스템은 장애인모드 전환 버튼기능을 갖추어서 청각장애인모드일 경우 수화 애니메이션을 상대방 비수화자 영상화면 영역에 표시하도록 할 수 있다.FIG. 21 is a diagram illustrating a layout of a deaf mode screen of a sign language telephone system, and FIG. 22 is a diagram illustrating a layout of a deaf mode of a sign language telephone system. The sign language animation may be displayed on the counterpart non-talker video screen area.

이상에서 본 발명에 대한 기술사상을 첨부도면과 함께 서술하였지만 이는 본 발명의 바람직한 실시예를 예시적으로 설명한 것이지 본 발명을 한정하는 것은 아니다. 또한, 이 기술분야의 통상의 지식을 가진 자라면 누구나 본 발명의 기술사상의 범주를 이탈하지 않는 범위 내에서 다양한 변형 및 모방이 가능함은 명백한 사실이다.The technical spirit of the present invention has been described above with reference to the accompanying drawings, but this is by way of example only and not intended to limit the present invention. In addition, it is obvious that any person skilled in the art can make various modifications and imitations without departing from the scope of the technical idea of the present invention.

본 발명은 수화 인식과 수화 발생을 구현한 양방향 수화 전화 시스템을 통하여, 청각장애인들의 의사소통에 대한 이해를 돕고 화상 전화 시스템 프로그램을 통해 여가 생활을 영위하거나 사회 활동 및 교육에 필요한 정보를 제공하는데 유용하게 사용될 수 있다.The present invention is useful for helping the hearing-impaired people to understand the communication through two-way sign phone system that implements sign language recognition and sign language generation, and to provide information necessary for leisure activities or social activities and education through the video phone system program. Can be used.

Claims

A sentence generation unit for generating a Hangul sentence corresponding to the provided sign language image data;

A voice generator for synthesizing the voice using a phoneme corresponding to the Hangul sentence provided by the sentence generator and providing the voice to the communication line;

A sentence generation unit generating a Hangul sentence corresponding to the voice provided by the communication line unit;

And a sign language generation unit for representing an avatar performing a sign language corresponding to the Hangul sentence provided by the sentence generation unit.

The method of claim 1,

The sign language generation unit, and a morpheme analysis unit for decomposing the sentence provided from the sentence generation unit into each morpheme,

When a sign language word corresponding to each morpheme provided from the morpheme analysis unit is provided, a modeling information value corresponding to the provided sign language word is extracted and represented in a 3D sign language graphic, and the respective morpheme analysis units are provided. A sign language retrieval unit for allowing a non-registered processing unit to express a 3D animation corresponding to the elementary / medium / finality of the sign language when the sign language word corresponding to the morpheme is not provided;

A face expression adder which is connected to the sign language search unit and expresses a facial expression and a head movement of the sign language word when the sign language search unit includes the sign language word;

A consonant converter connected to the morpheme analysis unit to change the consonant of the initial consonant in the morpheme to " o "

A lip expression adder for generating a lip shape corresponding to the output of the consonant transducer,

And a 3D sign language graphic generation unit connected to the face expression adding unit, an unregistered processing unit, and a lip expression adding unit to express a sign language avatar as a 3D sign language animation.

The method of claim 1,

The sentence generation unit may include a data acquisition unit receiving the provided sign language image data and generating image data corresponding to a hand gesture of a receiver;

An image preprocessing unit classifying hand trajectory data, hand posture data, and hand position data from the hand image data provided from the data acquisition unit;

A hand movement classifier configured to compare the hand trajectory data provided from the image preprocessor with the hand movement data and generate a hand movement information value;

A hand posture classification unit for generating a hand posture information value by comparing the hand posture data provided from the image preprocessing unit with the provided hand posture data;

The image of the hand position data provided from the image preprocessing unit is divided based on the center of the receiver and the face of the receiver, and the hand position data is compared with the hand position image data provided to generate hand position information values. Hand position classification unit,

Hand movement information value provided by the hand movement classification unit, hand position information value provided by the hand position classification unit, hand position information value provided by the hand position classification unit, and hand position information value A sign language data constructing unit for generating a text word name corresponding to a sign language word name provided in comparison with a value of a hand and a hand position information;

2. A sign language telephone system embodying a sign language recognition and a sign language generation, comprising a sentence construction unit constituting a sentence using a text word name provided from the sign language data construction unit.

The method of claim 3, wherein

The hand position classifying unit, when the finger of the hand position data is in contact with the face portion of the position data, the face image is re-divided around the user's nose, and compared with the position image data provided with the divided position image data A bidirectional sign language telephone system implementing sign language recognition and sign language generation, characterized by generating hand position information.

The method of claim 3, wherein

The hand motion classification unit does not generate a hand motion information value when the hand is stationary and does not move,

When the hand movement classification unit does not generate a hand movement information value, the sign language data configuration unit processes the hand movement data in a stationary state, and when the hand movement data is in a stationary state and contains unique values in the hand posture data. When the consonant / vowels correspond to the corresponding consonant names, the consonants / vowels are generated and the proper consonants are generated by combining the generated consonants / vowels. Telephone system.