KR20160002081A - Apparatus and method for translating of sign language using speech recognition - Google Patents
Apparatus and method for translating of sign language using speech recognition Download PDFInfo
- Publication number
- KR20160002081A KR20160002081A KR1020140080858A KR20140080858A KR20160002081A KR 20160002081 A KR20160002081 A KR 20160002081A KR 1020140080858 A KR1020140080858 A KR 1020140080858A KR 20140080858 A KR20140080858 A KR 20140080858A KR 20160002081 A KR20160002081 A KR 20160002081A
- Authority
- KR
- South Korea
- Prior art keywords
- sign language
- text data
- speech
- data
- speech recognition
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B21/00—Teaching, or communicating with, the blind, deaf or mute
-
- G—PHYSICS
- G09—EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
- G09B—EDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
- G09B21/00—Teaching, or communicating with, the blind, deaf or mute
- G09B21/02—Devices for Braille writing
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
Abstract
The present invention relates to a sign language translation apparatus and method using speech recognition, and more particularly, to a speech recognition processing unit for converting a speech recognized from a video object into speech data and generating the converted speech data as text data. A text recognition processor for extracting text data inserted into a video object; A sign language translation processing unit for translating the generated text data and extracted text data into sign language words and digest information, and forming sign language sentences using the extracted sign language words and digest information; And an output unit for outputting the generated sentence.
Description
The present invention relates to a sign language translation apparatus and a method thereof using speech recognition, and more particularly, to a sign language translation apparatus and method using speech recognition that allow users with hearing impairment to easily utilize video images .
Hearing impairment refers to the weakness of hearing a sound or the inability to hear it completely. If you have a hearing impairment, you will not be able to communicate properly if you do not get the accent or pronunciation of your words and do not have proper language training. Therefore, hearing-impaired people communicate using sign language, but the sign language is unfamiliar to the general public and it is difficult to communicate with people with hearing impairments.
Such a conventional technique of translating a sign language expression into a character recognizes the movement of the whole arm in order to translate all the characters of sign language and requires data according to the movement recognition of the arms. Therefore, although the conventional art uses a plurality of sensors to recognize the movement of the arm, there is a problem that even if a plurality of sensors are used, translation is performed to two or more characters when the motion of the sign language expression is similar, There is a problem that this is difficult.
On the other hand, in the related art, since the movement of the finger is recognized as a bent state, a straightened state, a middle state, and the like, there is a problem that the difference in finger movement can not be accurately discriminated in the case of a sign language expression having a finger movement of a similar operation.
In this regard, Korean Patent Publication No. 2008-0010234 discloses a " multifunctional geological system ".
SUMMARY OF THE INVENTION The present invention has been made to solve the above-mentioned problems, and it is an object of the present invention to provide a speech recognition system and a speech recognition method for converting a speech data recognized from a video object into text data, And an object of the present invention is to provide a sign language translation apparatus and method therefor.
The present invention also relates to a sign language translation apparatus and a method therefor, which uses speech recognition to recognize a start time and an end time of a corresponding sentence when recognizing speech data from a video object, The purpose is to provide.
In addition, the present invention provides a sign language translation apparatus and method using speech recognition that determines whether a homonym exists when converting speech data recognized from a video object into text data, and grasping an accurate meaning of a noun and verb on the basis of an investigation The purpose is to provide.
It is another object of the present invention to provide a sign language translation apparatus and method using speech recognition that performs sign language translation of a caption output from a video object.
According to an aspect of the present invention, there is provided a sign language translation apparatus using speech recognition, comprising: a speech recognition processor for converting speech recognized from a video object into speech data and generating converted speech data as text data; A text recognition processor for extracting text data inserted into a video object; A sign language translation processing unit for translating the generated text data and extracted text data into sign language words and digest information, and forming sign language sentences using the extracted sign language words and digest information; And an output unit for outputting the generated sentence.
Further, the speech recognition processing unit may include: a speech extraction unit for recognizing and extracting speech from a video object; A data conversion unit for converting the extracted voice into voice data; And a text data generation unit for generating the converted speech data as text data.
The sign language translation processing unit may include a morpheme analysis unit for analyzing the morphemes of the generated text data and the extracted text data; A homonym processing unit for determining and processing the existence of homonyms in the generated text data and the extracted text data; A sign language word extracting unit for extracting a sign language word based on a morphological analysis result and a homonym word decision result; An intelligence information extracting unit for extracting intelligence information corresponding to a primality, a neutrality, and a consonance of a corresponding morpheme when a sign language word does not exist as a result of the morpheme analysis; And a sign language sentence forming unit for forming a sign language sentence by combining the extracted sign language word and the digestion information.
In addition, the homonym processor may determine whether a homonym exists as a result of the morphological analysis. If the homonym exists in the noun, the semantic identifier is determined based on the verb. If the verb exists in the verb, .
The output unit divides the screen of the display unit to output a sign language sentence on a part of the divided screen, and the sign language sentence is translated into sign language on another divided screen.
The apparatus further includes a time log recording unit for recording a time log of a start time of starting extraction of the audio from the video and a ending time of ending the audio extraction.
In addition, the output unit outputs a sign language sentence on the display unit screen according to the recorded time log.
According to another aspect of the present invention, there is provided a method for translating a sign language using speech recognition, the method comprising: converting a speech recognized from a video object into speech data and generating converted speech data as text data; Extracting text data inserted into a video object by a text recognition processing unit; Translating the generated text data and extracted text data into sign language words and digest information by a sign language translation processing unit and forming a sign language sentence using the extracted sign language words and digest information; And outputting a sign language sentence formed by the output unit; .
The step of converting the voice recognized from the video object into the voice data and generating the converted voice data as text data includes the steps of: recognizing and extracting voice from the video object; Converting the extracted voice into voice data; And generating the converted speech data as text data.
The step of translating the generated text data and the extracted text data into the sign language word and the digestion information and forming the sign language sentence using the extracted sign language word and the digestion information may include the steps of: Lt; / RTI > Determining and processing the existence of homonyms in the generated text data and the extracted text data; Extracting a sign language word based on the morpheme analysis result and the homonym word determination result; Extracting geographical information corresponding to a primitive, a neutral, and a vertex of the morpheme when the sign language word is not present as a result of the morphological analysis; and forming a sign language sentence by combining the extracted sign language word and the geographical information .
In addition, the step of determining existence and existence of a homonym exists in the generated text data and the extracted text data may include determining whether a homonym exists as a result of morphological analysis, if the homonym exists in the noun, If there is a homonym in the verb, it is characterized by recognizing the meaning based on nouns and research.
In addition, in the step of converting the voice recognized from the video object into the voice data and generating the converted voice data as text data, the time log recording unit records, by the time log recording unit, the start time for starting extraction of the audio from the video and the ending time And recording the time log for the first time.
Also. A sign language sentence is output on a partial screen divided by splitting the screen of the display unit and a sign language sentence is translated into sign language on another divided screen, And outputs the audio start time of the video object and the output time of the sign language sentence coincidentally.
The sign language translation apparatus and method using the speech recognition according to the present invention having the above-described configuration can convert the speech data recognized from the video data into text data so that users with hearing impairment can easily use the video data, , It is possible to expand the image contents for users with hearing impairment and to utilize them as a medium for sharing culture in various fields.
Further, the present invention has an effect of improving the translation performance by recognizing the start time and the end time of a corresponding sentence when voice data is recognized from a video object, and adjusting the sign language translation time according to the voice data output time .
In addition, the present invention judges whether a homonym exists at the time of converting the voice data recognized from the video data into the text data and grasps the exact meaning of the verb based on the examination of the noun, thereby eliminating the ambiguity of the language caused by the homonym So that the accuracy of translation can be improved.
In addition, the present invention has an effect of enabling easy and immediate communication between a user having a hearing impairment and a general user, and expanding contents for users with hearing impairment.
1 is a diagram for explaining a configuration of a sign language translation apparatus using speech recognition according to the present invention.
2 is a diagram for explaining a detailed configuration of a speech recognition processor employed in a sign language translation apparatus using speech recognition according to the present invention.
3 is a diagram for explaining the detailed configuration of a sign language translation processing unit employed in a sign language translation apparatus using speech recognition according to the present invention.
4 is a diagram for explaining a procedure of a sign language translation method using speech recognition according to the present invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings in order to facilitate a person skilled in the art to easily carry out the technical idea of the present invention. . First, in adding reference numerals to the constituents of the drawings, it is to be noted that the same constituents are denoted by the same reference symbols as possible even if they are displayed on different drawings. In the following description of the present invention, a detailed description of known functions and configurations incorporated herein will be omitted when it may make the subject matter of the present invention rather unclear.
Hereinafter, a sign language translation apparatus and method using speech recognition according to an embodiment of the present invention will be described in detail with reference to the accompanying drawings.
1 is a diagram for explaining a configuration of a sign language translation apparatus using speech recognition according to the present invention.
1, a sign
The speech
The time
The
The sign language
The
2 is a diagram for explaining a detailed configuration of a speech recognition processor employed in a sign language translation apparatus using speech recognition according to the present invention.
Referring to FIG. 2, the
For this, the voice
The
The
The text
3 is a diagram for explaining the detailed configuration of a sign language translation processing unit employed in a sign language translation apparatus using speech recognition according to the present invention.
3, the sign language
To this end, the sign language
The
The homonym
The sign language
If there is no sign language as a result of the morphological analysis, the geographical
The sign language
FIG. 4 is a diagram for explaining a procedure of a sign language translation method using speech recognition according to the present invention, and FIG. 5 is a diagram illustrating an example of a screen implemented according to the present invention.
Referring to FIG. 4, a sign language translation method using speech recognition according to the present invention uses a sign language translation apparatus using speech recognition as described above, and a repeated description will be omitted.
First, speech and text data are recognized from a video object (S100)
Next, a time log is recorded for the start time of starting extraction of audio from the video and the ending time of ending audio extraction (S105).
Next, the recognized voice is converted into voice data, the converted voice data is generated as text data, and the text data inserted in the video data is extracted (S110). In step S110, the voice recognized and extracted from the video data is converted into voice data And generates the converted speech data as text data. In addition, general nouns, compound nouns, numbers, and alphabets included in the subtitles embedded in the video are extracted, and unnecessary parts of speech are removed to separate the root and the end.
Next, the morpheme of the generated text data and the extracted text data is analyzed (S120).
Next, it is determined whether there is a homonym in the generated text data and the extracted text data (S130).
Next, it is determined whether or not the sign language is present based on the results of steps S120 and S130 (S140).
Next, if there is no sign language word, the feature information is extracted (S145).
Next, if a sign language word exists, a sign language sentence is formed (S150).
Finally, the sign language sentence is output by combining the extracted sign language word and the digest information (S160).
As described above, the sign language interpretation apparatus and method using speech recognition according to the present invention converts speech data recognized from a video object into text data and performs sign language translation so that users with hearing impairments can easily utilize the video, It is possible to expand the visual content for users with hearing impairment and further utilize it as a medium for sharing culture in various fields.
In addition, the present invention can improve the translation performance by recognizing the start time and the end time of a corresponding sentence when voice data is recognized from the video object, and adjusting the sign language translation time according to the voice data output time.
While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is evident that many alternatives, modifications, and variations will be apparent to those skilled in the art without departing from the scope of the appended claims. As will be understood by those skilled in the art.
100: Sign language translation device using speech recognition
110: voice recognition processor 120: time log recorder
130: Text recognition processor 140: Sign language translation processor
150:
Claims (13)
A text recognition processor for extracting text data from a video object;
A sign language translation processing unit for translating the generated text data and extracted text data into sign language words and digest information, and forming sign language sentences using the extracted sign language words and digest information; And
An output unit for outputting the generated sign language sentence;
And a sign language translation apparatus using speech recognition.
The speech recognition processing unit,
A voice extracting unit for recognizing and extracting a voice from a video object;
A data conversion unit for converting the extracted voice into voice data; And
A text data generation unit for generating the converted speech data as text data;
And a sign language translation apparatus using speech recognition.
The sign language translation processing unit,
A morpheme analysis unit for analyzing morphemes of the generated text data and the extracted text data;
A homonym processing unit for determining and processing the existence of homonyms in the generated text data and the extracted text data;
A sign language word extracting unit for extracting a sign language word based on a morphological analysis result and a homonym word decision result;
An intelligence information extracting unit for extracting intelligence information corresponding to a primality, a neutrality, and a consonance of a corresponding morpheme when a sign language word does not exist as a result of the morpheme analysis; And
A sign language sentence forming unit for forming a sign language sentence by combining the extracted sign language word and the digestion information;
And a sign language translation apparatus using speech recognition.
The homonym word processing unit determines whether a homonym exists as a result of the morphological analysis. If the homonym exists in the noun, it is possible to grasp the meaning based on the verb, and if there is a homonym in the verb, A sign language translation device using speech recognition as a feature.
Wherein the output unit divides the screen of the display unit to output a sign language sentence on a part of the divided screen, and outputs the sign language sentence to another divided part of the screen to be output as a sign language.
Further comprising a time log recording unit for recording a time log for a start time for starting extraction of speech from the video object and an end time for ending speech extraction.
Wherein the output unit outputs a sign language sentence on the display unit screen according to the recorded time log.
Extracting text data inserted into a video object by a text recognition processing unit;
Translating the generated text data and extracted text data into sign language words and digest information by a sign language translation processing unit and forming a sign language sentence using the extracted sign language words and digest information; And
Outputting a sign language sentence formed by the output unit;
And a speech recognition method using the speech recognition.
The step of converting the voice recognized from the video object into the voice data and generating the converted voice data as text data,
Recognizing and extracting speech from a video object;
Converting the extracted voice into voice data; And
Generating converted speech data as text data;
And a speech recognition method using the speech recognition.
Translating the generated text data and extracted text data into sign language words and feature information, and forming a sign language sentence using the extracted sign language words and feature information,
Analyzing the morpheme of the generated text data and the extracted text data;
Determining and processing the existence of homonyms in the generated text data and the extracted text data;
Extracting a sign language word based on the morpheme analysis result and the homonym word determination result;
Extracting geographic information corresponding to the primitive, neutral, and longitudinal features of the morpheme if the sign language word does not exist as a result of the morphological analysis; and
Forming a sign language sentence by combining the extracted sign language word and the digestion information;
And a speech recognition method using the speech recognition.
The step of determining existence of homonyms in the generated text data and the extracted text data,
The present invention relates to a speech recognition method and a speech recognition method, and more particularly, to a speech recognition method and a speech recognition method, A sign language translation device utilizing the.
In the step of converting the voice recognized from the video object into the voice data and generating the converted voice data as text data,
Further comprising the step of recording a time log of a start time for starting extraction of speech from the video and an ending time of ending speech extraction by the time log recording unit.
In the step of outputting the formed sentence sentence,
The display screen of the display unit is divided to output a sign language sentence on a part of the divided screen and a sign language sentence is translated into sign language on another divided screen. And outputting the output of the sign language interpretation utilizing the speech recognition.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020140080858A KR20160002081A (en) | 2014-06-30 | 2014-06-30 | Apparatus and method for translating of sign language using speech recognition |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020140080858A KR20160002081A (en) | 2014-06-30 | 2014-06-30 | Apparatus and method for translating of sign language using speech recognition |
Publications (1)
Publication Number | Publication Date |
---|---|
KR20160002081A true KR20160002081A (en) | 2016-01-07 |
Family
ID=55168796
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020140080858A KR20160002081A (en) | 2014-06-30 | 2014-06-30 | Apparatus and method for translating of sign language using speech recognition |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR20160002081A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20180072136A (en) | 2016-12-21 | 2018-06-29 | 주식회사 이앤지테크 | Communication system capable of displaying emotion information ,and Drive Method of the Same |
CN112420046A (en) * | 2020-10-22 | 2021-02-26 | 深圳市声活科技文化有限公司 | Multi-person conference method, system and device suitable for hearing-impaired people to participate |
KR20210097243A (en) * | 2020-01-29 | 2021-08-09 | 이웅희 | Method, system, program and computer readable recording medium for provoding virtual experiencing tour |
KR102463283B1 (en) * | 2022-05-17 | 2022-11-07 | 주식회사 엘젠 | automatic translation system of video contents for hearing-impaired and non-disabled |
-
2014
- 2014-06-30 KR KR1020140080858A patent/KR20160002081A/en active Search and Examination
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20180072136A (en) | 2016-12-21 | 2018-06-29 | 주식회사 이앤지테크 | Communication system capable of displaying emotion information ,and Drive Method of the Same |
KR20210097243A (en) * | 2020-01-29 | 2021-08-09 | 이웅희 | Method, system, program and computer readable recording medium for provoding virtual experiencing tour |
CN112420046A (en) * | 2020-10-22 | 2021-02-26 | 深圳市声活科技文化有限公司 | Multi-person conference method, system and device suitable for hearing-impaired people to participate |
KR102463283B1 (en) * | 2022-05-17 | 2022-11-07 | 주식회사 엘젠 | automatic translation system of video contents for hearing-impaired and non-disabled |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108447486B (en) | Voice translation method and device | |
CN105244022B (en) | Audio-video method for generating captions and device | |
CN105704538A (en) | Method and system for generating audio and video subtitles | |
JP4448450B2 (en) | Multi-mode speech language translation and display | |
KR20180129486A (en) | Method for chunk-unit separation rule and display automated key word to develop foreign language studying, and system thereof | |
KR20140121580A (en) | Apparatus and method for automatic translation and interpretation | |
Pennell et al. | Normalization of text messages for text-to-speech | |
KR20160002081A (en) | Apparatus and method for translating of sign language using speech recognition | |
JP6887508B2 (en) | Information processing methods and devices for data visualization | |
JP2008158055A (en) | Language pronunciation practice support system | |
CN112329485A (en) | Translation method, device, system and storage medium | |
KR101130276B1 (en) | System and method for interpreting sign language | |
Shahriar et al. | A communication platform between bangla and sign language | |
JP7117629B2 (en) | translation device | |
KR101990019B1 (en) | Terminal for performing hybrid caption effect, and method thereby | |
KR102300589B1 (en) | Sign language interpretation system | |
US20220237379A1 (en) | Text reconstruction system and method thereof | |
Monga et al. | Speech to Indian Sign Language Translator | |
KR102148021B1 (en) | Information search method and apparatus in incidental images incorporating deep learning scene text detection and recognition | |
KR101553469B1 (en) | Apparatus and method for voice recognition of multilingual vocabulary | |
KR101920653B1 (en) | Method and program for edcating language by making comparison sound | |
KR20180130933A (en) | Analysis method for chunk and key word based on voice signal of video data, and system thereof | |
JP2020057401A (en) | Display support device, method and program | |
JPH1097280A (en) | Speech image recognition and translation device | |
Jiang | SDW-ASL: A Dynamic System to Generate Large Scale Dataset for Continuous American Sign Language |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A201 | Request for examination | ||
E902 | Notification of reason for refusal | ||
AMND | Amendment | ||
E601 | Decision to refuse application | ||
AMND | Amendment |