AU2021101012A4 - A system for translating sign language into speech and vice-versa - Google Patents

A system for translating sign language into speech and vice-versa Download PDF

Info

Publication number
AU2021101012A4
AU2021101012A4 AU2021101012A AU2021101012A AU2021101012A4 AU 2021101012 A4 AU2021101012 A4 AU 2021101012A4 AU 2021101012 A AU2021101012 A AU 2021101012A AU 2021101012 A AU2021101012 A AU 2021101012A AU 2021101012 A4 AU2021101012 A4 AU 2021101012A4
Authority
AU
Australia
Prior art keywords
gesture
unit
sign language
hand
deaf
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
AU2021101012A
Inventor
Jyoti Arora
Shilpi Gupta
Aman Kataria
HariKumar Pallathadka
Shaminder Singh Sohi
Nileshsingh V. Thakur
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to AU2021101012A priority Critical patent/AU2021101012A4/en
Application granted granted Critical
Publication of AU2021101012A4 publication Critical patent/AU2021101012A4/en
Ceased legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/017Gesture based interaction, e.g. based on a set of recognized hand gestures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/28Recognition of hand or arm movements, e.g. recognition of deaf sign language
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/02Methods for producing synthetic speech; Speech synthesisers
    • G10L13/027Concept to speech synthesisers; Generation of natural phrases from machine-based concepts
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/06Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids
    • G10L21/10Transforming into visible information
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/03Arrangements for converting the position or the displacement of a member into a coded form
    • G06F3/0304Detection arrangements using opto-electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/107Static hand or arm
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • General Engineering & Computer Science (AREA)
  • Social Psychology (AREA)
  • Psychiatry (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

A system for translating sign language of a deaf/dumb person into speech comprises an image capturing device, an image processing unit, a first repository, a feature matching unit, a feature recognition unit, and an audio unit. The image capturing device is configured to capture video of the deaf/dumb person speaking sign language, and generate a corresponding live image data stream. The image processing unit is configured to receive the captured live image data stream from the image capturing device and extract features of gesture of a left and right hand of the deaf/dumb person. The feature matching unit is configured to cooperate with the image processing unit to receive the extracted features of the gesture and match extracted features of the gesture with the stored dataset. The feature recognition unit is configured to recognize the gesture information on the basis of temporal and spatial hand gesture variation. Applicant name: Page 1 of 2 100 104 102 108 110 116 FIG. 1 Block diagram of a system for translating sign language into speech 1

Description

Applicant name: Page 1 of 2
100
104 102
108 110 116
FIG. 1 Block diagram of a system for translating sign language into speech
Australian Government IP Australia INNOVATION PATENT APPLICATION AUSTRALIA PATENT OFFICE
1. TITLE OF THE INVENTION
A SYSTEM FOR TRANSLATING SIGN LANGUAGE INTO SPEECH AND VICE-VERSA
2. APPLICANTS (S) NAME NATIONALITY ADDRESS
1 Dr. Nileshsingh V. Thakur INDIAN Department of Computer Science and Engineering, Prof Ram Meghe College of Engineering and Management,, Badnera.
2. Dr. Shaminder Singh Sohi INDIAN Department of Computer Science and Engineering, Gulzar group of Institutes, Khanna.
3 Jyoti Arora INDIAN Department of Computer Science and Engineering, Desh Bhagat University, Mandi Gobindgarh.
4. Dr. Harikumar Pallathadka INDIAN Manipur International University, Imphal, Manipur, India.
5. Aman Kataria INDIAN Department of Electrical and Instrumentation, Thapar Institute of Engineering and Technology, Patiala, India. 6. Shilpi Gupta INDIAN Department OF Information Technology, B.S. Anangpuria Institute of Technology and Management, Faridabad. 3. PREAMBLE TO THE DESCRIPTION
COMPLETE SPECIFICATION
The following specification particularly describes the invention and the manner in which it is to be performed
A SYSTEM FOR TRANSLATING SIGN LANGUAGE INTO SPEECH AND VICE VERSA TECHNICAL FIELD
[0001] The present disclosure relates to the field of digital image processing, particularly video processing in real time.
BACKGROUND
[0002] The background information herein below relates to the present disclosure but is not necessarily prior art. Around the world we have 466 million deaf and dumb people and 34 million of these are children. WHO says it will increase to 900 million by 2050. Hearing lossmay result from genetic causes and complication at birth. Usually, deaf and dumb people found difficulties to interact with normal person since they are not able to speak or hear and hence, they are unable to share their emotions to the normal person. Many times the expressions of the deaf and dumb people are wrongly interpreted by the normal person. Therefore, these deaf/dumb people will not come out and lack many opportunities such as jobs.
[0003] Many communication platforms are available for deaf and dumb people where they can interact with normal person. These platforms may include wearing of gloves having sensors and further tracking of the movements of hands to identify gestures of hands to interpret the sign language. However, deaf and dumb people cannot afford these platforms due to high cost and less efficiency to convert sign language into text or speech. Conventional approaches may found that a video filehaving sign language is processed to send the output in the form of audio or textso that the normal person may come to know about the expression of the deaf/dumb people using sign language but a delay was observed for converting the video file into audio.
[0004] Efforts have been made in in the related prior art to provide different solutions for sign language recognition. For example, Chinese Patent no. CN103136986B addressed the issues relates to a kind of sign Language Recognition Method, comprise the following steps: gather the image comprising marked region; The attitude in identification marking region; Generate the steering order that described attitude is corresponding; Convert described steering order to natural language information. In addition, a kind of sign Language Recognition is additionally provided. Above-mentioned sign Language Recognition Method and system can improve the accuracy rate of identification. However, the prior art fails to provide a system that is capable of providing bilateral communication between disabled people.
[0005] In some embodiments, the numbers expressing quantities or dimensions of items, and so forth, used to describe and claim certain embodiments of the invention are to be understood as being modified in some instances by the term "about." Accordingly, in some embodiments, the numerical parameters set forth in the written description and attached claims are approximations that can vary depending upon the desired properties sought to be obtained by a particular embodiment. In some embodiments, the numerical parameters should be construed in light of the number of reported significant digits and by applying ordinary rounding techniques. Notwithstanding that the numerical ranges and parameters setting forth the broad scope of some embodiments of the invention are approximations, the numerical values set forth in the specific examples are reported as precisely as practicable. The numerical values presented in some embodiments of the invention may contain certain errors necessarily resulting from the standard deviation found in their respective testing measurements.
[0006] As used in the description herein and throughout the claims that follow, the meaning of "a," "an," and "the" includes plural reference unless the context clearly dictates otherwise. Also, as used in the description herein, the meaning of "in" includes "in" and "on" unless the context clearly dictates otherwise.
OBJECTS OF THE INVENTION
[0007] It is an object of the present disclosure, which provides a system for translating sign language into a speech.
[0008] It is an object of the present disclosure which provides a method that improves user security.
[0009] It is an object of the present disclosure to provide a system for translating a speech into sign language.
[0010] It is an object of the present disclosure to provide a system for translating sign language that is highly efficient and cost effective.
[0011] It is an object of the present invention will be more apparent from the following description, which is not intended to limit the scope of the present disclosure.
SUMMARY
[0012] The present concept envisages a system for translating sign language into speech and vice versa. The system for translating sign language of a deaf/dumb person into speech comprises an image capturing device, an image processingunit, a first repository, a feature matching unit, a feature recognition unit, and an audio unit. The image capturing device is configured to capture video of the deaf/dumb person speaking sign language, in real-time, and is further configuredto generate a corresponding live image data stream based on the captured video. The image processing unit is configured to receive the captured live image data stream from the image capturing device and is further configured to extract features of gesture of a left and right hand of the deaf/dumb person. The first repository is configured to store a dataset having a list of gestures and corresponding gesture information. The feature matching unit is configured to cooperate with the image processing unit to receive the extracted features of gesture of the left and right hand and is further configured to match extracted features of gesture of the left and right hand with the stored dataset having gestureinformation in the first repository. The feature recognition unit is configured to receive matched gesture information from the feature matching unit to recognize the gesture information on the basis of temporal and spatial hand gesture variation. The audio unit is configured to receive recognized gesture information from the feature recognition unit and is further configured to output the gesture information into a voice audio via an audio device. The image processing unit, thefeature matching unit and the feature recognition unit are implemented using one or more processor(s).
[0013] In an embodiment, the image processing unit includes a detection unit and a hand feature extraction unit. The detection unit is configured to identify a region of interest within the received captured live image data stream from the image capturing device, wherein region of interest within the received captured live image data stream includes video of coordinated left and right hand of the deaf/dumb person. The hand feature extraction unit is configured to receive the identified video of coordinated left and right hand and is further configured to extract temporal and spatial feature of gesture of the left and right hand.
[0014] In another embodiment, the image capturing device is selected from the group consisting of a still camera, IP camera, a 3D camera, an infrared camera, an imagecapturing sensor, a digital camera, and a CCD camera.
[0015] In still another embodiment, the gesture information in the first repository includes a plurality of classes related to hand gesture used for sign language, hand gesture for alphabets, hand gesture for numbers, hand gesture for words, hand gesture for expressions and the like.
[0016] In yet another embodiment, the feature recognition unit is configured to verify andrecognize the correct gesture information on the basis of tracking of left and right hand, movement's occlusion, and position of hands on the basis of the temporal and spatial hand gesture variation by employing segmentation techniques.
[00171 The system for translating speech of a user into sign language comprises an input unit, a speech to text convertor, a second repository, a gesture matching unit, anda display unit. The input unit is configured to accept input in the form of voice commands from the user. The speech to text convertor cooperates with the input unit to receive the voice commands and is further configured to convert the voice commands into text commands, in real time. The second repository is configured to store a second dataset having a list of gestures information associated with a text. The gesture matching unit is configured to extract a suitable gesture based onthe received text commands by crawling through the stored list of gestures information via a crawler when the received text commands matches with the stored text in the second repository. The display unit is configured to display the corresponding gesture information received from the gesture matching unit. The speech to text convertor and the gesture matching unit are implemented using one or more processor(s).
[0018] In an embodiment, the system provides a bilateral communication to the deaf/dumb person speaking sign language and the user, in continuous live mode.
BRIEF DESCRIPTION OF THE DRAWINGS
[0019] The accompanying drawings are included to provide a further understanding of the present disclosure and are incorporated in and constitute a part of this specification. The drawings illustrate exemplary embodiments of the present disclosure and, together with the description, serve to explain the principles of the present disclosure.
[0020] FIG. 1 illustrates a block diagram of a system for translating sign language into speech, in real time, in accordance with the present disclosure.
[0021] FIG. 2 illustrates a block diagram of the system by translating speech into sign language, in real time, in accordance with the present disclosure.
[0022] Other objects, advantages and novel features of the invention will become apparent from the following detailed description of the present embodiment when taken in conjunction with the accompanying drawings.
DETAILED DESCRIPTION
[0023] Aspects of the present disclosure relate to a system for translating sign language into speech or vice-versa. It is inferred that the foregoing description is only illustrative of the present invention, and it is not intended that invention be limited or restrictive thereto. Many other specific embodiments of the present invention will be apparent to one skilled in the art from the foregoing disclosure. All substitutions, alterations and modifications of the present invention which comes within the scope of the following claims are to which the present invention is readily susceptible without departing from the spirit of the invention. The scope of the invention should therefore be determined not with reference to appended claims along with the full scope of equivalents to which such claims are entitled.
[0024] Various methods described herein may be practiced by combining one or more machine-readable storage media containing the code/instruction according to the present invention with appropriate standard device hardware to execute the instruction contained therein. An apparatus for practicing various embodiments of the present invention may involve one or more computers (say server) (or one or more processors within a single computer) and storage systems containing or having network access to computer program(s) coded in accordance with various methods described herein, and the method steps of the invention could be accomplished by modules, devises, routines, subroutines, or subparts of a computer program product.
[0025] If the specification states a component or feature"may, "can", "could", or "might" be included or have a characteristic, that particular component or feature is not required to be included or have the characteristic.
[0026] Various terms as used herein are shown below. To the extent a term used in a claim is not defined below, it should be given the broadest definition persons in the pertinent art have given that term as reflected in printed publications and issued patents at the time of filing.
[00271 Referring to FIG. 1, the present disclosure envisages a system 100 for translating sign language into speech, in real time. The system 100 comprises an image capturing device 102, an image processing unit 104, a first repository 110, a feature matching unit 112, a feature recognition unit 114, and an audio unit 116.
[0028] In an aspect, the image capturing device 102 is configured to capture video of a deaf/dumb person speaking sign language, in real-time. Further, the image capturing device 102 is configured to generate a corresponding live image data stream based on the captured video. In an embodiment, the image capturing device 102 is selected from the group consisting of a camera, IP camera, a 3D camera, an infrared camera, an image capturing sensor, a digital camera, and a CCD camera.
[0029] In another aspect of the present invention, the processing unit 104 is configured to cooperate with the image capturing device 102 to receive captured live image data stream. The image processing unit 104 includes a detection unit 106, a hand feature extraction unit 108. The detection unit 106 is configured to identify a region of interest within the received captured live image data stream. The region of interest within the received captured live image data stream is the identified video of coordinated left and right hand of the deaf/dumb person. The hand feature extraction unit 108 is configured to receive the identified video of coordinated left and right hand and is further configured to extract temporal and spatial feature of gesture of the left and right hand.
[0030] In another aspect of the present invention, the first repository 110 is configured to store a dataset having a list of gestures and corresponding gesture information. In another embodiment, the gesture information may include a plurality of classes related to gesture used for sign language, like hand gesture for alphabets, hand gesture for numbers, hand gesture for words, hand gesture for expressions and the like. In another embodiment, the first repository 110 is configured to store a dataset having a list of gestures and corresponding gesture information in each and every language.
[0031] In another aspect of the present invention, the feature matching unit 112 is configured to cooperate with the hand feature extraction unit 108 and the first repository 110. The feature matching unit 112 is configured to receive the extracted feature of gesture of the left and right hand and is further configured to match with the stored dataset having gesture information. After matching with suitable gesture information, the feature matching unit 112 is configured to transmit the gesture information to the feature recognition unit 114. The feature recognition unit 114 is configured to verify and recognize the correct gesture information on the basis of tracking of left and right hand, movement's occlusion, and position of hands on the basis of the temporal and spatial hand gesture variation by employing segmentation techniques. Further, the feature recognition unit 114 is configured to transmit the temporal and spatial gesture information to the audio unit 116. In another embodiment, the feature recognition unit 114 is configured to recognize the gesture information in any language.
[0032] In yet another aspect of the present invention, the audio unit 116 is configured to receive gesture information and is further configured to output the gesture information into a voice audio via an audio device. In an embodiment, the audio device can be a speaker. In an embodiment, the audio unit 116 includes a text to speech converter configured to convert the gesture information into the voice audio.
[0033] In an embodiment, the image processing unit 104, the feature matching unit 112 and the feature recognition unit 114 are implemented using one or more processor(s).
[0034] Referring to FIG. 2, the present disclosure envisages a system 100 for translating speech into sign language, in real time. The system 100 comprises an input unit 118, a speech to text convertor 120, a second repository 122, a gesture matching unit 124 and a display unit 126.
[0035] In yet another aspect of the present invention, the input unit 118 is configured to accept input in the form of voice commands from a user and is further configured to transmit the voice commands to the speech to text convertor 120. In an embodiment, the input unit 118 can be a microphone. The speech to text convertor 120 is configured to recognize the voice commands given by the user, and is further configured to convert the voice commands into a text commands. In an embodiment, the speech to text convertor 120 is configured to detect the voice commands to generate the text commands, and send the text commands to the gesture matching unit 124.
[0036] In yet another aspect of the present invention, second repository 122 is configured to store a second dataset having a list of gestures information associated with a text. The gesture matching unit 124 is configured to cooperate with the speech to text convertor 120 and the second repository 122. The gesture matching unit 124 is configured to receive the text commands and is configured to search a suitable gesture associated with the text by crawling through the stored list of gestures information via a crawler (not shown in the figure) and is further configured to extract the gesture when the received text commands matches with the stored text.
[00371 The display unit 126 is configured to cooperate with the gesture matching unit 124 to receive the gesture information and is further configured to display the corresponding gesture information in the form of a Graphics Interchange Format (GIF).
[0038] Hence, the system 100 is configured to provide a bilateral communication to the deaf/dumb person speaking sign language and the user (normal person) by converting a live sign language input of deaf/dumb person into a voice output and converting voice command from the normal person into GIF of sign language, in continuous live mode, thereby providing lively interaction between the deaf and dumb person and the normal person without any delay.
[0039] In yet another aspect of the present invention, the speech to text convertor 120 and the gesture matching unit 124 are implemented using one or more processor(s). Advantageously, the processors can be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions.
[0040] The functions described herein may be implemented in hardware, software executed by a processor, firmware, or any combination thereof. If implemented in software executed by a processor, the functions may be stored on or transmitted over as one or more instructions or code on a computer- readable medium. Other examples and implementations are within the scope and spirit of the disclosureand appended claims. For example, due to the nature of software, functions described above can be implemented using software executed by a processor, hardware, firmware, hardwiring, or combinations of any of these. Features implementing functions may also be physically located at various positions, including being distributed such that portions of functions are implemented at different physical locations.
[0041] Thus, the scope of the present disclosure is defined by the appended claims and includes both combinations and sub-combinations of the various features described hereinabove as well as variations and modifications thereof, which would occur to persons skilled in the art upon reading the foregoing description.

Claims (7)

We Claim:
1. A system (100) for translating sign language of a deaf/dumb person into speech, said system (100) comprising:
an image capturing device (102) configured to capture video of the deaf/dumb person speaking sign language, in real-time, and further configured to generate a corresponding live image data stream based on the captured video;
an image processing unit (104) configured to receive said captured live image data stream from said image capturing device (102) and further configured to extract features of gesture of a left and right hand of the deaf/dumb person, wherein said image processing unit (104), wherein said feature matching unit (112) and said feature recognition unit (114) are implemented using one or more processor(s);
a first repository (110) configured to store a dataset having a list of gestures and corresponding gesture information;
a feature matching unit (112) configured to cooperate with said image processing unit (104) to receive said extracted features of gesture of the left and right hand and further configured to match extracted features of gesture of the left and right hand with the stored dataset having gesture information in said first repository (110);
a feature recognition unit (114) configured to receive matched gesture information from said feature matching unit (112) to recognize the gesture information on the basis of temporal and spatial hand gesture variation; and
an audio unit (116) configured to receive recognized gesture information from said feature recognition unit (114) and further configured to output the gesture information into a voice audio via an audio device.
generating primary spectrum key (106) for a primary user wherein the generated key indicates the data of nodes and spectrum holes; generating a secondary spectrum key (108) for a secondary user wherein the generated key indicates the data of nodes and spectrum holes; computing the secondary nodes for the generated secondary spectrum keys; verifying (110) the node through spectrum key management; reassigning of the primary spectrum key for node verification through new key management (112).
2. The system as claimed in claim 1, wherein said image processing unit
(104) includes:
A detection unit (106) configured to identify a region of interest within the received captured live image data stream from said image capturing device (102), wherein region of interest within the received captured live image data stream includes video of coordinated left and right hand of the deaf/dumb person; and
a hand feature extraction unit (108) configured to receive the identified video of coordinated left and right hand and further configured to extract temporal and spatial feature of gesture of the left and right hand.
3. The system (100) as claimed in claim 1, wherein said image capturing device (102) is selected from the group consisting of a still camera, ip camera, a 3d camera, an infrared camera, an image capturing sensor, a digital camera, and a ccd camera.
4. The system (100) as claimed in claim 1, wherein said gesture information in said first repository (110) includes a plurality of classes related to hand gesture used for sign language, hand gesture for alphabets, hand gesture for numbers, hand gesture for words, hand gesture for expressions and the like.
5. The system (100) as claimed in claim 1, wherein said feature recognition unit (114) is configured to verify and recognize the correct gesture information on the basis of tracking of left and right hand, movements occlusion, and position of hands on the basis of the temporal and spatial hand gesture variation by employing segmentation techniques.
6. The system (100) for translating speech into sign language, said system (100) comprising:
an input (118) configured to accept input in the form of voice commands from a user; a speech to text convertor (120) cooperating with said input unit (118) to receive said voice commands and further configured to convert said voice commands into text commands, in real time;
a second repository (122) configured to store a second dataset having a list of gestures information associated with a text;
a gesture matching unit (124) configured to extract a suitable gesture based on the received text commands by crawling through the stored list of gestures information via a crawler when said received text commands matches with the stored text in said second repository (122); and
a display unit (126) configured to display the corresponding gesture information received from said gesture matching unit (124).
7. The system (100) as claimed in claim 6, wherein the system (100) provides a bilateral communication to the deaf/dumb person speaking sign language and the user, in continuous live mode.
EDITORIAL NOTE
2021101012
THERE ARE THREE PAGES OF CLAIMS ONLY
We Claim:
1. A system (100) for translating sign language of a deaf/dumb person into speech, said system (100) comprising:
an image capturing device (102) configured to capture video of the deaf/dumb person speaking sign language, in real-time, and further configured to generate a corresponding live image data streambased on the captured video;
an image processing unit (104) configured to receive said captured live image data stream from said image capturing device (102) and further configured to extract features of gesture of a left and right hand of the deaf/dumb person, wherein said image processing unit (104), wherein said feature matching unit (112) and said feature recognition unit (114) are implemented using one or more processor(s);
a first repository (110) configured to store a dataset having a list of gestures and corresponding gesture information;
a feature matching unit (112) configured to cooperate with said image processing unit (104) to receive said extracted features of gesture of the left and right hand and further configured to match extracted features of gesture of the left and right hand with the stored dataset having gesture information in said first repository (110);
a feature recognition unit (114) configured to receive matched gesture information from said feature matching unit (112) to recognize the gesture information on the basis of temporal and spatial hand gesture variation; and
an audio unit (116) configured to receive recognized gesture information from said feature recognition unit (114) and further configured to output the gesture information into a voice audio via an audio device.
generating primary spectrum key (106) for a primary user wherein the generated key indicates the data of nodes and spectrum holes;
generating a secondary spectrum key (108) for a secondary user wherein the generated key indicates the data of nodes and spectrum holes;
computing the secondary nodes for the generated secondary spectrum keys;
verifying (110) the node through spectrum key management; reassigning of the primary spectrum key for node verification through new key management (112).
2. The system as claimed in claim 1, wherein said image processing unit
(104) includes:
A detection unit (106) configured to identify a region of interest within the received captured live image data stream from said image capturing device (102), wherein region of interest within the received captured live image data stream includes video of coordinated left and right hand of the deaf/dumb person; and
a hand feature extraction unit (108) configured to receive the identified video of coordinated left and right hand and further configured to extract temporal and spatial feature of gesture of the left and right hand.
3. The system (100) as claimed in claim 1, wherein said image capturing device (102) is selected from the group consisting of a still camera, ip camera, a 3d camera, an infrared camera, an image capturing sensor, a digital camera, and a ccd camera.
4. The system (100) as claimed in claim 1, wherein said gesture information in said first repository (110) includes a plurality of classes related to hand gesture used for sign language, hand gesture for alphabets, hand gesture for numbers, hand gesture for words, hand gesture for expressions and the like.
5. The system (100) as claimed in claim 1, wherein said feature recognition unit (114) is configured to verify and recognize the correct gesture information on the basis of tracking of left and right hand, movements occlusion, and position of hands on the basis of the temporal and spatial hand gesture variation by employing segmentation techniques.
6. The system (100) for translating speech into sign language, said system (100) comprising:
an input (118) configured to accept input in the form of voice commands from a user;
a speech to text convertor (120) cooperating with said input unit
o receive said voice commands and further configured to convert said voice commands into text commands, in real time;
a second repository (122) configured to store a second dataset having a list of gestures information associated with a text; a gesture matching unit (124) configured to extract a suitable gesture based on the received text commands by crawling through the stored list of gestures information via a crawler when said received text commands matches with the stored text in said second repository (122); and a display unit (126) configured to display the corresponding gesture information received from said gesture matching unit (124). 7. The system (100) as claimed in claim 6, wherein the system (100) provides a bilateral communication to the deaf/dumb person speaking sign language and the user, in continuous live mode.
Application no.: Total no. of sheets: 2 Applicant name: Page 1 of 2 24 Feb 2021 2021101012
FIG. 1 Block diagram of a system for translating sign language into speech
Application no.: Total no. of sheets: 2 Applicant name: Page 2 of 2 24 Feb 2021 2021101012
FIG. 2 Block diagram of the system by translating speech into sign language
AU2021101012A 2021-02-24 2021-02-24 A system for translating sign language into speech and vice-versa Ceased AU2021101012A4 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2021101012A AU2021101012A4 (en) 2021-02-24 2021-02-24 A system for translating sign language into speech and vice-versa

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
AU2021101012A AU2021101012A4 (en) 2021-02-24 2021-02-24 A system for translating sign language into speech and vice-versa

Publications (1)

Publication Number Publication Date
AU2021101012A4 true AU2021101012A4 (en) 2021-04-29

Family

ID=75625875

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2021101012A Ceased AU2021101012A4 (en) 2021-02-24 2021-02-24 A system for translating sign language into speech and vice-versa

Country Status (1)

Country Link
AU (1) AU2021101012A4 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022271381A1 (en) * 2021-06-24 2022-12-29 Microsoft Technology Licensing, Llc Sign language and gesture capture and detection

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022271381A1 (en) * 2021-06-24 2022-12-29 Microsoft Technology Licensing, Llc Sign language and gesture capture and detection

Similar Documents

Publication Publication Date Title
Truong et al. A translator for American sign language to text and speech
Zheng et al. Recent advances of deep learning for sign language recognition
Upendran et al. American Sign Language interpreter system for deaf and dumb individuals
US11587305B2 (en) System and method for learning sensory media association without using text labels
Shukla et al. A DTW and fourier descriptor based approach for Indian sign language recognition
AU2021101012A4 (en) A system for translating sign language into speech and vice-versa
Tasmere et al. Real time hand gesture recognition in depth image using cnn
Rathi et al. Development of full duplex intelligent communication system for deaf and dumb people
Dissanayake et al. Utalk: Sri Lankan sign language converter mobile app using image processing and machine learning
Aly et al. Arabic sign language recognition using spatio-temporal local binary patterns and support vector machine
AU2021101804A4 (en) A system for translating sign language into speech and vice- versa
Naseem et al. Developing a prototype to translate pakistan sign language into text and speech while using convolutional neural networking
Srinivasan et al. Python And Opencv For Sign Language Recognition
Bora et al. ISL gesture recognition using multiple feature fusion
CN114067362A (en) Sign language recognition method, device, equipment and medium based on neural network model
Enikeev et al. Russian Fingerspelling Recognition Using Leap Motion Controller
KR20200097446A (en) System and Method for Providing Multi-modality Contents and Apparatus for Indexing of Contents
Rai et al. Gesture recognition system
Thanneru et al. Image to audio, text to audio, text to speech, video to text conversion using, NLP techniques
Tazalli et al. Computer vision-based Bengali sign language to text generation
Lima et al. Using convolutional neural networks for fingerspelling sign recognition in brazilian sign language
Praneel et al. Malayalam Sign Language Character Recognition System
Kamble et al. Deep Learning-Based Sign Language Recognition and Translation
Goudar et al. A effective communication solution for the hearing impaired persons: A novel approach using gesture and sentence formation
Shinde et al. Two-way sign language converter for speech-impaired

Legal Events

Date Code Title Description
FGI Letters patent sealed or granted (innovation patent)
MK22 Patent ceased section 143a(d), or expired - non payment of renewal fee or expiry