CN210402846U - Sign language translation terminal and sign language translation server - Google Patents

Sign language translation terminal and sign language translation server Download PDF

Info

Publication number
CN210402846U
CN210402846U CN201920711021.2U CN201920711021U CN210402846U CN 210402846 U CN210402846 U CN 210402846U CN 201920711021 U CN201920711021 U CN 201920711021U CN 210402846 U CN210402846 U CN 210402846U
Authority
CN
China
Prior art keywords
module
sign language
translation
language
camera
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201920711021.2U
Other languages
Chinese (zh)
Inventor
胡宝命
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China United Network Communications Group Co Ltd
Original Assignee
China United Network Communications Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China United Network Communications Group Co Ltd filed Critical China United Network Communications Group Co Ltd
Priority to CN201920711021.2U priority Critical patent/CN210402846U/en
Application granted granted Critical
Publication of CN210402846U publication Critical patent/CN210402846U/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Machine Translation (AREA)

Abstract

The utility model provides a sign language translation terminal and sign language translation server relates to sign language translation field. The embodiment of the utility model provides a can translate into plain language with sign language action, convert plain language into the audio data broadcast that accords with the target tone quality that the user used the custom, improve the user and use experience. The sign language translation terminal includes: the system comprises a camera, a translation module, a voice editing module and an audio output device; the camera is connected with the translation module, the translation module is connected with the voice editing module, and the voice editing module is connected with the audio output device; a camera for capturing a first gesture image; the translation module is used for translating the sign language actions in the first sign language action image into a first common language; the voice editing module is used for selecting a target tone from a plurality of tones and converting a first common language into audio data of the target tone; and the audio output device is used for playing the audio data. The utility model discloses be applied to sign language translation.

Description

Sign language translation terminal and sign language translation server
Technical Field
The utility model relates to a sign language translation field especially relates to a sign language translation terminal and sign language translation server.
Background
In real society, deaf-mutes often use sign language when communicating with a person because they cannot express the language through the vocal cords. However, people who have a sign language are a few, and therefore, they still have no way or difficulty in communicating with the general population. This severely restricts their social interaction and also prevents others (normal hearing or visually impaired people) from communicating with them. And certain restriction is formed on the development of the society. In fact, isolation of serious groups has been established, affecting both socio-economic progress and social development.
The advent of sign language translation devices solved the above problems. However, the existing sign language translation equipment has the defects of more mechanization of translation expression, poor user experience and the like. Therefore, the utility model provides a sign language translation terminal and sign language translation server can make sign language translation expression accord with user's use custom, improves user experience.
SUMMERY OF THE UTILITY MODEL
The utility model provides a sign language translation terminal and sign language translation server can translate sign language action into plain language to convert plain language into the audio data broadcast that accords with the target tone quality that the user used the custom, realize deaf-mute and general people's interchange, improve user's use simultaneously and experience. In order to achieve the above object, the embodiments of the present invention adopt the following technical solutions:
in a first aspect, an embodiment of the present invention provides a sign language translation terminal, including: the system comprises a camera, a translation module, a voice editing module and an audio output device; the camera is connected with the translation module, the translation module is connected with the voice editing module, and the voice editing module is connected with the audio output device;
a camera for capturing a first gesture image;
the translation module is used for translating the sign language actions in the first sign language action image into a first common language;
the voice editing module is used for selecting a target tone from a plurality of tones and converting a first common language into audio data of the target tone;
and the audio output device is used for playing the audio data.
In a second aspect, an embodiment of the present invention provides a sign language translation server, including: the device comprises a transceiver, a translation module and a voice editing module; the translation module is respectively connected with the transceiver and the voice editing module;
the transceiver is used for receiving image data which are sent by the terminal and contain a first finger language action image;
the translation module is used for translating the sign language actions in the first sign language action image into a first common language;
the voice editing module is used for selecting a target tone from a plurality of tones and converting a first common language into audio data of the target tone;
and the transceiver is also used for transmitting the audio data to the terminal.
The embodiment of the utility model provides a sign language translation terminal and sign language translation server can acquire sign language action of sign language sender, translates sign language action into common language, converts the audio data and then the output of common language conversion target tone quality to realize barrier-free exchange between deaf-mute and the general people. Compared with the prior art, the utility model discloses increased the voice editing module, through the voice editing module, converted common language into the audio data who accords with the target tone quality that the user used the custom more, then broadcast through audio output device to user use experience has been improved. In addition, the translation module and the voice editing module can be deployed in a remote server, so that the steps of translation and tone selection are completed by the server.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below.
Fig. 1 is a schematic structural diagram of a sign language translation terminal according to an embodiment of the present invention;
fig. 2 is a second schematic structural diagram of a sign language translation terminal according to an embodiment of the present invention;
fig. 3 is a third schematic structural diagram of a sign language translation terminal according to an embodiment of the present invention;
fig. 4 is a fourth schematic structural diagram of a sign language translation terminal according to an embodiment of the present invention;
fig. 5 is a fifth schematic structural diagram of a sign language translation terminal according to an embodiment of the present invention;
fig. 6 is a sixth schematic view of a sign language translation terminal according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of a sign language translation server according to an embodiment of the present invention.
Detailed Description
Embodiments of the present invention will be described below with reference to the accompanying drawings.
The embodiment of the utility model provides a sign language translation terminal, as shown in FIG. 1, this sign language translation terminal 10 includes: a camera 101, a translation module 102, a voice editing module 103, and an audio output device 104. The camera 101 is connected to the translation module 102, the translation module 102 is connected to the voice editing module 103, and the voice editing module 103 is connected to the audio output device 104.
The camera 101 is used for shooting a first gesture image.
And the translation module 102 is used for translating the sign language actions in the first sign language action image into a first common language.
Specifically, the translation module 102 specifically includes: and the processing chip has a function of translating the sign language action in the sign language action image into a common language.
In one implementation, the translation module 102 may be a separate embedded processor chip of MIPS or ARM architecture, a DSP chip, or the like. These chips store in advance programs for translating sign language motions in sign language motion images into a normal language, and these programs can translate sign language motions in sign language motion images into a normal language by a conventional image recognition algorithm.
And the voice editing module 103 is used for selecting a target tone color from the multiple tone colors and converting the first common language into audio data of the target tone color.
Specifically, the voice editing module 103 specifically includes: and the processing chip converts the common language into audio data of the target tone.
In one implementation, the voice editing module 103 may be an independent embedded processor chip, a DSP chip, or the like of the MIPS or ARM architecture. The chips are pre-stored with corresponding computer program codes, and the programs can convert common languages into audio data of target tone colors through the existing audio data processing algorithm.
And an audio output device 104 for playing the audio data.
Specifically, the audio output device 104 specifically includes: the earphone and the loudspeaker can be detached.
In the embodiment of the present invention, the terminal 10 obtains the sign language action of the sign language sender through the camera 101, and the translation module 102 translates the sign language action into the first common language, where the first common language may be a text or a voice. The voice editing module 103 converts the characters or the voice into audio data which accords with the target tone of the use habit of the user, and the audio data is played by the audio output device 104, so that the communication between the deaf-mute and the general people is realized, and the use experience of the user is improved.
Further, in order to realize the bidirectional communication between the deaf-mute and the general person, as shown in fig. 2, in the embodiment of the present invention, the sign language translation terminal 10 further includes an acquisition module 105 and an image output device 106; the acquisition module 105 is connected to the translation module 102, and the image output device 106 is connected to the translation module 102.
And the acquisition module 105 is used for acquiring the second common language.
Specifically, the acquisition module 105 may include: one or more of a microphone, a keyboard, and a touch screen.
The translation module 102 is further configured to translate the second common language into a second language action image.
And an image output device 106 for displaying the second phrase motion image.
Specifically, the image output device 106 includes: a reality-enhancing AR apparatus.
The embodiment of the utility model provides an in, terminal 10 not only can translate sign language action into pronunciation, make general people can understand the meaning that sign language expresses, can also gather through collection module 105 and obtain the ordinary language of second, the ordinary language of second can be the text information that the user passes through keyboard or touch-sensitive screen input, also can be the user's that the adapter was gathered voice message, translation module 102 translates characters or pronunciation into sign language action, and show through image output device 106, make deaf-mute can understand the meaning of characters or speech expression, thereby realize deaf-mute and general people's bidirectional communication.
Additionally, in the embodiment of the present invention, the AR device may specifically be glasses with the function of displaying AR images.
For example, when the deaf-mute communicates with an ordinary person face-to-face, the sign language translating terminal 10 may translate the words spoken by the ordinary person into a sign language image, which is then displayed on a display screen on glasses worn by the deaf-mute. Therefore, the deaf-mute can naturally and directly see the corresponding sign language action from the glasses under the condition of not needing redundant actions (for example, holding the display device by hands), thereby being more convenient for the communication of the deaf-mute and being convenient for the deaf-mute to be integrated into the society.
Further, as shown in fig. 3, in order to facilitate the terminal user to select a suitable translation mode, the sign language translation terminal 10 further includes a translation mode switching module 107, and the translation mode switching module 107 is respectively connected to the camera 101 and the acquisition module 105.
A translation mode switching module 107 for turning on the camera 101 after receiving a first operation input by a user, so that the camera 101 captures a first gesture image; after receiving the second operation input by the user, the capture module 105 is turned on, so that the capture module 105 captures the second common language.
For example, the translation mode switching module 107 specifically includes a button that can be toggled left and right on the terminal, and if the user is a general group, the terminal 10 is required to translate the sign language into the voice, that is, the translation mode of the terminal 10 is from the sign language to the voice; at this time, the user dials the button to the left, and the translation mode switching module 107 receives the operation and then controls the camera 101 to be turned on, so as to photograph the sign language action of the deaf-mute, and the terminal 10 translates the sign language action into voice playing.
If the user is a deaf-mute, the terminal 10 is required to translate the voice into a sign language action, namely, the translation mode of the terminal 10 is from the voice to the sign language, at this time, the button is pushed to the right side, the translation mode switching module 107 receives the operation and then controls the acquisition module 105 to work to obtain the voice information of the general population, and the terminal 10 translates the voice into the sign language action for display, so that the barrier-free communication between the general population and the deaf-mute is realized.
Further, the terminal 10 further includes an ad hoc network module, and the terminals can communicate with each other and transmit data. For example, when a general person communicates with a deaf-mute, the general person sets the terminal a to be in a sign language to voice mode, the deaf-mute sets the terminal B to be in a voice to hand language mode, and the terminal a translates the photographed sign language action into a common language and converts the common language action into audio data through a voice editing module to play. If the common language is the characters, the characters are transmitted to the terminal B to be displayed on the image output device, so that the deaf-mute can check the translation result of the sign language action and correct the translation result in time.
Further, the terminal is to translate the first finger language action into the first common language, and in an implementation manner, as shown in fig. 4, the translation module 102 specifically includes: a transceiver 1021.
The transceiver 1021 is configured to transmit image data including the first gesture image to the server 20, and receive the first common language transmitted by the server 20.
In another implementation, as shown in fig. 5, the sign language translation terminal 10 further includes a detection module 108; the detection module 108 is coupled to the translation module 102.
The detecting module 108 is configured to detect a network connection status of the sign language interpreting terminal 10.
A translation module 102, configured to translate the sign language action in the first sign language action image into the first common language by using offline software when the detection module 108 detects that the sign language translation terminal 10 is not connected to the internet.
For example, the detection module 108 first detects the network connection state of the terminal 10, and sends the detection result to the translation module 102, and the translation module 102 selects the translation mode according to the detection result. When the terminal 10 is connected to the internet, the translation module 102 transmits the first gesture image captured by the camera 101 to the remote server 20 through the transceiver 1021, and the server 20 performs online translation, and the transceiver 1021 receives the first common language transmitted by the server 20, converts the first common language into audio data through the voice editing module 103, and then plays the audio data through the audio output device 104; when the terminal 10 is not connected to the internet, the translation module 102 performs off-line translation to translate the first gesture image into voice using off-line software.
Further, the sign language interpretation terminal 10 further includes: a reality-augmented AR device 110; the AR device 110 is connected to the translation module 102.
The AR device 110 is configured to display a text corresponding to the first common language.
The terminal 10 translates the sign language action into the first common language, and if the first common language is text, the first common language is converted into audio data through the voice editing module 103 for playing, and meanwhile, text information can be directly displayed on the AR device 110.
Further, the camera 101 includes a steerable camera 101 a.
The steerable camera 101a is used for adjusting the shooting angle of the steerable camera according to the position of the sign language speaker in the shot picture.
The steerable camera 101a can track the gesture direction of the sign language speaker in real time, adjust the direction in time, shoot the sign language action of the sign language speaker, and transmit the sign language action to the translation module 102. The camera 101 may be a 3D infrared camera 101b, and can capture a clear sign language motion image even at night.
Further, in order to make the translation result more natural, as shown in fig. 6, the sign language translation terminal 10 further includes: an expression recognition module 109; the expression recognition module 109 is respectively connected with the steerable camera 101a and the translation module 102.
The expression recognition module 109 is configured to recognize an expression of a sign language speaker in a picture shot by the steerable camera 101, generate expression information, and send the expression information to the translation module 102.
The translation module 102 is specifically configured to generate a first common language according to the expression information.
For example, the expression recognition module 109 recognizes the expression of the sign language speaker to generate expression information, and sends the expression information to the translation module 102, and the translation module 102 adds a intonation and a speech speed mark to the first common language according to the expression information, so that the speech played by the audio output device 104 is rich in emotion and is more real and natural.
The sign language translation terminal described in the above embodiment may be integrated, and each functional module is integrated on the terminal, so that the terminal is suitable for communication between a single person and a single person, and is convenient to carry and use. The terminal can also be partially separated, each functional module is independent equipment, and the terminal is suitable for one party to communicate with a plurality of people in turn, such as a business handling counter, a selection activity, an interview and the like.
In addition, exemplarily, the utility model provides a sign language translation terminal still includes: storage battery, wireless charging module and the like. The specific connection mode and the realized functions of the storage battery, the wireless charging module and other functional modules can refer to the related contents in the prior art, and are not described herein again.
In another embodiment, as shown in fig. 7, the present invention further provides a sign language translation server 20, including: a transceiver 201, a translation module 202 and a voice editing module 203; the translation module 202 is connected to the transceiver 201 and the voice editing module 203, respectively.
The transceiver 201 is configured to receive image data including a first gesture image transmitted by the terminal 10.
And the translation module 202 is used for translating the sign language actions in the first sign language action image into a first common language.
Specifically, the translation module 202 specifically includes: and the processing chip has a function of translating the sign language action in the sign language action image into a common language.
In one implementation, translation module 202 may be a separate embedded processor chip of MIPS or ARM architecture, a DSP chip, or the like. These chips store in advance programs for translating sign language motions in sign language motion images into a normal language, and these programs can translate sign language motions in sign language motion images into a normal language by a conventional image recognition algorithm.
And the voice editing module 203 is used for selecting a target tone color from the plurality of tone colors and converting the first common language into audio data of the target tone color.
Specifically, the voice editing module 203 specifically includes: and the processing chip converts the common language into audio data of the target tone.
In one implementation, the voice editing module 203 may be an independent embedded processor chip, a DSP chip, or the like of the MIPS or ARM architecture. The chips are pre-stored with corresponding computer program codes, and the programs can convert common languages into audio data of target tone colors through the existing audio data processing algorithm.
The transceiver 201 is also used to transmit audio data to the terminal 10.
In the embodiment of the present invention, the server 20 receives the sign language action sent by the terminal 10, translates the sign language action into voice, and sends the voice to the terminal 10 for playing. The server 20 provided in this embodiment not only realizes communication between the deaf-mute and the general person, but also converts the text or voice into audio data of the target tone according to the usage habit of the user through the voice editing module 203, thereby improving the usage experience of the user.
The embodiment of the utility model provides a sign language translation terminal and sign language translation server can acquire sign language action of sign language sender, translates sign language action into common language, converts the audio data and then the output of common language conversion target tone quality to realize barrier-free exchange between deaf-mute and the general people. Compared with the prior art, the utility model discloses increased the voice editing module, through the voice editing module, converted common language into the audio data who accords with the target tone quality that the user used the custom more, then broadcast through audio output device to user use experience has been improved. In addition, the translation module and the voice editing module can be deployed in a remote server, so that the steps of translation and tone selection are completed by the server.
The above description is only for the specific embodiments of the present invention, but the protection scope of the present invention is not limited thereto, and any person skilled in the art can easily think of the changes or substitutions within the technical scope of the present invention, and all should be covered within the protection scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A sign language interpretation terminal, comprising: the system comprises a camera, a translation module, a voice editing module and an audio output device; the camera is connected with the translation module, the translation module is connected with the voice editing module, and the voice editing module is connected with the audio output device;
the camera is used for shooting a first handwritten motion image;
the translation module is used for translating the sign language actions in the first sign language action image into a first common language;
the voice editing module is used for selecting a target tone from a plurality of tones and converting the first common language into audio data of the target tone;
the audio output device is used for playing the audio data;
the translation module specifically comprises: a processing chip having a function of translating sign language actions in the sign language action image into common languages;
the voice editing module specifically comprises: and the processing chip converts the common language into audio data of the target tone.
2. The sign language interpretation terminal according to claim 1, wherein the sign language interpretation terminal further comprises: the system comprises an acquisition module and an image output device;
the acquisition module is connected with the translation module, and the image output device is connected with the translation module;
the acquisition module is used for acquiring a second common language;
the translation module is also used for translating the second common language into a second language action image;
and the image output device is used for displaying the second phrase action image.
3. The sign language interpretation terminal according to claim 2, wherein the image output means comprises: a reality-enhancing AR apparatus.
4. The sign language translation terminal according to claim 2, further comprising a translation mode switching module, wherein the translation mode switching module is respectively connected to the camera and the acquisition module;
the translation mode switching module is used for starting the camera after receiving a first operation input by a user so as to enable the camera to shoot the first mobile voice action image; and after receiving a second operation input by the user, starting the acquisition module so that the acquisition module acquires the second common language.
5. The sign language translation terminal according to claim 1, wherein the translation module specifically comprises: a transceiver;
the transceiver is used for sending image data containing the first finger language action image to a server and receiving the first common language sent by the server.
6. The sign language interpretation terminal according to claim 5, wherein the sign language interpretation terminal further comprises a detection module; the detection module is connected with the translation module;
the detection module is used for detecting the network connection state of the sign language translation terminal;
the translation module is used for translating the sign language action in the first sign language action image into the first common language by using off-line software when the detection module detects that the sign language translation terminal is not connected with the Internet.
7. The sign language interpretation terminal according to claim 1, wherein the sign language interpretation terminal further comprises: a reality-enhancing AR device; the AR device is connected with the translation module;
and the AR device is used for displaying the characters corresponding to the first common language.
8. The sign language interpretation terminal according to any one of claims 1 to 7, wherein the camera comprises a steerable camera;
the steerable camera is used for adjusting the shooting angle of the steerable camera according to the position of a sign language speaker in a shot picture.
9. The sign language interpretation terminal according to claim 8, wherein the sign language interpretation terminal further comprises: an expression recognition module; the expression recognition module is respectively connected with the steerable camera and the translation module;
the expression recognition module is used for recognizing the expression of a sign language speaker in the picture shot by the steerable camera, generating expression information and sending the expression information to the translation module;
the translation module is specifically configured to generate the first common language according to the expression information.
10. A sign language translation server, comprising: the device comprises a transceiver, a translation module and a voice editing module; the translation module is respectively connected with the transceiver and the voice editing module;
the transceiver is used for receiving image data which are sent by a terminal and contain a first gesture image;
the translation module is used for translating the sign language actions in the first sign language action image into a first common language;
the voice editing module is used for selecting a target tone from a plurality of tones and converting the first common language into audio data of the target tone;
the transceiver is further configured to send the audio data to the terminal.
CN201920711021.2U 2019-05-17 2019-05-17 Sign language translation terminal and sign language translation server Active CN210402846U (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201920711021.2U CN210402846U (en) 2019-05-17 2019-05-17 Sign language translation terminal and sign language translation server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201920711021.2U CN210402846U (en) 2019-05-17 2019-05-17 Sign language translation terminal and sign language translation server

Publications (1)

Publication Number Publication Date
CN210402846U true CN210402846U (en) 2020-04-24

Family

ID=70346064

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201920711021.2U Active CN210402846U (en) 2019-05-17 2019-05-17 Sign language translation terminal and sign language translation server

Country Status (1)

Country Link
CN (1) CN210402846U (en)

Similar Documents

Publication Publication Date Title
EP2574220B1 (en) Hand-held communication aid for individuals with auditory, speech and visual impairments
US5982853A (en) Telephone for the deaf and method of using same
CN110111787A (en) A kind of semanteme analytic method and server
US20140171036A1 (en) Method of communication
CN100358358C (en) Sign language video presentation device, sign language video i/o device, and sign language interpretation system
US9183199B2 (en) Communication device for multiple language translation system
TWI276357B (en) Image input apparatus for sign language talk, image input/output apparatus for sign language talk, and system for sign language translation
JP2003345379A6 (en) Audio-video conversion apparatus and method, audio-video conversion program
CN107845386B (en) Sound signal processing method, mobile terminal and server
EP4064280A1 (en) Interaction method and electronic device
US11636852B2 (en) Human-computer interaction method and electronic device
WO2018186416A1 (en) Translation processing method, translation processing program, and recording medium
CN111739517B (en) Speech recognition method, device, computer equipment and medium
JPWO2013077110A1 (en) Translation apparatus, translation system, translation method and program
US20210296915A1 (en) Wireless earphone device and method for using the same
CN108989558A (en) The method and device of terminal call
CN113297843B (en) Reference resolution method and device and electronic equipment
CN108124061A (en) The storage method and device of voice data
KR102263154B1 (en) Smart mirror system and realization method for training facial sensibility expression
KR20190100694A (en) Method for judging learning achievement method based on user's handwritten data, smart device, server and system for the same
CN210402846U (en) Sign language translation terminal and sign language translation server
CN111491058A (en) Method for controlling operation mode, electronic device, and storage medium
CN110248269A (en) A kind of information identifying method, earphone and terminal device
CN111524518B (en) Augmented reality processing method and device, storage medium and electronic equipment
CN108280189B (en) Voice question searching method and system based on intelligent pen

Legal Events

Date Code Title Description
GR01 Patent grant
GR01 Patent grant