CN112256827A - Sign language translation method and device, computer equipment and storage medium - Google Patents

Sign language translation method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN112256827A
CN112256827A CN202011122840.7A CN202011122840A CN112256827A CN 112256827 A CN112256827 A CN 112256827A CN 202011122840 A CN202011122840 A CN 202011122840A CN 112256827 A CN112256827 A CN 112256827A
Authority
CN
China
Prior art keywords
translation
data
sign language
model group
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011122840.7A
Other languages
Chinese (zh)
Inventor
洪振厚
王健宗
瞿晓阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202011122840.7A priority Critical patent/CN112256827A/en
Priority to PCT/CN2020/134561 priority patent/WO2021179703A1/en
Publication of CN112256827A publication Critical patent/CN112256827A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/63Querying
    • G06F16/635Filtering based on additional data, e.g. user or group profiles
    • G06F16/636Filtering based on additional data, e.g. user or group profiles by using biological or physiological data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/02Preprocessing
    • G06F2218/04Denoising
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/08Feature extraction

Abstract

The invention discloses a sign language translation method and device, computer equipment and a storage medium, and relates to the field of sign language translation. The method can translate sign language of deaf-dumb people in different areas, and selects the translation model group associated with the area information according to the area information by acquiring sign language data carrying the area information sent by a user, so that the accuracy of the translation result is improved by adopting corresponding translation models for the deaf-dumb people in different areas; and translating the sign language data by adopting a translation model group, and acquiring translation data, thereby realizing barrier-free communication between deaf-dumb people and ordinary people.

Description

Sign language translation method and device, computer equipment and storage medium
Technical Field
The present invention relates to the field of sign language translation, and in particular, to a sign language translation method and apparatus, a computer device, and a storage medium.
Background
With the development of the public health care for the disabled in China, the demand of the deaf-mutes for participating in the society is continuously enhanced. In recent years, with the continuous improvement of relevant researches of various subjects such as linguistics, computer science, graphic imaging, mechanical refinement and the like, the research on sign language translation systems is deepened at home and abroad, a plurality of portable sign language and voice inter-translation devices are also appeared in the market, for example, sign languages enable people who do not know sign languages to smoothly communicate with handicapped people who use sign languages, the daily communication between hearing impaired people and ordinary people is facilitated, and the researches are mainly focused on visual-based sign language translators.
The main working process of the visual sign language translator is as follows: the gesture data is acquired by acquiring the actions of key points of the hands through image acquisition equipment, and then the sign language is converted into visual characters or read aloud through voice software, otherwise, the communication between the two parties can be realized by converting the language of normal people into characters. Although the existing visual sign language translator combines sign language identification and sign language synthesis to realize the translation of sign language data, different countries or regions adopt different sign language standards, the gesture of sign language is not uniform, and the sign language translation system in the prior art ignores the problem of individual difference and regional difference of users, so that the situation of misrecognition occurs in the process of identifying sign language by using intelligent sign language translation equipment, and the communication between deaf-mutes and normal persons is disturbed. There is also a problem that the sign language translation result is inaccurate due to regional differences.
In summary, the conventional sign language translation apparatus mainly includes: the gesture recognition precision is low and the translation accuracy is not enough due to the difference of sign language actions in different areas.
Disclosure of Invention
Aiming at the problems of low gesture recognition precision and insufficient translation accuracy caused by the difference of sign language actions of different regions in the existing sign language translation equipment, a sign language translation method, a device, computer equipment and a storage medium based on the sign language translation method and the device aiming at improving the precision of sign language translation results of different regions are provided.
In order to achieve the above object, the present invention provides a sign language translation method, including:
acquiring sign language data carrying regional information sent by a user;
selecting a translation model group associated with a preset region range according to the region information;
each translation model group is associated with a preset area range, and comprises at least two translation models;
translating the sign language data by adopting the translation model in the translation model group to acquire translation data;
converting the translated data into audio data.
Preferably, selecting a translation model group associated with a preset region range according to the region information includes:
matching the area information with a plurality of preset area ranges to obtain the preset area ranges matched with the area information;
and acquiring the translation model group associated with the preset region range.
Preferably, the selecting, according to the region information, a translation model group associated with the region information further includes:
acquiring a training sample set associated with the area information and a testing sample set associated with the area information;
training each initial classification model in the initial classification model group by adopting the training sample set;
testing each trained initial classification model by adopting the test sample set, and taking the trained initial classification model as a translation model if a test result meets a preset requirement;
each set of translation models associated with the region information includes a plurality of the translation models.
Preferably, translating the sign language data by using the translation model group to obtain translation data includes:
translating the sign language data by adopting each translation model in the translation model group respectively to acquire semantic probability;
and taking the semantic data corresponding to the highest semantic probability in all the semantic probabilities as translation data.
Preferably, translating the sign language data by using each translation model in the translation model group to obtain the semantic probability includes:
the sign language data comprises an EMG signal;
extracting an EMG signal in the sign language data, denoising the EMG signal in a calculation and division average mode, and cutting the denoised signal to obtain characteristic data;
inputting the feature data into the translation model, and identifying the feature data through the translation model to acquire the semantic probability.
Preferably, the converting the translation data into audio data includes:
mapping the translation data to a preset sign language voice library to obtain audio data matched with the translation data;
wherein, predetermine sign language pronunciation storehouse and include: translation data and audio data associated with the translation data.
Preferably, the converting the translation data into audio data includes:
and identifying semantic information of the translation data, and converting the semantic information into the audio data by adopting a voice converter.
In order to achieve the above object, the present invention also provides a sign language interpretation apparatus, comprising:
the acquisition unit is used for acquiring sign language data carrying area information sent by a user;
the model selection unit is used for selecting a translation model group associated with a preset region range according to the region information;
the translation unit is used for translating the sign language data by adopting the translation models in the translation model group to acquire translation data;
a conversion unit for converting the translation data into audio data.
To achieve the above object, the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and the processor implements the steps of the above method when executing the computer program.
To achieve the above object, the present invention also provides a computer-readable storage medium having stored thereon a computer program, which, when being executed by a processor, carries out the steps of the above method.
The sign language translation method, the sign language translation device, the computer equipment and the storage medium can be used for translating sign languages of deaf and dumb people in different areas, and by acquiring sign language data carrying area information sent by a user and selecting a translation model group associated with the area information according to the area information, corresponding translation models are adopted for the deaf and dumb people in different areas, so that the accuracy of translation results is improved; and translating the sign language data by adopting a translation model group, and acquiring translation data, thereby realizing barrier-free communication between deaf-dumb people and ordinary people.
Drawings
FIG. 1 is a flowchart of an embodiment of a sign language translation method according to the present invention;
FIG. 2 is a flowchart of a method according to an embodiment of the present invention before selecting a set of translation models associated with a predetermined range of regions according to the region information;
FIG. 3 is a flowchart of a method for translating sign language data to obtain translated data according to an embodiment of the present invention;
FIG. 4 is a flowchart of a method for translating the sign language data using each of the set of translation models to obtain semantic probabilities according to an embodiment of the present invention;
FIG. 5 is a block diagram of an embodiment of a sign language translation device according to the present invention;
fig. 6 is a schematic hardware architecture diagram of an embodiment of a computer device according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict.
The sign language translation method, the sign language translation device, the computer equipment and the storage medium are suitable for the field of intelligent medical services. The method can be used for translating the sign language of deaf-dumb people in different areas, and the accuracy of the translation result is improved by adopting the corresponding translation model for the deaf-dumb people in different areas by acquiring the sign language data carrying area information sent by the user and selecting the translation model group associated with the area information according to the area information; and translating the sign language data by adopting a translation model group, and acquiring translation data, thereby realizing barrier-free communication between deaf-dumb people and ordinary people.
Example one
Referring to fig. 1, a sign language translation method of the present embodiment includes the following steps:
s1, acquiring sign language data carrying regional information sent by a user;
in this step, the area information is location information of a user (hearing-impaired person), and the location information may include positioning information and home location information. The location information can be obtained through a location module in the mobile terminal used by the user, the area information can be the information of the home location of the user, and the sign language areas (such as different countries or regions) used by the user are distinguished according to the location information. The positioning information may be the current location of the user, such as information obtained by positioning according to a positioning module in the intelligent terminal. The home location information may be the home location information of the user, or may be information filled by the user.
The hand language data can capture bioelectricity signals formed by weak current generated when muscles are static or contract through sensors such as a bracelet and an arm ring, the sensors are made of conductive yarn and can capture the actions of hands and the positions of corresponding fingers, the actions and the positions represent letters, numbers, words and phrases in hand language, the bracelet device can convert the motions of the fingers into electric signals and then send the electric signals to a circuit board on the bracelet, and the circuit board can wirelessly transmit the signals to mobile terminal equipment on a smart phone to generate the hand language data.
S2, selecting a translation model group associated with a preset area range according to the area information;
each translation model group is associated with a preset area range, and the translation model group comprises at least two translation models.
Further, the step S2 includes:
s21, matching the area information with a plurality of preset area ranges to obtain the preset area ranges matched with the area information;
in this embodiment, each translation model group is associated with a preset region range, different preset region ranges are not overlapped with each other, different preset region ranges correspond to different translation model groups, and the translation model groups can be stored by using a database. The sign language data in this embodiment carries the region information, and when translating the sign language data from different regions, the database may be queried according to the region information corresponding to the sign language data to select a preset region range matching with the region information.
And S22, acquiring the translation model group associated with the preset area range.
In this embodiment, a corresponding translation model group is determined according to the preset region range, so that the corresponding translation model in the translation model group is used to translate the language data. The area information is the location information of the user (hearing impaired person), and the location information may include positioning information and home location information.
By way of example and not limitation, when the region information carried by the sign language data is Sichuan province, a translation model matched with the Sichuan province is selected from a database according to the region information to translate the sign language data; and when the region information carried by the sign language data is Jiangsu province, selecting a translation model matched with the Jiangsu province from the database according to the region information to translate the sign language data, and so on.
It should be noted that: if the translation model matched with the region information does not exist in the database for storing the translation model group, the translation model is trained, the trained model group is stored in the database, and the translation model group of the database is updated, so that the sign language data from different regions can be better matched with the translation model, and a sign language translation result with higher translation accuracy and stronger pertinence is obtained.
The translation model set may be obtained by training an initial classification model set, or may be a translation model set trained in advance.
Further, before the step S2, the method further includes (shown with reference to fig. 2):
A1. acquiring a training sample set associated with the area information and a test sample set associated with the area information;
in this step, the training sample set is a set of data for finding and predicting potential relationships, including sign language data without sign language action semantic labeling, and the testing sample set is a set of data for evaluating the strength and utility of the predicted relationships, including sign language data with sign language action semantic labeling.
The marking mode can adopt a manual marking mode to mark the action semantics of the hand language.
A2. Training each initial classification model in the initial classification model group by adopting the training sample set;
in the step, a plurality of users with different sexes and ages from different areas can demonstrate actions under different emotional states aiming at the same sign language posture according to mobile phone prompts, bioelectricity signals formed by weak current generated when muscles are static or contract are captured through sensors such as a bracelet and an arm ring, sign language data are obtained, the obtained sign language data are translated through an initial classification model, the translation result is fed back and updated through a feedback mechanism, and a corresponding sign language translation library is generated. For example, the actual semantic meaning of the sign language action captured by the sensor is 'where the convenience store' and the hand language semantic meaning is translated by the initial classification model, if the translation result is 'where the convenience store' as well, the update and feedback are not carried out, and a corresponding sign language translation library storing the semantic meaning is generated, and if the translation result is not 'where the convenience store' and the wrong translation result is fed back and the initial classification model is updated.
In the training stage of the initial classification model set, the source of the training sample can be provided by users in the same region (community), and only the training set is available in the training process of the model by using the training set. The test set is only available when the resulting model is tested for accuracy. The test set is a set of data that is independent of the training set, but follows the same probability distribution as the data in the training set.
A3. Testing each trained initial classification model by adopting the test sample set, and taking the trained initial classification model as a translation model if a test result meets a preset requirement;
in this step, the test result refers to a result of translating the hand language data by using the initial classification model, and the translation accuracy of the initial translation model may be tested by using the sign language data subjected to semantic labeling in the test sample set, for example, there are 100 sets of sign language data in the test sample set, and 100 sets of sign language data are respectively tested by using the sign language initial translation model, if the accuracy of the test result is greater than or equal to 90%, it is determined that the initial translation model meets the preset requirement, and the trained initial classification model is used as the translation model.
A4. Each translation model group associated with the region information includes a plurality of the translation models;
in this embodiment, the translation model selects at least two of the following models:
long and short term memory model, gated cyclic unit model, sequence-to-sequence model.
A Long-Short Term Memory model (Long-Short Term Memory) is a special Recurrent Neural Network (RNN) that can be applied to speech recognition, language modeling, and translation, and in the conventional RNN, the Long-Term Memory effect of the RNN cannot be reflected in the training process, so that a Memory unit is required to store Memory, and thus the Long-Short Term Memory model is proposed; conventional neural networks do not make time-sequential correlations of information. For example, when the input sign language data has a semantic meaning of "hello", a traditional trained neural network model capable of sign language translation cannot correctly translate the same sign language data in the future, i.e., the traditional neural network cannot deduce the next event according to the previous judgment event, so that a loop exists in the network structure of the long-short term memory model, so that the previous training information is retained, and although the traditional Recurrent Neural Network (RNN) can also solve the problem, the long-short term memory model has better performance and is selected as a translation model.
The Gated Recurrent Unit (GRU) is a commonly used Gated Recurrent neural network, and is a variant of the long-short term memory model, and the Gated Recurrent Unit maintains the effect of the long-short term memory model, and simultaneously has a simpler structure and high processing speed, so that the Gated Recurrent Unit is very popular, and therefore the model is selected as a training translation model in the scheme.
Sequence-To-Sequence models (Sequence-To-Sequence, Seq2Seq) also have good performance in tasks such as translation, speech recognition and the like, can well process a series of data with continuous relations such as speech data, text data, video data and the like, and combines two recurrent neural networks. One neural network is responsible for receiving source sentences and the other recurrent neural network is responsible for outputting sentences into translated language. The two processes are respectively called encoding and decoding processes, and through the encoding and decoding processes, in the process of actually training the translation model, the situation of error accumulation can be avoided through using the model.
S3, translating the sign language data by adopting the translation model in the translation model group to obtain translation data;
in the step, the sign language data from different regions are translated through the trained translation model according to the sign language data from different regions, an accurate translation result is obtained, and a corresponding specific sign language translation library is generated, so that the accuracy of the translation result of the sign language data from different regions is improved, and the problems that sign language actions in different regions are the same and sign language voices are different are solved.
Further, referring to fig. 3, the step S3 may include:
s31, translating the sign language data by adopting each translation model in the translation model group respectively to obtain semantic probability;
in this step, translating the sign language data refers to translating the acquired sign language data with the same sign language semantic respectively by using each model in the model group, and acquiring a translation result respectively.
And S32, taking the semantic data corresponding to the highest semantic probability in all the semantic probabilities as translation data.
In this step, the translation results of the same sign language semantic acquired respectively are compared, and for example, if the semantic probabilities acquired respectively by different translation models are 90%, 92%, and 95%, semantic data with a semantic probability of 95% is selected as the translation data.
Further, the step S31 may include (shown with reference to fig. 4):
s311, extracting an EMG signal in the sign language data, denoising the EMG signal in a calculation and division average mode, and cutting the denoised signal to obtain characteristic data;
specifically, in this embodiment, the start and end points of the EMG signal are determined, the EMG signal is subjected to a score average, and db12 wavelet transform denoising is performed on the algorithmically averaged signal; and identifying whether the signal is within a preset threshold range, if so, determining the signal as an active segment (if the signal is higher than the initial threshold and lower than the offset threshold, determining the signal as the active segment), and extracting the characteristic data corresponding to the active segment.
S312, inputting the feature data into the translation model, and identifying the feature data through the translation model to obtain the semantic probability;
in this step, after the task of extracting the feature of the sign language data through the translation model is completed, the probability representing the current sign language semantics is finally output, wherein the translation model can be a long-short term memory model, a gated cyclic unit model, a sequence-to-sequence model. For example, the translation model may be a long-short term memory model, and meanwhile, since the long-short term memory model itself is used to train the sign language data in the training sample set, determine the features of the current sign language data, and finally output the probabilities representing the semantics of the current sign language, and finally replace the step of obtaining the probability of a certain feature of the current data through the conventional feature extraction function with the conditional random models, the problem of dependence of the conditional random field model on the manually provided feature extraction function and the correlation of the translation result is also alleviated.
And S4, converting the translation data into audio data.
In this step, a text to speech (tts) technology may be used to complete the conversion from the translation data to the audio data.
Further, the step S4 may include:
mapping the translation data to a preset sign language voice library to obtain audio data matched with the translation data;
wherein, predetermine sign language pronunciation storehouse and include: translation data and audio data associated with the translation data.
In the present embodiment, the audio data is stored by a preset sign language voice library.
It is emphasized that, to further ensure the privacy and security of the audio data, the audio data may also be stored in a node of a blockchain.
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
In this embodiment, the obtained sign language voice library can be fed back to users in different areas, or a cloud exists, and the users with requirements download the sign language voice library by themselves. The audio data may be data in a sign language voice library, or audio data obtained by translating the data.
The data audio can be displayed in a pronunciation mode through pronunciation voice by a pronunciation module of the sign language translation equipment. If the text represented by the audio data indicates that the convenience store is located, the audio data is read by the pronunciation module so that the deaf-mute can communicate with the Purui people normally.
Further, the step S4 may further include:
and identifying semantic information of the translation data, and converting the semantic information into the audio data by adopting a voice converter.
In this embodiment, a natural language processing technology (NLP for short) may be used to process the semantic information of the translation data, where the semantic information of the translation data may be text information, and the text information may be a sentence or a word. Through syntactic and semantic analysis, the semantic information of the translated data is subjected to syntactic analysis and disambiguation of polysemous words, and complete semantic information with high accuracy is obtained. For example, the semantic information of the translation data is that "you are asking where the convenience store is thank you", and after the semantic information of the translation data is processed by adopting a natural language processing technology, the semantic information of "you are asking where the convenience store is thank you", which is clearer in expression, can be obtained. The voice converter adopts TTS (text to speech) text to voice technology to complete the conversion from the translation data to the audio data, so as to realize barrier-free communication between deaf and dumb people and common people.
In this embodiment, the sign language translation method can be used for translating sign languages of deaf-dumb people in different regions, and by acquiring sign language data carrying region information sent by a user and selecting a translation model group associated with the region information according to the region information, corresponding translation models are adopted for the deaf-dumb people in different regions, so that the accuracy of translation results is improved; and translating the sign language data by adopting a translation model group, and acquiring translation data, thereby realizing barrier-free communication between deaf-dumb people and ordinary people.
Example two
Referring to fig. 5, a translation apparatus 1 of the present embodiment includes: the device comprises an acquisition unit 11, a model selection unit 12, a translation unit 13 and a conversion unit 14, wherein:
the acquisition unit 11 is configured to acquire sign language data carrying area information sent by a user;
the area information is the position information of the user (hearing-impaired person), and the position information can comprise positioning information and attribution information. The location information can be obtained through a location module in the mobile terminal used by the user, the area information can be the information of the home location of the user, and the sign language areas (such as different countries or regions) used by the user are distinguished according to the location information. The sign language data can capture a bioelectricity signal formed by weak current generated when muscles are static or contract through sensors such as a bracelet and an arm ring, and the signal is sent to mobile terminals such as a mobile phone to generate sign language data.
The model selection unit 12 is configured to select a translation model group associated with a preset region range according to the region information;
in this embodiment, the translation model group may be obtained by training an initial classification model group, or may be a translation model group trained in advance. Before the selecting the translation model group associated with the region information according to the region information, the method further comprises the following steps:
acquiring a training sample set associated with the area information and a testing sample set associated with the area information;
training each initial classification model in the initial classification model group by adopting the training sample set;
testing each trained initial classification model by adopting the test sample set, and taking the trained initial classification model as a translation model if a test result meets a preset requirement;
each set of translation models associated with the region information includes a plurality of the translation models.
The translation unit 13 is configured to translate the sign language data by using the translation model in the translation model group to obtain translation data;
the translation model selects at least two of the following models: long and short term memory model, gated cyclic unit model, sequence-to-sequence model. And for sign language data from different regions, translating the input sign language data from different regions through a trained translation model to obtain an accurate translation result, and generating a corresponding specific sign language translation library to improve the accuracy of the translation result of the sign language data for different regions.
A conversion unit 14 for converting the translation data into audio data;
the conversion from translation data To audio data can be completed by adopting a TTS (text To speech) text To speech technology, the data audio can be displayed in a pronunciation mode by a pronunciation module of sign language translation equipment, and the audio data can be read by the pronunciation module for the normal communication between the deaf-mute and the prosperous person.
In the embodiment, the method can be used for translating sign language of deaf-dumb people in different areas, and by acquiring sign language data carrying area information sent by a user and selecting the translation model group associated with the area information according to the area information, the accuracy of translation results is improved by adopting corresponding translation models for the deaf-dumb people in different areas; and translating the sign language data by adopting a translation model group, and acquiring translation data, thereby realizing barrier-free communication between deaf-dumb people and ordinary people.
EXAMPLE III
In order to achieve the above object, the present invention further provides a computer device 2, where the computer device 2 includes a plurality of computer devices 2, components of the sign language interpretation apparatus 1 according to the second embodiment may be dispersed in different computer devices 2, and the computer device 2 may be a smart phone, a tablet computer, a notebook computer, a desktop computer, a rack-mounted server, a blade server, a tower server, or a cabinet server (including an independent server or a server cluster formed by a plurality of servers) that executes programs. The computer device 2 of the present embodiment includes at least, but is not limited to: a memory 21, a processor 23, a network interface 22, and a sign language interpretation apparatus 1 (refer to fig. 6) which can be communicatively connected to each other through a system bus. It is noted that fig. 6 only shows the computer device 2 with components, but it is to be understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead.
In this embodiment, the memory 21 includes at least one type of computer-readable storage medium, which includes a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, and the like. In some embodiments, the storage 21 may be an internal storage unit of the computer device 2, such as a hard disk or a memory of the computer device 2. In other embodiments, the memory 21 may also be an external storage device of the computer device 2, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like provided on the computer device 2. Of course, the memory 21 may also comprise both an internal storage unit of the computer device 2 and an external storage device thereof. In this embodiment, the memory 21 is generally used for storing an operating system installed in the computer device 2 and various application software, such as program codes of the sign language translation method in the first embodiment. Further, the memory 21 may also be used to temporarily store various types of data that have been output or are to be output.
The processor 23 may be a Central Processing Unit (CPU), a controller, a microcontroller, a microprocessor, or other data Processing chip in some embodiments. The processor 23 is typically used for controlling the overall operation of the computer device 2, such as performing control and processing related to data interaction or communication with the computer device 2. In this embodiment, the processor 23 is configured to run the program code stored in the memory 21 or process data, for example, run the sign language interpretation apparatus 1.
The network interface 22 may comprise a wireless network interface or a wired network interface, and the network interface 22 is typically used to establish a communication connection between the computer device 2 and other computer devices 2. For example, the network interface 22 is used to connect the computer device 2 to an external terminal through a network, establish a data transmission channel and a communication connection between the computer device 2 and the external terminal, and the like. The network may be a wireless or wired network such as an Intranet (Intranet), the Internet (Internet), a Global System of Mobile communication (GSM), Wideband Code Division Multiple Access (WCDMA), a 4G network, a 5G network, Bluetooth (Bluetooth), Wi-Fi, and the like.
It is noted that fig. 6 only shows the computer device 2 with components 21-23, but it is to be understood that not all shown components are required to be implemented, and that more or less components may be implemented instead.
In this embodiment, the sign language interpretation apparatus 1 stored in the memory 21 may be further divided into one or more program modules, and the one or more program modules are stored in the memory 21 and executed by one or more processors (in this embodiment, the processor 23) to complete the present invention.
Example four
To achieve the above objects, the present invention also provides a computer-readable storage medium including a plurality of storage media such as a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, a server, an App application store, etc., on which a computer program is stored, which when executed by the processor 23, implements corresponding functions. The computer readable storage medium of the embodiment is used for storing the sign language interpretation apparatus 1, and when being executed by the processor 23, the computer readable storage medium implements the sign language interpretation method of the first embodiment.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A sign language translation method, comprising:
acquiring sign language data carrying regional information sent by a user;
selecting a translation model group associated with a preset region range according to the region information;
each translation model group is associated with a preset area range, and comprises at least two translation models;
translating the sign language data by adopting the translation model in the translation model group to acquire translation data;
converting the translated data into audio data.
2. The sign language translation method according to claim 1, wherein selecting a translation model group associated with a preset region range according to the region information comprises:
matching the area information with a plurality of preset area ranges to obtain the preset area ranges matched with the area information;
and acquiring the translation model group associated with the preset region range.
3. The sign language translation method according to claim 1, wherein selecting a translation model group associated with the region information according to the region information further comprises:
acquiring a training sample set associated with the area information and a testing sample set associated with the area information;
training each initial classification model in the initial classification model group by adopting the training sample set;
testing each trained initial classification model by adopting the test sample set, and taking the trained initial classification model as a translation model if a test result meets a preset requirement;
each set of translation models associated with the region information includes a plurality of the translation models.
4. The sign language translation method according to claim 1, wherein translating the sign language data by using the translation model in the translation model group to obtain translation data comprises:
translating the sign language data by adopting each translation model in the translation model group respectively to acquire semantic probability;
and taking the semantic data corresponding to the highest semantic probability in all the semantic probabilities as translation data.
5. The sign language translation method according to claim 4, wherein translating the sign language data by using each translation model in the set of translation models to obtain semantic probabilities comprises:
the sign language data comprises an EMG signal;
extracting an EMG signal in the sign language data, denoising the EMG signal in a calculation and division average mode, and cutting the denoised signal to obtain characteristic data;
inputting the feature data into the translation model, and identifying the feature data through the translation model to acquire the semantic probability.
6. The sign language translation method according to claim 1, converting the translation data into audio data, comprising:
mapping the translation data to a preset sign language voice library to obtain audio data matched with the translation data;
wherein, predetermine sign language pronunciation storehouse and include: translation data and audio data associated with the translation data.
7. The sign language translation method according to claim 1, converting the translation data into audio data, comprising:
and identifying semantic information of the translation data, and converting the semantic information into the audio data by adopting a voice converter.
8. A sign language interpretation apparatus comprising:
the acquisition unit is used for acquiring sign language data carrying area information sent by a user;
the model selection unit is used for selecting a translation model group associated with a preset region range according to the region information;
the translation unit is used for translating the sign language data by adopting the translation models in the translation model group to acquire translation data;
a conversion unit for converting the translation data into audio data.
9. A computer arrangement comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the method of any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
CN202011122840.7A 2020-10-20 2020-10-20 Sign language translation method and device, computer equipment and storage medium Pending CN112256827A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202011122840.7A CN112256827A (en) 2020-10-20 2020-10-20 Sign language translation method and device, computer equipment and storage medium
PCT/CN2020/134561 WO2021179703A1 (en) 2020-10-20 2020-12-08 Sign language interpretation method and apparatus, computer device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011122840.7A CN112256827A (en) 2020-10-20 2020-10-20 Sign language translation method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112256827A true CN112256827A (en) 2021-01-22

Family

ID=74244342

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011122840.7A Pending CN112256827A (en) 2020-10-20 2020-10-20 Sign language translation method and device, computer equipment and storage medium

Country Status (2)

Country Link
CN (1) CN112256827A (en)
WO (1) WO2021179703A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113780013A (en) * 2021-07-30 2021-12-10 阿里巴巴(中国)有限公司 Translation method, translation equipment and readable medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114157920B (en) * 2021-12-10 2023-07-25 深圳Tcl新技术有限公司 Method and device for playing sign language, intelligent television and storage medium

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106295603A (en) * 2016-08-18 2017-01-04 广东技术师范学院 Chinese sign language bidirectional translation system, method and apparatus
CN106383579A (en) * 2016-09-14 2017-02-08 西安电子科技大学 EMG and FSR-based refined gesture recognition system and method
WO2017161741A1 (en) * 2016-03-23 2017-09-28 乐视控股(北京)有限公司 Method and device for communicating information with deaf-mutes, smart terminal
CN109214347A (en) * 2018-09-19 2019-01-15 北京因时机器人科技有限公司 A kind of sign language interpretation method across languages, device and mobile device
CN109271901A (en) * 2018-08-31 2019-01-25 武汉大学 A kind of sign Language Recognition Method based on Multi-source Information Fusion
CN109960814A (en) * 2019-03-25 2019-07-02 北京金山数字娱乐科技有限公司 Model parameter searching method and device
CN110008839A (en) * 2019-03-08 2019-07-12 西安研硕信息技术有限公司 A kind of intelligent sign language interactive system and method for adaptive gesture identification
CN110413106A (en) * 2019-06-18 2019-11-05 中国人民解放军军事科学院国防科技创新研究院 A kind of augmented reality input method and system based on voice and gesture
CN110992783A (en) * 2019-10-29 2020-04-10 东莞市易联交互信息科技有限责任公司 Sign language translation method and translation equipment based on machine learning
CN111354246A (en) * 2020-01-16 2020-06-30 浙江工业大学 System and method for helping deaf-mute to communicate

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9282377B2 (en) * 2007-05-31 2016-03-08 iCommunicator LLC Apparatuses, methods and systems to provide translations of information into sign language or other formats
CN110210721B (en) * 2019-05-14 2023-11-21 株洲手之声信息科技有限公司 Remote sign language online translation customer service distribution method and device

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017161741A1 (en) * 2016-03-23 2017-09-28 乐视控股(北京)有限公司 Method and device for communicating information with deaf-mutes, smart terminal
CN106295603A (en) * 2016-08-18 2017-01-04 广东技术师范学院 Chinese sign language bidirectional translation system, method and apparatus
CN106383579A (en) * 2016-09-14 2017-02-08 西安电子科技大学 EMG and FSR-based refined gesture recognition system and method
CN109271901A (en) * 2018-08-31 2019-01-25 武汉大学 A kind of sign Language Recognition Method based on Multi-source Information Fusion
CN109214347A (en) * 2018-09-19 2019-01-15 北京因时机器人科技有限公司 A kind of sign language interpretation method across languages, device and mobile device
CN110008839A (en) * 2019-03-08 2019-07-12 西安研硕信息技术有限公司 A kind of intelligent sign language interactive system and method for adaptive gesture identification
CN109960814A (en) * 2019-03-25 2019-07-02 北京金山数字娱乐科技有限公司 Model parameter searching method and device
CN110413106A (en) * 2019-06-18 2019-11-05 中国人民解放军军事科学院国防科技创新研究院 A kind of augmented reality input method and system based on voice and gesture
CN110992783A (en) * 2019-10-29 2020-04-10 东莞市易联交互信息科技有限责任公司 Sign language translation method and translation equipment based on machine learning
CN111354246A (en) * 2020-01-16 2020-06-30 浙江工业大学 System and method for helping deaf-mute to communicate

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113780013A (en) * 2021-07-30 2021-12-10 阿里巴巴(中国)有限公司 Translation method, translation equipment and readable medium

Also Published As

Publication number Publication date
WO2021179703A1 (en) 2021-09-16

Similar Documents

Publication Publication Date Title
US10977452B2 (en) Multi-lingual virtual personal assistant
CN110795543B (en) Unstructured data extraction method, device and storage medium based on deep learning
CN109918680B (en) Entity identification method and device and computer equipment
CN112650854B (en) Intelligent reply method and device based on multiple knowledge graphs and computer equipment
CN113704428B (en) Intelligent inquiry method, intelligent inquiry device, electronic equipment and storage medium
CN111382261B (en) Abstract generation method and device, electronic equipment and storage medium
CN111274797A (en) Intention recognition method, device and equipment for terminal and storage medium
CN112131368B (en) Dialogue generation method and device, electronic equipment and storage medium
CN111695338A (en) Interview content refining method, device, equipment and medium based on artificial intelligence
CN106713111B (en) Processing method for adding friends, terminal and server
CN112256827A (en) Sign language translation method and device, computer equipment and storage medium
CN108304387B (en) Method, device, server group and storage medium for recognizing noise words in text
CN112463942A (en) Text processing method and device, electronic equipment and computer readable storage medium
CN115309877A (en) Dialog generation method, dialog model training method and device
US20220358297A1 (en) Method for human-machine dialogue, computing device and computer-readable storage medium
CN116050352A (en) Text encoding method and device, computer equipment and storage medium
CN111414453A (en) Structured text generation method and device, electronic equipment and computer readable storage medium
CN112836019B (en) Public medical health named entity identification and entity linking method and device, electronic equipment and storage medium
Ramadani et al. A new technology on translating Indonesian spoken language into Indonesian sign language system.
CN113515593A (en) Topic detection method and device based on clustering model and computer equipment
CN115132182B (en) Data identification method, device, equipment and readable storage medium
CN114611529B (en) Intention recognition method and device, electronic equipment and storage medium
CN112836522B (en) Method and device for determining voice recognition result, storage medium and electronic device
CN116189663A (en) Training method and device of prosody prediction model, and man-machine interaction method and device
CN114067362A (en) Sign language recognition method, device, equipment and medium based on neural network model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40043402

Country of ref document: HK

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination