CN116932706A - Chinese translation method and electronic equipment - Google Patents

Chinese translation method and electronic equipment Download PDF

Info

Publication number
CN116932706A
CN116932706A CN202210396448.4A CN202210396448A CN116932706A CN 116932706 A CN116932706 A CN 116932706A CN 202210396448 A CN202210396448 A CN 202210396448A CN 116932706 A CN116932706 A CN 116932706A
Authority
CN
China
Prior art keywords
user
mouth
action
hand
vocabulary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210396448.4A
Other languages
Chinese (zh)
Inventor
谢雨晨
常亚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Technologies Co Ltd
Original Assignee
Huawei Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co Ltd filed Critical Huawei Technologies Co Ltd
Priority to CN202210396448.4A priority Critical patent/CN116932706A/en
Priority to PCT/CN2023/086870 priority patent/WO2023197949A1/en
Publication of CN116932706A publication Critical patent/CN116932706A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B21/00Teaching, or communicating with, the blind, deaf or mute
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B21/00Teaching, or communicating with, the blind, deaf or mute
    • G09B21/009Teaching or communicating with deaf persons

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Business, Economics & Management (AREA)
  • Educational Administration (AREA)
  • Educational Technology (AREA)
  • Human Computer Interaction (AREA)
  • Machine Translation (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The application provides a method for Chinese translation and electronic equipment, wherein the method comprises the following steps: responding to the input of a user, the electronic equipment acquires text information, wherein the text information comprises keywords; the electronic equipment displays the hand action corresponding to the text information; the electronic device displays the mouth motion corresponding to the keyword. According to the translation method and the electronic device, the oral action conforming to the habit of the sign language user is added when the text information is translated into the sign language, so that the accuracy of language expression when Chinese is translated into the sign language is improved, misunderstanding of the sign language user on a translation result is reduced, communication and communication with the sign language user are enhanced, and application experience of the electronic device user is improved.

Description

Chinese translation method and electronic equipment
Technical Field
The application relates to the field of computers, in particular to a method for Chinese translation and electronic equipment.
Background
Digital humans (digitals) can help sign language users understand language information through hand movements and/or mouth movements.
When a sign language digital person performs a hand action and simultaneously performs a corresponding mouth action to express a sentence or a meaning, the mouth action sometimes cannot serve the purpose of assisting the hand action to assist understanding, but may cause unnecessary misunderstanding. For example, when a natural sign language is used as a hand, a person who is not impaired in hearing can speak the hand by the mouth, and the hand may not be expressed in the same time as the mouth, which may lead to misunderstanding.
Disclosure of Invention
The application provides a Chinese translation method, which only adds mouth actions for keywords when translating text information into sign language, thereby being beneficial to improving the accuracy of language expression when translating Chinese into sign language.
In a first aspect, a method for translating chinese is provided, including: responding to the input of a user, the electronic equipment acquires text information, wherein the text information comprises keywords; the electronic equipment displays the hand action corresponding to the text information; the electronic device displays the mouth motion corresponding to the keyword.
In one possible implementation, the keyword is identified by the electronic device according to text information input by the user by one or more of the following ways: the content of the text information, the translation history information of the user or other methods for determining the keywords in the text information by other users.
It should be noted that, the keywords may be one or more words included in the text information or may be one or more words included in the text information.
It should be noted that, the sequence of vocabulary expressions of the sign language user when doing the sign language actions may be different from the sequence of natural spoken language, where the sequence of hand actions corresponding to the text information displayed by the electronic device may be determined according to the habit of the sign language user.
When the sign language is played, corresponding mouth actions are only added for the keywords, and the technical scheme translates the text information into the sign language which serves the expression habit of the sign language user, so that the accuracy of the text information translation result is improved, the probability of misunderstanding of the sign language user on the translated sign language is reduced, and the mutual communication between the sign language user and the sign language user is enhanced.
With reference to the first aspect, in certain implementations of the first aspect, the keyword is determined according to language habits of the sign language user.
When the sign language is played, corresponding mouth actions are added to the keywords determined according to the language habits of the sign language users, and the technical scheme translates the text information into the sign language serving the expression habits of the sign language users, so that the accuracy of the text information translation result is improved, the probability of misunderstanding of the sign language users on the translated sign language is reduced, and the mutual communication between the sign language users is enhanced.
With reference to the first aspect, in some implementations of the first aspect, the electronic device does not display a mouth action corresponding to a common vocabulary, and the text information includes the common vocabulary, where the common vocabulary is different from the keyword.
In the technical scheme, the mouth action is not displayed for the common vocabulary which is not the keyword, so that the data transmission in the translation process is reduced, the efficiency of data transmission and processing in the translation process is improved, and the application and use experience of the electronic equipment user is improved.
With reference to the first aspect, in some implementations of the first aspect, the electronic device displays the mouth motion while displaying a hand motion corresponding to the keyword.
According to the technical scheme, the mouth actions corresponding to the key words are displayed while the hand actions corresponding to the key words are made, the implementation of the technical scheme is beneficial to guaranteeing the corresponding relation between the hand actions and the mouth actions, further improving the accuracy of the text information translation result and improving the understanding degree of a sign language user on the translated sign language.
With reference to the first aspect, in certain implementations of the first aspect, the keyword is a proper noun.
The proper noun may include one or more of the following words: name of person, place, organization, work name, and other proper nouns.
The mouth action is added for proper nouns, which is beneficial to improving the understanding degree of the proper nouns which are difficult to understand by the sign language user and enhancing the mutual communication with the sign language user.
With reference to the first aspect, in some implementations of the first aspect, before displaying the mouth action corresponding to the keyword, the electronic device displays a first vocabulary, where the text information includes the first vocabulary, and the first vocabulary is a vocabulary for recommending additional mouth actions; and responding to the confirmation operation of the user, and determining the first vocabulary as a keyword by the electronic equipment.
According to the technical scheme, the keywords are recommended to the user of the electronic equipment, and the mouth actions are added to the recommended keywords after the user confirms the recommended keywords. The implementation of the technical scheme is beneficial to improving the understanding degree of the sign language learner on the sign language use, improving the application use experience of the electronic equipment user and improving the efficiency of the sign language learner on the sign language.
With reference to the first aspect, in some implementations of the first aspect, before displaying the mouth action corresponding to the keyword, the electronic device obtains a second vocabulary in response to a first input of the user, where the second vocabulary is a vocabulary that the user requests for an additional mouth action; under the condition that the text information contains the second vocabulary, the electronic equipment determines the second vocabulary as a keyword; when the text information does not contain the second vocabulary, the electronic equipment displays update request information, wherein the update request information is used for prompting that the text information does not contain the second vocabulary; in response to a second input by the user, the electronic device obtains an updated second vocabulary.
The technical scheme is used for identifying the vocabulary which is input by the user and requests the additional oral action, and informing the user whether the word information contains the identification result of the vocabulary which is requested by the user and requests the additional oral action. The implementation of the technical scheme is beneficial to improving the efficiency of translating Chinese into sign language, improving the accuracy of translating the text information and improving the application experience of users.
With reference to the first aspect, in some implementations of the first aspect, the first vocabulary is determined according to a translation history of the user, where the translation history includes second vocabulary input by the user, and the second vocabulary is a vocabulary that the user requests an additional oral action.
The second vocabulary included in the translation history may present, to a certain extent, language habits and application usage habits of the user of the electronic device. According to the technical scheme, the vocabulary of the additional mouth action is recommended to the user according to the history of the user requesting the additional mouth action. The implementation of the technical scheme is beneficial to determining the Chinese translation result according to the habit of the user, improving the translation effect and improving the application and use experience of the user of the electronic equipment.
With reference to the first aspect, in certain implementations of the first aspect, the mouth motion is determined according to a pronunciation mouth shape of a chinese pinyin of the keyword.
With reference to the first aspect, in some implementations of the first aspect, the mixed shape value corresponding to the pronunciation mouth shape is stored in a mouth action database.
By establishing the oral action database, when the oral action is required to be displayed, the electronic equipment sends a request message to the server, and the server invokes the required oral action data from the database. Compared with the oral action data acquired through schemes such as deep learning, the method and the device are beneficial to simplifying the flow of oral action data acquisition, improving the translation efficiency and improving the application and use experience of the electronic equipment.
With reference to the first aspect, in certain implementations of the first aspect, the hand action includes a first hand action and a second hand action, the first hand action being preceded by the second hand action, the electronic device receiving first hand action data from a server, the first hand action data for displaying the first hand action; the electronic device receives second hand-movement data from the server while displaying the first hand-movement, the second hand-movement data being used to display the second hand-movement.
Here, the first hand operation or the second hand operation may be one specific operation, or may be one or more operation pictures included in one specific operation.
According to the technical scheme, the electronic equipment receives the hand motion data which needs to be displayed firstly, receives the hand motion data which needs to be displayed later when the hand motion which needs to be displayed firstly is displayed, and adopts the scheme of transmitting the hand motion data in a slicing mode and transmitting the hand motion data while displaying, so that the waiting time generated by data transmission is shortened, and the application and use experience of a user is improved.
With reference to the first aspect, in certain implementations of the first aspect, the oral action includes a first oral action and a second oral action, the first oral action being prior to the second oral action, the electronic device receiving first oral action data from the server, the first oral action data for displaying the first oral action; the electronic device receives second oral motion data from the server while displaying the first oral motion, the second oral motion data being used to display the second oral motion.
Here, the first oral action or the second oral action may be one specific action, or may be one or more action pictures included in one specific action.
According to the technical scheme, the electronic equipment receives the mouth motion data which needs to be displayed firstly, receives the mouth motion data which needs to be displayed secondly when displaying the mouth motion which needs to be displayed firstly, and adopts the scheme of transmitting the mouth motion data in a slicing way and transmitting the mouth motion data while displaying, so that the waiting time generated by data transmission is shortened, and the application and use experience of a user is improved.
With reference to the first aspect, in some implementations of the first aspect, before displaying the hand motion corresponding to the text information, the electronic device receives a response message from the server, where the response message is used to indicate that the text information does not include sensitive information.
Before translating the text information into the hand motion and/or the mouth motion, firstly, performing text wind control inspection on the text information, and the implementation of the technical scheme is beneficial to filtering bad text information and improving application and use experience of electronic equipment users.
In a second aspect, a method for translating chinese is provided, including: the method comprises the steps that a server receives a translation request message, wherein the translation request message comprises character information, the character information comprises keywords, the keywords are determined according to language habits of sign language users, the translation request message is used for requesting to acquire hand action data corresponding to the character information, and the translation request message is also used for requesting to acquire mouth action data corresponding to the keywords; the server determines whether to transmit hand motion data and/or mouth motion data according to the text information.
Here, the hand motion data is used to display hand motions corresponding to text information, and the mouth motion data is used to display mouth motions corresponding to keywords.
In one possible implementation, the keyword is identified by the electronic device according to text information input by the user by one or more of the following ways: the content of the text information, the translation history information of the user or other methods for determining the keywords in the text information by other users.
It should be noted that, the keywords may be one or more words included in the text information or may be one or more words included in the text information.
According to the technical scheme, only the mouth part action is added to the keyword determined according to the habit of the user, the implementation of the technical scheme is beneficial to reducing the data volume of data transferred between the electronic equipment and the server when the text information is translated into the sign language, and the efficiency of the electronic equipment for translating the text information is improved.
With reference to the second aspect, in some implementations of the second aspect, the keyword is a proper noun.
With reference to the second aspect, in certain implementations of the second aspect, the server determines whether the text information includes sensitive information; in the case that the text information contains sensitive information, the server sends a first response message, wherein the first response message is used for indicating that the text information contains sensitive information; in the case that the text information does not contain sensitive information, the server sends a second response message comprising hand motion data and/or mouth motion data.
Before translating the text information into the hand motion and/or the mouth motion, firstly, performing text wind control inspection on the text information, and the implementation of the technical scheme is beneficial to filtering bad text information and improving application and use experience of electronic equipment users.
With reference to the second aspect, in certain implementations of the second aspect, the hand movement data includes first hand movement data for displaying a first hand movement and second hand movement data for displaying a second hand movement, the first hand movement being before the second hand movement, the server transmitting the second hand movement data after transmitting the first hand movement data.
Here, the first hand operation or the second hand operation may be one specific operation, or may be one or more operation pictures included in one specific operation.
According to the technical scheme, the server firstly transmits the hand motion data to be displayed firstly, and simultaneously transmits the hand motion data to be displayed later when the hand motion to be displayed firstly is displayed, and the scheme of transmitting the hand motion data in a slicing mode and transmitting the hand motion data while displaying is adopted, so that waiting time generated by data transmission is shortened, and application and use experience of a user is improved.
With reference to the second aspect, in certain implementations of the second aspect, the mouthpiece motion data includes first mouthpiece motion data for displaying a first mouthpiece motion and second mouthpiece motion data for displaying a second mouthpiece motion, the first mouthpiece motion being before the second mouthpiece motion, the server transmitting the second mouthpiece motion data after transmitting the first mouthpiece motion data.
Here, the first oral action or the second oral action may be one specific action, or may be one or more action pictures included in one specific action.
According to the technical scheme, the server firstly transmits the mouth motion data which needs to be displayed firstly, and simultaneously transmits the mouth motion data which needs to be displayed firstly and then displays the mouth motion data, and the scheme of transmitting the mouth motion data in a fragmented manner and transmitting the mouth motion data while displaying is adopted, so that the waiting time generated by data transmission is shortened, and the application and use experience of a user is improved.
With reference to the second aspect, in some implementations of the second aspect, the server obtains the mouth motion data from a mouth motion database, where the mouth motion database includes mixed shape values corresponding to a pinyin pronunciation mouth shape.
By creating a database for the oral action data, after the electronic device sends a request message to the server when the oral action is required to be displayed, the server invokes the required oral action data from the database. Compared with the oral action data acquired through schemes such as deep learning, the method and the device are beneficial to simplifying the flow of oral action data acquisition, improving the translation efficiency and improving the application and use experience of the electronic equipment.
In a third aspect, there is provided an electronic device comprising a processor and a memory storing one or more computer programs, the one or more computer programs comprising instructions which, when executed by the processor, are to: responding to the input of a user, acquiring text information, wherein the text information comprises keywords which are determined according to the language habit of a sign language user; the processor is also used for displaying the hand actions corresponding to the text information, and the processor is also used for displaying the mouth actions corresponding to the keywords.
With reference to the third aspect, in some implementations of the third aspect, the keyword is determined according to language habits of the sign language user.
With reference to the third aspect, in some implementations of the third aspect, the processor is further configured to not display a mouth action corresponding to a common vocabulary, where the text information includes the common vocabulary, and the common vocabulary is different from the keyword.
With reference to the third aspect, in some implementations of the third aspect, the processor is specifically configured to display the mouth motion while displaying a hand motion corresponding to the keyword.
With reference to the third aspect, in some implementations of the third aspect, the processor is further configured to display a first vocabulary, where the text information includes the first vocabulary, and the first vocabulary is a vocabulary for recommending an additional mouth action; the processor is further configured to determine the first vocabulary as a keyword in response to a confirmation operation by the user.
With reference to the third aspect, in some implementations of the third aspect, in response to a first input by a user, the processor is configured to obtain a second vocabulary, where the second vocabulary is a vocabulary that the user requests additional oral actions; the processor is further configured to determine that the second vocabulary is a keyword if the text information includes the second vocabulary; the processor is further configured to display update request information when the text information does not include the second vocabulary, where the update request information is used to prompt the text information to not include the second vocabulary; the processor is further configured to obtain an updated second vocabulary in response to a second input from the user.
With reference to the third aspect, in certain implementations of the third aspect, the hand actions include a first hand action and a second hand action, the first hand action being preceded by the second hand action, the processor further configured to receive first hand action data from the server, the first hand action data being used to display the first hand action; the processor is also configured to receive second hand movement data from the server along with the first hand movement, the second hand movement data being configured to display the second hand movement.
With reference to the third aspect, in certain implementations of the third aspect, the oral action includes a first oral action and a second oral action, the first oral action being prior to the second oral action, the processor further configured to receive first oral action data from the server, the first oral action data being for displaying the first oral action; the processor is also configured to receive second oral motion data from the server, the second oral motion data being configured to display a second oral motion, along with the first oral motion.
With reference to the third aspect, in some implementations of the third aspect, the processor is further configured to receive a response message from the server, where the response message is configured to indicate that the text message does not include sensitive information.
In a fourth aspect, there is provided a server comprising a processor and a memory, the memory storing one or more computer programs, the one or more computer programs comprising instructions which, when executed by the processor, are to: receiving a translation request message, wherein the translation request message comprises character information, the translation request message is used for requesting to acquire hand motion data corresponding to the character information, the character information comprises keywords, the keywords are determined according to language habits of sign language users, and the translation request message is also used for requesting to acquire mouth motion data corresponding to the keywords; the processor is also used for determining whether to send hand motion data and/or mouth motion data according to the text information.
With reference to the fourth aspect, in some implementations of the fourth aspect, the processor is further configured to determine whether the text information includes sensitive information; in the case that the text information contains sensitive information, the processor is further configured to send a first response message, where the first response message is used to indicate that the text information contains sensitive information; in the case that the text information does not contain sensitive information, the processor is further configured to send a second response message, the second response message including hand motion data and/or mouth motion data.
With reference to the fourth aspect, in some implementations of the fourth aspect, the hand movement data includes first hand movement data for displaying a first hand movement and second hand movement data for displaying a second hand movement, the first hand movement being before the second hand movement, the processor being further configured to send the second hand movement data after sending the first hand movement data.
With reference to the fourth aspect, in certain implementations of the fourth aspect, the mouthpiece motion data includes first mouthpiece motion data for displaying a first mouthpiece motion and second mouthpiece motion data for displaying a second mouthpiece motion, the first mouthpiece motion being before the second mouthpiece motion, the processor being further configured to send the second mouthpiece motion data after sending the first mouthpiece motion data.
With reference to the fourth aspect, in some implementations of the fourth aspect, the processor is further configured to obtain the mouth motion data from a mouth motion database, where the mouth motion database includes mixed shape values corresponding to a mouth shape of a pinyin pronunciation.
In a fifth aspect, there is provided a chinese translation apparatus, including an obtaining unit and a processing unit, where the obtaining unit is configured to obtain, in response to an input from a user, text information, the text information including a keyword, the keyword being determined according to language habits of a sign language user; the processing unit is used for displaying the hand actions corresponding to the text information; the processing unit also displays the mouth action corresponding to the keyword.
With reference to the fifth aspect, in certain implementations of the fifth aspect, the keyword is determined according to language habits of the sign language user.
With reference to the fifth aspect, in some implementations of the fifth aspect, the processing unit is further configured to not display a mouth action corresponding to a common vocabulary, where the text information includes the common vocabulary, and the common vocabulary is different from the keyword.
With reference to the fifth aspect, in some implementations of the fifth aspect, the processing unit is further configured to display the mouth motion while displaying a hand motion corresponding to the keyword.
With reference to the fifth aspect, in some implementations of the fifth aspect, the processing unit is further configured to display a first vocabulary, where the text information includes the first vocabulary, and the first vocabulary is a vocabulary for recommending the additional mouth action, and in response to a confirmation operation by the user, the processing unit is further configured to determine the first vocabulary as a keyword.
With reference to the fifth aspect, in some implementations of the fifth aspect, the obtaining unit is further configured to obtain, in response to a first input by a user, a second vocabulary, where the second vocabulary is a vocabulary that the user requests an additional mouth action; the processing unit is further configured to determine that the second vocabulary is a keyword when the text information includes the second vocabulary; the processing unit is further configured to display an update request message when the text information does not include the second vocabulary, where the update request message is used to prompt that the text information does not include the second vocabulary; the acquisition unit is further configured to acquire an updated second vocabulary in response to a second input from the user.
With reference to the fifth aspect, in some implementations of the fifth aspect, the chinese translation device further includes a communication unit, where the hand actions include a first hand action and a second hand action, and before the second hand action, the first hand action is before displaying a hand action corresponding to the text information, the communication unit is configured to receive first hand action data from a server, where the first hand action data is used to display the first hand action; the communication unit is also configured to receive second hand-movement data from the server, the second hand-movement data being used to display the second hand-movement, while the first hand-movement is displayed.
With reference to the fifth aspect, in certain implementations of the fifth aspect, the oral action includes a first oral action and a second oral action, the first oral action being before the second oral action, the communication unit is further configured to receive first oral action data from the server, the first oral action data being configured to display the first oral action; the communication unit is also configured to receive second oral motion data from the server, the second oral motion data being configured to display a second oral motion, along with the first oral motion.
With reference to the fifth aspect, in some implementations of the fifth aspect, before displaying the hand motion corresponding to the text information, the communication unit is further configured to receive a response message from the server, where the response message is configured to indicate that the text information does not include sensitive information.
In a sixth aspect, a chinese language translating apparatus is provided, including a communication unit and a processing unit, where the communication unit is configured to receive a translation request message, where the translation request message includes text information, where the translation request message is configured to request to obtain hand motion data corresponding to the text information, where the text information includes a keyword, where the keyword is determined according to language habits of sign language users, and where the translation request message is further configured to request to obtain mouth motion data corresponding to the keyword; the processing unit is used for determining whether to send hand motion data and/or mouth motion data according to the text information.
With reference to the sixth aspect, in some implementations of the sixth aspect, the processing unit is further configured to determine whether the text information includes sensitive information; in the case that the text information contains sensitive information, the communication unit is further configured to send a first response message, where the first response message is used to indicate that the text information contains sensitive information; in case the text information does not contain sensitive information, the communication unit is further adapted to send a second response message comprising hand motion data and/or mouth motion data.
With reference to the sixth aspect, in certain implementations of the sixth aspect, the hand movement data includes first hand movement data for displaying a first hand movement and second hand movement data for displaying a second hand movement, the first hand movement being before the second hand movement, and the communication unit is further configured to send the second hand movement data after sending the first hand movement data.
With reference to the sixth aspect, in certain implementations of the sixth aspect, the mouthpiece motion data includes first mouthpiece motion data for displaying a first mouthpiece motion and second mouthpiece motion data for displaying a second mouthpiece motion, the first mouthpiece motion being before the second mouthpiece motion, the communication unit is further configured to send the second mouthpiece motion data after sending the first mouthpiece motion data.
With reference to the sixth aspect, in some implementations of the sixth aspect, the processing unit is further configured to obtain the mouth motion data from a mouth motion database, where the mouth motion database includes a mixed shape value corresponding to a pinyin pronunciation mouth shape.
In a seventh aspect, a computer program product is provided, comprising computer program code which, when run on a computer, causes the method of the first aspect or any possible implementation thereof to be performed.
In an eighth aspect, there is provided a computer program product comprising computer program code for causing the method of the second aspect or any possible implementation thereof to be performed when the computer program code is run on a computer.
In a ninth aspect, there is provided a computer readable storage medium having stored therein computer instructions which, when run on a computer, cause the method of the first aspect or any possible implementation thereof to be performed.
In a tenth aspect, there is provided a computer readable storage medium having stored therein computer instructions which, when run on a computer, cause the method of the second aspect or any possible implementation thereof to be performed.
In an eleventh aspect, there is provided a chip comprising a processor for reading instructions stored in a memory, which when executed by the processor, cause the chip to implement the method of the first aspect or any possible implementation thereof to be performed.
In a twelfth aspect, there is provided a chip comprising a processor for reading instructions stored in a memory, which when executed by the processor causes the chip to implement the second aspect or any possible implementation thereof.
Drawings
Fig. 1 is a schematic diagram of a hardware architecture of an electronic device according to an embodiment of the present application.
Fig. 2 is a schematic diagram of an electronic device software architecture suitable for use in an embodiment of the present application.
Fig. 3 is a schematic diagram of a chinese language translation method according to an embodiment of the present application.
Fig. 4 is a schematic diagram of another chinese language translation method according to an embodiment of the present application.
Fig. 5 is a schematic diagram of another chinese language translation method according to an embodiment of the present application.
Fig. 6 is a schematic diagram of another chinese translation method according to an embodiment of the present application.
Fig. 7 is a schematic diagram of another chinese translation method according to an embodiment of the present application.
Fig. 8 is a schematic diagram of another chinese translation method according to an embodiment of the present application.
Fig. 9 is a schematic diagram of another chinese translation method according to an embodiment of the present application.
Fig. 10 is a schematic diagram of another chinese translation method according to an embodiment of the present application.
Fig. 11 is a schematic diagram of another chinese language translation method according to an embodiment of the present application.
Fig. 12 is a schematic diagram of another chinese translation method according to an embodiment of the present application.
Fig. 13 is a schematic diagram of a chinese language translation device according to an embodiment of the present application.
Fig. 14 is a schematic diagram of another chinese language translation device according to an embodiment of the present application.
Fig. 15 is a schematic diagram of an electronic device according to an embodiment of the present application.
Fig. 16 is a schematic diagram of a server according to an embodiment of the present application.
Detailed Description
The technical scheme of the application will be described below with reference to the accompanying drawings.
The terminology used in the following examples is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the application and the appended claims, the singular forms "a," "an," "the," and "the" are intended to include, for example, "one or more" such forms of expression, unless the context clearly indicates to the contrary. It should also be understood that in the following embodiments of the present application, "at least one", "one or more" means one, two or more than two. The term "and/or" is used to describe an association relationship of associated objects, meaning that there may be three relationships; for example, a and/or B may represent: a alone, a and B together, and B alone, wherein A, B may be singular or plural. The character "/" generally indicates that the context-dependent object is an "or" relationship.
Reference in the specification to "one embodiment" or "some embodiments" or the like means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," and the like in the specification are not necessarily all referring to the same embodiment, but mean "one or more but not all embodiments" unless expressly specified otherwise. The terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless expressly specified otherwise.
The method provided by the embodiment of the application can be applied to electronic equipment such as mobile phones, tablet computers, wearable equipment, vehicle-mounted equipment, augmented reality (augmented reality, AR)/Virtual Reality (VR) equipment, notebook computers, ultra-mobile personal computer (UMPC), netbooks, personal digital assistants (personal digital assistant, PDA) and the like, and the embodiment of the application does not limit the specific type of the electronic equipment.
By way of example, fig. 1 shows a schematic diagram of an electronic device 100. The electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (universal serial bus, USB) interface 130, a charge management module 140, a power management module 141, a battery 142, an antenna 1, an antenna 2, a mobile communication module 150, a wireless communication module 160, an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, a sensor module 180, keys 190, a motor 191, an indicator 192, a camera 193, a display 194, and a subscriber identity (subscriber identification module, SIM) card interface 195, etc. The sensor module 180 may include a pressure sensor 180A, a gyro sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, a bone conduction sensor 180M, and the like.
It should be understood that the illustrated structure of the embodiment of the present application does not constitute a specific limitation on the electronic device 100. In other embodiments of the application, electronic device 100 may include more or fewer components than shown, or certain components may be combined, or certain components may be split, or different arrangements of components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
The processor 110 may include one or more processing units, such as: the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processor (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), a controller, a memory, a video codec, a digital signal processor (digital signal processor, DSP), a baseband processor, and/or a neural network processor (neural-network processing unit, NPU), etc. Wherein the different processing units may be separate devices or may be integrated in one or more processors.
The controller may be a neural hub and a command center of the electronic device 100, among others. The controller can generate operation control signals according to the instruction operation codes and the time sequence signals to finish the control of instruction fetching and instruction execution.
A memory may also be provided in the processor 110 for storing instructions and data. In some embodiments, the memory in the processor 110 is a cache memory. The memory may hold instructions or data that the processor 110 has just used or recycled. If the processor 110 needs to reuse the instruction or data, it can be called directly from the memory. Repeated accesses are avoided and the latency of the processor 110 is reduced, thereby improving the efficiency of the system.
In some embodiments, the processor 110 may include one or more interfaces. The interfaces may include an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, a universal asynchronous receiver transmitter (universal asynchronous receiver/transmitter, UART) interface, a mobile industry processor interface (mobile industry processor interface, MIPI), a general-purpose input/output (GPIO) interface, a subscriber identity module (subscriber identity module, SIM) interface, and/or a universal serial bus (universal serial bus, USB) interface, among others.
The I2C interface is a bi-directional synchronous serial bus comprising a serial data line (SDA) and a serial clock line (derail clock line, SCL). In some embodiments, the processor 110 may contain multiple sets of I2C buses. The processor 110 may be coupled to the touch sensor 180K, charger, flash, camera 193, etc., respectively, through different I2C bus interfaces. For example: the processor 110 may be coupled to the touch sensor 180K through an I2C interface, such that the processor 110 communicates with the touch sensor 180K through an I2C bus interface to implement a touch function of the electronic device 100.
The I2S interface may be used for audio communication. In some embodiments, the processor 110 may contain multiple sets of I2S buses. The processor 110 may be coupled to the audio module 170 via an I2S bus to enable communication between the processor 110 and the audio module 170. In some embodiments, the audio module 170 may transmit an audio signal to the wireless communication module 160 through the I2S interface, to implement a function of answering a call through the bluetooth headset.
PCM interfaces may also be used for audio communication to sample, quantize and encode analog signals. In some embodiments, the audio module 170 and the wireless communication module 160 may be coupled through a PCM bus interface. In some embodiments, the audio module 170 may also transmit audio signals to the wireless communication module 160 through the PCM interface to implement a function of answering a call through the bluetooth headset. Both the I2S interface and the PCM interface may be used for audio communication.
The UART interface is a universal serial data bus for asynchronous communications. The bus may be a bi-directional communication bus. It converts the data to be transmitted between serial communication and parallel communication. In some embodiments, a UART interface is typically used to connect the processor 110 with the wireless communication module 160. For example: the processor 110 communicates with a bluetooth module in the wireless communication module 160 through a UART interface to implement a bluetooth function. In some embodiments, the audio module 170 may transmit an audio signal to the wireless communication module 160 through a UART interface, to implement a function of playing music through a bluetooth headset.
The MIPI interface may be used to connect the processor 110 to peripheral devices such as a display 194, a camera 193, and the like. The MIPI interfaces include camera serial interfaces (camera serial interface, CSI), display serial interfaces (display serial interface, DSI), and the like. In some embodiments, processor 110 and camera 193 communicate through a CSI interface to implement the photographing functions of electronic device 100. The processor 110 and the display 194 communicate via a DSI interface to implement the display functionality of the electronic device 100.
The GPIO interface may be configured by software. The GPIO interface may be configured as a control signal or as a data signal. In some embodiments, a GPIO interface may be used to connect the processor 110 with the camera 193, the display 194, the wireless communication module 160, the audio module 170, the sensor module 180, and the like. The GPIO interface may also be configured as an I2C interface, an I2S interface, a UART interface, an MIPI interface, etc.
The USB interface 130 is an interface conforming to the USB standard specification, and may specifically be a Mini USB interface, a Micro USB interface, a USB Type C interface, or the like. The USB interface 130 may be used to connect a charger to charge the electronic device 100, and may also be used to transfer data between the electronic device 100 and a peripheral device. And can also be used for connecting with a headset, and playing audio through the headset. The interface may also be used to connect other electronic devices, such as AR devices, etc.
It should be understood that the interfacing relationship between the modules illustrated in the embodiments of the present application is only illustrative, and is not meant to limit the structure of the electronic device 100. In other embodiments of the present application, the electronic device 100 may also employ different interfacing manners in the above embodiments, or a combination of multiple interfacing manners.
The charge management module 140 is configured to receive a charge input from a charger. The charger can be a wireless charger or a wired charger. In some wired charging embodiments, the charge management module 140 may receive a charging input of a wired charger through the USB interface 130. In some wireless charging embodiments, the charge management module 140 may receive wireless charging input through a wireless charging coil of the electronic device 100. The charging management module 140 may also supply power to the electronic device through the power management module 141 while charging the battery 142.
The power management module 141 is used for connecting the battery 142, and the charge management module 140 and the processor 110. The power management module 141 receives input from the battery 142 and/or the charge management module 140 and provides power to the processor 110, the internal memory 121, the external memory, the display 194, the camera 193, the wireless communication module 160, and the like. The power management module 141 may also be configured to monitor battery capacity, battery cycle number, battery health (leakage, impedance) and other parameters. In other embodiments, the power management module 141 may also be provided in the processor 110. In other embodiments, the power management module 141 and the charge management module 140 may be disposed in the same device.
The wireless communication function of the electronic device 100 may be implemented by the antenna 1, the antenna 2, the mobile communication module 150, the wireless communication module 160, a modem processor, a baseband processor, and the like.
The antennas 1 and 2 are used for transmitting and receiving electromagnetic wave signals. Each antenna in the electronic device 100 may be used to cover a single or multiple communication bands. Different antennas may also be multiplexed to improve the utilization of the antennas. For example: the antenna 1 may be multiplexed into a diversity antenna of a wireless local area network. In other embodiments, the antenna may be used in conjunction with a tuning switch.
The mobile communication module 150 may provide a solution for wireless communication including 2G/3G/4G/5G, etc., applied to the electronic device 100. The mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (low noise amplifier, LNA), etc. The mobile communication module 150 may receive electromagnetic waves from the antenna 1, perform processes such as filtering, amplifying, and the like on the received electromagnetic waves, and transmit the processed electromagnetic waves to the modem processor for demodulation. The mobile communication module 150 can amplify the signal modulated by the modem processor, and convert the signal into electromagnetic waves through the antenna 1 to radiate. In some embodiments, at least some of the functional modules of the mobile communication module 150 may be disposed in the processor 110. In some embodiments, at least some of the functional modules of the mobile communication module 150 may be provided in the same device as at least some of the modules of the processor 110.
The modem processor may include a modulator and a demodulator. The modulator is used for modulating the low-frequency baseband signal to be transmitted into a medium-high frequency signal. The demodulator is used for demodulating the received electromagnetic wave signal into a low-frequency baseband signal. The demodulator then transmits the demodulated low frequency baseband signal to the baseband processor for processing. The low frequency baseband signal is processed by the baseband processor and then transferred to the application processor. The application processor outputs sound signals through an audio device (not limited to the speaker 170A, the receiver 170B, etc.), or displays images or video through the display screen 194. In some embodiments, the modem processor may be a stand-alone device. In other embodiments, the modem processor may be provided in the same device as the mobile communication module 150 or other functional module, independent of the processor 110.
The wireless communication module 160 may provide solutions for wireless communication including wireless local area network (wireless local area networks, WLAN) (e.g., wireless fidelity (wireless fidelity, wi-Fi) network), bluetooth (BT), global navigation satellite system (global navigation satellite system, GNSS), frequency modulation (frequency modulation, FM), near field wireless communication technology (near field communication, NFC), infrared technology (IR), etc., as applied to the electronic device 100. The wireless communication module 160 may be one or more devices that integrate at least one communication processing module. The wireless communication module 160 receives electromagnetic waves via the antenna 2, modulates the electromagnetic wave signals, filters the electromagnetic wave signals, and transmits the processed signals to the processor 110. The wireless communication module 160 may also receive a signal to be transmitted from the processor 110, frequency modulate it, amplify it, and convert it to electromagnetic waves for radiation via the antenna 2.
In some embodiments, antenna 1 and mobile communication module 150 of electronic device 100 are coupled, and antenna 2 and wireless communication module 160 are coupled, such that electronic device 100 may communicate with a network and other devices through wireless communication techniques. The wireless communication techniques may include the Global System for Mobile communications (global system for mobile communications, GSM), general packet radio service (general packet radio service, GPRS), code division multiple access (code division multiple access, CDMA), wideband code division multiple access (wideband code division multiple access, WCDMA), time division code division multiple access (time-division code division multiple access, TD-SCDMA), long term evolution (long term evolution, LTE), BT, GNSS, WLAN, NFC, FM, and/or IR techniques, among others. The GNSS may include a global satellite positioning system (global positioning system, GPS), a global navigation satellite system (global navigation satellite system, GLONASS), a beidou satellite navigation system (beidou navigation satellite system, BDS), a quasi zenith satellite system (quasi-zenith satellite system, QZSS) and/or a satellite based augmentation system (satellite based augmentation systems, SBAS).
The electronic device 100 implements display functions through a GPU, a display screen 194, an application processor, and the like. The GPU is a microprocessor for image processing, and is connected to the display 194 and the application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering. Processor 110 may include one or more GPUs that execute program instructions to generate or change display information.
The display screen 194 is used to display images, videos, and the like. The display 194 includes a display panel. The display panel may employ a liquid crystal display (liquid crystal display, LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode (AMOLED) or an active-matrix organic light-emitting diode (matrix organic light emitting diode), a flexible light-emitting diode (flex), a mini, a Micro led, a Micro-OLED, a quantum dot light-emitting diode (quantum dot light emitting diodes, QLED), or the like. In some embodiments, the electronic device 100 may include 1 or N display screens 194, N being a positive integer greater than 1.
The electronic device 100 may implement photographing functions through an ISP, a camera 193, a video codec, a GPU, a display screen 194, an application processor, and the like.
The ISP is used to process data fed back by the camera 193. For example, when photographing, the shutter is opened, light is transmitted to the camera photosensitive element through the lens, the optical signal is converted into an electric signal, and the camera photosensitive element transmits the electric signal to the ISP for processing and is converted into an image visible to naked eyes. ISP can also optimize the noise, brightness and skin color of the image. The ISP can also optimize parameters such as exposure, color temperature and the like of a shooting scene. In some embodiments, the ISP may be provided in the camera 193.
The camera 193 is used to capture still images or video. The object generates an optical image through the lens and projects the optical image onto the photosensitive element. The photosensitive element may be a charge coupled device (charge coupled device, CCD) or a Complementary Metal Oxide Semiconductor (CMOS) phototransistor. The photosensitive element converts the optical signal into an electrical signal, which is then transferred to the ISP to be converted into a digital image signal. The ISP outputs the digital image signal to the DSP for processing. The DSP converts the digital image signal into an image signal in a standard RGB, YUV, or the like format. In some embodiments, electronic device 100 may include 1 or N cameras 193, N being a positive integer greater than 1.
The digital signal processor is used for processing digital signals, and can process other digital signals besides digital image signals. For example, when the electronic device 100 selects a frequency bin, the digital signal processor is used to fourier transform the frequency bin energy, or the like.
Video codecs are used to compress or decompress digital video. The electronic device 100 may support one or more video codecs. In this way, the electronic device 100 may play or record video in a variety of encoding formats, such as: dynamic picture experts group (moving picture experts group, MPEG) 1, MPEG2, MPEG3, MPEG4, etc.
The NPU is a neural-network (NN) computing processor, and can rapidly process input information by referencing a biological neural network structure, for example, referencing a transmission mode between human brain neurons, and can also continuously perform self-learning. Applications such as intelligent awareness of the electronic device 100 may be implemented through the NPU, for example: image recognition, face recognition, speech recognition, text understanding, etc.
The external memory interface 120 may be used to connect an external memory card, such as a Micro SD card, to enable expansion of the memory capabilities of the electronic device 100. The external memory card communicates with the processor 110 through an external memory interface 120 to implement data storage functions. For example, files such as music, video, etc. are stored in an external memory card.
The internal memory 121 may be used to store computer executable program code including instructions. The processor 110 executes various functional applications of the electronic device 100 and data processing by executing instructions stored in the internal memory 121. The internal memory 121 may include a storage program area and a storage data area. The storage program area may store an application program (such as a sound playing function, an image playing function, etc.) required for at least one function of the operating system, etc. The storage data area may store data created during use of the electronic device 100 (e.g., audio data, phonebook, etc.), and so on. In addition, the internal memory 121 may include a high-speed random access memory, and may further include a nonvolatile memory such as at least one magnetic disk storage device, a flash memory device, a universal flash memory (universal flash storage, UFS), and the like.
The electronic device 100 may implement audio functions through an audio module 170, a speaker 170A, a receiver 170B, a microphone 170C, an earphone interface 170D, an application processor, and the like. Such as music playing, recording, etc.
The audio module 170 is used to convert digital audio information into an analog audio signal output and also to convert an analog audio input into a digital audio signal. The audio module 170 may also be used to encode and decode audio signals. In some embodiments, the audio module 170 may be disposed in the processor 110, or a portion of the functional modules of the audio module 170 may be disposed in the processor 110.
The speaker 170A, also referred to as a "horn," is used to convert audio electrical signals into sound signals. The electronic device 100 may listen to music, or to hands-free conversations, through the speaker 170A.
A receiver 170B, also referred to as a "earpiece", is used to convert the audio electrical signal into a sound signal. When electronic device 100 is answering a telephone call or voice message, voice may be received by placing receiver 170B in close proximity to the human ear.
Microphone 170C, also referred to as a "microphone" or "microphone", is used to convert sound signals into electrical signals. When making a call or transmitting voice information, the user can sound near the microphone 170C through the mouth, inputting a sound signal to the microphone 170C. The electronic device 100 may be provided with at least one microphone 170C. In other embodiments, the electronic device 100 may be provided with two microphones 170C, and may implement a noise reduction function in addition to collecting sound signals. In other embodiments, the electronic device 100 may also be provided with three, four, or more microphones 170C to enable collection of sound signals, noise reduction, identification of sound sources, directional recording functions, etc.
The earphone interface 170D is used to connect a wired earphone. The headset interface 170D may be a USB interface 130 or a 3.5mm open mobile electronic device platform (open mobile terminal platform, OMTP) standard interface, a american cellular telecommunications industry association (cellular telecommunications industry association of the USA, CTIA) standard interface.
The keys 190 include a power-on key, a volume key, etc. The keys 190 may be mechanical keys. Or may be a touch key. The electronic device 100 may receive key inputs, generating key signal inputs related to user settings and function controls of the electronic device 100.
The motor 191 may generate a vibration cue. The motor 191 may be used for incoming call vibration alerting as well as for touch vibration feedback. For example, touch operations acting on different applications (e.g., photographing, audio playing, etc.) may correspond to different vibration feedback effects. The motor 191 may also correspond to different vibration feedback effects by touching different areas of the display screen 194. Different application scenarios (such as time reminding, receiving information, alarm clock, game, etc.) can also correspond to different vibration feedback effects. The touch vibration feedback effect may also support customization.
The indicator 192 may be an indicator light, may be used to indicate a state of charge, a change in charge, a message indicating a missed call, a notification, etc.
The SIM card interface 195 is used to connect a SIM card. The SIM card may be inserted into the SIM card interface 195, or removed from the SIM card interface 195 to enable contact and separation with the electronic device 100. The electronic device 100 may support 1 or N SIM card interfaces, N being a positive integer greater than 1. The SIM card interface 195 may support Nano SIM cards, micro SIM cards, and the like. The same SIM card interface 195 may be used to insert multiple cards simultaneously. The types of the plurality of cards may be the same or different. The SIM card interface 195 may also be compatible with different types of SIM cards. The SIM card interface 195 may also be compatible with external memory cards. The electronic device 100 interacts with the network through the SIM card to realize functions such as communication and data communication. In some embodiments, the electronic device 100 employs an embedded SIM (eSIM) card, namely: an embedded SIM card. The eSIM card can be embedded in the electronic device 100 and cannot be separated from the electronic device 100.
It should be understood that the phone cards in embodiments of the present application include, but are not limited to, SIM cards, eSIM cards, universal subscriber identity cards (universal subscriber identity module, USIM), universal integrated phone cards (universal integrated circuit card, UICC), and the like.
The software system of the electronic device 100 may employ a layered architecture, an event driven architecture, a microkernel architecture, a microservice architecture, or a cloud architecture. In the embodiment of the application, taking an Android system with a layered architecture as an example, a software structure of the electronic device 100 is illustrated.
Fig. 2 is a software configuration block diagram of the electronic device 100 according to the embodiment of the present application. The layered architecture divides the software into several layers, each with distinct roles and branches. The layers communicate with each other through a software interface. In some embodiments, the Android system is divided into four layers, from top to bottom, an application layer, an application framework layer, an Zhuoyun row (Android run) and system libraries, and a kernel layer, respectively. The application layer may include a series of application packages.
As shown in fig. 2, the application package may include applications for cameras, gallery, calendar, phone calls, maps, navigation, WLAN, bluetooth, music, video, short messages, etc.
The application framework layer provides an application programming interface (application programming interface, API) and programming framework for application programs of the application layer. The application framework layer includes a number of predefined functions.
As shown in FIG. 2, the application framework layer may include a window manager, a content provider, a view system, a telephony manager, a resource manager, a notification manager, and the like.
The window manager is used for managing window programs. The window manager can acquire the size of the display screen, judge whether a status bar exists, lock the screen, intercept the screen and the like.
The content provider is used to store and retrieve data and make such data accessible to applications. The data may include video, images, audio, calls made and received, browsing history and bookmarks, phonebooks, etc.
The view system includes visual controls, such as controls to display text, controls to display pictures, and the like. The view system may be used to build applications. The display interface may be composed of one or more views. For example, a display interface including a text message notification icon may include a view displaying text and a view displaying a picture.
The telephony manager is used to provide the communication functions of the electronic device 100. Such as the management of call status (including on, hung-up, etc.).
The resource manager provides various resources for the application program, such as localization strings, icons, pictures, layout files, video files, and the like.
The notification manager allows the application to display notification information in a status bar, can be used to communicate notification type messages, can automatically disappear after a short dwell, and does not require user interaction. Such as notification manager is used to inform that the download is complete, message alerts, etc. The notification manager may also be a notification in the form of a chart or scroll bar text that appears on the system top status bar, such as a notification of a background running application, or a notification that appears on the screen in the form of a dialog window. For example, a text message is prompted in a status bar, a prompt tone is emitted, the electronic device vibrates, and an indicator light blinks, etc.
Android runtimes include core libraries and virtual machines. Android run time is responsible for scheduling and management of the Android system.
The core library consists of two parts: one part is a function which needs to be called by java language, and the other part is a core library of android.
The application layer and the application framework layer run in a virtual machine. The virtual machine executes java files of the application program layer and the application program framework layer as binary files. The virtual machine is used for executing the functions of object life cycle management, stack management, thread management, security and exception management, garbage collection and the like.
The system library may include a plurality of functional modules. For example: surface manager (surface manager), media library (media library), three-dimensional graphics processing library (e.g., openGL ES), 2D graphics engine (e.g., SGL), etc.
The surface manager is used to manage the display subsystem and provides a fusion of 2D and 3D layers for multiple applications.
Media libraries support a variety of commonly used audio, video format playback and recording, still image files, and the like. The media library may support a variety of audio and video encoding formats, such as MPEG4, h.264, MP3, AAC, AMR, JPG, PNG, etc.
The three-dimensional graphic processing library is used for realizing three-dimensional graphic drawing, image rendering, synthesis, layer processing and the like.
The 2D graphics engine is a drawing engine for 2D drawing.
The kernel layer is a layer between hardware and software. The inner core layer at least comprises a display driver, a camera driver, an audio driver and a sensor driver.
It should be understood that the technical scheme in the embodiment of the application can be used in Android, IOS, hong Meng and other systems.
The architecture of hardware and software of the electronic device suitable for the translation method provided by the present application is described above with reference to fig. 1 and fig. 2, and the chinese translation method provided by the embodiment of the present application is described below with reference to fig. 3 to fig. 16. Before formally describing embodiments of the present application, some terms that may be used in the following embodiments are first described.
1. Chinese sign language (Chinese sign language, CSL): the Chinese universal sign language is mainly used in China.
2. Speech recognition (automatic speech recognition, ASR): and may also be called Speech To Text (STT), the goal of which is to automatically convert human speech content into corresponding text by a computer.
3. Optical character recognition (optical character recognition, OCR): the method refers to a process of analyzing, identifying and processing the image file of the text data to obtain the text and layout information.
4. Software development kit (software development kit, SDK): refers to a collection of development tools for creating application software for a particular software package, software framework, hardware platform, operating system, etc.
5. Hybrid deformation (blendrope): the three-dimensional model mesh vertices are manipulated to implement a technique for defining shapes that can be used to control the facial expression of a virtual character.
6. Digital person: the human body structure is digitalized through a computer technology, a visual and controllable virtual human body form appears on a computer screen, human body functional information is further added to the human body form frame, and through cross fusion of the virtual reality technology, the digital human body can simulate a real human body to make various reactions, and visual and natural real-time sense such as viewing, listening and touching can be provided if a device for sound and force feedback is arranged.
7. Sign language (sign language ) is a language used for expressing meaning and meaning in terms of limb movements and facial expressions, using a visual-gesture mode without using auditory-speech.
8. Part of speech: the characteristic of the word is used for dividing word class. Words of modern chinese can be divided into two major categories, real words and imaginary words, where real words refer to words that can serve as syntactic components alone or mostly as major components of oranges. Has lexical meaning and grammatical meaning. Including nouns, verbs, adjectives, adverbs, numbers, adjectives, pronouns, and personification. The stop word cannot serve as a syntactic component alone or as an auxiliary component of a sentence in most cases. Only in grammatical sense. Including prepositions, conjunctions, assisted words, and exclamation.
Table 1 shows a part-of-speech classification method, where proper nouns may include: name of person, place name, organization group, name of work, other proper nouns, etc.
TABLE 1 part of speech tags and their meanings
Fig. 3 is a schematic diagram of a chinese language translation method according to an embodiment of the present application, and a process of translating text information into corresponding sign language by using App1 by an electronic device is described below as an example.
It should be noted that, in the following embodiments, the user of App1 (i.e., the user of the electronic device or the user) may be a hearing impaired person or a hearing impaired person.
The electronic device user may input data to be translated through App 1's input functionality control 304, which input functionality control 304 may be used to input one or more of the following data types to App 1: text (e.g., content shown at 303), images, documents, audio, video, etc.
When the electronic equipment user inputs the text, the App1 can directly acquire the text information contained in the text input by the electronic equipment user. The text may be manually entered by the user of the electronic device or may be one or more texts provided by App1 (common sentences built into App 1) from which the user of the electronic device selects.
When the terminal user inputs an image, app1 recognizes text information contained in the image by OCR after receiving the image data. After the electronic device user inputs document (e.g., example. Txt, etc.) data, app1 parses the document after receiving the document data to obtain text information contained in the document.
When the electronic device user inputs audio or video data, app1 recognizes text information contained in the audio or video data through ASR and/or OCR after receiving the audio or video data. When the video data received by the App1 contains subtitles, the App1 can recognize text information in the video through OCR, when the video data received by the App1 contains audio data, the App1 can recognize the text information contained in the video data through ASR, and when the video data received by the App1 contains both subtitles and audio data, the App1 can recognize the text information contained in the video through ASR and OCR at the same time, and perform mutual correction, thereby improving the accuracy of text recognition.
When App1 obtains the text information "john to go to movie theatre in the afternoon" as shown in 303. App1 may obtain hand motion data corresponding to the text information according to the text information, and further drive the virtual character model using the hand motion data, so that the virtual character model may display hand motions corresponding to the text information.
In some embodiments, before App1 translates the acquired text into sign language, app1 further identifies parts of speech of different vocabularies included in the data to be translated input by the user, and for a proper noun, app1 further acquires mouth action data corresponding to the proper noun. When the virtual character displays the hand action of the proper noun, the virtual character displays the mouth action corresponding to the proper noun under the driving of the mouth action data.
In some embodiments, app1 displays a prompt 302 for prompting a user of the electronic device to use a method or step of App1 in response to a user's operation when the user of the electronic device opens the interface shown in fig. 3.
For example, the prompt may be used to prompt the electronic device user to input data to be translated to App1 via the input functionality control 304.
The prompt may also be used, for example, to prompt the user of the electronic device to enter a word or words that require additional oral actions.
Alternatively, app1 may also display the processing state information. Illustratively, the operation currently being performed by App1 or the operation being performed by the user is displayed as at 301.
Fig. 4 is a schematic diagram of another chinese language translation method according to an embodiment of the present application.
In the embodiment of the application, the processing procedure of the App1 on the data to be translated, which needs to identify the text information, such as images, documents and the like, is described by taking the text information in the OCR identification image of the App1 as an example.
The electronic device user inputs to App1 an input functionality control containing "john has gone to the movie theatre in the afternoon. "picture of text information. After receiving an input picture of a user, app1 recognizes text information in the picture through OCR.
In some embodiments, app1 recognizes that the text information is correct ("john's afternoon goes to movie theatre."), app1 displays a text recognition result confirmation prompt window (as shown in fig. 4 (a)), and the user clicks "confirm", app1 obtains a confirmation instruction from the user, and performs the next operation, that is, the operation shown in fig. 4 (c).
In other embodiments, app1 recognizes that the text information is wrong ("john's am goes to movie theatre."), app1 displays a text recognition result confirmation prompt window, the user clicks "modify" after confirming App1 recognizes that the text is wrong, app1 obtains a modification indication of the user, displays a modified text recognition result prompt window as shown in (b) of fig. 4, the user clicks "confirm" after inputting the correct text information ("john's am goes to movie theatre."), and in response to the user input, app1 obtains the modified text information and performs the next operation, i.e., the operation shown in (c) of fig. 4.
As shown in fig. 4 (c), app1 may translate the text information into a corresponding hand motion after acquiring the text information confirmed by the user.
In some embodiments, app1 displays an operation prompt "please enter keywords that require additional mouth actions" before translating the confirmed text information into hand actions: according to the operation prompt information, an electronic device user inputs John to the App1 through the input function control, and after acquiring John' keywords, the App1 acquires hand motion data corresponding to the words according to the words and mouth motion data corresponding to the keywords John, so that the App1 can drive the virtual character to display corresponding hand motions and mouth motions by utilizing the acquired hand motion data and mouth motion data of the keywords.
In other embodiments, app1 analyzes that the text information includes the proper noun "john" before translating the confirmed text information into the spoken language, app1 uses the proper noun as a keyword that needs additional mouth motion, and obtains mouth motion data corresponding to the proper noun while obtaining hand motion data corresponding to the text information, so that App1 can drive the virtual character to display corresponding hand motion and mouth motion by using the obtained hand motion data and mouth motion data of the keyword.
It should be noted that the keyword may be a vocabulary including one or more Chinese characters.
Fig. 5 is a schematic diagram of another method for translating chinese according to an embodiment of the present application. In the embodiment of the application, the user of the electronic equipment inputs the keywords needing additional oral actions, and App1 checks the keywords input by the user to reduce the possibility of errors in the process of translating the text information into the hand actions.
As shown in fig. 5 (a), the text message to be translated is inputted by the user of the electronic device as "water running out please turn off the faucet. "in response to user input, app1 displays a prompt" please input keywords requiring additional mouth actions: ".
In some embodiments, the user of the electronic device inputs keywords according to the prompt information: and (3) closing and water tap, responding to the input of a user, checking and determining that the text information contains the keywords input by the user by App1, and executing corresponding translation operation.
In other embodiments, the user of the electronic device enters keywords according to the prompt message: "close, tap", in response to user input, app1 checks and determines that the text information contains the keyword "close", but does not contain the keyword "tap", app1 displays confirmation keyword prompt information as shown in (b) in fig. 5: "does not find" tap ", please confirm" tap "? According to the prompt information, the electronic equipment user confirms that the inputted keyword is wrong and the keyword identified by App1 is correct, and clicks the confirmation. In response to a confirmation operation by the user, app1 updates the keywords requiring additional mouth actions as: "off" and "tap".
Or, when the user confirms that the keyword which has been input is wrong and the keyword which App1 recognizes is also incorrect, the user can click "modify" to input the correct keyword which requires the additional oral action.
In still other embodiments, the electronic device user enters keywords according to the prompt information described above: and responding to the input of a user, checking and determining that the text information contains the keyword input by the user by the App1, and if the oral action data corresponding to the keyword cannot be obtained, sending out prompt information by the App1, wherein the prompt information can be oral action data corresponding to the 'tap' which can not be input by the user, and requesting manual service for the background of the user. ".
Optionally, app1 may establish a video connection with a human customer service for the user, and after the connection is established, the human customer service may display the mouth actions of the keywords that cannot be obtained for the user. Or, the manual customer service supplements the mouth action data of the keywords which cannot be obtained in the background and then calls the App1, and the App1 obtains the mouth action data and displays the mouth action data to the user of the electronic equipment.
In still other embodiments, the electronic device user inputs the entire content of the text message to be translated according to the prompt. In response to user input, app1 checks that there are more keywords that require additional mouth action data, and App1 may send a prompt message to prompt the user: the keywords which need to be added with the mouth action data are more at present, and the keywords which need to be added with the mouth action can be input again.
In still other embodiments, if the user of the electronic device does not input any keyword according to the prompt information, and App1 detects that the keyword input by the user is not acquired within the preset duration, app1 may send a prompt information according to the content of the text information input by the identified user, where the prompt information includes a keyword for recommending the additional mouth action.
In one embodiment, the keywords recommending additional mouth actions may be determined based on the parts of speech of the Chinese vocabulary as shown in Table 1, such as proper nouns, time words, etc.
In another embodiment, the keyword for recommending the additional mouth action may also be determined according to the habit of the user, and by way of example, the user takes "john" as the keyword for recommending the additional mouth action in the translation query history in App1, and when App1 obtains that the data to be translated input by the user also contains "john", the user can take "john" as the keyword for recommending the additional mouth action.
Also exemplary, when the user selects the keywords of the additional oral action for the text information to be translated, the subject and the object in one sentence are determined to be the keywords of the additional oral action for a plurality of times, and when App1 does not acquire the keywords requiring the additional oral action input by the user in the preset duration, the subject and the object in the text information to be translated input by the user can be determined to be the keywords recommending the additional oral action.
In yet another embodiment, the keywords recommending additional mouth actions may be determined according to a determination method of other users. For example, for the same video, 80% of users determine "cinema" and "playground" as keywords requiring additional oral actions, and when the user inputs the same video and App1 does not acquire the keywords requiring additional oral actions input by the user in a preset period of time, app1 may use "cinema" and "playground" as keywords recommending additional oral actions.
Optionally, when the keyword input by the user and requiring additional mouth motion data only includes "movie theatre", app1 may also send a prompt message to prompt the user whether to be "playground" and also add mouth motion data? When the electronic device user determines that the mouth motion data is added for the "playground", in response to the operation of the user, app1 takes "movie theatre", "playground" as a keyword requiring the additional mouth motion.
As shown in fig. 5 (c), after App1 obtains the keyword confirmed by the user, app1 also displays a prompt message for prompting the updated keyword.
The text information input process of the chinese language translation method provided by the present application is described above with reference to fig. 3 to 5, and the following processes of displaying and using the translation result corresponding to the chinese language translation method provided by the embodiment of the present application are described with reference to fig. 6 to 11.
After App1 obtains hand motion data and mouth motion data corresponding to the text information according to the input of the user, app1 displays a translation result interface as shown in fig. 6.
The translation result interface may include a processing hint 630 that is used to hint that translation of the text message has been completed. Optionally, the prompt information is further used for prompting the user of the electronic device to use the translation result.
The translation results interface may also include an overall presentation area 611 for presenting the overall situation of the virtual character when the character is in the whisper. Optionally, when the user confirms that the mouth motion data needs to be added for one or more keywords, the overall display area is used for displaying the hand motion of the text information to be translated and the mouth motion of the keywords added with the mouth motion data.
The translation result interface may further include a hand motion display area 613 for displaying details of hand motions of the text information to be translated entered by the user. Optionally, the hand motion display area may include auxiliary lines and/or auxiliary text for helping the user understand sign language details such as the motion track of the finger.
The translation result interface may also include a mouth action presentation area 612 for presenting a mouth action that a user requests additional mouth action data words or a mouth action that recommends additional mouth action data words. Optionally, the mouth motion display area may include auxiliary lines and/or auxiliary text for helping the user understand mouth motion details such as mouth motion trajectories.
The translation result interface may also include a text status display area 614 for displaying text corresponding to the currently displayed hand action and/or mouth action. Optionally, the text state display area further includes a pinyin annotation area, where the pinyin annotation area is used to display pinyin annotations of text corresponding to the currently displayed hand motion and/or mouth motion.
In some embodiments, the text status presentation area displays the corresponding vocabulary in the order of hand movements.
Here, the order of the hand movements may not be the same as the order of reading the hearing impaired person from left to right.
Exemplary, "I do not have a handset. The sign language expression sequence of "is: the mobile phone, I, the band and none, so if the mobile phone is used as a keyword needing to be added with the mouth action, the sentence can be displayed in the following form through a text state display area: "Mobile", "I", "band", "none" are displayed in turn from left to right. For "cell phones", highlighting such as highlighting or thickening may be performed.
In other embodiments, the text status display area displays text information in the order of natural spoken language and highlights the vocabulary corresponding to the hand actions in the order of the hand actions.
For example, the default color of the text in the text state display area is black, the text corresponding to the currently displayed hand motion is red, and the text corresponding to the currently displayed mouth motion is green and thickened.
Also exemplary, the sign language "I did not have a cell phone. The sign language expression sequence of "is: the mobile phone, I, the band and none, so if the mobile phone is used as a keyword needing to be added with the mouth action, the sentence is displayed in the following form through a text state display area: the "cell phone" is shown bolded green, the "me" is shown red, the "band" is shown red, and the "no" is shown red.
In some embodiments, the above-mentioned whole display area 611, the mouth action display area 612, the hand action display area 613, the text status display area 614, and the prompt 630 form a translation result interface.
In other embodiments, the above-described overall display area 611, oral action display area 612, hand action display area 613, and text status display area 614 form a translation result display area 610 of the translation result interface, and the translation result display area 610 is a part of the translation result interface.
Optionally, the translation result display area 610 may further include a prompt 630 and an input area 620, where the input area is used to display the text information to be translated that has been input by the user, the input prompt sent by App1, the keywords that need additional mouth actions that have been input by the user, and so on. Optionally, the user of the electronic device may further reenter keywords in the input area that require additional mouth motion data. When the user inputs the keywords needing additional mouth motion data again in the input area, in response to the input of the user, app1 acquires the mouth motion data corresponding to the keywords input again by the user, and updates the whole display area, the mouth motion display area, the hand motion display area and the character state display area in the translation result display area.
Illustratively, after the user has determined that the text information: "I have no cell phone. After "mobile phone" in "is the vocabulary requiring the additional oral action," me "is re-input by the user in the input area shown in fig. 6, and App1 determines" me "as the keyword requiring the additional oral action in response to the input of the user.
The basic components of the translation result display interface are described above in connection with fig. 6, and the functions that each component of the translation structure display interface may have are described in detail below in connection with fig. 7.
And triggering the App1 to display the function options of the translation result display area by the electronic equipment user through clicking, double clicking or long pressing the blank of the translation result display area.
And triggering the App1 to display and translate function tabs in the whole display area or the mouth action display area or the hand action display area or the character state display area by clicking, double clicking or long clicking of the electronic equipment user.
The function tabs described above may include one or more of the following functions: "full screen view", "double speed play", "insert into audio/video", "hide", "save" or "share", etc.
When the electronic equipment user selects a 'full screen view' function option, in response to the operation of the user, app1 fully displays the whole display area or the oral action display area or the hand action display area or the text state display area.
When the electronic equipment user selects the function option of 'double-speed playing', in response to the operation of the user, app1 displays a playing speed adjusting function window, and the user can select or input the playing speed required to be set in the playing speed adjusting function window. After the play rate selected or input by the user is obtained, app1 plays the content contained in the whole display area or the mouth action display area or the hand action display area or the text state display area according to the corresponding rate (slow or fast).
When the electronic device user selects the "insert audio/video" function option, app1 inserts one or more of the overall display area or the oral action display area or the hand action display area or the text state display area into the corresponding audio or video in response to the user's operation. Alternatively, app1 may save the modified audio file in the format of a video file after inserting any of the above-described areas in the audio file.
When the electronic equipment user selects the 'hiding' function option, in response to the operation of the user, the App1 hides the whole display area or the oral action display area or the hand action display area or the text state display area. When the user clicks the hidden area again, the function options corresponding to the area may include a "display" function option, and when the user selects the "display" function option, app1 displays the hidden area in response to the user operation.
Here, if the text information to be translated does not include any mouth motion, or if the user selects not to add a mouth motion to any keyword, the mouth motion display area may be hidden by default.
When the electronic equipment user selects the 'save' function option, in response to the operation of the user, the App1 saves the data corresponding to the area selected by the user. Optionally, in response to the operation of the user, app1 may further display a save prompt window, where the save prompt window is used to prompt the user whether to save data corresponding to other relevant areas at the same time, and the save prompt window is further used to obtain the indication information of the user. For example, when the user selects to save the data corresponding to other relevant areas at the same time, app1 saves both the data corresponding to the user selected area and the data corresponding to the relevant areas to the local electronic device in response to the user operation.
Illustratively, when the user selects the "save" function option in the overall presentation area, app1 displays a prompt: "is data of the mouth motion display area, the hand motion display area, and the character state display area simultaneously saved? "when the user selects to save the mouth action display area, app1 saves the data corresponding to the whole display area and the mouth action display area at the same time in response to the user's selection.
When the electronic equipment selects the "sharing" function option, in response to the operation of the user, app1 displays a sharing function control, where the sharing function control includes one or more sharing paths. The terminal user can select one or more sharing ways, and in response to the selection of the user, app1 shares data corresponding to the region selected by the user through the one or more sharing ways selected by the user.
Optionally, when the electronic device selects the "share" function option, in response to the operation of the user, the App1 may further display a share prompt window, where the share prompt window is used to prompt the user whether to share data corresponding to other relevant areas at the same time, and the share prompt window is further used to obtain indication information of the user. For example, when the user selects to share the data corresponding to the other relevant areas at the same time, in response to the operation of the user, app1 takes the data corresponding to the user selected area and the data corresponding to the relevant areas as the data to be shared.
For example, when the user selects the "share" function option in the overall presentation area, app1 displays a prompt message: "is data of the mouth motion display area, the hand motion display area, and the text state display area shared at the same time? And when the user selects the mouth action display area, in response to the selection of the user, app1 simultaneously shares the data corresponding to the whole display area and the mouth action display area.
And for the data corresponding to different display areas stored locally in the electronic equipment, the user of the electronic equipment can open to view, share, edit and the like again.
FIG. 8 is an interface of a repository for classifying, ranking, and displaying data corresponding to different display regions locally stored on an electronic device according to certain rules, including classification rules and ranking rules.
Wherein the classification rules may include any of the following rules: an area (an overall display area, an oral action display area, a hand action display area, etc.), a time (a time saved locally to the electronic device, e.g., today, yesterday, one week ago, etc.), or a source (e.g., from a current electronic device, from an electronic device of the same account, from a home electronic device, etc.), etc.
The ordering rules may include any of the following rules: time (e.g., from far to near or near to far), text information order (e.g., text information initial alphabet order) contained in the data, or order of the additional mouth action keywords (stroke order of the first word of the keywords).
The electronic device user can select the "sort mode" function option 801 of the repository to set different sort modes for the locally stored data. The electronic device user may also select the "arrangement" feature 802 of the repository to set a different arrangement for the locally stored data.
In some embodiments, the repository further includes a search box 805 in which the electronic device user can enter words, terms, time, regions, sources, etc. to quickly find corresponding data.
In other embodiments, the repository further includes a "recycle bin" function option 803 that the electronic device user may select to view data that has been stored in the "recycle bin". The 'recycle bin' is used for storing data deleted by a user temporarily, and the data which is not recovered by the user after a preset time period or the deleted data which is confirmed by the user in the 'recycle bin' can be erased from a storage medium of the electronic equipment by App 1.
In still other embodiments, the repository further includes a "share" feature option 804 that the electronic device user can select to share one or more data in the repository.
When the user of the electronic device selects any data in the resource library to be opened, the electronic device can display a playing interface as shown in fig. 9 in response to the operation of the user.
Similar to the translation result display area 610 shown in fig. 6, the playing interface may include one or more of an overall display area, an opening action display area, a hand action display area, and a text status display area, where the option functions corresponding to the area shown in fig. 6 may be opened, and detailed triggering manners of the option functions and specific functions may refer to the related descriptions in fig. 6, so that repetition is avoided.
In some embodiments, the playback interface may include a playback functionality control 901 that may control the start and stop of the playback of data, and may also view the progress of the current playback of data.
Optionally, the play functionality control may also include a prompt control 902 for the additional mouth action data keywords, which the electronic device user may select (e.g., click on) to view the mouth actions of the keywords directly.
In some embodiments, the playback interface may include a "share" function option 903 that the electronic device user may select to share one or more of the data being played of the playback interface.
The following details about the sharing process of the translation data in conjunction with fig. 10 will be described, where the sharing process may be triggered by the sharing function shown in fig. 6, or may be triggered by the sharing function in the interface of the resource library shown in fig. 8, or may be triggered by the sharing function of the playing interface shown in fig. 9, or may be triggered by other manners, which is not limited in this disclosure.
As shown in fig. 10, the sharing interface includes a sharing selection prompt 1001, a sharing data preview area 1002, and a sharing route selection window 1003.
The sharing selection prompt information 1001 is used for prompting information of data to be shared that has been selected currently, where the sharing selection prompt information may include the number of data to be shared, and the sharing selection prompt information may also include a category included in the data to be shared.
For example, when the electronic device user selects data corresponding to 3 hand motion display areas, data corresponding to 4 mouth motion display areas, and data corresponding to 4 mouth motion display areas, the sharing selection prompt information may be displayed: 11 items have been selected, including: data (manual) corresponding to the hand motion display area, data (oral motion) corresponding to the oral motion display area, and data (text) corresponding to the text state display area.
The shared data preview area 1002 is used for displaying data to be shared. For example, when the user of the electronic device selects to share the data of the integral display area, the shared data preview area may display a frame of screen sharing the integral display area for previewing the data of the integral display area.
Optionally, the shared data preview area may further include a function check box 1004, and the electronic device user may select the data to be shared or deselect the data to be shared by clicking the function check box.
The sharing pathway selection window 1003 is used to expose one or more available sharing pathways, and is also used to obtain one or more sharing pathways selected by the user of the electronic device. For example, as shown in fig. 10, the one or more sharing approaches described above may include: bluetooth sharing, uploading to a cloud disk, or sending through mail, etc.
The translation method provided by the embodiment of the present application is described in detail by taking App1 as an example in conjunction with fig. 3 to 10, and one or more functions of App1 described above may be turned on or off by setting function options of App 1. The setting function of App1 is described below with reference to fig. 11.
The setting function options can comprise a function option of automatically identifying and converting keywords, and the electronic equipment user can start or stop keywords in input text, video and audio data of App1 in the input process through the function option, wherein the keywords refer to keywords needing additional oral action data.
The setting function option can also comprise a function option of 'keyword automatic correction', and the electronic equipment user can prompt and/or automatically correct the condition that the keyword input by the user has errors in the input process by the App1 in the process of starting or closing the function option.
The setting function option may further include a "translation acceleration function" function option, through which the electronic device user may start a function for improving the efficiency of text information translation, and how to improve the efficiency of text information translation in detail is described in the following embodiments.
The setting function option can also comprise a function option of 'result display content', and the electronic equipment user can select the content to be displayed on the translation result display interface through the function option. For example, when the electronic device user selects "hand motion" and "mouth motion" in the function options, the entire display area and the text status display area are not displayed by default, and the hand motion display area and the mouth motion display area are displayed by default in the interface shown in fig. 6.
The setting function option can also comprise a function option of 'a default classification mode of a resource library', and the electronic equipment user can select the default classification mode of different data stored in the local electronic equipment in the resource library through the function option.
The setting function option can also comprise a function option of 'default ordering mode of the resource library', and the electronic equipment user can select the default ordering mode of different data stored in the local electronic equipment in the resource library by the user through the function option.
The method for translating chinese provided by the embodiment of the present application is described above in terms of the user of the electronic device, and the method for translating chinese provided by the embodiment of the present application is described below with reference to fig. 12, which is a flow of implementation inside the electronic device.
S1201, the electronic equipment acquires text information to be translated.
The text information to be translated can be directly input to the electronic equipment by the user of the electronic equipment, or can be obtained by the electronic equipment through recognition according to the text, the picture, the audio or the video and other data input by the user. The specific method for obtaining the text information to be translated can refer to the related descriptions in fig. 3 to 5.
In some embodiments, the electronic device also obtains keywords that require additional mouth motion data.
S1202, the electronic equipment sends a translation request to the server, and the server receives the translation request correspondingly.
The translation request is used for requesting to acquire hand motion data corresponding to the text information to be translated. When the electronic device further obtains a keyword that needs additional mouth motion data in S1201, the translation request is further used for requesting to obtain mouth motion data corresponding to the keyword.
In some embodiments, the translation request is for requesting acquisition of the mouth action data corresponding to the keyword requiring additional mouth action data.
S1203, the server transmits the hand motion data and/or the mouth motion data, and the electronic device receives the hand motion data and/or the mouth motion data accordingly.
The server determines to transmit hand motion data and/or mouth motion data to the electronic device according to the content of the translation request message received in S1202.
Optionally, before sending the hand motion data and/or the mouth motion data to the electronic device, the server first performs a text wind control check on the text information requested to be translated by the electronic device, where the text wind control check is used to check whether the text information to be translated includes sensitive information, so as to play a role in filtering bad text information.
In some embodiments, the server determines that the text information to be translated passes through the text wind control inspection and then directly sends the hand motion data and/or the mouth motion data to the electronic device.
In other embodiments, after determining that the text information to be translated passes the text wind control inspection, the server sends indication information to the electronic device, where the indication information is used to indicate that the text information to be translated passes the text wind control inspection. After receiving the indication information, the electronic equipment sends a text-to-hand language request corresponding to the text checked by text wind control to the server, and after receiving the text-to-hand language request, the server sends the hand motion data and/or the mouth motion data to the electronic equipment.
In still other embodiments, if the server determines that the text information to be translated fails the text wind control check, the server sends indication information to the electronic device, where the indication information is used to indicate that the text information to be translated fails the text wind control check.
The server may determine hand motion data corresponding to the text information to be translated from the hand motion database and send the hand motion data to the electronic device.
Similarly, the server may determine the mouth motion data corresponding to the keyword from the mouth motion database, and transmit the mouth motion data to the electronic device.
In some embodiments, the server includes a part-of-speech tagging module for tagging each word in the text information received from the electronic device with a part-of-speech tag, the specific meaning of the part-of-speech tag being shown in table 1, and an oral action database. The mouth action database is used for storing the mixed shape numerical value corresponding to the mouth shape of the Chinese pinyin, and the mixed shape numerical value can be used for displaying the mouth actions corresponding to the keywords.
Specifically, firstly, a video recording device records a single Pinyin mouth shape video of a model face, such as Pinyin mouth shape wu, and the mixed shape value of each recorded frame is stored in a mouth action database.
Table 2 shows the Chinese pinyin for a mouth shape video to be recorded during the mouth motion database creation process, and the mouth motions of different Chinese characters are determined according to their corresponding Chinese pinyin. And recording mouth shape videos corresponding to different Chinese Pinyin, and converting the mouth shape videos into data capable of driving the mouth part of the virtual character to act. When the keywords needing to be added with the mouth action data are acquired, the server can call a mouth shape generation algorithm to acquire data obtained by converting mouth shape videos of Chinese pinyin pronunciations corresponding to the keywords and send the data to the electronic equipment, so that the electronic equipment can drive the virtual character to make corresponding mouth actions by utilizing the acquired data.
TABLE 2 Pinyin
In some embodiments, the server transmits the mouth motion data and hand motion data to the electronic device together.
In other embodiments, the server sequentially sends the hand motion data of different time frames in slices according to the sequence of the hand motions.
In still other embodiments, the server sequentially sends the mouth motion data of different time frames in slices according to the sequence of mouth motions.
In still other embodiments, the hand motion data and the mouth motion data are provided with the same time stamp, and the server sends the hand motion data and the mouth motion data in different time frames in a slicing manner according to the sequence of the hand motion or the mouth motion.
S1204, driving the virtual character.
The electronic device drives the virtual character model to display the hand motion corresponding to the text to be translated and/or the mouth motion of the keyword according to the hand motion data and/or the mouth motion data received in S1203.
After the translation result corresponding to the text information is obtained, the electronic device user may save, share, edit and set the translation result, and the detailed execution process may refer to the related descriptions in fig. 6 to 11, which are not repeated herein for brevity.
Based on the same inventive concept, as shown in fig. 13, an embodiment of the present application further provides a chinese translation apparatus 1300, where the chinese translation apparatus 1300 includes an obtaining unit 1310 for obtaining information input by a user of the electronic device in the embodiment shown in fig. 3 to 11, and a processing unit 1320 for performing processing operations performed by the electronic device in the embodiment shown in fig. 3 to 11, such as obtaining corresponding hand motion data according to text information input by the user.
Optionally, the chinese translation device may further include a communication unit 1330 for performing communication with a server, data transmission operations performed by the electronic device in the embodiment shown in fig. 3 to 11, and the like.
As shown in fig. 14, the embodiment of the present application further provides another chinese translation apparatus 1400, where the chinese translation apparatus 1400 includes a processing unit 1410 and a communication unit 1420, where the processing unit is configured to perform a text wind control operation check or the like on text information to be translated sent by an electronic device, and the communication unit is configured to perform a communication and data transmission operation or the like performed by a server and the electronic device in the embodiment shown in fig. 3 to 11.
Optionally, the chinese language translation device may further include a storage unit 1430 for storing one or more computer programs, hand motion data, mouth motion data, and the like.
As shown in fig. 15, an embodiment of the present application further provides an electronic device 1500, which includes a processor 1510 and a memory 1520, where the processor is configured to perform processing operations performed by the electronic device in the embodiments shown in fig. 3 to 11, such as obtaining corresponding hand motion data based on text information input by a user, and the like, where the memory stores one or more computer programs including instructions that, when executed by the one or more processors, cause any of the chinese language translation methods as previously described to be performed.
As shown in fig. 16, an embodiment of the present application further provides a server 1600 that includes a processor 1610 and a memory 1620, where the processor is configured to perform text wind control operations on text information to be translated sent by an electronic device, and the memory stores one or more computer programs, hand motion data, mouth motion data, and the like, where the one or more computer programs include instructions that when executed by the one or more processors cause any of the chinese language translation methods as previously described to be performed.
Embodiments of the present application also provide a computer program product comprising computer program code to, when run on a computer, cause the computer to implement the method in the embodiments as shown in fig. 3 to 12.
Embodiments of the present application also provide a computer-readable storage medium storing computer instructions that, when executed on a computer, cause the computer to implement the method of the embodiments shown in fig. 3 to 12.
The embodiment of the application also provides a chip, which comprises a processor, and is used for reading the instructions stored in the memory, and when the processor executes the instructions, the chip is enabled to realize the method in the embodiment shown in fig. 3 to 12.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.
In the several embodiments provided by the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a random access memory (random access memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
The foregoing is merely illustrative of the present application, and the present application is not limited thereto, and any person skilled in the art will readily recognize that variations or substitutions are within the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (18)

1. A method of chinese translation comprising:
responding to input of a user, and acquiring text information by the electronic equipment, wherein the text information comprises keywords;
the electronic equipment displays the hand actions corresponding to the text information;
and the electronic equipment displays the mouth action corresponding to the keyword.
2. The method of claim 1, wherein the keywords are determined based on language habits of a sign language user.
3. The method according to claim 1 or 2, characterized in that the method further comprises: the electronic equipment does not display the oral action corresponding to the common vocabulary, the text information comprises the common vocabulary, and the common vocabulary is different from the keywords.
4. A method according to any one of claims 1 to 3, wherein the electronic device displaying the mouth action corresponding to the keyword comprises:
The electronic equipment displays the mouth motion while displaying the hand motion corresponding to the keyword.
5. The method of any one of claims 1 to 4, wherein the keyword is a proper noun.
6. The method according to any one of claims 1 to 5, wherein before displaying the mouth action corresponding to the keyword, the method further comprises:
the electronic equipment displays a first vocabulary, wherein the text information comprises the first vocabulary, and the first vocabulary is a vocabulary for recommending additional mouth actions;
and responding to the confirmation operation of the user, and determining the first vocabulary as the keyword by the electronic equipment.
7. The method according to any one of claims 1 to 6, wherein before displaying the mouth hand motion corresponding to the keyword, the method further comprises:
responding to a first input of a user, the electronic equipment acquires a second vocabulary, wherein the second vocabulary is a vocabulary of the user requesting additional mouth actions;
when the text information contains the second vocabulary, the electronic equipment determines the second vocabulary as the keyword;
when the text information does not contain the second vocabulary, the electronic equipment displays update request information, wherein the update request information is used for prompting that the text information does not contain the second vocabulary;
And responding to a second input of the user, and acquiring the updated second vocabulary by the electronic equipment.
8. The method of claim 6, wherein the first vocabulary is determined based on a translation history of the user, the translation history comprising second vocabulary entered by the user, the second vocabulary being a vocabulary of the user requesting additional oral actions.
9. The method according to any one of claims 1 to 8, wherein the mouth motion is determined from a pronunciation mouth shape of the chinese pinyin of the keyword.
10. The method of claim 9, wherein the blended shape values corresponding to the pronunciation mouth shapes are stored in a mouth motion database.
11. The method of any of claims 1-10, wherein the hand actions include a first hand action and a second hand action, the first hand action being preceded by the second hand action, the electronic device displaying the corresponding hand action of the textual information, comprising:
the electronic equipment receives first hand-operated data from a server, wherein the first hand-operated data is used for displaying the first hand-operated;
the electronic device receives second hand-action data from the server while displaying the first hand-action, the second hand-action data being used to display the second hand-action.
12. The method of any one of claims 1 to 11, wherein the oral actions include a first oral action and a second oral action, the first oral action being prior to the second oral action, the electronic device displaying an oral action corresponding to a keyword, comprising:
the electronic equipment receives first mouth motion data from a server, wherein the first mouth motion data is used for displaying the first mouth motion;
the electronic device receives second mouth motion data from the server while displaying the first mouth motion, the second mouth motion data being used to display the second mouth motion.
13. The method of any one of claims 1 to 12, wherein prior to displaying the corresponding hand action of the text message, the method further comprises:
the electronic device receives a response message from the server, wherein the response message is used for indicating that the text information does not contain sensitive information.
14. An electronic device comprising a processor and a memory, the memory user storing program instructions, the processor for invoking the program instructions to perform the method of any of claims 1-13.
15. A chinese translation device comprising means for implementing the method of any one of claims 1 to 13.
16. A computer program product, characterized in that it comprises a computer program code which, when run on a computer, performs the method according to any one of claims 1 to 13.
17. A computer-readable storage medium, characterized in that a computer program is stored thereon, which computer program, when being executed by a computer, causes the method of any of claims 1 to 13 to be implemented.
18. A chip product, comprising: a processor for reading instructions stored in a memory, which when executed by the processor, cause the chip to implement the method of any one of claims 1 to 13.
CN202210396448.4A 2022-04-15 2022-04-15 Chinese translation method and electronic equipment Pending CN116932706A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202210396448.4A CN116932706A (en) 2022-04-15 2022-04-15 Chinese translation method and electronic equipment
PCT/CN2023/086870 WO2023197949A1 (en) 2022-04-15 2023-04-07 Chinese translation method and electronic device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210396448.4A CN116932706A (en) 2022-04-15 2022-04-15 Chinese translation method and electronic equipment

Publications (1)

Publication Number Publication Date
CN116932706A true CN116932706A (en) 2023-10-24

Family

ID=88329034

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210396448.4A Pending CN116932706A (en) 2022-04-15 2022-04-15 Chinese translation method and electronic equipment

Country Status (2)

Country Link
CN (1) CN116932706A (en)
WO (1) WO2023197949A1 (en)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11308450B2 (en) * 2018-04-27 2022-04-19 Microsoft Technology Licensing, Llc Generating personalized smart responses
CN112287690A (en) * 2020-10-29 2021-01-29 中国科学技术大学 Sign language translation method based on conditional sentence generation and cross-modal rearrangement
CN113657101A (en) * 2021-07-20 2021-11-16 北京搜狗科技发展有限公司 Data processing method and device and data processing device
CN113835522A (en) * 2021-09-10 2021-12-24 阿里巴巴达摩院(杭州)科技有限公司 Sign language video generation, translation and customer service method, device and readable medium
CN113971837A (en) * 2021-10-27 2022-01-25 厦门大学 Knowledge-based multi-modal feature fusion dynamic graph neural sign language translation method

Also Published As

Publication number Publication date
WO2023197949A1 (en) 2023-10-19

Similar Documents

Publication Publication Date Title
CN110111787B (en) Semantic parsing method and server
CN110910872B (en) Voice interaction method and device
EP3905246A1 (en) Song recording method, sound correction method and electronic device
CN111724775B (en) Voice interaction method and electronic equipment
CN109286725B (en) Translation method and terminal
WO2020119455A1 (en) Method for repeating word or sentence during video playback, and electronic device
WO2022052776A1 (en) Human-computer interaction method, and electronic device and system
CN111881315A (en) Image information input method, electronic device, and computer-readable storage medium
CN113806473A (en) Intention recognition method and electronic equipment
WO2022143258A1 (en) Voice interaction processing method and related apparatus
CN115543145A (en) Folder management method and device
CN112416984A (en) Data processing method and device
WO2022135254A1 (en) Text editing method, electronic device and system
CN113380240B (en) Voice interaction method and electronic equipment
WO2023197949A1 (en) Chinese translation method and electronic device
CN116861066A (en) Application recommendation method and electronic equipment
CN114528842A (en) Word vector construction method, device and equipment and computer readable storage medium
CN114093368A (en) Cross-device voiceprint registration method, electronic device and storage medium
CN113470638B (en) Method for slot filling, chip, electronic device and readable storage medium
CN115841099B (en) Intelligent recommendation method of page filling words based on data processing
CN116193275B (en) Video processing method and related equipment
WO2023236908A1 (en) Image description method, electronic device and computer-readable storage medium
WO2023197951A1 (en) Search method and electronic device
WO2023071441A1 (en) Method and apparatus for displaying letters in contact list, and terminal device
CN116052648B (en) Training method, using method and training system of voice recognition model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination