CN112153213A - Method and equipment for determining voice information - Google Patents

Method and equipment for determining voice information Download PDF

Info

Publication number
CN112153213A
CN112153213A CN201910579343.0A CN201910579343A CN112153213A CN 112153213 A CN112153213 A CN 112153213A CN 201910579343 A CN201910579343 A CN 201910579343A CN 112153213 A CN112153213 A CN 112153213A
Authority
CN
China
Prior art keywords
voice information
user
terminal
information
corresponding relation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910579343.0A
Other languages
Chinese (zh)
Inventor
赵云
李凯
皮素梅
孟庆霞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hisense Mobile Communications Technology Co Ltd
Original Assignee
Hisense Mobile Communications Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hisense Mobile Communications Technology Co Ltd filed Critical Hisense Mobile Communications Technology Co Ltd
Priority to CN201910579343.0A priority Critical patent/CN112153213A/en
Publication of CN112153213A publication Critical patent/CN112153213A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/225Feedback of the input speech

Abstract

The invention discloses a method and equipment for determining voice information, which are used for solving the problem that the conversation content is single when a user and a terminal carry out voice conversation in the prior art. The invention firstly receives the voice information of the target user, then determines the terminal voice information corresponding to the received voice information of the target user according to the user-defined corresponding relation between the voice information and the terminal voice information, and finally notifies the determined terminal voice information to the user. Because the user voice information and the terminal voice information in the user-defined corresponding relation are determined according to the collected voice information for establishing the user-defined relation, rich terminal voice information can be determined according to the user-defined voice information.

Description

Method and equipment for determining voice information
Technical Field
The present invention relates to the field of wireless communication technologies, and in particular, to a method and a device for determining voice information.
Background
The electronic product voice function is increasingly popular with users as a novel interactive technology, at present, the development directions of the electronic product voice function are mainly divided into two directions, one is that the electronic product helps the users to finish some things such as weather inquiry, phone call, software opening and the like through the voice of the users, and the other is that the users can talk with the terminal.
In the prior art, a manufacturer configures a dialog content in a remote server, where the dialog content includes a correspondence between voice information of a user and voice information of a terminal. When a user and a terminal have a conversation, the terminal sends the acquired voice information of the user to a remote server, the remote server searches the voice information in conversation contents stored in the remote server after receiving the voice information of the user, and if the voice information exists, the voice information of the terminal corresponding to the voice information is returned to the terminal. For example, the voice information of the terminal corresponding to the voice information "hello" of the user configured in the server is "owner, hello", and when the voice information of the user acquired by the terminal is "hello", the remote server returns only "owner, hello" to the terminal.
In summary, in the prior art, when a user and a terminal perform a voice conversation, the conversation content is relatively single.
Disclosure of Invention
The invention provides a method and equipment for determining voice information, which are used for solving the problem that the conversation content is single when a user and a terminal carry out voice conversation in the prior art.
In a first aspect, an embodiment of the present invention provides a method for determining voice information, where the method includes:
receiving voice information of a target user;
determining terminal voice information corresponding to the received target user voice information according to a user-defined corresponding relation between the user voice information and the terminal voice information, wherein the user voice information and the terminal voice information in the user-defined corresponding relation are determined according to the collected voice information for establishing the user-defined corresponding relation;
and informing the user of the determined terminal voice information.
The method comprises the steps of firstly receiving voice information of a target user, then determining terminal voice information corresponding to the received voice information of the target user according to a user-defined corresponding relation between the voice information and the terminal voice information, and finally informing the user of the determined terminal voice information. Because the user voice information and the terminal voice information in the user-defined corresponding relation are determined according to the collected voice information for establishing the user-defined relation, rich terminal voice information can be determined according to the user-defined voice information.
In one possible implementation, the custom correspondence is determined by:
after receiving user voice information for establishing a user-defined corresponding relation, determining the user voice information and terminal voice information in the voice information for establishing the user-defined corresponding relation;
taking the determined corresponding relation between the user voice information and the terminal voice information as the self-defined corresponding relation;
the determining the user voice information and the terminal voice information in the voice information for establishing the corresponding relationship includes:
extracting the user voice information and the terminal voice information in the user voice information for establishing the user-defined corresponding relation; or
And converting the user voice information for establishing the user-defined corresponding relation into conversation content, and extracting the user voice information and the terminal voice information from the converted information.
The method comprises the steps of firstly determining user voice information and terminal voice information in voice information for establishing a corresponding relation, and then taking the determined corresponding relation between the user voice information and the terminal voice information as the user-defined corresponding relation; when the user voice information and the terminal voice information in the voice information for establishing the corresponding relationship are determined, the user voice information and the terminal voice information can be directly extracted from the received user voice information for establishing the customized corresponding relationship, or conversation content conversion can be performed on the user voice information for establishing the customized corresponding relationship, and then the user voice information and the terminal voice information are extracted from the converted information. The user-defined voice information established by the method is rich, so that the terminal voice information determined according to the user-defined voice information is also rich.
In a possible implementation manner, the determining, according to the customized correspondence between the user voice information and the terminal voice information, the terminal voice information corresponding to the received target user voice information includes:
matching the received target user voice information with the user voice information in the user-defined corresponding relation;
and if the matching is successful, determining the terminal voice information corresponding to the received target user voice information according to the user-defined corresponding relation.
The method provides a mode for determining the terminal voice information corresponding to the received target user voice information, firstly, the received target user voice information is matched with the user voice information in the user-defined corresponding relation, and if the matching is successful, the terminal voice information corresponding to the received target user voice information is determined according to the user-defined corresponding relation. Because the target user voice information may not exist in the customized corresponding relation, when the terminal voice information corresponding to the target voice information is determined according to the customized corresponding relation, the target voice information can be matched with the user voice information in the customized corresponding relation, and therefore the success rate of determining the terminal voice information can be improved.
In one possible implementation, the method further comprises:
if the matching fails, extracting target keyword information in the voice information of the target user;
replacing the target keyword information with keyword information in a keyword information set corresponding to the target keyword information;
and matching the target user voice information after the keyword information is replaced with the user voice information in the user-defined corresponding relation.
According to the method, another mode for determining the terminal voice information corresponding to the received target user voice information is provided, the target user voice information is matched with the user voice information in the custom corresponding relation, if the matching fails, the keyword information in the target user voice information needs to be replaced, and the replaced target user voice information is matched with the user voice information in the custom corresponding relation, so that the success rate of determining the terminal voice information can be improved.
In one possible implementation, the method further includes:
and responding to the instruction for displaying the voice information to display or play the voice information for establishing the user-defined corresponding relation.
According to the method, the user may forget the voice information for establishing the user-defined corresponding relation, so that the voice information can be displayed or played, and the error rate of determining the voice information of the terminal can be reduced.
In a second aspect, an embodiment of the present invention provides an apparatus for determining voice information, where the apparatus includes: a processing unit and a storage unit, wherein the storage unit stores program code, and when one or more computer programs stored by the storage unit are executed by the processing unit, the terminal is caused to execute the following processes:
receiving voice information of a target user;
determining terminal voice information corresponding to the received target user voice information according to a user-defined corresponding relation between the user voice information and the terminal voice information, wherein the user voice information and the terminal voice information in the user-defined corresponding relation are determined according to the collected voice information for establishing the user-defined corresponding relation;
and informing the user of the determined terminal voice information.
In a possible implementation manner, when the user-defined correspondence is determined in the following manner, the processing unit is specifically configured to:
after receiving user voice information for establishing a user-defined corresponding relation, determining the user voice information and terminal voice information in the voice information for establishing the user-defined corresponding relation;
taking the determined corresponding relation between the user voice information and the terminal voice information as the self-defined corresponding relation;
extracting the user voice information and the terminal voice information in the user voice information for establishing the user-defined corresponding relation; or
And converting the user voice information for establishing the user-defined corresponding relation into conversation content, and extracting the user voice information and the terminal voice information from the converted information.
In a possible implementation manner, the processing unit is specifically configured to:
matching the received target user voice information with the user voice information in the user-defined corresponding relation;
and if the matching is successful, determining the terminal voice information corresponding to the received target user voice information according to the user-defined corresponding relation.
In one possible implementation, the processing unit is further configured to:
if the matching fails, extracting target keyword information in the voice information of the target user;
replacing the target keyword information with keyword information in a keyword information set corresponding to the target keyword information;
and matching the target user voice information after the keyword information is replaced with the user voice information in the user-defined corresponding relation.
In one possible implementation, the processing unit is further configured to:
and responding to the instruction for displaying the voice information to display or play the voice information for establishing the user-defined corresponding relation.
In a third aspect, an embodiment of the present invention further provides an apparatus for determining voice information, where the apparatus includes a receiving module, a determining module, and a notifying module:
the receiving module is used for receiving voice information of a target user;
the determining module is used for determining the terminal voice information corresponding to the received target user voice information according to the user-defined corresponding relation between the user voice information and the terminal voice information, wherein the user voice information and the terminal voice information in the user-defined corresponding relation are determined according to the collected voice information for establishing the user-defined corresponding relation;
and the notification module is used for notifying the terminal voice information determined by the user.
In a fourth aspect, the present application also provides a computer storage medium having a computer program stored thereon, which when executed by a processing unit, performs the steps of the method of the first aspect.
In addition, for technical effects brought by any one implementation manner of the second aspect to the fourth aspect, reference may be made to technical effects brought by different implementation manners of the first aspect, and details are not described here.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a schematic flowchart of a method for determining voice information according to an embodiment of the present invention;
fig. 2 is a schematic view of a scenario in which a terminal determines voice information according to an embodiment of the present invention;
fig. 3 is a schematic diagram illustrating that a terminal responds to an instruction for displaying voice information to display the voice information on a display interface of the terminal according to an embodiment of the present invention;
fig. 4 is a schematic view of a scenario in which a server determines voice information according to an embodiment of the present invention;
FIG. 5 is a flow chart illustrating a customized voice dialog provided by an embodiment of the present invention;
FIG. 6 is a flow chart illustrating a voice memorization process according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of an apparatus for determining voice information according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of a second apparatus for determining speech information according to an embodiment of the present invention;
fig. 9 is a schematic structural diagram of a third apparatus for determining speech information according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention clearer, the present invention will be described in further detail with reference to the accompanying drawings, and it is apparent that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The popularization of intelligent electronic products brings great convenience to the life of people, for example, a voice assistant in a smart phone enables a user to make a call, send a short message and open software through the voice assistant, and the user can also talk with the voice assistant.
The server stores some corresponding relations of the conversation content, for example, "thank you" corresponds to "no air" and "hello" corresponds to "hello", the corresponding relations are stored in the server, when a user converses with the smart phone, the smart phone sends the acquired voice information to the server, the server inquires whether the voice information exists in the corresponding relations, and if yes, the content corresponding to the voice information is returned.
At present, the conversation between the user and the intelligent terminal does not cause different conversation contents due to different terminals, that is, no matter which user and which terminal are connected with the same server, the contents corresponding to the same voice information are completely the same, so that the conversation contents are relatively single.
If some conversation contents can be customized, the conversation contents can be enriched and interesting. For example, a user may customize some questions and answers, and then store the user-defined question and answer content in the terminal or the server, and when the user and the terminal have a conversation, the user voice information acquired by the terminal is information in the user-defined question and answer content, and the user-defined question and answer content may be returned.
For example, the user says "what you call", and you say "how you call harley", and then stores the customized correspondence between "what you call" and "how you call harley", and when the user says "what you call", the terminal will first find out whether the voice information of "what you call" exists in the customized correspondence, and if so, notify the user of the voice content of "what you call harley" corresponding to "what you call".
The application scenario described in the embodiment of the present invention is for more clearly illustrating the technical solution of the embodiment of the present invention, and does not form a limitation on the technical solution provided in the embodiment of the present invention, and it can be known by a person skilled in the art that with the occurrence of a new application scenario, the technical solution provided in the embodiment of the present invention is also applicable to similar technical problems.
For the application scenario, the present application provides a method for determining voice information, as shown in fig. 1, the method includes the following steps:
s100, receiving voice information of a target user;
s101, determining terminal voice information corresponding to received target user voice information according to a user-defined corresponding relation between the user voice information and the terminal voice information, wherein the user voice information and the terminal voice information in the user-defined corresponding relation are determined according to collected voice information used for establishing the user-defined corresponding relation;
and S102, informing the user of the determined terminal voice information.
In the embodiment of the invention, the voice information of the target user is received firstly, then the terminal voice information corresponding to the received voice information of the target user is determined according to the user-defined corresponding relation between the voice information and the terminal voice information, and finally the determined terminal voice information is notified to the user. Because the user voice information and the terminal voice information in the user-defined corresponding relation are determined according to the collected voice information for establishing the user-defined relation, rich terminal voice information can be determined according to the user-defined voice information.
The execution subject of the embodiment of the present invention may be a terminal or a server, and will be described below.
And in the first scenario, the execution main body is a terminal.
As shown in fig. 2, a scene diagram of determining voice information for a terminal is shown. In fig. 2, a user sends target user voice information to a terminal, after receiving the target user voice information, the terminal determines terminal voice information corresponding to the received target user voice information according to the user-defined corresponding relationship between the user voice information and the terminal voice information, and then the terminal notifies the determined terminal voice information to the user.
In implementation, the terminal may determine, in different manners, user voice information and terminal voice information in the voice information for establishing the customized correspondence according to different target user voice information, and then use the determined relationship between the user voice information and the terminal voice information as the customized correspondence.
The following exemplifies the determination of user voice information and terminal voice information in the voice information for establishing the customized correspondence in different ways according to different target user voice information.
And in the first mode, the user voice information and the terminal voice information are directly extracted from the voice information used for establishing the user-defined corresponding relation.
For example, the voice information collected by the terminal is "i say: "what you call", you say: the terminal can directly extract the user voice information as 'what you call' and the terminal voice information as 'what i call' and then take the corresponding relation between the user voice information as 'what you call' and the terminal voice information as 'what i call' as the self-defined corresponding relation.
And secondly, the terminal firstly converts the conversation content of the voice information used for establishing the custom corresponding relation, and then extracts the user voice information and the terminal voice information from the converted information.
For example, the voice information collected by the terminal is "please remember that you like to eat fruit", the terminal can convert the conversation content "please remember that you like to eat fruit" into "what you like to eat" and "i like to eat fruit", then extract the voice information of the user as "what you like to eat" from the converted "what you like to eat" and "i like to eat fruit", the voice information of the terminal is "i like to eat fruit", and finally, the terminal takes the corresponding relationship between the voice information of the user as "what you like to eat" and the voice information of the terminal as "i like to eat fruit" as the self-defined corresponding relationship.
The custom correspondence may be stored in the terminal and may be described in a form of a table, for example, as shown in table 1, a table of the custom correspondence provided in the embodiment of the present invention is provided.
Figure BDA0002112739210000091
TABLE 1
It should be noted that the terminal determines in what manner to extract the user voice information and the terminal voice information, the terminal may first perform direct extraction, and if the direct extraction fails, the voice information is converted and extracted from the converted information;
the terminal may also perform the determination according to the voice identifier in the voice information, for example, the terminal extracts the terminal identifier in the voice information first, and then extracts the terminal identifier according to the extraction mode corresponding to the terminal identifier.
For example, the voice information collected by the terminal is "i say: "what you call", you say: the terminal extracts the voice identification as 'I say' and 'you say', the corresponding extraction mode of the voice identification is direct extraction, therefore, the terminal can directly extract the user voice information and the terminal voice information in the voice information.
For another example, the voice information collected by the terminal is "please remember you like to eat fruit", the terminal extracts the voice identifier as "please remember", and the corresponding extraction method of the voice identifier is to firstly convert the conversation content of the voice identifier and then extract the user voice information and the terminal voice information from the converted information.
In implementation, when the terminal determines the terminal voice information corresponding to the received target user voice information, in order to improve the success rate of determining the terminal voice information, the received target user voice information may be first matched with the user voice information in the customized correspondence relationship determined in the above embodiment, and the next operation is performed according to the matching result.
And if the matching is successful, determining the terminal voice information corresponding to the voice information of the target user according to the user-defined corresponding relation.
For example, the customized correspondence is shown in table 1, the user voice information received by the terminal is "what you call", the terminal matches the user voice information in the customized correspondence in table 1 with the voice information in "what you call" in the user voice information in the customized correspondence in table 1, and because the voice information in "what you call" exists in the user voice information in the customized correspondence in table 1, the matching is successful, and the terminal directly notifies the user of the terminal voice information in table 1 corresponding to "what you call" that "i call harry".
When the received target user voice information is matched with the user voice information in the customized corresponding relationship determined in the embodiment, the matching may be failed, if the matching fails, the terminal needs to extract the target keyword information in the target user voice information, then the keyword information in the keyword information set corresponding to the target keyword information is used for replacing the target keyword information, and finally the target user voice information after replacing the keyword information is matched with the user voice information in the customized relationship.
It should be noted that the keyword information sets are preset and stored in the terminal, each keyword set may have many keywords with similar meanings, such as "like", "love", "preference", and may form a keyword set, and when the extracted keyword is "like", the keyword set corresponding to the keyword is { like, love, preference }.
The following description will be given with specific examples.
For example, the customized correspondence is shown in table 1, the target user voice information received by the terminal is "what movement you prefer", the terminal matches the voice information of "what movement you prefer" with the user voice information in the customized correspondence in table 1, because the voice information of "what movement you prefer" does not exist in the user voice information in the customized correspondence in table 1, the matching fails, at this time, the terminal needs to extract the keyword "preference" in the target user voice information, and then, according to the extracted keyword information, a keyword set corresponding to the keyword information is searched in a keyword information set stored in the terminal.
The searched keyword information set corresponding to the keyword preference is { love, love and preference }, the preference can be replaced by love ', the voice information of the target user after replacement is ' what movement you love ', the terminal matches ' what movement you love ' with the user voice information in the table 1, and the matching fails because the voice information of ' what movement you love ' does not exist in the user voice information in the table 1.
After the matching fails, the terminal can replace the keyword information again, the favorite in the keyword information set is used for replacing the preference, the voice information of the target user after the replacement is 'what sports you like', the terminal matches the voice information of 'what sports you like' with the voice information of the user in the table 1, and if the matching is successful, the terminal informs the user of the voice information of 'i like playing basketball' corresponding to 'what sports you like'.
If the matching is not successful after all the keyword information in the keyword information set is replaced in the matching process, the terminal can match the target voice information with the user voice information in the corresponding relationship configured in the server by the manufacturer, the subsequent matching process is the same as the prior art, and the description is omitted here.
After all the keyword information in the keyword information set is replaced, the matching is not successful, and the terminal can also inform the user of a specific terminal voice information, such as "i can not know what you are saying".
The terminal determines that the user voice information and the terminal voice information in the customized corresponding relation are determined according to the collected voice information for establishing the customized corresponding relation, if a lot of voice information for establishing the customized corresponding relation exists, the user may forget which voice information is used for establishing the customized corresponding relation once, and therefore, the terminal can respond to the instruction for displaying the voice information to display or play the voice information for establishing the customized corresponding relation.
As shown in fig. 3, the terminal responds to the instruction for displaying the voice message to display the voice message on the display interface of the terminal, and as can be seen from fig. 3, there are 7 voice messages, and the user can perform a conversation with the terminal according to the displayed voice messages.
The above is a description of a scenario in which the execution agent is a terminal, and the following is a description of a scenario in which the execution agent is a server.
And in the second scenario, the execution subject is a server.
As shown in fig. 4, a scene diagram of determining voice information for a server is shown. In fig. 4, a user sends target user voice information to a terminal, the terminal sends the target user voice information to a server after receiving the target user voice information, then the server determines the terminal voice information corresponding to the received target user voice information according to the user-defined corresponding relationship between the user voice information and the terminal voice information, then the server sends the determined terminal voice information to the terminal, and finally the terminal notifies the terminal voice information returned by the server to the user.
It should be noted that, in the embodiment of the present invention, the server and the server configured by the manufacturer with the voice conversation correspondence relationship may be different servers.
In implementation, the manner in which the server determines the user voice information and the terminal voice information in the voice information for establishing the customized correspondence is the same as the manner in which the terminal determines, and reference may be made to the manner in which the terminal determines the user voice information and the terminal voice information in the voice information for establishing the customized correspondence, which is not described herein again.
Of course, the way in which the server determines the customized corresponding relationship is also the same as the way in which the terminal determines the customized corresponding relationship, and reference may be made to the way in which the terminal determines the customized corresponding relationship, which is not described herein again.
In implementation, when the server determines the terminal voice information corresponding to the received target user voice information sent by the terminal, in order to improve the success rate of determining the terminal voice information, the received target user voice information may be first matched with the user voice information in the customized correspondence relationship determined in the above embodiment, and the next operation is performed according to the matching result.
The server matches the voice information of the target user with the voice information in the custom corresponding relationship, and the operation after matching is the same as the operation after matching the voice information of the target user with the voice information in the custom corresponding relationship by the terminal in the above embodiment, and the matching mode of the terminal can be referred to, which is not described herein again.
In order to avoid that the user forgets which voice messages are used to establish the custom corresponding relationship, the voice messages used for establishing the custom corresponding relationship can be displayed or played on the terminal.
Specifically, the server receives an instruction for displaying the voice information sent by the terminal, then sends the voice information for establishing the customized corresponding relationship to the terminal, and displays or plays the voice information for establishing the customized corresponding relationship on the terminal.
For displaying the voice information on the terminal, see fig. 3, which is not described herein.
The present invention will be described below with reference to specific examples.
Fig. 5 is a schematic flow chart of a customized voice dialog according to an embodiment of the present invention.
S500, configuring entry corpus information of the user-defined voice conversation;
for example, the entry corpus information of the custom voice dialog is "i say XXX", and you say "XXX".
S501, monitoring voice information of a user, and determining that corpus information of a user-defined voice conversation exists in the monitored voice information;
s502, extracting questions and answers in the user voice information;
s503, storing the corresponding relation between the extracted questions and the answers;
s504, monitoring voice information of a user;
s505, judging whether the voice information of the user is a question in the stored corresponding relation or not according to the stored corresponding relation of the question and the answer, if so, executing S506, otherwise, executing S507;
s506, notifying the answer corresponding to the question in the stored corresponding relation to the user;
and S507, sending the question to a server configured with question and answer content by the manufacturer.
Fig. 6 is a schematic flow chart of a voice memorizing process according to an embodiment of the present invention.
S600, configuring an entry corpus of voice memory;
for example, the entry corpus information of the phonetic memory is "please remember XXX".
S601, monitoring voice information of a user, and determining that voice memory corpus information exists in the monitored voice information;
s602, carrying out conversation content conversion on the monitored user voice information;
s603, extracting user voice information and terminal voice information in the converted information;
s604, storing the corresponding relation between the converted user voice information and the terminal voice information;
s605, monitoring voice information of a user;
s606, judging whether the user voice information is the user voice information stored in the corresponding relation or not according to the corresponding relation between the stored converted user voice information and the terminal voice information, if so, executing S607, and otherwise, executing S608;
s607, notifying the terminal voice information corresponding to the monitored user voice information in the stored corresponding relation to the user;
and S608, sending the monitored voice information of the user to a server configured with the question and answer content by the manufacturer.
Based on the same inventive concept, the embodiment of the present invention further provides a device for determining voice information, and because the principle of the device for solving the problem is similar to the method for determining voice information in the embodiment of the present invention, the implementation of the device may refer to the implementation of the method, and repeated details are not repeated.
As shown in fig. 7, an apparatus for determining speech information according to an embodiment of the present invention includes: a processing unit 700 and a storage unit 701, wherein the storage unit 701 stores program code, and when one or more computer programs stored in the storage unit 701 are executed by the processing unit 700, the apparatus is caused to perform the following processes:
receiving voice information of a target user;
determining terminal voice information corresponding to the received target user voice information according to a user-defined corresponding relation between the user voice information and the terminal voice information, wherein the user voice information and the terminal voice information in the user-defined corresponding relation are determined according to the collected voice information for establishing the user-defined corresponding relation;
and informing the user of the determined terminal voice information.
Optionally, when the user-defined corresponding relationship is determined in the following manner, the processing unit 700 is specifically configured to:
after receiving user voice information for establishing a user-defined corresponding relation, determining the user voice information and terminal voice information in the voice information for establishing the user-defined corresponding relation;
taking the determined corresponding relation between the user voice information and the terminal voice information as the self-defined corresponding relation;
extracting the user voice information and the terminal voice information in the user voice information for establishing the user-defined corresponding relation; or
And converting the user voice information for establishing the user-defined corresponding relation into conversation content, and extracting the user voice information and the terminal voice information from the converted information.
Optionally, the processing unit 700 is specifically configured to:
matching the received target user voice information with the user voice information in the user-defined corresponding relation;
and if the matching is successful, determining the terminal voice information corresponding to the received target user voice information according to the user-defined corresponding relation.
Optionally, the processing unit 700 is further configured to:
if the matching fails, extracting target keyword information in the voice information of the target user;
replacing the target keyword information with keyword information in a keyword information set corresponding to the target keyword information;
and matching the target user voice information after the keyword information is replaced with the user voice information in the user-defined corresponding relation.
Optionally, the processing unit 700 is further configured to:
and responding to the instruction for displaying the voice information to display or play the voice information for establishing the user-defined corresponding relation.
Based on the same inventive concept, if the execution subject of the embodiment of the present invention is the terminal, the embodiment of the present invention further provides a second apparatus for determining voice information.
As shown in fig. 8, a second apparatus 800 for determining speech information according to an embodiment of the present invention includes: a Radio Frequency (RF) circuit 810, a power supply 820, a processor 830, a memory 840, an input unit 850, a display unit 860, a camera 870, a communication interface 880, and a Wireless Fidelity (WiFi) module 890. Those skilled in the art will appreciate that the configuration of the terminal shown in fig. 8 is not intended to be limiting, and that the terminal provided by the embodiments of the present application may include more or less components than those shown, or some components may be combined, or a different arrangement of components may be provided.
The following describes the various components of the terminal 800 in detail with reference to fig. 8:
the RF circuitry 810 may be used for receiving and transmitting data during a communication or conversation. Specifically, the RF circuit 810 sends downlink data of the base station to the processor 830 for processing; and in addition, sending the uplink data to be sent to the base station. Generally, the RF circuit 810 includes, but is not limited to, an antenna, at least one Amplifier, a transceiver, a coupler, a Low Noise Amplifier (LNA), a duplexer, and the like.
In addition, the RF circuit 810 may also communicate with networks and other terminals via wireless communication. The wireless communication may use any communication standard or protocol, including but not limited to Global System for Mobile communication (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), email, Short Messaging Service (SMS), and the like.
The WiFi technology belongs to a short-distance wireless transmission technology, and the terminal 800 is connected to an Access Point (AP) through a WiFi module 890, so as to achieve Access to a data network. The WiFi module 890 can be used for receiving and transmitting data during communication.
The terminal 800 may be physically connected to other terminals through the communication interface 880. Optionally, the communication interface 880 is connected to the communication interface of the other terminal through a cable, so as to implement data transmission between the terminal 800 and the other terminal.
In the embodiment of the present application, the terminal 800 can implement a communication service and send information to other contacts, so the terminal 800 needs to have a data transmission function, that is, the terminal 800 needs to include a communication module inside. Although fig. 8 shows communication modules such as the RF circuit 810, the WiFi module 890 and the communication interface 880, it is understood that at least one of the above components or other communication modules (e.g., bluetooth modules) for enabling communication are present in the terminal 800 for data transmission.
For example, when the terminal 800 is a mobile phone, the terminal 800 may include the RF circuit 810 and may further include the WiFi module 890; when the terminal 800 is a computer, the terminal 800 may include the communication interface 880 and may further include the WiFi module 890; when the terminal 800 is a tablet computer, the terminal 800 may include the WiFi module.
The memory 840 may be used to store software programs and modules. The processor 730 executes various functional applications and data processing of the terminal 800 by executing the software programs and modules stored in the memory 840, and can implement part or all of the processes in fig. 1 in the embodiments of the present invention when the processor 830 executes the program codes in the memory 840.
Alternatively, the memory 840 may mainly include a program storage area and a data storage area. Wherein, the storage program area can store an operating system, various application programs (such as communication application), various modules for WLAN connection, and the like; the storage data area may store data created according to the use of the terminal, and the like.
Further, the memory 840 may include high speed random access memory and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.
The input unit 850 may be used to receive numeric or character information input by a user and generate key signal inputs related to user settings and function control of the terminal 800.
Alternatively, the input unit 850 may include a touch panel 851 and other input terminals 852.
The touch panel 851, also referred to as a touch screen, can collect touch operations of a user on or near the touch panel 851 (for example, operations of the user on or near the touch panel 851 using any suitable object or accessory such as a finger or a stylus), and drive the corresponding connection device according to a preset program. Alternatively, the touch panel 851 may include two parts, namely, a touch detection device and a touch controller. The touch detection device detects the touch direction of a user, detects a signal brought by touch operation and transmits the signal to the touch controller; the touch controller receives touch information from the touch sensing device, converts the touch information into touch point coordinates, sends the touch point coordinates to the processor 830, and can receive and execute commands sent by the processor 830. In addition, the touch panel 851 may be implemented by various types, such as a resistive type, a capacitive type, an infrared ray, and a surface acoustic wave.
Optionally, the other input terminals 852 may include, but are not limited to, one or more of a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, a joystick, and the like.
The display unit 860 may be used to display information input by a user or information provided to a user and various menus of the terminal 800. The display unit 860 is a display system of the terminal 800, and is configured to present an interface and implement human-computer interaction.
The display unit 860 may include a display panel 861. Alternatively, the Display panel 861 may be configured in the form of a Liquid Crystal Display (LCD), an Organic Light-Emitting Diode (OLED), or the like.
Further, the touch panel 851 may cover the display panel 861, and when the touch panel 851 detects a touch operation thereon or nearby, the touch operation is transmitted to the processor 830 to determine the type of touch event, and then the processor 830 provides a corresponding visual output on the display panel 861 according to the type of touch event.
Although in fig. 8, the touch panel 851 and the display panel 861 are two separate components to implement the input and output functions of the terminal 800, in some embodiments, the touch panel 851 and the display panel 861 may be integrated to implement the input and output functions of the terminal 800.
The processor 830 is a control center of the terminal 800, connects various components using various interfaces and lines, and performs various functions of the terminal 800 and processes data by operating or executing software programs and/or modules stored in the memory 840 and calling data stored in the memory 840, thereby implementing various services based on the terminal.
Optionally, the processor 830 may include one or more processing units. Optionally, the processor 830 may integrate an application processor and a modem processor, wherein the application processor mainly processes an operating system, a user interface, an application program, and the like, and the modem processor mainly processes wireless communication. It will be appreciated that the modem processor described above may not be integrated into the processor 830.
The camera 870 is configured to implement a shooting function of the terminal 800, and shoot pictures or videos.
The terminal 800 also includes a power supply 820 (e.g., a battery) for powering the various components. Optionally, the power supply 820 may be logically connected to the processor 830 through a power management system, so as to implement functions of managing charging, discharging, power consumption, and the like through the power management system.
Although not shown, the terminal 800 may further include at least one sensor, an audio circuit, and the like, which will not be described herein.
Based on the same inventive concept, another device for determining voice information is also provided in the embodiments of the present invention, and since the principle of solving the problem of the device is similar to the method for determining voice information in the embodiments of the present invention, the implementation of the device may refer to the implementation of the method, and repeated details are not repeated.
As shown in fig. 9, the third apparatus for determining voice information according to the embodiment of the present invention further includes a receiving module 900, a determining module 901, and a notifying module 902:
a receiving module 900, configured to receive voice information of a target user;
a determining module 901, configured to determine, according to a customized correspondence between user voice information and terminal voice information, terminal voice information corresponding to received target user voice information, where the user voice information and the terminal voice information in the customized correspondence are determined according to collected voice information used for establishing the customized correspondence;
a notification module 902, configured to notify the terminal voice information determined by the user.
Optionally, when the user-defined corresponding relationship is determined in the following manner, the determining module 901 is specifically configured to:
after receiving user voice information for establishing a user-defined corresponding relation, determining the user voice information and terminal voice information in the voice information for establishing the user-defined corresponding relation;
taking the determined corresponding relation between the user voice information and the terminal voice information as the self-defined corresponding relation;
extracting the user voice information and the terminal voice information in the user voice information for establishing the user-defined corresponding relation; or
And converting the user voice information for establishing the user-defined corresponding relation into conversation content, and extracting the user voice information and the terminal voice information from the converted information.
Optionally, the determining module 901 is specifically configured to:
matching the received target user voice information with the user voice information in the user-defined corresponding relation;
and if the matching is successful, determining the terminal voice information corresponding to the received target user voice information according to the user-defined corresponding relation.
Optionally, the determining module 901 is further configured to:
if the matching fails, extracting target keyword information in the voice information of the target user;
replacing the target keyword information with keyword information in a keyword information set corresponding to the target keyword information;
and matching the target user voice information after the keyword information is replaced with the user voice information in the user-defined corresponding relation.
Optionally, the notification module 902 is further configured to:
and responding to the instruction for displaying the voice information to display or play the voice information for establishing the user-defined corresponding relation.
Further, embodiments of the present invention also provide a computer-readable medium on which a computer program is stored, which, when being executed by a processing unit, implements the steps of any of the methods described above.
The present application is described above with reference to block diagrams and/or flowchart illustrations of methods, apparatus (systems) and/or computer program products according to embodiments of the application. It will be understood that one block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, and/or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer and/or other programmable data processing apparatus, create means for implementing the functions/acts specified in the block diagrams and/or flowchart block or blocks.
Accordingly, the subject application may also be embodied in hardware and/or in software (including firmware, resident software, micro-code, etc.). Furthermore, the present application may take the form of a computer program product on a computer-usable or computer-readable storage medium having computer-usable or computer-readable program code embodied in the medium for use by or in connection with an instruction execution system. In the context of this application, a computer-usable or computer-readable medium may be any medium that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

1. A method for determining speech information, the method comprising:
receiving voice information of a target user;
determining terminal voice information corresponding to the received target user voice information according to a user-defined corresponding relation between the user voice information and the terminal voice information, wherein the user voice information and the terminal voice information in the user-defined corresponding relation are determined according to the collected voice information for establishing the user-defined corresponding relation;
and informing the user of the determined terminal voice information.
2. The method of claim 1, wherein the custom correspondence is determined by:
after receiving user voice information for establishing a user-defined corresponding relation, determining the user voice information and terminal voice information in the voice information for establishing the user-defined corresponding relation;
taking the determined corresponding relation between the user voice information and the terminal voice information as the self-defined corresponding relation;
the determining the user voice information and the terminal voice information in the voice information for establishing the corresponding relationship includes:
extracting the user voice information and the terminal voice information in the voice information for establishing the user-defined corresponding relation; or
And converting the conversation content of the voice information for establishing the custom corresponding relation, and extracting the user voice information and the terminal voice information from the converted information.
3. The method of claim 1, wherein the determining the terminal voice information corresponding to the received target user voice information according to the customized correspondence between the user voice information and the terminal voice information comprises:
matching the received target user voice information with the user voice information in the user-defined corresponding relation;
and if the matching is successful, determining the terminal voice information corresponding to the received target user voice information according to the user-defined corresponding relation.
4. The method of claim 3, further comprising:
if the matching fails, extracting target keyword information in the voice information of the target user;
replacing the target keyword information with keyword information in a keyword information set corresponding to the target keyword information;
and matching the target user voice information after the keyword information is replaced with the user voice information in the user-defined corresponding relation.
5. The method of any of claims 1 to 4, further comprising:
and responding to the instruction for displaying the voice information to display or play the voice information for establishing the user-defined corresponding relation.
6. An apparatus for determining speech information, the apparatus comprising: a processing unit and a storage unit, wherein the storage unit stores program code that, when executed by the processing unit, causes the apparatus to perform the following:
receiving voice information of a target user;
determining terminal voice information corresponding to the received target user voice information according to a user-defined corresponding relation between the user voice information and the terminal voice information, wherein the user voice information and the terminal voice information in the user-defined corresponding relation are determined according to the collected voice information for establishing the user-defined corresponding relation;
and informing the user of the determined terminal voice information.
7. The device of claim 6, wherein the processing unit is specifically configured to, when determining the custom correspondence by:
after receiving user voice information for establishing a user-defined corresponding relation, determining the user voice information and terminal voice information in the voice information for establishing the user-defined corresponding relation;
taking the determined corresponding relation between the user voice information and the terminal voice information as the self-defined corresponding relation;
extracting the user voice information and the terminal voice information in the user voice information for establishing the user-defined corresponding relation; or
And converting the user voice information for establishing the user-defined corresponding relation into conversation content, and extracting the user voice information and the terminal voice information from the converted information.
8. The device of claim 6, wherein the processing unit is specifically configured to:
matching the received target user voice information with the user voice information in the user-defined corresponding relation;
and if the matching is successful, determining the terminal voice information corresponding to the received target user voice information according to the user-defined corresponding relation.
9. The device of claim 8, wherein the processing unit is further to:
if the matching fails, extracting target keyword information in the voice information of the target user;
replacing the target keyword information with keyword information in a keyword information set corresponding to the target keyword information;
and matching the target user voice information after the keyword information is replaced with the user voice information in the user-defined corresponding relation.
10. The apparatus of any of claims 6 to 8, wherein the processing unit is further configured to:
and responding to the instruction for displaying the voice information to display or play the voice information for establishing the user-defined corresponding relation.
CN201910579343.0A 2019-06-28 2019-06-28 Method and equipment for determining voice information Pending CN112153213A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910579343.0A CN112153213A (en) 2019-06-28 2019-06-28 Method and equipment for determining voice information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910579343.0A CN112153213A (en) 2019-06-28 2019-06-28 Method and equipment for determining voice information

Publications (1)

Publication Number Publication Date
CN112153213A true CN112153213A (en) 2020-12-29

Family

ID=73891530

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910579343.0A Pending CN112153213A (en) 2019-06-28 2019-06-28 Method and equipment for determining voice information

Country Status (1)

Country Link
CN (1) CN112153213A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1703923A (en) * 2002-10-18 2005-11-30 中国科学院声学研究所 Portable digital mobile communication apparatus and voice control method and system thereof
CN101075435A (en) * 2007-04-19 2007-11-21 深圳先进技术研究院 Intelligent chatting system and its realizing method
CN102842306A (en) * 2012-08-31 2012-12-26 深圳Tcl新技术有限公司 Voice control method and device as well as voice response method and device
CN103413549A (en) * 2013-07-31 2013-11-27 深圳创维-Rgb电子有限公司 Voice interaction method and system and interaction terminal
CN105488032A (en) * 2015-12-31 2016-04-13 杭州智蚁科技有限公司 Speech recognition input control method and system
CN109584860A (en) * 2017-09-27 2019-04-05 九阳股份有限公司 A kind of voice wakes up word and defines method and system
US20190198013A1 (en) * 2017-12-21 2019-06-27 International Business Machines Corporation Personalization of conversational agents through macro recording

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1703923A (en) * 2002-10-18 2005-11-30 中国科学院声学研究所 Portable digital mobile communication apparatus and voice control method and system thereof
CN101075435A (en) * 2007-04-19 2007-11-21 深圳先进技术研究院 Intelligent chatting system and its realizing method
CN102842306A (en) * 2012-08-31 2012-12-26 深圳Tcl新技术有限公司 Voice control method and device as well as voice response method and device
CN103413549A (en) * 2013-07-31 2013-11-27 深圳创维-Rgb电子有限公司 Voice interaction method and system and interaction terminal
CN105488032A (en) * 2015-12-31 2016-04-13 杭州智蚁科技有限公司 Speech recognition input control method and system
CN109584860A (en) * 2017-09-27 2019-04-05 九阳股份有限公司 A kind of voice wakes up word and defines method and system
US20190198013A1 (en) * 2017-12-21 2019-06-27 International Business Machines Corporation Personalization of conversational agents through macro recording

Similar Documents

Publication Publication Date Title
CN106686396B (en) Method and system for switching live broadcast room
CN109921976B (en) Group-based communication control method, device and storage medium
TWI565315B (en) Method of interactions based on video, terminal, server and system thereof
CN106973330B (en) Screen live broadcasting method, device and system
CN106059894B (en) Message processing method and device
CN106506321B (en) Group message processing method and terminal device
CN106161749B (en) Malicious telephone identification method and device
CN105208056B (en) Information interaction method and terminal
CN105007543B (en) Intercommunication method, device, equipment and system
CN106371964B (en) Method and device for prompting message
CN103347003B (en) A kind of Voice over Internet method, Apparatus and system
CN106302938B (en) Communication event processing method and device
CN106254910B (en) Method and device for recording image
CN106681860B (en) A kind of data back up method and data backup device
US20170318061A1 (en) Method, device, and system for managing information recommendation
CN108834132B (en) Data transmission method and equipment and related medium product
WO2018120905A1 (en) Message reminding method for terminal, and terminal
CN110099378B (en) Method, device, equipment and storage medium for determining incoming call attribution information
CN108881778B (en) Video output method based on wearable device and wearable device
CN108600887B (en) Touch control method based on wireless earphone and related product
CN103828335A (en) Terminal and method for automatically answering incoming calls
CN108270764B (en) Application login method, server and mobile terminal
CN103312907A (en) Voice channel distribution management method, related device and communication system
CN108810261B (en) Antenna switching method in call and related product
CN103401989A (en) Method, device and terminal for displaying contact information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20201229