CN107610690B - Data processing method and device - Google Patents

Data processing method and device Download PDF

Info

Publication number
CN107610690B
CN107610690B CN201710930363.9A CN201710930363A CN107610690B CN 107610690 B CN107610690 B CN 107610690B CN 201710930363 A CN201710930363 A CN 201710930363A CN 107610690 B CN107610690 B CN 107610690B
Authority
CN
China
Prior art keywords
voice
output result
voice request
identifier
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710930363.9A
Other languages
Chinese (zh)
Other versions
CN107610690A (en
Inventor
蔡明祥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lenovo Beijing Ltd
Original Assignee
Lenovo Beijing Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lenovo Beijing Ltd filed Critical Lenovo Beijing Ltd
Priority to CN201710930363.9A priority Critical patent/CN107610690B/en
Publication of CN107610690A publication Critical patent/CN107610690A/en
Application granted granted Critical
Publication of CN107610690B publication Critical patent/CN107610690B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Telephonic Communication Services (AREA)

Abstract

The invention relates to the field of multimedia processing, in particular to a data processing method and a device, wherein the method is applied to a multimedia terminal and comprises the following steps: receiving a first input; generating a first voice request according to the first input; acquiring a first voice output result obtained by processing the first voice request; judging whether the first voice output result meets a first preset condition or not, and obtaining a first judgment result; and when the first judgment result shows that the first voice output result does not meet the first preset condition, not playing the first voice output result. By applying the method provided by the invention, the voice output result played by the multimedia terminal always corresponds to the latest voice request, thereby realizing the matching of the voice output result and the voice request and leading the voice playing result to accord with the expectation of a user.

Description

Data processing method and device
The application has an application date of 2012, 12 and 11, and has an application number of: 201210533421.1, title of the invention: a divisional application of a data processing method and apparatus.
Technical Field
The present invention relates to the field of multimedia processing, and in particular, to a data processing method and apparatus.
Background
TTS (Text To Speech, from Text To Speech) is a Speech synthesis technique that converts a user's Text input into Speech data for playback To the user. Because the voice in the voice data obtained by applying the TTS technology is very dynamic, the TTS technology is widely applied to the field of voice control, and very good experience is brought to users. In the prior art, TTS is generally in an asynchronous playing mode, and after a client requests a voice event from a TTS server, the client is in a state of waiting for the TTS server to feed back voice information until the server feeds back voice information, and the client plays the voice information. If the user quickly makes another voice event request while the client waits for the server to feed back, the client obviously does not meet the user's expectations if the feedback for the first voice event request is also played. Therefore, the TTS asynchronous voice output method in the prior art cannot solve the problem that the voice request of the user corresponds to the matching of the played voice data.
Disclosure of Invention
In order to solve the above technical problem, embodiments of the present invention provide a data processing method and apparatus, which can implement a matching correspondence between a voice request and played voice data. The technical scheme is as follows:
according to a first aspect of the embodiments of the present invention, a data processing method is disclosed, which is applied to a multimedia terminal, and the method includes:
receiving a first input;
generating a first voice request according to the first input;
acquiring a first voice output result obtained by processing the first voice request;
judging whether the first voice output result meets a first preset condition or not, and obtaining a first judgment result;
and when the first judgment result shows that the first voice output result does not meet the first preset condition, not playing the first voice output result.
Preferably, after receiving the first input, the method further comprises:
receiving a second input;
generating a second voice request according to the second input;
acquiring a second voice output result obtained by processing the second voice request;
when the first voice output result is judged not to meet the first preset condition, judging whether the second voice output result meets the first preset condition or not, and obtaining a second judgment result;
and when the second judgment result shows that the second voice output result meets a first preset condition, playing a second voice output result corresponding to the second voice request.
Preferably, the generating a first voice request according to the first input comprises:
processing the first input to obtain a first processing result;
and taking the first processing result as a first voice request.
Preferably, the generating a first voice request according to the first input comprises:
and generating a first voice request and a first identifier corresponding to the first voice request according to the first input, and storing the corresponding relation between the first voice request and the first identifier.
Preferably, the determining whether the first voice output result meets a first preset condition includes:
acquiring a first voice request corresponding to a first voice output result according to the first voice output result;
acquiring a first identifier according to the corresponding relation between the first voice request and the first identifier;
acquiring a third identifier, comparing the first identifier with the third identifier, and determining that a first preset condition is met when the first identifier is the same as the third identifier; wherein the third identification corresponds to a most recent voice request.
Preferably, the obtaining a first voice output result obtained by processing the first voice request includes:
sending the first voice request to a server so that the server processes the first voice request to obtain a first voice output result;
and receiving a first voice output result sent by the server.
Preferably, the first identifier is a timestamp, a universally unique identifier UUID, or a hash value.
Preferably, when the first identifier is a timestamp, the generating a first voice request and a first identifier corresponding to the first voice request according to the first input includes:
generating a first voice request according to the first input;
generating a first local timestamp corresponding to the first voice request as a first identifier according to the time of the first voice request generation, and storing the corresponding relation between the first voice request and the first local timestamp;
the method further comprises the following steps:
generating a global timestamp as a third identifier according to the time generated by the first voice request; the third identification is updated when a new voice request is generated.
Preferably, the obtaining of the third identifier compares the first identifier with the third identifier:
obtaining a global timestamp corresponding to a latest voice request;
comparing a first local timestamp corresponding to the first voice request to the global timestamp.
According to a second aspect of the embodiments of the present invention, there is disclosed a data processing apparatus, the apparatus including:
a first receiving unit for receiving a first input;
a first generating unit, configured to generate a first voice request according to the first input;
the first acquisition unit is used for acquiring a first voice output result obtained by processing the first voice request;
the first judging unit is used for judging whether the first voice output result meets a first preset condition or not and acquiring a first judging result;
and the output unit is used for not playing the first voice output result when the first judgment result shows that the first voice output result does not meet the first preset condition.
Preferably, the apparatus further comprises:
a second receiving unit for receiving a second input;
a second generating unit, configured to generate a second voice request according to the second input;
the second acquisition unit is used for acquiring a second voice output result obtained by processing the second voice request;
the second judging unit is used for judging whether the second voice output result meets the first preset condition or not when the first voice output result does not meet the first preset condition, and acquiring a second judging result;
the output unit is further configured to play a second voice output result corresponding to the second voice request when the second determination result indicates that the second voice output result satisfies a first preset condition.
Preferably, the first generating unit is specifically configured to process the first input to obtain a first processing result; and taking the first processing result as a first voice request.
Preferably, the first generating unit is further configured to generate a first voice request and a first identifier corresponding to the first voice request according to the first input, and store a corresponding relationship between the first voice request and the first identifier.
Preferably, the first judging unit includes:
the second acquisition unit is used for acquiring a first voice request corresponding to a first voice output result according to the first voice output result;
a third obtaining unit, configured to obtain a first identifier according to a corresponding relationship between the first voice request and the first identifier;
the comparison unit is used for acquiring a third identifier, comparing the first identifier with the third identifier, and determining that a first preset condition is met when the first identifier is the same as the third identifier; wherein the third identification corresponds to a most recent voice request.
Preferably, the first obtaining unit includes:
the sending unit is used for sending the first voice request to a server so that the server processes the first voice request to obtain a first voice output result;
and the receiving unit is used for receiving the first voice output result sent by the server.
Preferably, the first identifier is a timestamp, a universally unique identifier UUID, or a hash value.
Preferably, when the first identifier is a timestamp, the first generating unit includes:
a voice request generating unit, configured to generate a first voice request according to the first input;
a first identifier generation unit, configured to generate a first local timestamp corresponding to the first voice request as a first identifier according to the time when the first voice request is generated, and store a correspondence between the first voice request and the first local timestamp;
a third identifier generating unit, configured to generate a global timestamp as a third identifier according to the time generated by the first voice request; the third identification is updated when a new voice request is generated.
Preferably, the comparing unit is specifically configured to obtain a global timestamp, where the global timestamp corresponds to the latest voice request; comparing a first local timestamp corresponding to the first voice request to the global timestamp.
The embodiment of the invention has the following beneficial effects in one aspect: the invention provides a data processing method, which is applied to a multimedia terminal, wherein the multimedia terminal receives a first input, generates a first voice request according to the first input, and acquires a first voice output result obtained by processing the first voice request. Judging whether the first voice output result meets a first preset condition or not, and obtaining a first judgment result; and when the first judgment result shows that the first voice output result does not meet the first preset condition, not playing the first voice output result. In this way, when the multimedia terminal judges that the returned first voice output result does not meet the preset condition, the returned first voice output result is determined not to correspond to the latest voice request, the first voice output result is not played, and the first voice output result is played only when the first voice output result corresponds to the latest voice request. Therefore, the voice output result played by the multimedia terminal always corresponds to the latest voice request, the matching of the voice output result and the voice request is realized, and the voice playing result meets the expectation of a user.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
Fig. 1 is a schematic diagram of a first embodiment of a data processing method according to an embodiment of the present invention;
fig. 2 is a schematic diagram of a second embodiment of a data processing method according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a third embodiment of a data processing method according to an embodiment of the present invention;
fig. 4 is a schematic diagram of an embodiment of a data processing apparatus according to the present invention.
Detailed Description
The embodiment of the invention provides a data processing method and a data processing device, which can solve the problem that a voice request is matched with played voice data.
In order to make those skilled in the art better understand the technical solution of the present invention, the technical solution in the embodiment of the present invention will be clearly and completely described below with reference to the drawings in the embodiment of the present invention, and it is obvious that the described embodiment is only a part of the embodiment of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, a flowchart of a data processing method according to a first embodiment of the present invention is shown.
The method provided by the first embodiment of the invention is applied to a multimedia terminal which is provided with an output unit for outputting audio data. The multimedia terminal can be an electronic device such as a smart television, a mobile phone, a PAD, a computer and the like.
S101, receiving a first input.
The multimedia terminal receives a first input, which may be a key input, a gesture input, a cursor input, or a voice input. The multimedia terminal may have a user interface for receiving a first input from a user, the first input being associated with a voice request. The user can trigger and generate the voice request through preset key action, input instructions, mouse click, cursor click or movement action and preset gesture input. Alternatively, the user enters the text information as a first input. Alternatively, a voice input of the user is taken as the first input. When the first input is a voice input, the multimedia terminal should have an audio collecting unit for collecting the voice input of the user. Of course, the first input may also be control information or data from other electronic devices.
S102, generating a first voice request according to the first input.
In specific implementation, when the first input is non-text input, the first input is processed and converted into text input, and a text input result is used as a first voice request. Further, when the first input is a voice input, voice recognition processing is performed to convert the voice input into a text input. Preferably, the semantic recognition processing is performed on a text input result obtained by converting the voice input into the text input, and the semantic recognition result is used as the first voice request. Wherein the semantic recognition processing is performed for the purpose of performing semantic analysis on the text input result to obtain a result that can be recognized by a computing device having a processor. Generally, the results of semantic recognition or analysis may include one or more of an action, a target of action execution, or a scenario of an application. The invention is not limited in this regard.
Further, one possible implementation manner of generating the first voice request according to the first input is as follows: processing the first input to obtain a first processing result; and taking the first processing result as a first voice request. In specific implementation, a user performs a first input through a multimedia terminal to initiate a first voice request, and when the user desires to play a processing result of the first input, the user needs to process the first input first to obtain a first processing result, and the first processing result is used as the first voice request.
Further, another implementation manner of generating the first voice request according to the first input is as follows: and generating a first voice request and a first identifier corresponding to the first voice request according to the first input, and storing the corresponding relation between the first voice request and the first identifier. The first identification may be a timestamp, a universally unique identifier UUID, or a hash value. Wherein the first identifier is used for uniquely identifying the first voice request. The invention is not limited to the specific manner of the first identifier, and other implementations obtained by those skilled in the art without inventive labor fall within the scope of the invention.
S103, acquiring a first voice output result obtained by processing the first voice request.
In this embodiment of the present invention, the multimedia terminal further has a communication module for performing data connection with the server. Preferably, the server is a cloud TTS server.
Step S103 is specifically realized by the following steps:
S103A, the multimedia terminal sends the first voice request to the server, so that the server processes the first voice request to obtain a first voice output result.
The multimedia terminal sends the first voice request to the server, and the server responds to the first voice request of the multimedia terminal and processes the first request to obtain a first voice output result. The specific implementation of the server obtaining the first voice output result according to the first voice request may be in a manner provided by the prior art, and the present invention is not described herein again.
S103B, receiving the first voice output result sent by the server.
And after the server processes the first voice request, sending the obtained first voice output result to the multimedia terminal, and receiving the first voice output result sent by the server by the multimedia terminal.
S104, judging whether the first voice output result meets a first preset condition or not, and obtaining a first judgment result.
In the first embodiment of the present invention, in order to achieve that the currently played voice output result of the multimedia terminal is always matched with the latest voice request, a first preset condition is set, and when it is determined that the first voice output result satisfies the first preset condition, the first voice output result is played. And when the first voice output result is judged not to meet the first preset condition, the first voice output result is not played. The first preset condition is used for judging whether the currently acquired voice output result is matched with the latest voice request. Corresponding to the step of the first example, the first preset condition is used for judging whether the acquired first voice output result is matched with the latest voice request. In a specific implementation, the first preset condition may be preset by a system or a user.
Preferably, when the implementation manner of generating the first voice request is to generate the first voice request and the first identifier corresponding to the first voice request according to the first input, the determining whether the first voice output result satisfies the first preset condition may specifically include:
S104A, according to the first voice output result, acquiring a first voice request corresponding to the first voice output result.
In the embodiment of the invention, the multimedia terminal is provided with a communication module which can realize data communication with the server. The communication module is provided with a processing mechanism, and can realize the correspondence between the sent voice request and the voice output result returned by the server. In a specific implementation, the processing mode of the communication module may be set as a synchronous processing mode, that is, after a sub-module of the communication module sends a voice request, the sub-module waits for the server to return a voice output result obtained by processing the voice request. The communication module may have a plurality of sub-modules for transmitting \ receiving data. The plurality of sub-modules may be further divided into a transmitting unit and a receiving unit.
And when the multimedia terminal receives a first voice output result returned by the server, acquiring a first voice request corresponding to the first voice output result.
S104B, obtaining a first identifier according to the corresponding relation between the first voice request and the first identifier.
And acquiring the first identifier according to the pre-stored corresponding relation between the first voice request and the first identifier.
S104C, acquiring a third identifier, comparing the first identifier with the third identifier, and determining that a first preset condition is met when the first identifier is the same as the third identifier; wherein the third identification corresponds to a most recent voice request.
Wherein the third identification corresponds to the most recent voice request. In the first embodiment of the invention, each time the multimedia terminal receives the user input, the multimedia terminal generates the voice request corresponding to the user input and sets a unique identifier for the voice request. When the user's input is plural, the third identification is a most recently generated identification corresponding to a most recent voice request.
And comparing the first identifier and the third identifier corresponding to the first voice request/first voice output result, and if the first identifier and the third identifier are the same, determining that the first voice output result corresponds to the latest voice request, and judging that the first voice output result meets a first preset condition. And if the first identifier is different from the third identifier, determining that the first voice output result does not correspond to the latest voice request, and judging that the first voice output result does not accord with a first preset condition.
S105, when the first judgment result shows that the first voice output result does not meet the first preset condition, the first voice output result is not played.
In the first embodiment of the present invention, the first speech output result is played only when the first speech output result satisfies the first preset condition, and the first speech output result is not played when the first speech output result does not satisfy the first preset condition. Therefore, the voice output result played by the multimedia terminal is ensured to always correspond to the latest voice request, the matching of the voice output result and the voice request is realized, the true expectation of a user is better met, and the user experience is improved.
Referring to fig. 2, a flowchart of a data processing method according to a second embodiment of the present invention is shown.
The method provided by the second embodiment of the invention is applied to a multimedia terminal which is provided with an output unit for outputting audio data. The multimedia terminal can be an electronic device such as a smart television, a mobile phone, a PAD, a computer and the like.
In the second embodiment of the present invention, the case where the multimedia terminal receives two input requests is described, and it will be understood by those skilled in the art that the method provided in the second embodiment of the present invention can also be applied to the case where the multimedia terminal receives a plurality of input requests. Those skilled in the art can make modifications and variations to the present invention without inventive step, and all such modifications and variations are within the scope of the present invention.
S201, receiving a first input.
S202, generating a first voice request according to the first input.
In specific implementation, when the first input is non-text input, the first input is processed and converted into text input, and a text input result is used as a first voice request. Further, when the first input is a voice input, voice recognition processing is performed to convert the voice input into a text input. Preferably, the semantic recognition processing is performed on a text input result obtained by converting the voice input into the text input, and the semantic recognition result is used as the first voice request. Wherein the semantic recognition processing is performed for the purpose of performing semantic analysis on the text input result to obtain a result that can be recognized by a computing device having a processor. Generally, the results of semantic recognition or analysis may include one or more of an action, a target of action execution, or a scenario of an application. The invention is not limited in this regard.
Further, one possible implementation manner of generating the first voice request according to the first input is as follows: processing the first input to obtain a first processing result; and taking the first processing result as a first voice request. In specific implementation, a user performs a first input through a multimedia terminal to initiate a first voice request, and when the user desires to play a processing result of the first input, the user needs to process the first input first to obtain a first processing result, and the first processing result is used as the first voice request.
Further, another implementation manner of generating the first voice request according to the first input is as follows: and generating a first voice request and a first identifier corresponding to the first voice request according to the first input, and storing the corresponding relation between the first voice request and the first identifier. The first identification may be a timestamp, a universally unique identifier UUID, or a hash value. Wherein the first identifier is used for uniquely identifying the first voice request. The invention is not limited to the specific manner of the first identifier, and other implementations obtained by those skilled in the art without inventive labor fall within the scope of the invention.
Further, after generating the first identifier and storing the corresponding relationship between the first identifier and the first voice request, the method provided by the present invention further comprises: a third identification is generated. The third identification corresponds to a most recent voice request. In specific implementation, when the first voice request is generated and the first identifier is generated, the copy of the first identifier is used as the third identifier. The third identification is updated when a new voice request is generated.
S203, a first voice output result obtained by processing the first voice request is obtained.
And S204, receiving a second input.
Wherein the second input occurs after the first input.
And S205, generating a second voice request according to the second input.
The implementation manner of generating the second voice request according to the second input is the same as the implementation manner of generating the first request according to the first input. And during specific implementation, generating a second voice request and a second identifier corresponding to the second voice request according to the second input, and storing the corresponding relation between the second voice request and the second identifier. The second identification may be a timestamp, a universally unique identifier UUID, or a hash value. Wherein the second identifier is used to uniquely identify the second voice request. The invention is not limited to the specific manner of the second identifier, and other implementations obtained by those skilled in the art without inventive labor fall within the scope of the invention. Typically, the first identifier is of the same type as the second identifier.
Further, it was previously mentioned that a third identification was generated at the same time or after the first identification was generated, the third identification corresponding to the most recent voice request. Therefore, when a new voice request is generated, namely a second voice request is generated, the third identification is updated. Specifically, when the second voice request is generated and the second identifier is generated, a copy of the second identifier is taken as the third identifier. Thus, the third identification is updated when a new voice request is generated.
It will be understood by those skilled in the art that the second input is generated later than the first input, but the execution order of the steps (S202, S203) of the first input processing and the steps (S205, S206) of the second input processing may be performed in reverse, or in parallel.
S206, a second voice output result obtained by processing the second voice request is obtained.
S207, judging whether the first voice output result meets a first preset condition or not, and obtaining a first judgment result.
In specific implementation, the first preset condition is used for judging whether the currently acquired voice output result is matched with the latest voice request. And when the first voice output result is judged to meet the first preset condition, playing the first voice output result. When it is determined that the first speech output result does not satisfy the first preset condition, the first speech output result is not played, and the process proceeds to step S208.
The description will be given by taking the first preset condition as an example to determine whether the identifier corresponding to the current voice output result corresponds to the identifier corresponding to the latest updated voice request. In concrete implementation, the first preset condition is to determine whether the first identifier corresponding to the first voice output result is the same as the third identifier, which is taken as an example for explanation, and since the third identifier is updated (replaced with a copy of the second identifier) when the second voice request is generated, when the first identifier is compared with the third identifier, and the obtained determination result is that the first identifier is different from the third identifier, the step S208 is performed.
S208, when the first voice output result is judged not to meet the first preset condition, judging whether the second voice output result meets the first preset condition or not, and obtaining a second judgment result.
The first preset condition is further used for judging whether the currently acquired voice output result (i.e. the second voice output result) is matched with the latest voice request.
Still take the first preset condition as an example to determine whether the identifier corresponding to the current voice output result corresponds to the identifier corresponding to the latest updated request. In a specific implementation, in this step, taking the first preset condition as an example to determine whether the second identifier corresponding to the second voice output result is the same as the third identifier, since the third identifier is updated (replaced with a copy of the second identifier) when the second voice request is generated, when the second identifier is compared with the third identifier, and the obtained determination result is that the second identifier is the same as the third identifier, it is determined that the second voice output result satisfies the first preset condition, and the process proceeds to step S209.
S209, when the second judgment result shows that the second voice output result meets the first preset condition, playing a second voice output result corresponding to the second voice request.
And when the second voice output result meets the first preset condition, playing the second voice output result corresponding to the second voice request. If the current input is multiple, when the second voice output result is judged not to meet the first preset condition, namely the second voice output result is determined not to correspond to the latest voice request, the second voice output result is not played.
In the second embodiment of the present invention, when the multimedia terminal receives two or more inputs requesting voices, the voice output result is played only when it is determined that the currently acquired voice output result corresponds to the latest voice request; otherwise, abandoning the voice output result and not playing. In the concrete implementation, a unique identifier is given to the voice request, the identifier corresponding to the currently acquired voice output result is compared with the identifier corresponding to the latest voice request, and when the identifier corresponding to the currently acquired voice output result is judged to be the same as the identifier corresponding to the latest voice request, the currently acquired voice output result is output only after the currently acquired voice output result is determined to correspond to the latest voice request, so that the matching of the voice output result and the voice request is realized, and the user experience is improved. On the other hand, the method provided by the invention completely matches the voice request with the voice output result by the multimedia terminal in a mode of giving the unique identifier without additional operation of the server, thereby avoiding the reconstruction of the server and saving network transmission resources.
Referring to fig. 3, a flowchart of a data processing method according to a third embodiment of the present invention is shown.
In the methods provided in the first and second embodiments of the present invention, the unique identifier given to the generated voice request may specifically be a timestamp, a universally unique identifier UUID, or a hash value, and is used to uniquely identify the voice request and a voice output result corresponding to the voice request. The specific application scenario of the present invention is described below by taking the unique identifier as a timestamp as an example. The following method can also be used in situations where other identifications are used. Alternatively, those skilled in the art can modify and modify the methods provided in the following examples to adapt to the implementation of other forms of identification, and the embodiments obtained thereby are within the scope of the present invention.
In the third embodiment of the present invention, the case where the multimedia terminal receives two input requests is still taken as an example for description, and it can be understood by those skilled in the art that the method provided in the third embodiment of the present invention can also be applied to the case where the multimedia terminal receives a plurality of input requests. Those skilled in the art can make modifications and variations to the present invention without inventive step, and all such modifications and variations are within the scope of the present invention.
S301, receiving a first input.
S302, generating a first voice request according to the first input, generating a first local timestamp corresponding to the first voice request, and generating a global timestamp according to a time generated by the first voice request.
In a specific implementation, one possible implementation manner of generating the first voice request according to the first input is as follows: processing the first input to obtain a first processing result; and taking the first processing result as a first voice request. In specific implementation, a user performs a first input through a multimedia terminal to initiate a first voice request, and when the user desires to play a processing result of the first input, the user needs to process the first input first to obtain a first processing result, and the first processing result is used as the first voice request. To illustrate by way of an example, the user sends an input (which may be a text input or a voice input) to the multimedia terminal asking for "what is now", at which point the multimedia terminal needs to process the input, i.e. obtain the current time, and take the result of processing the input (e.g. now 12 points) as the first voice request. Of course, this is only a simple example, and the processing of the first input by the multimedia terminal may involve more complex processing, such as querying, retrieving, translating, converting, etc., which is not limited by the present invention.
When a first voice request is generated according to the first input, according to the time of the first voice request generation, a first local timestamp corresponding to the first voice request is generated as a first identifier, and the corresponding relation between the first voice request and the first local timestamp is stored.
Further, after generating the first local timestamp and storing a corresponding relationship between the first local timestamp and the first voice request, the method provided by the present invention further includes: and generating a global timestamp as a third identifier according to the time generated by the first voice request. The global timestamp corresponds to the most recent voice request. In particular, when the first voice request is generated and the first local timestamp is generated, a copy of the first local timestamp is taken as the global timestamp. The global timestamp is updated when a new voice request is generated.
S303, acquiring a first voice output result obtained by processing the first voice request.
S304, receiving a second input.
Wherein the second input occurs after the first input.
S305, generating a second voice request according to the second input, generating a second local timestamp corresponding to the second voice request, and updating the global timestamp according to the time generated by the second voice request.
The implementation manner of generating the second voice request according to the second input is the same as the implementation manner of generating the first request according to the first input. During specific implementation, a second voice request and a second local timestamp corresponding to the second voice request are generated according to the second input, and the corresponding relation between the second voice request and the second local timestamp is stored.
Further, as mentioned previously, a global timestamp is generated at the same time as or after the first local timestamp is generated, the global timestamp corresponding to the most recent voice request. The global timestamp is updated when a new voice request is generated, i.e. when a second voice request is generated. Specifically, when the second voice request is generated and the second local timestamp is generated, a copy of the second local timestamp is taken as the global timestamp. In this way, the global timestamp is updated when a new voice request is generated.
It will be understood by those skilled in the art that the second input is generated later than the first input, but the execution order of the steps (S302, S303) of the first input processing and the steps (S305, S306) of the second input processing may be performed in reverse, or in parallel.
S306, a second voice output result obtained by processing the second voice request is obtained.
S307, acquiring the global timestamp, comparing the first local timestamp with the global timestamp, acquiring a first determination result, and if the first determination result indicates that the first local timestamp is different from the global timestamp, performing step S308.
S308, comparing whether a second local timestamp corresponding to the second voice output result is the same as the global timestamp or not, and acquiring a second judgment result.
S309, when the second judgment result shows that the second local timestamp corresponding to the second voice output result is the same as the global timestamp, playing the second voice output result corresponding to the second voice request.
And when the second local timestamp corresponding to the second voice output result is judged to be the same as the global timestamp, determining that the second voice output result corresponds to the latest voice request, and playing the second voice output result corresponding to the second voice request. If the current input is multiple, when the second local timestamp corresponding to the second voice output result is judged to be different from the global timestamp, namely the second voice output result is determined not to correspond to the latest voice request, the second voice output result is not played.
In the third embodiment of the present invention, in a specific implementation, a unique identifier is assigned to the voice request by using a timestamp, the identifier corresponding to the currently obtained voice output result is compared with the timestamp corresponding to the latest voice request, and when the identifier corresponding to the currently obtained voice output result is judged to be the same as the timestamp corresponding to the latest voice request, the currently obtained voice output result is output only when the currently obtained voice output result is determined to correspond to the latest voice request, so that matching between the voice output result and the voice request is achieved, user experience is improved, and the method is simple to implement.
Furthermore, in the first embodiment, the second embodiment and the third embodiment of the present invention, after the multimedia terminal plays the voice output result, the method may further include: and converting the voice output result meeting the first preset condition into a control signaling, and controlling the multimedia terminal to execute the control signaling. For example, when the user inputs "water of forgetting to play liu de hua" through text or voice, the voice output result obtained after the multimedia terminal processes the input is "water of forgetting to play liu de hua for you now", and at this time, the multimedia terminal may control the processing unit of the multimedia terminal to search the media library and play the audio data matched with the voice output result while playing the voice output result. The above is only an example and is not to be considered as a limitation of the present invention, and other embodiments obtained by those skilled in the art without inventive efforts belong to the protection scope of the present invention.
Fig. 4 is a schematic diagram of a data processing apparatus according to an embodiment of the present invention.
The device comprises:
a first receiving unit 401, configured to receive a first input.
A first generating unit 402, configured to generate a first voice request according to the first input.
A first obtaining unit 403, configured to obtain a first voice output result obtained by processing the first voice request.
A first determining unit 404, configured to determine whether the first voice output result meets a first preset condition, and obtain a first determination result.
An output unit 405, configured to not play the first voice output result when the first determination result indicates that the first voice output result does not satisfy the first preset condition.
Preferably, the apparatus further comprises:
a second receiving unit for receiving a second input;
a second generating unit, configured to generate a second voice request according to the second input;
the second acquisition unit is used for acquiring a second voice output result obtained by processing the second voice request;
the second judging unit is used for judging whether the second voice output result meets the first preset condition or not when the first voice output result does not meet the first preset condition, and acquiring a second judging result;
the output unit is further configured to play a second voice output result corresponding to the second voice request when the second determination result indicates that the second voice output result satisfies a first preset condition.
Preferably, the first generating unit is specifically configured to process the first input to obtain a first processing result; and taking the first processing result as a first voice request.
Preferably, the first generating unit is further configured to generate a first voice request and a first identifier corresponding to the first voice request according to the first input, and store a corresponding relationship between the first voice request and the first identifier.
Preferably, the first judging unit includes:
the second acquisition unit is used for acquiring a first voice request corresponding to a first voice output result according to the first voice output result;
a third obtaining unit, configured to obtain a first identifier according to a corresponding relationship between the first voice request and the first identifier;
the comparison unit is used for acquiring a third identifier, comparing the first identifier with the third identifier, and determining that a first preset condition is met when the first identifier is the same as the third identifier; wherein the third identification corresponds to a most recent voice request.
Preferably, the first obtaining unit includes:
the sending unit is used for sending the first voice request to a server so that the server processes the first voice request to obtain a first voice output result;
and the third receiving unit is used for receiving the first voice output result sent by the server.
Preferably, the first identifier is a timestamp, a universally unique identifier UUID, or a hash value.
Preferably, when the first identifier is a timestamp, the first generating unit includes:
a voice request generating unit, configured to generate a first voice request according to the first input;
a first identifier generation unit, configured to generate a first local timestamp corresponding to the first voice request as a first identifier according to the time when the first voice request is generated, and store a correspondence between the first voice request and the first local timestamp;
a third identifier generating unit, configured to generate a global timestamp as a third identifier according to the time generated by the first voice request; the third identification is updated when a new voice request is generated.
Preferably, the comparing unit is specifically configured to obtain a global timestamp, where the global timestamp corresponds to the latest voice request; comparing a first local timestamp corresponding to the first voice request to the global timestamp.
Preferably, the data processing apparatus may further include an audio acquisition unit for acquiring a voice input.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The invention may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The invention may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus embodiment, since it is substantially similar to the method embodiment, it is relatively simple to describe, and reference may be made to some descriptions of the method embodiment for relevant points. The above-described embodiments of the apparatus are merely illustrative, wherein the modules described as separate parts may or may not be physically separate, and the parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The foregoing is directed to embodiments of the present invention, and it is understood that various modifications and improvements can be made by those skilled in the art without departing from the spirit of the invention.

Claims (9)

1. A data processing method, applied to a multimedia terminal, the method comprising:
receiving a first input;
generating a first voice request according to the first input;
acquiring a first voice output result obtained by processing the first voice request;
receiving a second input;
generating a second voice request according to the second input;
acquiring a second voice output result obtained by processing the second voice request;
judging whether the first voice output result meets a first preset condition or not, and obtaining a first judgment result;
when the first judgment result shows that the first voice output result does not meet the first preset condition, judging whether the second voice output result meets the first preset condition or not, and obtaining a second judgment result;
and when the second judgment result shows that the second voice output result meets a first preset condition, playing the second voice output result corresponding to the second voice request, and not playing the first voice output result.
2. The method of claim 1, wherein generating a first voice request based on the first input comprises:
processing the first input to obtain a first processing result;
and taking the first processing result as a first voice request.
3. The method of claim 1 or 2, wherein generating a first voice request based on the first input comprises:
and generating a first voice request and a first identifier corresponding to the first voice request according to the first input, and storing the corresponding relation between the first voice request and the first identifier.
4. The method of claim 1, wherein obtaining the first speech output result from processing the first speech request comprises:
sending the first voice request to a server so that the server processes the first voice request to obtain a first voice output result;
and receiving a first voice output result sent by the server.
5. The method of claim 3, wherein the first identifier is a timestamp, a Universally Unique Identifier (UUID), or a hash value.
6. A data processing method, applied to a multimedia terminal, the method comprising:
receiving a first input;
generating a first voice request and a first identifier corresponding to the first voice request according to the first input, and storing the corresponding relation between the first voice request and the first identifier;
acquiring a first voice output result obtained by processing the first voice request;
acquiring a first identifier according to the corresponding relation between the first voice request and the first identifier;
acquiring a third identifier, comparing the first identifier with the third identifier, and determining that a first preset condition is met when the first identifier is the same as the third identifier; wherein the third identification corresponds to a most recent voice request;
and when the first judgment result shows that the first voice output result does not meet the first preset condition, not playing the first voice output result.
7. The method of claim 6, wherein obtaining the third identifier, comparing the first identifier with the third identifier:
obtaining a global timestamp corresponding to a latest voice request;
comparing a first local timestamp corresponding to the first voice request to the global timestamp.
8. A data processing method, applied to a multimedia terminal, the method comprising:
receiving a first input;
generating a first voice request and a first identifier corresponding to the first voice request according to the first input, and storing the corresponding relation between the first voice request and the first identifier, wherein the first identifier is a timestamp, a universal unique identification code (UUID) or a hash value;
acquiring a first voice output result obtained by processing the first voice request;
judging whether the first voice output result meets a first preset condition or not, and obtaining a first judgment result;
when the first judgment result shows that the first voice output result does not meet a first preset condition, the first voice output result is not played;
when the first identifier is a timestamp, generating a first voice request and a first identifier corresponding to the first voice request according to the first input, and storing a corresponding relationship between the first voice request and the first identifier includes:
generating a first voice request according to the first input;
generating a first local timestamp corresponding to the first voice request as a first identifier according to the time of the first voice request generation, and storing the corresponding relation between the first voice request and the first local timestamp;
the method further comprises the following steps:
generating a global timestamp as a third identifier according to the time generated by the first voice request; the third identification is updated when a new voice request is generated.
9. A data processing apparatus, characterized in that the apparatus comprises:
a first receiving unit for receiving a first input;
a first generating unit, configured to generate a first voice request according to the first input;
the first acquisition unit is used for acquiring a first voice output result obtained by processing the first voice request;
a second receiving unit for receiving a second input;
a second generating unit, configured to generate a second voice request according to the second input;
the second acquisition unit is used for acquiring a second voice output result obtained by processing the second voice request;
the first judging unit is used for judging whether the first voice output result meets a first preset condition or not and acquiring a first judging result;
the second judging unit is used for judging whether the second voice output result meets the first preset condition or not when the first voice output result does not meet the first preset condition, and acquiring a second judging result;
and the output unit is used for playing the second voice output result corresponding to the second voice request and not playing the first voice output result when the second judgment result shows that the second voice output result meets a first preset condition.
CN201710930363.9A 2012-12-11 2012-12-11 Data processing method and device Active CN107610690B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710930363.9A CN107610690B (en) 2012-12-11 2012-12-11 Data processing method and device

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710930363.9A CN107610690B (en) 2012-12-11 2012-12-11 Data processing method and device
CN201210533421.1A CN103871410B (en) 2012-12-11 2012-12-11 A kind of data processing method and device

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
CN201210533421.1A Division CN103871410B (en) 2012-12-11 2012-12-11 A kind of data processing method and device

Publications (2)

Publication Number Publication Date
CN107610690A CN107610690A (en) 2018-01-19
CN107610690B true CN107610690B (en) 2021-09-14

Family

ID=50909874

Family Applications (2)

Application Number Title Priority Date Filing Date
CN201710930363.9A Active CN107610690B (en) 2012-12-11 2012-12-11 Data processing method and device
CN201210533421.1A Active CN103871410B (en) 2012-12-11 2012-12-11 A kind of data processing method and device

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN201210533421.1A Active CN103871410B (en) 2012-12-11 2012-12-11 A kind of data processing method and device

Country Status (1)

Country Link
CN (2) CN107610690B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107767872A (en) * 2017-10-13 2018-03-06 深圳市汉普电子技术开发有限公司 Audio recognition method, terminal device and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2308821A1 (en) * 1999-05-25 2000-11-25 Command Audio Corporation Playing audio of one kind in response to user action while playing audio of another kind
CN1245704C (en) * 2003-09-29 2006-03-15 微星科技股份有限公司 Voice output / input system and method
CN101253547A (en) * 2005-04-29 2008-08-27 摩托罗拉公司 Speech dialog method and system

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3715584B2 (en) * 2002-03-28 2005-11-09 富士通株式会社 Device control apparatus and device control method
WO2005091128A1 (en) * 2004-03-18 2005-09-29 Nec Corporation Voice processing unit and system, and voice processing method
US8099289B2 (en) * 2008-02-13 2012-01-17 Sensory, Inc. Voice interface and search for electronic devices including bluetooth headsets and remote systems
JP5466519B2 (en) * 2010-01-20 2014-04-09 日立コンシューマエレクトロニクス株式会社 Information processing apparatus and signal processing method for information processing apparatus
CN102255780A (en) * 2010-05-20 2011-11-23 株式会社曙飞电子 Home network system and control method
CN102262879B (en) * 2010-05-24 2015-05-13 乐金电子(中国)研究开发中心有限公司 Voice command competition processing method and device as well as voice remote controller and digital television
CN102316227B (en) * 2010-07-06 2014-06-04 宏碁股份有限公司 Data processing method for voice call process

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2308821A1 (en) * 1999-05-25 2000-11-25 Command Audio Corporation Playing audio of one kind in response to user action while playing audio of another kind
CN1245704C (en) * 2003-09-29 2006-03-15 微星科技股份有限公司 Voice output / input system and method
CN101253547A (en) * 2005-04-29 2008-08-27 摩托罗拉公司 Speech dialog method and system

Also Published As

Publication number Publication date
CN103871410B (en) 2017-09-29
CN103871410A (en) 2014-06-18
CN107610690A (en) 2018-01-19

Similar Documents

Publication Publication Date Title
US11240050B2 (en) Online document sharing method and apparatus, electronic device, and storage medium
CN109474843B (en) Method for voice control of terminal, client and server
KR101777392B1 (en) Central server and method for processing of voice of user
CN103970793B (en) Information query method, client and server
US11310066B2 (en) Method and apparatus for pushing information
CN109688475B (en) Video playing skipping method and system and computer readable storage medium
CN107423070B (en) Page generation method and device
CN109271130B (en) Audio playing method, medium, device and computing equipment
CN105072146B (en) Music information sharing method and device
CN110196927B (en) Multi-round man-machine conversation method, device and equipment
KR101919257B1 (en) Application program switch method, apparatus and electronic terminal
CN110418181B (en) Service processing method and device for smart television, smart device and storage medium
CN103701994A (en) Automatic responding method and automatic responding device
CN112331202A (en) Voice screen projection method and device, electronic equipment and computer readable storage medium
CN104853251A (en) Online collection method and device for multimedia data
CN113823282B (en) Voice processing method, system and device
CN111883131A (en) Voice data processing method and device
CN110083768B (en) Information sharing method, device, equipment and medium
CN111063348A (en) Information processing method, device and equipment and computer storage medium
CN110928603A (en) Service providing method and device
CN113420159A (en) Target customer intelligent identification method and device and electronic equipment
CN104239371B (en) A kind of command information processing method and processing device
CN108509442B (en) Search method and apparatus, server, and computer-readable storage medium
CN107610690B (en) Data processing method and device
CN106792125A (en) A kind of video broadcasting method and its terminal, system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant