CN106951091B - Processing method and device - Google Patents

Processing method and device Download PDF

Info

Publication number
CN106951091B
CN106951091B CN201710201531.0A CN201710201531A CN106951091B CN 106951091 B CN106951091 B CN 106951091B CN 201710201531 A CN201710201531 A CN 201710201531A CN 106951091 B CN106951091 B CN 106951091B
Authority
CN
China
Prior art keywords
data
sound
monitoring
data set
obtaining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710201531.0A
Other languages
Chinese (zh)
Other versions
CN106951091A (en
Inventor
魏云龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lenovo Beijing Ltd
Original Assignee
Lenovo Beijing Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Lenovo Beijing Ltd filed Critical Lenovo Beijing Ltd
Priority to CN201710201531.0A priority Critical patent/CN106951091B/en
Publication of CN106951091A publication Critical patent/CN106951091A/en
Application granted granted Critical
Publication of CN106951091B publication Critical patent/CN106951091B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3343Query execution using phonetics

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present disclosure discloses a processing method and apparatus. The processing method comprises the following steps: obtaining a first sound, wherein the first sound corresponds to first data comprising an ordered first portion and a second portion; obtaining second data, wherein the second data comprises a third part and a fourth part which are ordered, and the data of the third part is matched with the data of the second part; wherein the second data corresponds to a second sound.

Description

Processing method and device
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a processing method and device.
Background
With the increasingly widespread use of smart devices, smart devices play a prominent role in social interaction. For example, a device with a voice recognition function can recognize a person's voice and then perform some corresponding processing to meet the needs of the person, providing a variety of services to the user. For another example, in the process of information interaction between people, the intelligent device can also provide powerful help for the interaction between people by identifying the interaction information between people.
Disclosure of Invention
One aspect of the present disclosure provides a processing method, including:
obtaining a first sound, wherein the first sound corresponds to first data comprising an ordered first portion and a second portion;
obtaining second data, wherein the second data comprises a third part and a fourth part which are ordered, and the data of the third part is matched with the data of the second part;
wherein the second data corresponds to a second sound.
Optionally, after the obtaining the second data, the method further comprises:
and outputting the second sound.
Optionally, the obtaining the second data includes:
dividing the first data according to a first strategy to obtain the first part and the second part;
determining at least one search datum using the second portion of data;
searching the at least one piece of search data in the determined candidate data set to obtain a search result;
when the search result comprises at least one piece of data, determining the second data from the at least one piece of data according to a second strategy; each piece of data in the at least one piece of data comprises a third portion and a fourth portion which are ordered; the third portion of each of the at least one piece of data includes the at least one search datum.
Optionally, the determining the second data from the at least one piece of data according to the second policy includes:
extracting target data which does not match any data in the monitoring data set from the at least one piece of data;
if at least one piece of target data exists, selecting one piece of data in the target data as second data; the monitoring data set at least comprises first data;
storing the second data in the monitoring data set.
Optionally, the method further comprises:
and if the search result is empty or the target data is empty, outputting a prompt and emptying the monitoring data set.
Optionally, after the obtaining the first sound, the method further comprises:
obtaining a judgment result, wherein the judgment result indicates whether the first data is matched with data in a monitoring data set; the monitoring data set is used for storing data obtained after the monitoring data set is emptied for the last time;
and outputting a judgment result at least when the judgment result shows that the first data is matched with the data in the monitoring data set.
Optionally, after obtaining the determination result, the method further includes:
and when the judgment result is that the monitoring data set is empty or the data in the monitoring data set is not matched, storing the first data into the monitoring data set.
Optionally, the obtaining second data comprises:
obtaining a second sound;
obtaining the second data according to the second sound;
after the obtaining second data, the method further comprises:
obtaining a judgment result, wherein the judgment result indicates whether the second data is matched with data in the monitoring data set; the monitoring data set comprises at least the first data; when the judgment result is that the data in the monitoring data set is not matched, storing the second data into the monitoring data set; and
and outputting a judgment result at least when the judgment result shows that the second data is matched with the data in the monitoring data set.
Optionally, after the obtaining of the determination result, the method further includes:
and when the judgment result shows that the second data is matched with the data in the monitoring data set, emptying the data in the monitoring data set.
Another aspect of the present disclosure provides a processing device, including:
the device comprises a sound acquisition device, a processing device and a processing device, wherein the sound acquisition device is used for acquiring a first sound, the first sound corresponds to first data, and the first data comprises a first part and a second part which are ordered;
the data acquisition device is used for acquiring second data, the second data comprises a third part and a fourth part which are ordered, and the data of the third part is matched with the data of the second part;
wherein the second data corresponds to a second sound.
Drawings
For a more complete understanding of the present disclosure and the advantages thereof, reference is now made to the following descriptions taken in conjunction with the accompanying drawings, in which:
FIG. 1 illustrates a method flow diagram of a processing method in an exemplary embodiment of the present disclosure;
FIG. 2 illustrates a method flow diagram of step 102 in an exemplary embodiment of the present disclosure;
FIG. 3 shows a method flowchart of step 204 in an exemplary embodiment of the present disclosure;
FIG. 4 illustrates a flow chart of a method of determining whether a first data is duplicative of a data in a monitored data set in an exemplary embodiment of the present disclosure;
FIG. 5 illustrates a flow chart of a method of determining whether first data is duplicative of data in a monitored data set in another exemplary embodiment of the present disclosure;
FIG. 6 shows a block diagram of a processing device in an exemplary embodiment of the present disclosure;
FIG. 7 shows a block diagram of a processing device in an exemplary embodiment of the present disclosure.
Detailed Description
Other aspects, advantages, and salient features of the disclosure will become apparent to those skilled in the art from the following detailed description, which, taken in conjunction with the annexed drawings, discloses exemplary embodiments of the disclosure.
In the present disclosure, the terms "include" and "comprise," as well as derivatives thereof, mean inclusion without limitation; the term "or" is inclusive, meaning and/or.
In this specification, the various embodiments described below which are used to describe the principles of the present disclosure are by way of illustration only and should not be construed in any way to limit the scope of the invention. The following description with reference to the accompanying drawings is provided to assist in a comprehensive understanding of exemplary embodiments of the present disclosure as defined by the claims and their equivalents. The following description includes various specific details to aid understanding, but such details are to be regarded as illustrative only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Moreover, descriptions of well-known functions and constructions are omitted for clarity and conciseness. Moreover, throughout the drawings, the same reference numerals are used for similar functions and operations.
Some block diagrams and/or flow diagrams are shown in the figures. It will be understood that some blocks of the block diagrams and/or flowchart illustrations, or combinations thereof, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the instructions, which execute via the processor, create means for implementing the functions/acts specified in the block diagrams and/or flowchart block or blocks.
Accordingly, the techniques of this disclosure may be implemented in hardware and/or software (including firmware, microcode, etc.). Additionally, the techniques of this disclosure may take the form of a computer program product on a computer-readable medium having instructions stored thereon for use by an instruction execution system. In the context of this disclosure, a computer-readable medium may be any medium that can contain, store, communicate, propagate, or transport the instructions. For example, the computer readable medium can include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, device, or propagation medium. Specific examples of the computer readable medium include: magnetic storage devices, such as magnetic tape or Hard Disk Drives (HDDs); optical storage devices, such as compact disks (CD-ROMs); a memory, such as a Random Access Memory (RAM) or a flash memory; and/or wired/wireless communication links.
Fig. 1 shows a flowchart of a processing method proposed by an exemplary embodiment of the present disclosure. As shown in fig. 1, the method includes steps 101 and 102, wherein:
in step 101, obtaining a first sound, wherein the first sound corresponds to first data, and the first data comprises a first part and a second part which are ordered;
in step 102, obtaining second data, wherein the second data comprises a third part and a fourth part which are ordered, and the data of the third part is matched with the data of the second part; wherein the second data corresponds to a second sound.
In this embodiment, a first sound corresponding to first data and second data corresponding to a second sound are obtained, the first data includes a first portion and a second portion which are ordered, the second data includes a third portion and a fourth portion which are ordered, and data of the third portion matches data of the second portion. The embodiment obtains the second data on the basis of obtaining the first sound, and the second data has a certain matching relation with the first data corresponding to the first sound, namely, the front part of the second data is matched with the rear part of the first data. According to the embodiment, the voice can be intelligently recognized in the mode, the data related to the data corresponding to the recognized voice can be obtained through matching in a certain mode, and convenience can be provided for a user.
In the embodiment of the present disclosure, the first sound corresponds to the first data, that is, the first sound is a speech expression, and the first data is a character expression, and the first sound matches with the pronunciation of the first data character. Similarly, the second sound matches the pronunciation of the second data character, except that the second sound is in a phonetic representation and the second data is in a character representation.
In the embodiment of the present disclosure, the first data may be words, idioms, sentences, or the like composed of chinese characters, or may also be english words, phrases, or sentences composed of characters; the first part and the second part may be two length ranges obtained by dividing the first data according to the number of characters or the number of characters, that is, the first part and the second part may be the number of characters or the number of words, and of course, the first part and the second part may also be the number of syllables. For example, if the first data is a four-character idiom, the first part is 1, and the second part is 3, the data of the first part is the first 1 Chinese characters in the first data, and the data of the second part is the last 3 Chinese characters; for another example, the first data is an english word composed of five characters, and the first part is 1, and the second part is 4, then the data of the first part is the first letter in the first data, and the data of the second part is the last 4 characters. Similarly, the second data may be words, idioms, sentences and the like composed of chinese characters, or english words, phrases, sentences and the like composed of characters, and the third portion and the fourth portion may also be two length ranges obtained by dividing the second data according to the number of characters or the number of characters, that is, the third portion and the fourth portion may be the number of characters or the number of characters, or the third portion and the fourth portion may also be the number of syllables.
In the embodiment of the present disclosure, the data of the second part in the first data may include a single chinese character, word, or sentence, etc., and the data of the first part includes other words, or sentences, etc., except for the data of the second part in the first data; the data of the second part may also include single characters, words, phrases or sentences in english words, phrases and sentences, and the data of the first part may include other characters, words, phrases or sentences except the data of the second part in the first data. Similarly, the third part of data in the second data may include a single chinese character, word or sentence, etc., and the fourth part of data includes other words, words or sentences, etc. in the second data except for the third part of data; the data of the third part may also include single characters, words, phrases or sentences in english words, phrases and sentences, etc., and the data of the fourth part may include other characters, words, phrases or sentences in the second data except for the data of the third part.
In the embodiment of the present disclosure, the data of the third portion and the data of the second portion match means that both follow a certain matching rule, for example, it may be understood that the data of the third portion and the data of the second portion are the same data, or may be homophonic or nearsighted data; for the first data in english, the data in the third part and the data in the second part may also be words or phrases with the same initial, the same or similar first pronunciation, etc.
In the embodiment of the present disclosure, the number of characters corresponding to the first data and the second data may be the same, and the number of words or characters or the number of syllables corresponding to the data of the second portion and the data of the third portion are also the same. For example, the first data and the second data are both in a four-word idiom, the second part of data is the last word of the first data, and the third part of data is the first word of the second data; the number of words or characters or the number of syllables corresponding to the first data and the second data can also be different, and the number of words or characters or the number of syllables corresponding to the data of the second part and the data of the third part are the same; for example, the first data and the second data are both a sentence of lyrics or poems, the second part of data is the last word in the first data, and the third part of data is the first word in the second data; for another example, the first data and the second data correspond to english words such as good and precision, respectively, while the data of the second part is the last letter of the first data, and the data of the third part is the first letter of the second data, and both are d.
In an embodiment of the present disclosure, after the obtaining the second data, the method further includes: and outputting the second sound. In this embodiment, after the second data is obtained, the second sound corresponding to the second data is output. Through the embodiment, after the first sound is obtained, the second data with the correlation with the first data corresponding to the first sound can be obtained (namely the second part of the first data is matched with the third part of the second data), and the second sound is obtained according to the second data and then output, so that the voice interaction with the user is realized, and the user can be provided with easy and pleasant entertainment activities.
In an embodiment of the present disclosure, an executor of the method may be an intelligent electronic device, where the intelligent electronic device identifies a first sound of a user, processes the first sound, and outputs a second sound for the user; the executing body of the method can also be a server, and the server obtains the first sound from a voice electronic device, and controls the voice electronic device to play the second sound after performing corresponding processing. No matter which mode, can both realize with user's voice interaction, improve user experience.
In an embodiment of the present disclosure, the method executor in the voice interaction process is used as a participant in the voice interaction process to perform voice interaction with a participant user. In some embodiments, the method executor serves as a participant and also monitors whether the first sound obtained from the user during the voice interaction process complies with the voice interaction rule, i.e., plays a role of monitoring. In the above embodiments, the first sound is obtained from the participant user, and the second sound is obtained and output by the method executer as the participant after obtaining the second data from the first sound of the previous participant user. The step of obtaining the first sound may be performed one or more times according to the number and order of the participant users, for example, only once in case of only one participant user, and once in case of multiple participant users, the obtaining of the second data and the outputting of the second sound are performed for the method executer as participants, and performed once in each cycle of a round of voice interaction, where a cycle mentioned herein refers to that multiple participants including the method executer interact once before the end of a round of voice interaction, for example, there are n-1 participant users and n participants in total, starting from the first participant in a preset order, the n-1 participant users provide the first sound and the method executer provides the second sound as a cycle, the premise is that n-1 participants and the method executors can provide a first sound or a second sound which accords with the voice interaction rule; of course, if one of the participants (including the participant user and the method executor) fails to provide the first sound or the second sound according with the rule of the voice interaction in a loop, the loop is ended, and the voice interaction in the current round is also ended, and a round of voice interaction may be executed for a plurality of complete loops, or may not be executed for one complete loop. Although the method executable is mentioned as a participant, in practical applications, the participant may be an electronic device capable of recognizing voice and playing voice, and the steps of obtaining the first sound and obtaining the second data and the corresponding second sound may be performed at a background server or a cloud, which may be set according to practical situations.
Fig. 2 shows a flowchart of a method of step 102 provided in an exemplary embodiment of the present disclosure, and as shown in fig. 2, step 102 includes steps 201 to 204, where:
in step 201, dividing the first data according to a first policy to obtain the first part and the second part;
in step 202, determining at least one search datum using the second portion of data;
in step 203, searching the at least one search datum in the determined candidate data set to obtain a search result;
in step 204, when at least one piece of data is included in the search result, determining second data from the at least one piece of data according to a second strategy; each piece of data in the at least one piece of data comprises a third portion and a fourth portion which are ordered; the third portion of each of the at least one piece of data includes one of the at least one search datum.
In this embodiment, to obtain the second data, the first data is first divided into a first part and a second part according to a preset first policy. The preset first policy may be set according to actual requirements, for example, if the first data is a four-word idiom, the first policy is to divide the four-word idiom into two parts including 3 words and 1 word by length, the first data is a sentence of lyrics or poems, and the first policy is to divide the lyrics into two parts including n-1 words (n is the length of the first data) and 1 word by length.
In this embodiment, when determining the search data by using the second part of data, the determination is specifically performed according to a matching rule between the second part of data of the first data and the third part of data of the second data, and the matching rule is preset according to an actual requirement. For example, when the matching rule of the second part data of the first data and the third part data of the second data includes that both are homophones or the same word, all the words that are the same or homophones as the data of the second part are determined as search data. After the search data is determined, a search is performed in the candidate data set based on the determined search data. In an embodiment, in the case that the determined search data includes a plurality of search data, one of the search data may be selected to be searched; of course, it is understood that in other embodiments, a plurality of or all of the search data may be selected for searching according to actual requirements.
In this embodiment, the candidate data set is a predetermined set including a plurality of data, and the candidate data set may be one or more, specifically set according to actual requirements. For example, in the idiom-to-dragon voice interaction process, the candidate data set is a data set including a plurality of four-character idioms, and may include all the collected four-character idioms, or may be a part of the four-character idioms, and may be set in consideration of interaction difficulty, entertainment and other aspects. For example, for an interaction with a high difficulty, idioms with a last word that is less common or less common at the beginning of the idioms can be recorded in the candidate data set, and for an interaction with a high entertainment requirement, some idioms that are more interesting or can increase the smile point can be selected. The selection of the candidate data sets may also be determined according to a preset setting, for example, a candidate data set with a small difficulty is selected in the first round, the difficulty of the candidate data set selected later is larger, or only one of the candidate data sets is selected all the time, and the like.
In this embodiment, the searching for the search data from the candidate data set may be performed by a matching rule, where the matching rule is related to a dividing rule of the third portion and the fourth portion in the second data, for example, when the dividing rule of the third portion and the fourth portion in the second data is divided according to the first word/character/syllable and the remaining word/character/syllable in the second data, the data with the first word/character/syllable being the same as the search data is searched from the candidate data set, and the like. In summary, the data in the candidate data set can be divided by means of the third and fourth parts, as can the second data. Thus, each piece of data in the search results searched from the candidate data set includes the ordered third portion and fourth portion, and the third portion includes the search data. And when only one search result is obtained from the candidate data set, directly determining the search result as the second data, and when a plurality of search results are obtained, selecting one of the search results according to a second strategy to determine the second data. The second policy is specifically set according to actual conditions, and may be arbitrarily selected from search results and determined as the second data, or may be selected and determined as the second data based on a certain rule based on a specific condition.
The applicable scene of the method is described below by taking idiom connection as an example. In this example, the intelligent electronic device is a participant in a phrase-by-phrase game with player A, B, C. If player a is the first participant in the idiom pick-up game, the intelligent electronic device is the next participant followed by player B, C. When a game starts, a player A speaks a first sound 'bamboo-forming chest', after the intelligent electronic device acquires the first sound, the intelligent electronic device determines that the corresponding first data is data with the same pronunciation as 'xiong gyouchung zhu', divides the first data into 'qiongyouchen' and 'zhu' which are first partial data, determines at least one search result according to the 'zhu', such as 'bamboo', searches a database of candidate words for the first four-character word phrase of the 'bamboo', determines the 'bamboo report safety' as second data if one of the searched results is 'bamboo report safety', outputs a second sound 'zhubaopingan' corresponding to the second data, and then the player B participates in the game again and performs similar circulation until the game is finished.
Fig. 3 shows a flowchart of a method of step 204 in an exemplary embodiment of the present disclosure. As shown in fig. 3, the determining the second data from the at least one piece of data according to the second policy in step 204 includes steps 301 to 303, where:
in step 301, extracting target data which does not match any data in the monitoring data set from the at least one piece of data;
in step 302, if there is at least one piece of the target data, selecting one piece of the target data as second data; the monitoring data set at least comprises first data;
in step 303, the second data is stored in the monitoring data set.
In this embodiment, when the search result includes a plurality of pieces of data, the target data that does not match any data in the monitoring data set is selected as the second data. The monitoring data set comprises first data corresponding to the acquired first sound or second data determined from the candidate data set in a round of voice interaction. The first round of voice interaction is from the first time of obtaining the first sound, and after one or more cycles of the processing method provided by the embodiment of the disclosure, the ending condition is reached. Zero or at least one loop can be included in a round of voice interaction, wherein one loop is the process of participating in one voice interaction by all participants, in the first loop, when a first sound is obtained for the first time, first data corresponding to the first sound is identified, the first data is stored in a monitoring data set (at this time, whether the first obtained data is matched or not is not considered), and then if the next participant is still a participant user, the step of obtaining the first sound is continuously executed until the next participant is a method executive or an end condition is reached (for example, the first data and the monitoring data set are repeated, or the first sound is not obtained within a preset time range, and the like). If the next participant is a method executive, the step of obtaining second data is executed, i.e. second data is obtained according to the first data and stored in the monitoring data set, and of course, if no suitable second data is found, the voice interaction of the current round is ended. After a round of voice interaction is finished, the monitoring data set is emptied, namely when each round of voice interaction is started, the monitoring data set is empty; in each round of voice interaction, the first sound may be obtained only once, or may be obtained many times, and the first data and the second data corresponding to the first sound obtained each time are stored in the monitoring data set under the condition that the preset voice interaction rule is satisfied. As can be seen, the monitoring data set is used to store data obtained after the monitoring data set is last used, that is, the data includes first data corresponding to the first sound obtained each time in the current round of voice interaction and the determined second data. Therefore, in order to avoid duplication, after a search result is obtained from the candidate data set each time, data which is not duplicated with the data in the monitoring data set is selected as the second data, and after the second data is selected, the second data is also stored in the monitoring data set.
FIG. 4 is a flow chart illustrating a method for determining whether a first data is duplicated with data in a monitored data set in an exemplary embodiment of the present disclosure. As shown in fig. 4, after the step 101, the method further includes steps 401 and 402, wherein:
in step 401, a determination result is obtained, where the determination result indicates whether the first data matches data in a monitoring data set; the monitoring data set is used for storing data obtained after the last emptying;
in step 402, outputting a judgment result at least when the judgment result indicates that the first data matches the data in the monitoring data set;
in step 403, when the determination result is that the monitoring data set is empty or data in the monitoring data set is not matched, storing the first data into the monitoring data set.
In this embodiment, in a round of voice interaction, a step of obtaining the first sound may be performed multiple times, and after obtaining the first sound once, first data corresponding to the first sound is identified, and it is determined whether the first data matches with data in the monitoring data set, and when the first data matches with data in the monitoring data set, it is described that the sound corresponding to the first data has occurred in the round of voice interaction, and may have occurred by obtaining the first sound, or may have occurred by using the second data (the second data is finally presented by using the second sound). In order to realize that the first sound and the second sound appearing in each round of voice interaction process are not repeated, after the first sound is obtained, the first sound is determined by whether the first data corresponding to the first sound is matched with the data in the monitoring data set. When the first data is matched with the data in the monitoring data set, a judgment result needs to be output, wherein the judgment result can be a prompt that the currently obtained first sound is repeated before, or a prompt that the voice interaction of the current round is ended after the first data corresponding to the first sound input by the user is repeated with the data in the monitoring data set after the prompt for many times. After each round of voice interaction is finished, the monitoring data set is emptied. If the first sound is obtained for the first time in a round of voice interaction, the monitoring data set is empty, or the first sound is obtained from the second time, if the first sound is not repeated with any data in the monitoring data set, under the two conditions, the first sound is in accordance with the voice interaction rule, and therefore the first data corresponding to the first sound is stored in the monitoring data set.
In one embodiment, in addition to the first sound being obtained and the previous repetition resulting in the end of a round of voice interaction, it is also possible that the second data acquisition failure results in the end of a round of voice interaction. Referring to the foregoing embodiment, when searching from a candidate data set, if the search result is empty or the target data is empty, a prompt is output, and the monitoring data set is emptied, which also indicates that a round of voice interaction is finished. In this embodiment, if no suitable second data is searched in the candidate data set, the voice interaction in the current round is ended.
For example, in a idiom pickup game, a user provides an idiom (corresponding to a first sound) in a voice mode, and a method executive body obtains another idiom with a first word matched with a last word of the idiom according to idiom search and presents the idiom in a voice mode.
In the idiom joining game, the monitoring data set stores idioms spoken or output by all players including intelligent electronic equipment in the current idiom joining game, such as 'adult bamboos at chest', 'bamboo report safety' and the like; if the player a turns to the player a again in the current round, the player a says that the first sound is 'strong chest-rising', the search result obtained by the intelligent electronic device in the manner described above only includes 'adult chest bamboos', and since the 'adult chest bamboos' exist in the monitoring database, that is, the intelligent electronic device has been used in the current round of idiom pickup game, the conclusion is reached that the target data is empty, the intelligent electronic device accepts the loss, the current round of idiom pickup game is ended, and the data in the monitoring data set is also emptied because the intelligent electronic device cannot continue to output idioms conforming to the current round of pickup game.
In addition, when the turn of other players is reached, the intelligent electronic device can also monitor idioms spoken by other players to determine whether the idioms spoken by the players have been used in the turn. For example, when the turn of the player B comes, B speaks the idiom of "anfuzonong", and the intelligent electronic device, after the first sound of the live player B, "anfuzonong", matches the data in the monitoring data set, and if the target data is not matched, stores the first data "anfuzonong" corresponding to "anfuzonong" into the monitoring data set; if "prosperous" already exists in the monitored dataset, player B may be prompted that the idiom has been used, etc.
In the processing method of the foregoing embodiment of the present disclosure, the method executor serves as a participant to complete a voice interaction process together with the user, and in the voice interaction process, the method executor may also serve as a supervisor besides serving as one of the participants, and is used to supervise whether the voice interactions of all the participants in the current round of voice interaction process meet the rules. After the user speaks the first sound, the method executive body obtains first data according to the first sound, and judges whether the first sound appears in the current round or not according to the first data, if so, the user can be prompted to speak the first sound again, or after multiple prompts, the first sound spoken by the user still does not accord with the rule, and the current round of voice interaction is ended; since the method executor also serves as a participant, when it is the turn of the method executor to output the voice, the method executor obtains second data according to the first data corresponding to the first voice spoken by the previous participant, and outputs a second voice corresponding to the second data.
It will be appreciated that in a round of voice interaction, the participant users may be one or more, in the case of multiple participant users, the method executors need to determine their own sequential relationship as a participant with other participant users, with the processing for the other participant users being performed with reference to the steps of obtaining the first sound, and with the method executors being in turn performed with reference to the steps of obtaining the second data and outputting the second sound. That is to say, in the processing method mentioned in the embodiment of the disclosure, when there are multiple participant users, the step of obtaining the first sound may be performed multiple times continuously, for example, there are m-1(m is an integer greater than 2) participant users in total, the method executor serves as a participant, and when performing a round of voice interaction with the m-1 participant users, if the interaction order is that the first m-x-1 is m-x-1 participant users, the method executor is an m-x participant, and then there are x participant users, and at this time, assuming that each participant does not make an error, the step of obtaining the first sound m-x-1 times is performed first, then the step of obtaining the second data and outputting the second sound is performed once, and finally the step of obtaining the first sound is performed m-1 times (the first m-x-1 participant and the last x participant) continuously Then the steps of obtaining the second data and outputting the second sound are performed once again, and so on.
Some details are described in detail below.
In an embodiment of the present disclosure, after the obtaining the first sound, the method further includes: judging whether the first sound is matched with a preset sound corresponding to the current sequence in a sound sequence set or not; the sound sequence set comprises at least two preset sounds with a cyclic sequence relation; and when the first sound is matched with the preset sound corresponding to the current sequence, taking the next preset sound in the sound sequence set as the preset sound corresponding to the current sequence. In this embodiment, when there are multiple participant users, the method executor may further recognize whether the current participant order matches a current preset sound in the preset sound order set according to the speech. The sound order set may be set manually before the voice interaction begins, such as each participant user uttering a sentence, for the method to perform body recognition, and recording the voice characteristics of each participant user in order as the sound order set; the sound order set may also be a self-learning setting from the beginning of the speech interaction. The self-learning setting means that in the voice interaction process, the method executive body obtains and processes the first voice, and simultaneously stores the voice characteristics of each participant user into the voice sequence set according to the sequence until all participant users participate in one voice interaction. Of course, it can be understood that the participation order of the method executors can be artificially set or randomly determined according to actual requirements. After each first sound of one participant user is obtained, matching is carried out according to the first sound and the voice characteristics corresponding to the current sequence in the sound sequence set, and if the first sound is not matched with the voice characteristics corresponding to the current sequence in the sound sequence set, the user can be informed of the sequence error in an output prompting mode; if the sound order set is matched with the current order set, the next sound in the sound order set is taken as the sound corresponding to the current order, and the first sound of the next participant user is received. In the embodiment, the method executor can be used as a participant to perform voice interaction with the user in the voice interaction process, and can also supervise whether the participant user complies with the voice interaction rule in the voice interaction process, so that the voice interaction between the users is facilitated.
In the above idiom pick-up game as an example, at the beginning of the game, from player a, the intelligent electronic device records the sound characteristics of each player, and forms a sound sequence set according to the game sequence during the game, if the preset sound corresponding to the current sequence is recorded in the sound sequence set as the sound of player a, the intelligent electronic device receives the first sound corresponding to the "precious but basic" order, and the first sound is recognized to be the sound of player B, and then the prompt of wrong sequence can be output.
In the embodiment of the present disclosure, the method executor further prompts the user to time out by monitoring whether the first sound of the participant user is obtained within the preset time, and if the preset time is exceeded, the method executor may even output the voice interaction result, for example, the result that the current user has been lost in the current round of voice interaction.
In the embodiment of the present disclosure, the method executive obtains the first sound of the user participant within a predetermined time, and divides the first data corresponding to the first sound into the first part and the second part according to the rule in the foregoing embodiment, if the participant user is not the first participant of the current round of voice interaction, the first data further needs to be compared with the previous data corresponding to the previous sound of the previous participant user; according to the description of the foregoing embodiment, the previous data corresponding to the previous sound of the previous participant user is divided into the first part and the second part according to the rule of the foregoing embodiment, the first data corresponding to the first sound of the current participant user is divided into the third part and the fourth part according to the rule of the foregoing embodiment, whether the first sound of the current participant meets the voice interaction rule is determined by comparing whether the second part and the third part are matched, if not, the user may be prompted to re-input the first sound, if the number of times of prompting exceeds the preset threshold, the previous data corresponding to the previous sound of the previous participant user still does not match, the result, that is, the result that the current participant user has output, may be output, and the current round of voice interaction is ended. For example, for a idiom extension voice interaction, the obtained first data corresponding to the first sound of the current participant user is idiom a, the idiom a is divided into a third part including the first three words and a fourth part including the last word, then the previous data corresponding to the previous sound input by the previous participant is obtained as idiom B, and it is determined whether the idiom B is divided into a first part including the first word and a second part including the remaining three words according to the rules of the foregoing embodiment, and the first part data of the first sound of the current participant user and the fourth part data of the previous data corresponding to the previous sound of the previous participant user are compared to be harmonious words.
Still taking the above idiom reception game as an example, if the player B says "anfurong" and then the next turn comes to the player C reception game, within the predetermined time, the player C says "easy to make mistakes", the intelligent electronic device prompts the user that "easy to make mistakes" is not an idiom after the identification processing, and if the player C says again not an idiom or the first pronunciation of an idiom is not "rong", or exceeds the predetermined time, the intelligent electronic device outputs a prompt that the number of times is over or is over, and determines that the player C has output, and ends the turn of idiom reception game.
In other embodiments of the present disclosure, the method executant does not act as a participant, but merely as a role of supervising whether or not the sound obtained from the user during the voice interaction complies with the voice interaction rules, which is described in detail by embodiments below.
FIG. 5 illustrates a flow chart of a method for determining whether first data is duplicative of data in a monitored data set in another exemplary embodiment of the present disclosure. As shown in fig. 5, the step 102 includes steps 501 and 502, and after the step 102, the method further includes steps 503 and 504. Wherein:
in step 501, a second sound is obtained;
in step 502, the second data is obtained from the second sound.
In step 503, obtaining a determination result, where the determination result indicates whether the second data matches data in the monitoring data set; the monitoring data set comprises at least the first data; when the judgment result is that the data in the monitoring data set is not matched, storing the second data into the monitoring data set; and
in step 504, a determination result is output at least when the determination result indicates that the second data matches the data in the monitoring data set.
In this embodiment, after the first sound is obtained, the second data is obtained according to the second sound, the second data is matched with the data in the monitoring data set, whether the second data is matched with the data in the monitoring data set is determined, when the second data is not matched with the data in the monitoring data set, the second data is stored in the monitoring data set, and when the second data is matched with the data in the monitoring data set, the determination result is output. In this embodiment, it is determined whether the obtained second data corresponding to the second sound meets the voice interaction rule, that is, whether the obtained second data does not overlap with existing data in the monitoring data set, and if the obtained second data does not overlap with the existing data in the monitoring data set, the determination result is output, and if the obtained second data does not overlap with the existing data in the monitoring data set, the second data is stored in the monitoring data set. In this implementation, the method executor serves as a supervisor for supervising whether the second sound obtained from the user complies with the voice interaction rule, and provides convenience for voice interaction between users. The definition and detailed description of the monitoring data set can be referred to the description in the foregoing embodiments, and are not repeated herein.
In this embodiment, if the second sound and the voice interaction rule are the same, the second sound of the next participant user can be obtained continuously and determined. In the method of this embodiment, at least two participants are required to participate in the voice interaction, and the method executor is configured to determine that the first sound obtained from a participant is the sound of the previous participant, and in a case that the sound of the first sound meets a predetermined voice interaction rule, the corresponding first data is already stored in the monitoring data set, so that it is only required to determine whether the second sound obtained from the current participant meets the voice interaction rule. Therefore, in the practical application process, the obtained sound of the current participant is processed according to the second sound, and the first sound is equivalent to the sound of the previous participant which is processed and conforms to the voice interaction rule.
In an embodiment of the present disclosure, after obtaining the determination result, the method further includes: and when the judgment result shows that the second data is matched with the data in the monitoring data set, emptying the data in the monitoring data set. If the second data does not conform to the voice interaction rule, namely when the second data is repeated with the data in the monitoring data set, the voice interaction process of the current round can be ended, and the data in the monitoring data set is emptied, so that a new round of voice interaction can be restarted. It will of course be appreciated that in other embodiments, the current participant may also be prompted to retrieve the second sound and the current round of voice interaction may end after the number of prompts exceeds a predetermined threshold. For specific details, reference may be made to the description of each embodiment of the foregoing method executer as a participant at the same time, and details are not described herein again.
The applicable scenario of the above method is still described below by taking idiom connection as an example. In this example, the intelligent electronic device is only a supervisor and does not participate in the game, for example, three players A, B, C play a idiom dragon-joining game. If player a is the first participant in the idiom pick-up game, then player B, C follows. When the game starts, the player A speaks a first sound "with a bamboo chest," because the first sound is the first idiom of the current round, the corresponding first data can be directly stored in the monitoring data set, then the second sound of the next player B is acquired, if the player B speaks the idiom "bamboo report well", after the intelligent electronic device acquires the second sound, the second data corresponding to the second sound is determined to be the data with the same pronunciation as "zhubaopinan", the second data is divided into the data "zhu" of the third part and the data "baopinan" of the fourth part, whether the data "zhu" of the third part is matched with the second part "zhu" of the first data of the previous player, namely the player A is determined, when the data is matched, the second data is matched with the monitoring data set, if no matched data is found, the second data is stored in the monitoring data set, and receives the second sound of the next player C.
If the intelligent electronic equipment determines that the idiom exists in the monitoring data set through matching after the idiom 'bamboo report safety' uttered by the player B is received again some time before the turn is not finished, outputting a prompt that the idiom is used, and directly emptying the monitoring data set, or emptying the monitoring data set when the correct idiom uttered by the player B is not received within time or the number of times that the player B utters the wrong idiom exceeds time, and finishing the turn of idiom pick-up game.
Fig. 6 shows a block diagram of a processing device 600 provided by an exemplary embodiment of the present disclosure. As shown in fig. 6, the processing apparatus includes:
a sound acquiring device 601, configured to acquire a first sound, where the first sound corresponds to first data, and the first data includes a first part and a second part that are ordered;
a data obtaining device 602, configured to obtain second data, where the second data includes a third portion and a fourth portion that are ordered, and data of the third portion matches data of the second portion;
wherein the second data corresponds to a second sound.
In this embodiment, the sound acquiring apparatus 601 acquires a first sound corresponding to first data, the data acquiring apparatus 602 acquires second data corresponding to a second sound, the first data includes a first portion and a second portion which are ordered, the second data includes a third portion and a fourth portion which are ordered, and data of the third portion matches data of the second portion. The embodiment obtains the second data on the basis of obtaining the first sound, and the second data has a certain matching relation with the first data corresponding to the first sound, namely, the front part of the second data is matched with the rear part of the first data. According to the embodiment, the voice can be intelligently recognized in the mode, the data related to the data corresponding to the recognized voice can be obtained through matching in a certain mode, and convenience can be provided for a user.
In the embodiment of the present disclosure, the sound acquiring apparatus 601 may be a voice acquiring apparatus on an intelligent electronic device with a voice recognition function, and the data acquiring apparatus 602 may be a processor on the intelligent electronic device, or a background server, etc. In this example. For the detailed functions of the sound acquiring apparatus 601 and the data acquiring apparatus 602, reference may be made to the above description of the method, and the detailed description thereof is omitted here.
According to an exemplary embodiment of the present disclosure, the data acquisition device 602 further has the following functions: dividing the first data according to a first strategy to obtain the first part and the second part; determining at least one search datum using the second portion of data; searching the at least one piece of search data in the determined candidate data set to obtain a search result; when the search result comprises at least one piece of data, determining the second data from the at least one piece of data according to a second strategy; each piece of data in the at least one piece of data comprises a third portion and a fourth portion which are ordered; the third portion of each of the at least one piece of data includes the at least one search datum.
In this embodiment, to obtain the second data, the first data is first divided into a first part and a second part according to a preset first policy. The preset first policy may be set according to actual requirements, for example, if the first data is a four-word idiom, the first policy is to divide the four-word idiom into two parts including 3 words and 1 word by length, the first data is a sentence of lyrics or poems, and the first policy is to divide the lyrics into two parts including n-1 words (n is the length of the first data) and 1 word by length.
In this embodiment, when determining the search data by using the second part of data, the determination is specifically performed according to a matching rule between the second part of data of the first data and the third part of data of the second data, and the matching rule is preset according to an actual requirement. For example, when the matching rule of the second part data of the first data and the third part data of the second data is that both are harmonic characters, all the characters that are the same as or near-phonetic to the second part data are determined as the search data. After the search data is determined, a search is performed in the candidate data set based on the determined search data. In an embodiment, in the case that the determined search data includes a plurality of search data, one of the search data may be selected to be searched; of course, it is understood that in other embodiments, a plurality of or all of the search data may be selected for searching according to actual requirements.
In this embodiment, the candidate data set is a predetermined set including a plurality of data, and the candidate data set may be one or more, specifically set according to actual requirements. For example, in the idiom-to-dragon voice interaction process, the candidate data set is a data set including a plurality of four-character idioms, and may include all the collected four-character idioms, or may be a part of the four-character idioms, and may be set in consideration of interaction difficulty, entertainment and other aspects. For example, for an interaction with a high difficulty, idioms with a last word that is less common or less common at the beginning of the idioms can be recorded in the candidate data set, and for an interaction with a high entertainment requirement, some idioms that are more interesting or can increase the smile point can be selected. The selection of the candidate data sets may also be determined according to a preset setting, for example, a candidate data set with a small difficulty is selected in the first round, the difficulty of the candidate data set selected later is larger, or only one of the candidate data sets is selected all the time, and the like.
In this embodiment, the searching for the search data from the candidate data set may be performed by a matching rule, where the matching rule is related to a dividing rule of the third portion and the fourth portion in the second data, for example, when the dividing rule of the third portion and the fourth portion in the second data is divided according to the first word/character/syllable and the remaining word/character/syllable in the second data, the data with the first word/character/syllable being the same as the search data is searched from the candidate data set, and the like. In summary, the data in the candidate data set can be divided by means of the third and fourth parts, as can the second data. Thus, each piece of data in the search results searched from the candidate data set includes the ordered third portion and fourth portion, and the third portion includes the search data. And when only one search result is obtained from the candidate data set, directly determining the search result as the second data, and when a plurality of search results are obtained, selecting one of the search results according to a second strategy to determine the second data. The second policy is specifically set according to actual conditions, and may be arbitrarily selected from search results and determined as the second data, or may be selected and determined as the second data based on a certain rule based on a specific condition.
According to an embodiment of the present disclosure, the data obtaining apparatus 602 further has the following functions: extracting target data which does not match any data in the monitoring data set from the at least one piece of data; if at least one piece of target data exists, selecting one piece of data in the target data as second data; the monitoring data set at least comprises first data; storing the second data in the monitoring data set.
In this embodiment, when the search result includes a plurality of pieces of data, the target data that does not match any data in the monitoring data set is selected as the second data. The monitoring data set comprises first data corresponding to the acquired first sound or second data determined from the candidate data set in a round of voice interaction. The round of voice interaction means that the first sound is obtained from the first time, and the ending condition is reached after one or more times of processing. Zero or at least one loop may be included in a round of voice interaction, where a loop is a process of all participants participating in a voice interaction, in the first loop, when a first sound is obtained for the first time, first data corresponding to the first sound is identified, the first data is stored in a monitoring data set (at this time, because the first obtained data is not needed to consider whether the first sound is matched), and then if the next participant is still a participant user, the step of obtaining the first sound is continuously executed until the next participant is a processing device or an end condition is reached (for example, the first data is repeated in the monitoring data set, or the first sound is not obtained within a predetermined time range, and the like). If the next participant is a processing device, the step of obtaining second data is performed, i.e. second data is obtained according to the first data and stored in the monitoring data set, and of course, if no suitable second data is found, the present round of voice interaction is ended. After a round of voice interaction is finished, the monitoring data set is emptied, namely when each round of voice interaction is started, the monitoring data set is empty; in each round of voice interaction, the first sound may be obtained only once, or may be obtained many times, and the first data and the second data corresponding to the first sound obtained each time are stored in the monitoring data set under the condition that the preset voice interaction rule is satisfied. As can be seen, the monitoring data set is used to store data obtained after the monitoring data set is last used, that is, the data includes first data corresponding to the first sound obtained each time in the current round of voice interaction and the determined second data. Therefore, in order to avoid duplication, after a search result is obtained from the candidate data set each time, data which is not duplicated with the data in the monitoring data set is selected as the second data, and after the second data is selected, the second data is also stored in the monitoring data set.
According to an embodiment of the present disclosure, the processing apparatus further includes a device having: obtaining a judgment result, wherein the judgment result indicates whether the first data is matched with data in a monitoring data set; the monitoring data set is used for storing data obtained after the last emptying; outputting a judgment result at least when the judgment result shows that the first data is matched with the data in the monitoring data set; and when the judgment result is that the monitoring data set is empty or the data in the temporary control data set are not matched, storing the first data into the monitoring data set.
In this embodiment, in a round of voice interaction, a step of obtaining the first sound may be performed multiple times, and after obtaining the first sound once, first data corresponding to the first sound is identified, and it is determined whether the first data matches with data in the monitoring data set, and when the first data matches with data in the monitoring data set, it is described that the sound corresponding to the first data has occurred in the round of voice interaction, and may have occurred by obtaining the first sound, or may have occurred by using the second data (the second data is finally presented by using the second sound). In order to realize that the first sound and the second sound appearing in each round of voice interaction process are not repeated, after the first sound is obtained, the first sound is determined by whether the first data corresponding to the first sound is matched with the data in the monitoring data set. When the first data is matched with the data in the monitoring data set, a judgment result needs to be output, wherein the judgment result can be a prompt that the currently obtained first sound is repeated before, or a prompt that the voice interaction of the current round is ended after the first data corresponding to the first sound input by the user is repeated with the data in the monitoring data set after the prompt for many times. After each round of voice interaction is finished, the monitoring data set is emptied. If the first sound is obtained for the first time in a round of voice interaction, the monitoring data set is empty, or the first sound is obtained from the second time, if the first sound is not repeated with any data in the monitoring data set, under the two conditions, the first sound is in accordance with the voice interaction rule, and therefore the first data corresponding to the first sound is stored in the monitoring data set.
In one embodiment, in addition to the first sound being obtained and the previous repetition resulting in the end of a round of voice interaction, it is also possible that the second data acquisition failure results in the end of a round of voice interaction. Referring to the foregoing embodiment, when searching from a candidate data set, if the search result is empty or the target data is empty, a prompt is output, and the monitoring data set is emptied, which also indicates that a round of voice interaction is finished. In this embodiment, if no suitable second data is searched in the candidate data set, the voice interaction in the current round is ended.
For example, in a idiom pickup game, a user provides an idiom (corresponding to a first sound) in a voice manner, and the processing device obtains another idiom with a first word matched with a last word of the idiom according to idiom search and presents the idiom in a voice form.
In the processing method according to the foregoing embodiment of the present disclosure, the processing device serves as a participant to complete a voice interaction process with the user together, and in the voice interaction process, the processing device may also serve as a supervisor besides one of the participants, and is used to supervise whether the voice interactions of all the participants in the current round of voice interaction process meet the rules. After the user speaks the first sound, the processing device obtains first data according to the first sound, and judges whether the first sound appears in the current round or not according to the first data, if yes, the user can be prompted to speak the first sound again, or after multiple prompts, the first sound spoken by the user still does not accord with the rule, and the current round of voice interaction is ended; since the processing device also serves as a participant, when it is the turn of the method performer to output the voice, the processing device obtains the second data from the first data corresponding to the first voice spoken by its previous participant and outputs the second voice corresponding to the second data.
It will be appreciated that in a round of voice interaction, the participant users may be one or more, in the case of multiple participant users, the processing means needs to determine the sequential relationship between themselves as participants and the other participant users, with the processing for the other participant users being performed with reference to the step of obtaining the first sound, and with the processing means being in turn performed with reference to the step of obtaining the second data and outputting the second sound. That is to say, in the processing method mentioned in the embodiment of the disclosure, when there are a plurality of participant users, the step of obtaining the first sound may be performed a plurality of times in succession, for example, there are m-1(m is an integer greater than 2) participant users in total, the processing device is used as a participant, and when performing a round of voice interaction with m-1 participant users, if the interaction sequence is the first m-x-1 participant users being m-x-1, the processing device is the m-x participants, and then there are x participant users, then, assuming that each participant has no error, the step of obtaining the first sound is performed m-x-1 times first, then the step of obtaining the second data and outputting the second sound is performed once, and finally, the step of obtaining the first sound is performed m-1 times in succession (the first m-x-1 participants and the last x participants), the steps of obtaining the second data and outputting the second sound are performed once more, and so on.
Some details are described in detail below.
In an embodiment of the present disclosure, the processing device further determines whether the first sound matches a preset sound corresponding to a current order in a sound order set; the sound sequence set comprises at least two preset sounds with a cyclic sequence relation; and when the first sound is matched with the preset sound corresponding to the current sequence, taking the next preset sound in the sound sequence set as the preset sound corresponding to the current sequence. The present embodiment may further recognize whether the current participant order matches a current preset sound in a preset sound order set according to the speech when there are multiple participant users. The sound order set may be set manually before the voice interaction begins, such as each participant user uttering a sentence for recognition by the processing device and recording the voice characteristics of each participant user in order as a sound order set; the sound order set may also be a self-learning setting from the beginning of the speech interaction. The self-learning setting means that in the voice interaction process, the processing device obtains and processes the first voice, and simultaneously stores the voice characteristics of each participant user in the voice sequence set according to the sequence until all the participant users participate in the voice interaction once. It is understood that the participation sequence of the processing devices themselves can be set manually or randomly, and is determined according to actual requirements. After each first sound of one participant user is obtained, matching is carried out according to the first sound and the voice characteristics corresponding to the current sequence in the sound sequence set, and if the first sound is not matched with the voice characteristics corresponding to the current sequence in the sound sequence set, the user can be informed of the sequence error in an output prompting mode; if the sound order set is matched with the current order set, the next sound in the sound order set is taken as the sound corresponding to the current order, and the first sound of the next participant user is received. In this embodiment, the processing device not only can be used as a participant to perform voice interaction with a user in a voice interaction process, but also can supervise whether the participant user complies with a voice interaction rule in the voice interaction process, thereby facilitating the voice interaction between users.
In the embodiment of the present disclosure, the processing device further prompts the user to time out by monitoring whether the first sound of the participant user is obtained within the preset time, and if the preset time is exceeded, the processing device may even output a voice interaction result, for example, a result that the current user has input in the current round of voice interaction.
In the embodiment of the present disclosure, the processing device obtains the first sound of the user participant within a predetermined time, and divides the first data corresponding to the first sound into the first part and the second part according to the rule in the foregoing embodiment, if the participant user is not the first participant of the current round of voice interaction, the first data further needs to be compared with the previous data corresponding to the previous sound of the previous participant user; according to the description of the foregoing embodiment, the previous data corresponding to the previous sound of the previous participant user is divided into the first part and the second part according to the rule of the foregoing embodiment, the first data corresponding to the first sound of the current participant user is divided into the third part and the fourth part according to the rule of the foregoing embodiment, whether the first sound of the current participant meets the voice interaction rule is determined by comparing whether the second part and the third part are matched, if not, the user may be prompted to re-input the first sound, if the number of times of prompting exceeds the preset threshold, the previous data corresponding to the previous sound of the previous participant user still does not match, the result, that is, the result that the current participant user has output, may be output, and the current round of voice interaction is ended. For example, for a idiom extension voice interaction, the obtained first data corresponding to the first sound of the current participant user is idiom a, the idiom a is divided into a third part including the first three words and a fourth part including the last word, then the previous data corresponding to the previous sound input by the previous participant is obtained as idiom B, and it is determined whether the idiom B is divided into a first part including the first word and a second part including the remaining three words according to the rules of the foregoing embodiment, and the first part data of the first sound of the current participant user and the fourth part data of the previous data corresponding to the previous sound of the previous participant user are compared to be harmonious words.
In other embodiments of the present disclosure, the processing device does not act as a participant, but merely as a role for supervising whether or not the sound obtained from the user during the voice interaction meets the voice interaction rules, which is described in detail below by way of example.
According to an embodiment of the present disclosure, the data obtaining apparatus 602 further has the following functions: obtaining a second sound; and obtaining the second data according to the second sound. The processing device also comprises a device with the following functions: obtaining a judgment result, wherein the judgment result indicates whether the second data is matched with data in the monitoring data set; the monitoring data set comprises at least the first data; when the judgment result is that the data in the monitoring data set is not matched, storing the second data into the monitoring data set; and outputting a judgment result at least when the judgment result shows that the second data matches the data in the monitoring data set.
In this embodiment, after the first sound is obtained, the second data is obtained according to the second sound, the second data is matched with the data in the monitoring data set, whether the second data is matched with the data in the monitoring data set is determined, when the second data is not matched with the data in the monitoring data set, the second data is stored in the monitoring data set, and when the second data is matched with the data in the monitoring data set, the determination result is output. In this embodiment, it is determined whether the obtained second data corresponding to the second sound meets the voice interaction rule, that is, whether the obtained second data does not overlap with existing data in the monitoring data set, and if the obtained second data does not overlap with the existing data in the monitoring data set, the determination result is output, and if the obtained second data does not overlap with the existing data in the monitoring data set, the second data is stored in the monitoring data set. In this embodiment, the processing device is used as a supervisor for supervising whether the second sound obtained from the user meets the voice interaction rule, so that convenience is provided for voice interaction among users. The definition and detailed description of the monitoring data set can be referred to the description in the foregoing embodiments, and are not repeated herein.
In this embodiment, if the second sound and the voice interaction rule are the same, the second sound of the next participant user can be obtained continuously and determined. In the method of this embodiment, at least two participants are required to participate in the voice interaction, and the processing device is configured to determine that the first sound obtained from a participant is the sound of the previous participant, and when the sound of the first sound meets a predetermined voice interaction rule, the corresponding first data is already stored in the monitoring data set, so that it is only required to determine whether the second sound obtained from the current participant meets the voice interaction rule. Therefore, in the practical application process, the obtained sound of the current participant is processed according to the second sound, and the first sound is equivalent to the sound of the previous participant which is processed and conforms to the voice interaction rule.
In an embodiment of the present disclosure, the processing apparatus further includes a device having: and after the judgment result is obtained, when the judgment result shows that the second data is matched with the data in the monitoring data set, emptying the data in the monitoring data set. If the second data does not conform to the voice interaction rule, namely when the second data is repeated with the data in the monitoring data set, the voice interaction process of the current round can be ended, and the data in the monitoring data set is emptied, so that a new round of voice interaction can be restarted. It will of course be appreciated that in other embodiments, the current participant may also be prompted to retrieve the second sound and the current round of voice interaction may end after the number of prompts exceeds a predetermined threshold.
Fig. 7 schematically shows a block diagram of a processing device according to an embodiment of the present disclosure.
As shown in fig. 7, a processing device according to an embodiment of the present disclosure for use in a voice interaction process includes a processor 710 and a computer-readable storage medium 720.
In particular, processor 710 may comprise, for example, a general purpose microprocessor, an instruction set processor and/or associated chipset, and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), and/or the like. The processor 710 may also include on-board memory for caching purposes. Processor 710 may be a single processing unit or a plurality of processing units for performing different actions of the method flows described with reference to fig. 1, 3, 4, 6-7, and other embodiments of the disclosure, in accordance with various embodiments of the disclosure.
Computer-readable storage medium 720 may be, for example, any medium that can contain, store, communicate, propagate, or transport the instructions. For example, a readable storage medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, device, or propagation medium. Specific examples of the readable storage medium include: magnetic storage devices, such as magnetic tape or Hard Disk Drives (HDDs); optical storage devices, such as compact disks (CD-ROMs); a memory, such as a Random Access Memory (RAM) or a flash memory; and/or wired/wireless communication links.
The computer-readable storage medium 720 may include a computer program 721, which computer program 721 may include code/computer-executable instructions that, when executed by the processor 710, cause the processor 710 to perform the method flows described in the embodiments of the present disclosure and any variations thereof.
The computer program 721 may be configured with, for example, computer program code comprising computer program modules. For example, in an example embodiment, code in computer program 721 may include one or more program modules, including 721A, modules 721B, … …, for example. It should be noted that the division and number of the modules are not fixed, and those skilled in the art can use suitable program modules or program module combinations according to actual situations, and when the program modules are executed by the processor 710, the processor 710 can execute the method flows described in the embodiments of the present disclosure and any modifications thereof.
In accordance with embodiments of the present disclosure, the processor 710 may use the signal transmitter 730 and the signal receiver 740 to perform the method flows described by the embodiments of the present disclosure and any variations thereof.
The above methods, apparatuses, units and/or modules according to embodiments of the present disclosure may be implemented by an electronic device with computing capabilities executing software containing computer instructions. The system may include storage devices to implement the various storage described above. The computing-capable electronic device may include, but is not limited to, a general-purpose processor, a digital signal processor, a special-purpose processor, a reconfigurable processor, and the like capable of executing computer instructions. Execution of such instructions causes the electronic device to be configured to perform the operations described above in accordance with the present disclosure. The above devices and/or modules may be implemented in one electronic device, or may be implemented in different electronic devices. Such software may be stored in a computer readable storage medium. The computer readable storage medium stores one or more programs (software modules) comprising instructions which, when executed by one or more processors in an electronic device, cause the electronic device to perform the methods of the present disclosure.
Such software may be stored in the form of volatile memory or non-volatile storage (such as storage devices like ROM), whether erasable or rewritable, or in the form of memory (e.g. RAM, memory chips, devices or integrated circuits), or on optically or magnetically readable media (such as CD, DVD, magnetic disks or tapes, etc.). It should be appreciated that the storage devices and storage media are embodiments of machine-readable storage suitable for storing one or more programs that include instructions, which when executed, implement embodiments of the present disclosure. Embodiments provide a program and a machine-readable storage device storing such a program, the program comprising code for implementing the apparatus or method of any one of the claims of the present disclosure. Further, these programs may be delivered electronically via any medium (e.g., communication signals carried via a wired connection or a wireless connection), and embodiments suitably include these programs.
Methods, apparatus, units and/or modules according to embodiments of the present disclosure may also be implemented using hardware or firmware, or in any suitable combination of software, hardware and firmware implementations, for example, Field Programmable Gate Arrays (FPGAs), Programmable Logic Arrays (PLAs), system on a chip, system on a substrate, system on a package, Application Specific Integrated Circuits (ASICs), or in any other reasonable manner for integrating or packaging circuits. The system may include a storage device to implement the storage described above. When implemented in these manners, the software, hardware, and/or firmware used is programmed or designed to perform the corresponding above-described methods, steps, and/or functions according to the present disclosure. One skilled in the art can implement one or more of these systems and modules, or one or more portions thereof, using different implementations as appropriate to the actual needs. Such implementations are within the scope of the present disclosure.
While the disclosure has been shown and described with reference to certain exemplary embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the disclosure as defined by the appended claims and their equivalents. Accordingly, the scope of the present disclosure should not be limited to the above-described embodiments, but should be defined not only by the appended claims, but also by equivalents thereof.

Claims (9)

1. A method of processing, comprising:
obtaining a first sound, wherein the first sound corresponds to first data comprising an ordered first portion and a second portion;
obtaining second data, wherein the second data comprises a third part and a fourth part which are ordered, and the data of the third part is matched with the data of the second part;
wherein the second data corresponds to a second sound;
the obtaining second data comprises:
dividing the first data according to a first strategy to obtain the first part and the second part;
determining at least one search datum using the second portion of data;
searching the search data in the determined candidate data set to obtain a search result;
when the search result comprises at least one piece of data, determining the second data from the at least one piece of data according to a second strategy; each piece of data in the at least one piece of data comprises a third portion and a fourth portion which are ordered; the third portion of each of the at least one piece of data comprises one of the at least one search datum; the determining the second data from the at least one piece of data according to the second policy includes:
extracting target data which is not matched with any data in the monitoring data set from the at least one piece of data;
and if at least one piece of target data exists, selecting one piece of data in the target data as second data.
2. The method of claim 1, after the obtaining second data, the method further comprising:
and outputting the second sound.
3. The method of claim 1, the determining the second data from the at least one piece of data according to a second policy, further comprising:
and storing the second data in the monitoring data set, wherein the monitoring data set at least comprises the first data.
4. The method of claim 3, further comprising:
and if the search result is empty or the target data is empty, outputting a prompt and emptying the monitoring data set.
5. The method of claim 2, after the obtaining the first sound, the method further comprising:
obtaining a judgment result, wherein the judgment result indicates whether the first data is matched with data in a monitoring data set; the monitoring data set is used for storing data obtained after the monitoring data set is emptied for the last time;
and outputting a judgment result at least when the judgment result shows that the first data is matched with the data in the monitoring data set.
6. The method of claim 5, after obtaining the determination, further comprising:
and when the judgment result is that the monitoring data set is empty or the data in the monitoring data set is not matched, storing the first data into the monitoring data set.
7. The method of claim 1, wherein,
the obtaining second data comprises:
obtaining a second sound;
obtaining the second data according to the second sound; after the obtaining second data, the method further comprises:
obtaining a judgment result, wherein the judgment result indicates whether the second data is matched with data in the monitoring data set; the monitoring data set comprises at least the first data; when the judgment result is that the data in the monitoring data set is not matched, storing the second data into the monitoring data set; and
and outputting a judgment result at least when the judgment result shows that the second data is matched with the data in the monitoring data set.
8. The method of claim 7, wherein after said obtaining the determination result, further comprising:
and when the judgment result shows that the second data is matched with the data in the monitoring data set, emptying the data in the monitoring data set.
9. A processing device, comprising:
the device comprises a sound acquisition device, a processing device and a processing device, wherein the sound acquisition device is used for acquiring a first sound, the first sound corresponds to first data, and the first data comprises a first part and a second part which are ordered;
the data acquisition device is used for acquiring second data, the second data comprises a third part and a fourth part which are ordered, and the data of the third part is matched with the data of the second part;
wherein the second data corresponds to a second sound;
the obtaining second data comprises:
dividing the first data according to a first strategy to obtain the first part and the second part;
determining at least one search datum using the second portion of data;
searching the search data in the determined candidate data set to obtain a search result;
when the search result comprises at least one piece of data, determining the second data from the at least one piece of data according to a second strategy; each piece of data in the at least one piece of data comprises a third portion and a fourth portion which are ordered; the third portion of each of the at least one piece of data comprises one of the at least one search datum; the determining the second data from the at least one piece of data according to the second policy includes:
extracting target data which is not matched with any data in the monitoring data set from the at least one piece of data;
and if at least one piece of target data exists, selecting one piece of data in the target data as second data.
CN201710201531.0A 2017-03-29 2017-03-29 Processing method and device Active CN106951091B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710201531.0A CN106951091B (en) 2017-03-29 2017-03-29 Processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710201531.0A CN106951091B (en) 2017-03-29 2017-03-29 Processing method and device

Publications (2)

Publication Number Publication Date
CN106951091A CN106951091A (en) 2017-07-14
CN106951091B true CN106951091B (en) 2020-06-23

Family

ID=59473925

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710201531.0A Active CN106951091B (en) 2017-03-29 2017-03-29 Processing method and device

Country Status (1)

Country Link
CN (1) CN106951091B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104036782A (en) * 2013-03-06 2014-09-10 联想(北京)有限公司 Noise reduction method and handheld mobile terminal
CN105474212A (en) * 2013-08-27 2016-04-06 高通股份有限公司 Method and apparatus for classifying data items based on sound tags
CN106205612A (en) * 2016-07-08 2016-12-07 北京光年无限科技有限公司 Information processing method and system towards intelligent robot

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104036782A (en) * 2013-03-06 2014-09-10 联想(北京)有限公司 Noise reduction method and handheld mobile terminal
CN105474212A (en) * 2013-08-27 2016-04-06 高通股份有限公司 Method and apparatus for classifying data items based on sound tags
CN106205612A (en) * 2016-07-08 2016-12-07 北京光年无限科技有限公司 Information processing method and system towards intelligent robot

Also Published As

Publication number Publication date
CN106951091A (en) 2017-07-14

Similar Documents

Publication Publication Date Title
US11580960B2 (en) Generating input alternatives
US10650815B2 (en) Voice interaction device, voice interaction method, voice interaction program, and robot
KR102222317B1 (en) Speech recognition method, electronic device, and computer storage medium
US10074363B2 (en) Method and apparatus for keyword speech recognition
Senior et al. Acoustic modelling with cd-ctc-smbr lstm rnns
US10453117B1 (en) Determining domains for natural language understanding
US9159319B1 (en) Keyword spotting with competitor models
KR100718147B1 (en) Apparatus and method of generating grammar network for speech recognition and dialogue speech recognition apparatus and method employing the same
US9009049B2 (en) Recognition of speech with different accents
JP6507316B2 (en) Speech re-recognition using an external data source
US9154629B2 (en) System and method for generating personalized tag recommendations for tagging audio content
US9953637B1 (en) Speech processing using skip lists
US10140976B2 (en) Discriminative training of automatic speech recognition models with natural language processing dictionary for spoken language processing
US20220343895A1 (en) User-defined keyword spotting
US20220262352A1 (en) Improving custom keyword spotting system accuracy with text-to-speech-based data augmentation
Alon et al. Contextual speech recognition with difficult negative training examples
JP7230806B2 (en) Information processing device and information processing method
WO2014183373A1 (en) Systems and methods for voice identification
KR102094935B1 (en) System and method for recognizing speech
JPWO2016013503A1 (en) Speech recognition apparatus and speech recognition method
US11195522B1 (en) False invocation rejection for speech processing systems
US10417345B1 (en) Providing customer service agents with customer-personalized result of spoken language intent
US20170270923A1 (en) Voice processing device and voice processing method
US11151986B1 (en) Learning how to rewrite user-specific input for natural language understanding
CN106951091B (en) Processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant