CN102868740A - Method and system for controlling toy based on mobile communication terminal and internet voice interaction - Google Patents

Method and system for controlling toy based on mobile communication terminal and internet voice interaction Download PDF

Info

Publication number
CN102868740A
CN102868740A CN2012103297631A CN201210329763A CN102868740A CN 102868740 A CN102868740 A CN 102868740A CN 2012103297631 A CN2012103297631 A CN 2012103297631A CN 201210329763 A CN201210329763 A CN 201210329763A CN 102868740 A CN102868740 A CN 102868740A
Authority
CN
China
Prior art keywords
communication terminal
speech recognition
voice
toy
recognition conversion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012103297631A
Other languages
Chinese (zh)
Inventor
吴玉胜
李新岗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHENZHEN SILICON ELECTRONICS CO Ltd
Original Assignee
SHENZHEN SILICON ELECTRONICS CO Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHENZHEN SILICON ELECTRONICS CO Ltd filed Critical SHENZHEN SILICON ELECTRONICS CO Ltd
Priority to CN2012103297631A priority Critical patent/CN102868740A/en
Publication of CN102868740A publication Critical patent/CN102868740A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention relates to a method and a system for controlling a toy based on a mobile communication terminal and internet voice interaction. The system comprises a toy provided with communication connection, the mobile communication terminal provided with voice input, and a network server provided with voice identification and conversion. By the adoption of the method and the system for controlling the toy based on the mobile communication terminal and internet voice interaction, the identification of voice input is realized through the mobile communication terminal and the internet, by calling contents of the network server by the internet, functions of large storage of the contents, remarkable effect of an identification result and timely updating of the contents are realized, so that the functions of the toy is enabled to be stronger, and cost is greatly saved at the same time.

Description

The toy control method that movement-based communicating terminal and internet voice are mutual and system
Technical field
The present invention relates to a kind of toy sound control method and system, relate in particular to a kind of movement-based communicating terminal and internet voice mutual toy control method and system.
Background technology
Along with the development of society and the raising of voice technology, voice toy more and more comes extensive use.The existing voice toy arranges voice recognition chip at toy mostly, stores simple phonetic order and content, by instruction and the content of calling storage after the speech recognition, thus the operation voice toy.For now, there is following defective in prior art: 1, usually toy need to be controlled cost, and the instruction and the content memory capacity that have of toy is limited cheaply, and content is few; 2, each toy itself need to possess a cover voice input, voice recognition chip and memory module, causes like this cost high.
Summary of the invention
The technical problem that the present invention solves is: make up a kind of movement-based communicating terminal and internet voice mutual toy control method and system, overcome that prior art voice toy memory capacity is limited to cause the technical problem that content is few and cost is high.The present invention benefits from calculation process ability and the network communication ability that communication terminal is better than the toy terminal far away, so that the user can be by the communication terminal that extensive use develops on market at present, on its terminal platform, the speech recognition that employing intelligent interaction ability is stronger, the identification accuracy is higher and natural language understanding system finally realize with the interaction of entity toy mutual, bring user's experience of the interactive experience of ultra-traditional toy scheme far away.
Technical scheme of the present invention is: a kind of movement-based communicating terminal and the mutual toy control method of internet voice are provided, comprise the toy that has communication and connect, have the communication terminal of phonetic entry and speech recognition, the webserver with speech recognition conversion, the mutual toy control method of movement-based communicating terminal and internet voice comprises the steps:
Input voice: by described communication terminal input voice;
Upload voice: with described communication terminal connecting Internet, and the voice messaging of input uploaded to the described webserver by the Internet that described communication terminal connects;
Speech recognition conversion: conversion identified in the voice that described communication terminal and described webserver side-by-side docking are received, and this speech recognition conversion result is the form of instruction or instruction and parameter;
Carry out the identification transformation result: by the described webserver, described communication terminal, described toy jointly carry out this speech recognition conversion result or by wherein any two carry out this speech recognition conversion results or any one carries out this speech recognition conversion result by described toy and described communication terminal.
Further technical scheme of the present invention is: also comprise according to the identification scene making up semantic knowledge-base, described semantic knowledge-base comprises the semantic attribute of words, in the speech recognition conversion step, also comprises and carries out semanteme identification conversion, specifically comprises the steps:
Participle and semantic disambiguation: voice identification result is carried out the participle lang justice disambiguation of going forward side by side according to the semantic attribute of knowledge base words;
Intention classification and parameter extraction: the result to participle and semantic disambiguation is intended to classification, and the line parameter of going forward side by side extracts.
Further technical scheme of the present invention is: the speech recognition conversion result of the described webserver and described communication terminal includes the confidence level of speech recognition conversion, described communication terminal arranges speech recognition conversion result's confidence threshold value, when described communication terminal speech recognition conversion result's confidence level during more than or equal to this confidence threshold value, get this speech recognition conversion result, if described communication terminal speech recognition conversion result's confidence level during less than this confidence threshold value, is got the speech recognition conversion result of higher value in described webserver speech recognition conversion result's confidence level and described communication terminal speech recognition conversion result's the confidence level.
Further technical scheme of the present invention is: also comprise and wake described communication terminal up, wake described communication terminal up by any mode in phonetic order, button or the wireless signal and make described communication terminal enter the state of input voice.
Further technical scheme of the present invention is: described toy and described communication terminal are by any is connected in infrared communication assembly, high frequency modulated communication part, bluetooth communication assembly, 2.4G wireless communication assembly, the RFID radio-frequency communication assembly.
Further technical scheme of the present invention is: when input can not be identified voice messaging or inexecutable voice messaging, carry out interactive voice by communication terminal or toy, can identify the voice messaging that maybe can carry out to obtain.
Further technical scheme of the present invention is: comprise described toy to the input voice identify conversion, described toy is carried out this speech recognition conversion result.
Technical scheme of the present invention is: make up the mutual toy control system of a kind of movement-based communicating terminal and internet voice, comprise and have the toy that communication connects, communication terminal with phonetic entry and speech recognition, the webserver with speech recognition conversion, described toy comprises the second wireless communication module that connects described communication terminal, described communication terminal comprises the voice-input unit of inputting voice, carry out the first wireless communication module that wireless telecommunications are connected with described toy, carry out the first speech conversion unit of speech recognition conversion and carry out the network connecting module that the internet is connected with the described webserver, the described webserver has the 3rd speech conversion unit of the voice messaging that receives being identified conversion process, described voice-input unit input voice, conversion identified in the parallel voice to input in the first speech conversion unit of described communication terminal and the 3rd speech conversion unit of the described webserver, by the described webserver, described communication terminal, described toy jointly carry out this speech recognition conversion result or by wherein any two carry out this speech recognition conversion results or any one carries out this speech recognition conversion result by described toy and described communication terminal.
Further technical scheme of the present invention is: described the 3rd speech conversion unit comprises sound identification module and semantic identification module, and described semantic identification module goes out the semanteme of described voice-input unit input voice according to the phonetic decision of described sound identification module identification.
Further technical scheme of the present invention is: the speech recognition conversion result of the described webserver and described communication terminal includes the confidence level of speech recognition conversion, described communication terminal arranges speech recognition conversion result's confidence threshold value, when described communication terminal speech recognition conversion result's confidence level during more than or equal to this confidence threshold value, get this speech recognition conversion result, if described communication terminal speech recognition conversion result's confidence level during less than this confidence threshold value, is got the speech recognition conversion result of higher value in described webserver speech recognition conversion result's confidence level and described communication terminal speech recognition conversion result's the confidence level.
Further technical scheme of the present invention is: described communication terminal or the described webserver all arrange the interactive voice storehouse of carrying out interactive voice, when input can not be identified voice messaging or inexecutable voice messaging, carry out interactive voice, can identify the voice messaging that maybe can carry out to obtain.
Further technical scheme of the present invention is: described toy has the second speech conversion unit that carries out speech recognition, and conversion identified in described the second speech conversion unit voice.
Further technical scheme of the present invention is: described communication terminal also comprises and wakes the wake module of state that described communication terminal enters the input voice up, and described wake module comprises any mode in phonetic order, button or the wireless signal.
Technique effect of the present invention is: by a kind of movement-based communicating terminal and mutual toy control method and the system of internet voice, comprise the toy that has communication and connect, have the communication terminal of phonetic entry, the webserver with speech recognition conversion.A kind of movement-based communicating terminal of the present invention and mutual toy control method and the system of internet voice, by communication terminal and the Internet, realize the identification of phonetic entry, call the content of the webserver by the Internet, realized that the content memory space is large, recognition result is effective and the upgrading in time of content.The present invention benefits from calculation process ability and the network communication ability that communication terminal is better than the toy terminal far away, so that the user can be by the communication terminal that extensive use develops on market at present, on its terminal platform, adopt speech recognition and the natural language understanding system that the intelligent interaction ability is stronger, the identification accuracy is higher finally to realize with the interaction of entity toy mutual, bring the user of the interactive experience of ultra-traditional toy scheme far away to experience, make the function of toy more powerful, simultaneously, greatly saved cost.
Description of drawings
Fig. 1 is flow chart of the present invention.
Fig. 2 is structural representation of the present invention.
Embodiment
Below in conjunction with specific embodiment, technical solution of the present invention is further specified.
As shown in Figure 1 and Figure 2, the specific embodiment of the present invention is: make up a kind of movement-based communicating terminal and the mutual toy control method of internet voice, comprise the toy 2 that has communication and connect, have the communication terminal 1 of phonetic entry, the webserver 3 with speech recognition conversion, the mutual toy control method of movement-based communicating terminal and internet voice comprises the steps:
Step 100: the input voice, that is: by described communication terminal 1 input voice;
Step 200: upload voice, that is: with described communication terminal 1 connecting Internet, and the voice messaging of input uploaded to the described webserver 3 by the Internet that described communication terminal 1 connects;
Step 300: speech recognition conversion, that is: conversion identified in the voice of described communication terminal 1 and 3 side-by-side dockings of described webserver receipts, and this speech recognition conversion result is the form of instruction or instruction and parameter;
Step 400: carry out transformation result, that is: by the described webserver 3, described communication terminal 1, described toy 2 common this speech recognition conversion results of execution or by wherein any two carry out this speech recognition conversion results or any one carries out this speech recognition conversion result by described toy 2 and described communication terminal 1.
As shown in Figure 1 and Figure 2, specific implementation process of the present invention is: by communication terminal 1 input voice, communication terminal 1 uploads to the described webserver 3 with the voice that receive, the voice that described communication terminal 1 and 3 side-by-side dockings of the described webserver are received are identified, voice identification result is changed again, and this speech recognition conversion result is the form of instruction or instruction and parameter.Speech recognition conversion result is carried out separately by toy 2, that is: the described webserver 3 instruction or instruction and the parameter that will identify conversion is sent to described communication terminal 1, described communication terminal 1 is set up wireless telecommunications with described toy 2 and is connected, then described communication terminal 1 is sent to described toy 2 with voice identification result, is carried out by described toy 2.If speech recognition conversion result comprises the control command of controlling toy 2, this control command of storage reaches the content that matches with phonetic order on the described toy, then described communication terminal 1 is sent to described toy 2 with voice identification result, is carried out the content of this instruction and call instruction by described toy 2.Speech recognition conversion result is carried out jointly by the described webserver 3 and toy 2, if on the webserver 3 storage corresponding with phonetic order in respective quadrature mutual information perhaps, the described webserver 3 according to the speech conversion call by result should be corresponding with phonetic order in perhaps the respective quadrature mutual information be sent to described toy 2 by described communication terminal 1, carry out these execution results by toy 2.Speech recognition conversion result is carried out jointly by described communication terminal 1 and toy 2, respective quadrature mutual information perhaps in if 1 storage of described communication terminal is corresponding with phonetic order, then by the described webserver 3 the speech recognition conversion result is sent to described communication terminal 1, carry out this speech recognition conversion result by described communication terminal 1, namely call this corresponding with phonetic order in respective quadrature mutual information perhaps, then be sent to described toy 2, institute's toy 2 is carried out this execution result.Speech recognition conversion result is carried out separately by described communication terminal 1, if communication terminal 1 has corresponding interaction content, then mobile communication is opened terminal and carried out this speech recognition conversion result, then plays back the independent execution of finishing communication terminal 1.For instruction and the content of control toy, such as music playing, tell a story, take off, rotation etc.Described speech recognition conversion result is instruction or instruction and parameter, carries out this instruction or instruction and parameter, such as, play " little swallow ", then " broadcast " is instruction, " little swallow " audio content is that content is as parameter.Speech recognition conversion result is carried out jointly by the described webserver 3, communication terminal 1, toy 3, such as: if the input phonetic order " how may I ask Beijing weather today? " then the webserver 3 is inquired about Beijing weather condition of today, then send to communication terminal 1, play by communication terminal, then the audio signal of playing is sent to toy 3 outputs, finished so the common execution by the described webserver 3, communication terminal 1, toy 3.In the specific embodiment, described content comprises one or more in audio content, the word content.In the specific embodiments of the invention, described communication terminal 1 comprises that also waking described communication terminal 1 up receives the wake-up step of inputting voice status, in the described wake-up step, realizes waking up by the input phonetic order or by button.Conversion identified in the voice that comprise 1 pair of input of described communication terminal, and conversion identified in described communication terminal 1 and the described webserver 3 parallel voice to input, to obtain first the speech recognition conversion result for obtaining the result.Because communication terminal 1 has larger storage capacity, therefore, its content library can be larger, the content that can store more phonetic order and match with phonetic order in communication terminal 1.
In the specific embodiment, the described webserver 3 walks abreast with described communication terminal 1 voice messaging is identified conversion.The speech recognition conversion result of described communication terminal and the described webserver includes the confidence level of speech recognition conversion.So-called confidence level also is confidence level.It refers to that particular individual treats the degree that the particular proposition authenticity is believed, namely probability is to measure individual conviction is rational. the confidence level of probability is explained and is shown, event itself is what probability not, and why event assigns probability is the conviction evidence that has in people's brains of assign probabilities.Confidence level refers to that the population parameter value drops on the probability in a certain district of sample statistics value; And confidential interval refers under a certain confidence level, error range between sample statistics value and population parameter value.Confidential interval is larger, and confidence level is higher.The confidence level of speech recognition conversion is namely to the degree of faith of speech recognition conversion real result.The described webserver 3 and described communication terminal 1 are parallel when voice messaging is identified conversion, described communication terminal 1 arranges speech recognition conversion result's confidence threshold value, when described communication terminal 1 speech recognition conversion result's confidence level during more than or equal to this confidence threshold value, get this speech recognition conversion result, if described webserver speech recognition conversion result's confidence level during less than this confidence threshold value, is got the speech recognition conversion result of higher value in described communication terminal 1 speech recognition conversion result's confidence level and the described webserver 3 speech recognition conversion results' the confidence level.
In the specific embodiment, described communication terminal 1 comprises mobile phone, mobile panel computer, mobile communication amusement equipment.Described communication terminal 1 and the described webserver 3 all arrange the interactive voice storehouse of carrying out interactive voice, when input can not be identified voice messaging or inexecutable voice messaging, carry out interactive voice by communication terminal 1 or toy 2, can identify the phonetic order that maybe can carry out to obtain.If voice identification result comprises interactive information, then call in the interactive voice storehouse corresponding interactive information and be sent to described toy 2 by described communication terminal 1, play this interactive information by toy 2 and realize interactive voice.Interactive information, such as, be by interactive voice " song whether Wang Fei is arranged ", then the described webserver 3 obtains Query Result and is " having " or " nothing " by inquiry, this Query Result " has " or " nothing " then is corresponding interactive information.In addition, when input can not be identified voice messaging or inexecutable voice messaging, input voice and carry out interactive voice, the voice messaging that can carry out to obtain the described webserver 3 or communication terminal 1 or toy 2 by described communication terminal 1.Such as, in input during " start " voice messaging, if may be owing to aphthenxia Chu or excessive with the received pronunciation difference, when causing identifying, can call the interactive voice information bank and point out and input again voice.For another example, in input " opening now story ", at this moment, the possible webserver 3 or communication terminal 1 or toy 2 can not be converted to control command with this phonetic order, at this moment, need to replenish input voice information, such as, call the interactive information storehouse " you want to listen a story? " replenishing of phonetic order information finished in like this interactive voice prompting, realizes can controlling toy with natural-sounding.
As shown in Figure 1, preferred implementation of the present invention is: also comprise according to the identification scene making up semantic knowledge-base, described semantic knowledge-base comprises the semantic attribute of words.Making up semantic knowledge-base is the primary condition of semantic identification, and some words are made up its knowledge base, defines its semantic attribute.Such as: " Liu Dehua ", its knowledge base comprises: man, Hong Kong native, singer, performer, its semantic attribute is " amusement personage "." raining ", then is a kind of weather condition, weather forecast, and its semantic attribute is " weather ".In the speech conversion step, also comprise according to the speech conversion result and carry out semantic conversion.Specifically comprise:
Step 10: participle and semantic disambiguation, that is: according to the semantic attribute of knowledge base words voice identification result is carried out the participle lang justice disambiguation of going forward side by side.Detailed process is as follows: according to the semantic attribute of words in knowledge base, voice identification result is carried out participle or disambiguation, such as: voice identification result for " tomorrow can rain in Beijing? " semantic attribute participle according to the knowledge base words is " tomorrow ", " Beijing ", " meeting ", " raining ", " ", " tomorrow " is time attribute, " Beijing " is site attribute, " meeting " is verb, and " raining " is the weather attribute, and " " is for puing question to.In some cases, need disambiguation, such as " song of Liu Dehua ", may be identified as " clear must be sliding ", but through the definition of knowledge base to " Liu Dehua ", analyze and be judged as " Liu Dehua ".This belongs to the semantic attribute disambiguation according to the knowledge base words.
Step 20: intention classification and parameter extraction, that is: the result of participle and semantic disambiguation is intended to classification, the line parameter of going forward side by side extracts.Such as: voice identification result for " tomorrow can rain in Beijing? " result according to participle and semantic disambiguation is intended to classification, and its intention class is " inquiry weather ", and extracting parameter is: the place is Beijing, and the time is tomorrow.Like this to " tomorrow can rain in Beijing? " carried out semantic conversion.
Detailed process is as follows: the input voice are " It's lovely day? " at first, carry out speech recognition, output recognition result is " It's lovely day? " then according to voice identification result, carry out Semantic judgement, according to Semantic judgement be: the weather condition of broadcasting this ground today.For another example: phonetic entry is: " I want to listen the music of Wang Fei ", final semantic discriminance analysis obtains user's be intended to " played songs ", and parameter is " Wang Fei ", then according to analysis result, calls the playback of songs function and play-overs the song of Wang Fei.Because adopt semantic identification is arranged, the user does not need to remember the voice control command of fixing, but the language performance that can adopt the user oneself to be accustomed to most comes and toy interaction.So to a upper intention, the user also can say " please help me to look for the song of Wang Fei ", " the up-to-date special edition of Wang Fei is arranged? ", " Wang Fei perverse ", that is to say, the user can freely express order and the intention of oneself, powerful speech recognition and semantic understanding engine on the portable terminal, can extraordinaryly identify user's real intention: play the song of Wang Fei, or play a certain song of Wang Fei.So, allow the better free, interesting alternately of intelligent toy and user, and do not increase the direct hardware cost of original toy terminal, allow toy manufacturer can use lower cost, but realized high performance man-machine interaction effect.
As shown in Figure 1 and Figure 2, preferred implementation of the present invention is: in the speech recognition conversion step, comprise that the voice of 2 pairs of inputs of described toy are changed.The function of toy 2 concrete standby speech recognition conversion own, simultaneously, instruction and content library are set, and for simple voice, conversion identified in the speech recognition conversion module of toy 2 and described communication terminal 1 and the described webserver 3 parallel voice to input.
As shown in Figure 1 and Figure 2, preferred implementation of the present invention is: described toy 2 and described communication terminal 1 are by any is connected in infrared communication assembly, high frequency modulated communication part, bluetooth communication assembly, 2.4G wireless communication assembly, the RFID radio-frequency communication assembly.On the described toy wireless communication receiver is set, in the art of this patent scheme, the second wireless communication module 21 on the described toy 2 is infrared signal receiver, the bluetooth communication assembly, in RFID radio-frequency communication assembly and the 2.4G wireless communication assembly one or more, the first wireless communication module 11 of described communication terminal 1 is Infrared Projector, the bluetooth communication assembly, in RFID radio-frequency communication assembly and the 2.4G wireless communication assembly any one or more, instruction after described communication terminal 1 will be changed by wireless communication signal or instruction and parameter send to toy 2, carry out this instruction or instruction and parameter by toy 2.
As shown in Figure 2, the specific embodiment of the present invention is: the toy speech control system that makes up a kind of movement-based communicating terminal 1 and the Internet, comprise and have the toy 2 that communication connects, communication terminal 1 with phonetic entry and speech recognition, the webserver 3 with speech recognition conversion, described toy 2 comprises the second wireless communication module 21 that connects described communication terminal 1, described communication terminal 1 comprises the voice-input unit 15 of inputting voice, carry out the first wireless communication module 11 that wireless telecommunications are connected with described toy 2, carry out the first speech conversion unit 13 of speech recognition conversion and carry out the network connecting module 12 that the internet is connected with the described webserver 3, the described webserver 3 has the 3rd speech conversion unit 31 of the voice messaging that receives being identified conversion process, described voice-input unit 15 input voice, conversion is identified with the 3rd speech conversion unit 31 parallel voice to input of the described webserver 3 in the first speech conversion unit 13 of described communication terminal 1, by the described webserver 3, described communication terminal 1, described toy 2 common this speech recognition conversion results of execution or by wherein any two carry out this speech recognition conversion results or any one carries out this speech recognition conversion result by described toy 2 and described communication terminal 1.
As shown in Figure 2, specific implementation process of the present invention is: by communication terminal 1 input voice, network connecting module 12 connecting Internets of communication terminal 1, then the voice that receive are uploaded to the described webserver 3 by the Internet, the voice that the first speech conversion unit 32 side-by-side dockings of the first speech conversion unit 13 of described communication terminal 1 and the described webserver 3 are received are identified, voice identification result is changed again, and this speech recognition conversion result is the form of instruction or instruction and parameter.Speech recognition conversion result is carried out separately by toy 2, that is: the described webserver 3 instruction or instruction and the parameter that will identify conversion is sent to described communication terminal 1, described communication terminal 1 first wireless communication module 11 and described toy 2 be connected wireless communication module 21 and set up wireless telecommunications and be connected, then described communication terminal 1 is sent to described toy 2 with voice identification result, is carried out by described toy 2.If speech recognition conversion result comprises the control command of controlling toy 2, this control command of storage reaches the content that matches with phonetic order on the described toy, then described communication terminal 1 is sent to described toy 2 with voice identification result, is carried out the content of this instruction and call instruction by described toy 2.Speech recognition conversion result is carried out jointly by the described webserver 3 and toy 2, if on the webserver 3 storage corresponding with phonetic order in respective quadrature mutual information perhaps, the described webserver 3 according to the speech conversion call by result should be corresponding with phonetic order in perhaps the respective quadrature mutual information be sent to described toy 2 by described communication terminal 1, carry out these execution results by toy 2.Speech recognition conversion result is carried out jointly by described communication terminal 1 and toy 2, respective quadrature mutual information perhaps in if 1 storage of described communication terminal is corresponding with phonetic order, then by the described webserver 3 the speech recognition conversion result is sent to described communication terminal 1, carry out this speech recognition conversion result by described communication terminal 1, namely call this corresponding with phonetic order in respective quadrature mutual information perhaps, then be sent to described toy 2, institute's toy 2 is carried out this execution result.Speech recognition conversion result is carried out separately by described communication terminal 1, if communication terminal 1 has corresponding interaction content, then mobile communication is opened terminal and carried out this speech recognition conversion result, then plays back the independent execution of finishing communication terminal 1.For instruction and the content of control toy, such as music playing, tell a story, take off, rotation etc.Described speech recognition conversion result is instruction or instruction and parameter, carries out this instruction or instruction and parameter, such as, play " little swallow ", then " broadcast " is instruction, " little swallow " audio content is that content is as parameter.Speech recognition conversion result is carried out jointly by the described webserver 3, communication terminal 1, toy 3, such as: if the input phonetic order " how may I ask Beijing weather today? " then the webserver 3 is inquired about Beijing weather condition of today, then send to communication terminal 1, play by communication terminal, then the audio signal of playing is sent to toy 3 outputs, finished so the common execution by the described webserver 3, communication terminal 1, toy 3.In the specific embodiment, described content comprises one or more in audio content, the word content.In the specific embodiments of the invention, described communication terminal 1 comprises that also waking described communication terminal 1 up enters the wake module 14 that receives the input voice status, and described wake module 14 realizes waking up by the input phonetic order or by button.Conversion identified in the voice that comprise 1 pair of input of described communication terminal, and conversion identified in described communication terminal 1 and the described webserver 3 parallel voice to input, to obtain first the speech recognition conversion result for obtaining the result.Because communication terminal 1 has larger storage capacity, therefore, its content library can be larger, the content that can store more phonetic order and match with phonetic order in communication terminal 1.
In the specific embodiment, the described webserver 3 walks abreast with described communication terminal 1 voice messaging is identified conversion.The speech recognition conversion result of described communication terminal and the described webserver includes the confidence level of speech recognition conversion.So-called confidence level also is confidence level.It refers to that particular individual treats the degree that the particular proposition authenticity is believed, namely probability is to measure individual conviction is rational. the confidence level of probability is explained and is shown, event itself is what probability not, and why event assigns probability is the conviction evidence that has in people's brains of assign probabilities.Confidence level refers to that the population parameter value drops on the probability in a certain district of sample statistics value; And confidential interval refers under a certain confidence level, error range between sample statistics value and population parameter value.Confidential interval is larger, and confidence level is higher.The confidence level of speech recognition conversion is namely to the degree of faith of speech recognition conversion real result.The described webserver 3 and described communication terminal 1 are parallel when voice messaging is identified conversion, described communication terminal 1 arranges speech recognition conversion result's confidence threshold value, when described communication terminal 1 speech recognition conversion result's confidence level during more than or equal to this confidence threshold value, get this speech recognition conversion result, if described webserver speech recognition conversion result's confidence level during less than this confidence threshold value, is got the speech recognition conversion result of higher value in described communication terminal 1 speech recognition conversion result's confidence level and the described webserver 3 speech recognition conversion results' the confidence level.
In the specific embodiment, described communication terminal 1 comprises mobile phone, mobile panel computer, mobile communication amusement equipment.Described communication terminal 1 and the described webserver 3 all arrange the interactive voice storehouse of carrying out interactive voice, when input can not be identified voice messaging or inexecutable voice messaging, carry out interactive voice, the voice messaging that can carry out to obtain the described webserver or described communication terminal.If voice identification result comprises interactive information, then call in the interactive voice storehouse corresponding interactive information and be sent to described toy 2 by described communication terminal 1, play this interactive information by toy 2 and realize interactive voice.Interactive information, such as, be by interactive voice " song whether Wang Fei is arranged ", then the described webserver 3 obtains Query Result and is " having " or " nothing " by inquiry, this Query Result " has " or " nothing " then is corresponding interactive information.In addition, when input can not be identified voice messaging or inexecutable voice messaging, input voice and carry out interactive voice, the voice messaging that can carry out to obtain the described webserver 3 or communication terminal 1 or toy 2 by described communication terminal 1.Such as, in input during " start " voice messaging, if may be owing to aphthenxia Chu or excessive with the received pronunciation difference, when causing identifying, can call the interactive voice information bank and point out and input again voice.For another example, in input " opening now story ", at this moment, the possible webserver 3 or communication terminal 1 or toy 2 can not be converted to control command with this phonetic order, at this moment, need to replenish input voice information, such as, call the interactive information storehouse " you want to listen a story? " replenishing of phonetic order information finished in like this interactive voice prompting, realizes can controlling toy with natural-sounding.
As shown in Figure 2, preferred implementation of the present invention is: described speech conversion unit 32 comprises and also comprises semantic identification module 322, and described semantic identification module 322 goes out the semanteme of described voice-input unit 15 input voice according to the phonetic decision of described sound identification module 321 identifications.Such as, described voice-input unit 15 inputs that voice are " It's lovely day? " at first, carry out speech recognition, export that recognition result is " It's lovely day? " then according to voice identification result, carry out Semantic judgement, described semantic identification module 322 according to Semantic judgement is: the weather condition of broadcasting this ground today.For another example: such as, described voice-input unit 15 inputs that voice are " It's lovely day? " at first, carry out speech recognition, export that recognition result is " It's lovely day? " then described semantic identification module 322 is according to voice identification result, carry out Semantic judgement, according to Semantic judgement be: the weather condition of broadcasting this ground today.For another example: phonetic entry is: " I want to listen the music of Wang Fei ", described semantic identification module 322 semantic discriminance analysiss obtain user's be intended to " played songs ", parameter is " Wang Fei ", then according to analysis result, calls the playback of songs function and play-overs the song of Wang Fei.Because adopt semantic identification is arranged, the user does not need to remember the voice control command of fixing, but the language performance that can adopt the user oneself to be accustomed to most comes and toy interaction.So to a upper intention, the user also can say " please help me to look for the song of Wang Fei ", " the up-to-date special edition of Wang Fei is arranged? ", " Wang Fei perverse ", that is to say, the user can freely express order and the intention of oneself, powerful speech recognition and semantic understanding engine on the portable terminal, can extraordinaryly identify user's real intention: play the song of Wang Fei, or play a certain song of Wang Fei.So, allow the better free, interesting alternately of intelligent toy and user, and do not increase the direct hardware cost of original toy terminal, allow toy manufacturer can use lower cost, but realized high performance man-machine interaction effect.
As shown in Figure 2, preferred implementation of the present invention is: conversion identified in the voice that comprise 2 pairs of inputs of described toy, and described toy 2 is carried out this speech recognition conversion result.Described toy 2 comprises the second speech conversion unit 23 that carries out speech recognition conversion, and conversion identified in 23 pairs of voice in the second speech conversion unit of described toy 2.Simultaneously, described toy 2 arranges the content that matches with phonetic order, for simple voice, itself identifies conversion by toy 2, is then carried out by toy 2.When communication terminal 1 can not carry out phonetic entry and identification, input the voice of voice or 1 transmission of reception communication terminal and identify conversion by described toy 2, this speech recognition conversion result is carried out by described toy 2.So just make toy 2 concrete certain abilities that work independently, overcome the dependence of 2 pairs of communication terminals of toy, made things convenient for the use of toy 2.In the specific embodiment, the described content that matches with phonetic order comprises one or more in audio content, the word content.
Preferred implementation of the present invention is: described communication terminal 1 has the memory cell that the storaged voice instruction reaches the content that matches with phonetic order.When toy 2 is operated, comprise the content of operational order or instruction and instruction indication, such as, play " little swallow ", then " broadcast " is instruction, " little swallow " audio content is that content is as parameter.Because communication terminal 1 has larger storage capacity, therefore, its content library can be larger, the content that can store more phonetic order and match with phonetic order in communication terminal 1.
As shown in Figure 1 and Figure 2, preferred implementation of the present invention is: described toy 2 is connected by infrared signal, high frequency modulated communication signal, Bluetooth signal, 2.4G wireless communication signal, RFID radiofrequency signal with described communication terminal 1.On the described toy wireless communication receiver is set, in the art of this patent scheme, wireless communication mode comprises infrared signal, the high frequency modulated communication signal, Bluetooth signal, 2.4G wireless communication signal, in the RFID radiofrequency signal one or more, the second wireless communication module 21 relative set infrared signal receivers on the described toy 2, the high frequency modulated communication signal receiver, the Bluetooth signal receiver, 2.4G wireless communication signal receiver, in the RFID radiofrequency signal receiving unit one or more, 12 of described communication terminal 1 first wireless communication modules are infrared signal, the high frequency modulated communication signal, Bluetooth signal, 2.4G wireless communication signal, in the RFID emission of radio frequency signals assembly one or more, instruction after described communication terminal 1 will be changed by wireless communication signal or instruction and parameter send to toy 2, carry out this instruction or instruction and parameter by toy 2.
Technique effect of the present invention is: by toy sound control method and the system of a kind of movement-based communicating terminal 1 and the Internet, comprise the toy 2 that has communication and connect, have the communication terminal 1 of phonetic entry, the webserver 3 with speech recognition conversion, the described webserver 3 has the storaged voice instruction and the memory cell 31 of the content that matches with phonetic order.Toy sound control method and the system of a kind of movement-based communicating terminal 1 of the present invention and the Internet, by communication terminal 1 and the Internet, realize the identification of phonetic entry, call the content of the webserver 3 by the Internet, realized that the content memory space is large, recognition result is effective and the upgrading in time of content, make the function of toy more powerful, simultaneously, greatly saved cost.
Above content is in conjunction with concrete preferred implementation further description made for the present invention, can not assert that implementation of the present invention is confined to these explanations.For the general technical staff of the technical field of the invention, without departing from the inventive concept of the premise, can also make some simple deduction or replace, all should be considered as belonging to protection scope of the present invention.

Claims (11)

1. a movement-based communicating terminal and the mutual toy control method of internet voice, it is characterized in that, comprise the toy that has communication and connect, have the communication terminal of phonetic entry and speech recognition, the webserver with speech recognition conversion, the mutual toy control method of movement-based communicating terminal and internet voice comprises the steps:
Input voice: by described communication terminal input voice;
Upload voice: with described communication terminal connecting Internet, and the voice messaging of input uploaded to the described webserver by the Internet that described communication terminal connects;
Speech recognition conversion: conversion identified in the voice that described communication terminal and described webserver side-by-side docking are received, and this speech recognition conversion result is the form of instruction or instruction and parameter;
Carry out the identification transformation result: by the described webserver, described communication terminal, described toy jointly carry out this speech recognition conversion result or by wherein any two carry out this speech recognition conversion results or any one carries out this speech recognition conversion result by described toy and described communication terminal.
2. according to claim 1 described movement-based communicating terminal and the mutual toy control method of internet voice, it is characterized in that, also comprise according to the identification scene and make up semantic knowledge-base, described semantic knowledge-base comprises the semantic attribute of words, in the speech recognition conversion step, also comprise and carry out semanteme identification conversion, specifically comprise the steps:
Participle and semantic disambiguation: voice identification result is carried out the participle lang justice disambiguation of going forward side by side according to the semantic attribute of knowledge base words;
Intention classification and parameter extraction: the result to participle and semantic disambiguation is intended to classification, and the line parameter of going forward side by side extracts.
3. according to claim 1 described movement-based communicating terminal and the mutual toy control method of internet voice, it is characterized in that, the speech recognition conversion result of the described webserver and described communication terminal includes the confidence level of speech recognition conversion, described communication terminal arranges speech recognition conversion result's confidence threshold value, when described communication terminal speech recognition conversion result's confidence level during more than or equal to this confidence threshold value, get this speech recognition conversion result, if described communication terminal speech recognition conversion result's confidence level during less than this confidence threshold value, is got the speech recognition conversion result of higher value in described webserver speech recognition conversion result's confidence level and described communication terminal speech recognition conversion result's the confidence level.
4. according to claim 1 described movement-based communicating terminal and the mutual toy control method of internet voice, it is characterized in that, also comprise and wake described communication terminal up, wake described communication terminal up by any mode in phonetic order, button or the wireless signal and make described communication terminal enter the state of input voice.
5. according to claim 1 described movement-based communicating terminal and the mutual toy control method of internet voice, it is characterized in that, when input can not be identified voice messaging or inexecutable voice messaging, carry out interactive voice by communication terminal or toy, can identify the voice messaging that maybe can carry out to obtain.
6. according to claim 1 described movement-based communicating terminal and the mutual toy control method of internet voice is characterized in that, comprise that described toy identifies conversion to the voice of input, and described toy is carried out this speech recognition conversion result.
7. a movement-based communicating terminal and the mutual toy control system of internet voice, it is characterized in that, comprise and have the toy that communication connects, communication terminal with phonetic entry and speech recognition, the webserver with speech recognition conversion, described toy comprises the second wireless communication module that connects described communication terminal, described communication terminal comprises the voice-input unit of inputting voice, carry out the first wireless communication module that wireless telecommunications are connected with described toy, carry out the first speech conversion unit of speech recognition conversion and carry out the network connecting module that the internet is connected with the described webserver, the described webserver has the 3rd speech conversion unit of the voice messaging that receives being identified conversion process, described voice-input unit input voice, conversion identified in the parallel voice to input in the first speech conversion unit of described communication terminal and the 3rd speech conversion unit of the described webserver, by the described webserver, described communication terminal, described toy jointly carry out this speech recognition conversion result or by wherein any two carry out this speech recognition conversion results or any one carries out this speech recognition conversion result by described toy and described communication terminal.
8. the mutual toy control system of described movement-based communicating terminal and internet voice according to claim 7, it is characterized in that, described the 3rd speech conversion unit comprises sound identification module and semantic identification module, and described semantic identification module goes out the semanteme of described voice-input unit input voice according to the phonetic decision of described sound identification module identification.
9. the mutual toy control system of described movement-based communicating terminal and internet voice according to claim 7, it is characterized in that, the speech recognition conversion result of the described webserver and described communication terminal includes the confidence level of speech recognition conversion, described communication terminal arranges speech recognition conversion result's confidence threshold value, when described communication terminal speech recognition conversion result's confidence level during more than or equal to this confidence threshold value, get this speech recognition conversion result, if described communication terminal speech recognition conversion result's confidence level during less than this confidence threshold value, is got the speech recognition conversion result of higher value in described webserver speech recognition conversion result's confidence level and described communication terminal speech recognition conversion result's the confidence level.
10. the mutual toy control system of described movement-based communicating terminal and internet voice according to claim 7 is characterized in that, described toy has the second speech conversion unit that carries out speech recognition, and conversion identified in described the second speech conversion unit voice.
11. the mutual toy control system of described movement-based communicating terminal and internet voice according to claim 7, it is characterized in that, described communication terminal also comprises and wakes the wake module of state that described communication terminal enters the input voice up, and described wake module comprises any mode in phonetic order, button or the wireless signal.
CN2012103297631A 2012-09-07 2012-09-07 Method and system for controlling toy based on mobile communication terminal and internet voice interaction Pending CN102868740A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012103297631A CN102868740A (en) 2012-09-07 2012-09-07 Method and system for controlling toy based on mobile communication terminal and internet voice interaction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012103297631A CN102868740A (en) 2012-09-07 2012-09-07 Method and system for controlling toy based on mobile communication terminal and internet voice interaction

Publications (1)

Publication Number Publication Date
CN102868740A true CN102868740A (en) 2013-01-09

Family

ID=47447326

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012103297631A Pending CN102868740A (en) 2012-09-07 2012-09-07 Method and system for controlling toy based on mobile communication terminal and internet voice interaction

Country Status (1)

Country Link
CN (1) CN102868740A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103902373A (en) * 2014-04-02 2014-07-02 百度在线网络技术(北京)有限公司 Intelligent terminal control method, server and intelligent terminal
CN104667530A (en) * 2015-02-10 2015-06-03 深圳分布科技有限公司 Mobile social intelligent toy system and using method thereof
CN104795069A (en) * 2014-01-21 2015-07-22 腾讯科技(深圳)有限公司 Speech recognition method and server
CN104965426A (en) * 2015-06-24 2015-10-07 百度在线网络技术(北京)有限公司 Intelligent robot control system, method and device based on artificial intelligence
CN105469796A (en) * 2015-12-18 2016-04-06 合肥寰景信息技术有限公司 Control method for network voice input conversion
WO2016070593A1 (en) * 2014-11-07 2016-05-12 深圳新创客电子科技有限公司 Control method for smart toy and control method and system for electronic device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20010021669A1 (en) * 1995-11-20 2001-09-13 Creator Ltd. I*doll
CN101604204A (en) * 2009-07-09 2009-12-16 北京科技大学 Distributed cognitive technology for intelligent emotional robot
CN201394359Y (en) * 2009-04-28 2010-02-03 南昌航空大学 Voice controlled toy car based on Bluetooth earphone
CN102196207A (en) * 2011-05-12 2011-09-21 深圳市子栋科技有限公司 Method, device and system for controlling television by using voice
CN102496364A (en) * 2011-11-30 2012-06-13 苏州奇可思信息科技有限公司 Interactive speech recognition method based on cloud network

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20010021669A1 (en) * 1995-11-20 2001-09-13 Creator Ltd. I*doll
CN201394359Y (en) * 2009-04-28 2010-02-03 南昌航空大学 Voice controlled toy car based on Bluetooth earphone
CN101604204A (en) * 2009-07-09 2009-12-16 北京科技大学 Distributed cognitive technology for intelligent emotional robot
CN102196207A (en) * 2011-05-12 2011-09-21 深圳市子栋科技有限公司 Method, device and system for controlling television by using voice
CN102496364A (en) * 2011-11-30 2012-06-13 苏州奇可思信息科技有限公司 Interactive speech recognition method based on cloud network

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104795069A (en) * 2014-01-21 2015-07-22 腾讯科技(深圳)有限公司 Speech recognition method and server
CN103902373A (en) * 2014-04-02 2014-07-02 百度在线网络技术(北京)有限公司 Intelligent terminal control method, server and intelligent terminal
CN103902373B (en) * 2014-04-02 2017-09-29 百度在线网络技术(北京)有限公司 intelligent terminal control method, server and intelligent terminal
WO2016070593A1 (en) * 2014-11-07 2016-05-12 深圳新创客电子科技有限公司 Control method for smart toy and control method and system for electronic device
CN104667530A (en) * 2015-02-10 2015-06-03 深圳分布科技有限公司 Mobile social intelligent toy system and using method thereof
CN104667530B (en) * 2015-02-10 2018-02-06 深圳分布科技有限公司 A kind of mobile social intelligence toy system and application method
CN104965426A (en) * 2015-06-24 2015-10-07 百度在线网络技术(北京)有限公司 Intelligent robot control system, method and device based on artificial intelligence
US10223638B2 (en) 2015-06-24 2019-03-05 Baidu Online Network Technology (Beijing) Co., Ltd. Control system, method and device of intelligent robot based on artificial intelligence
CN105469796A (en) * 2015-12-18 2016-04-06 合肥寰景信息技术有限公司 Control method for network voice input conversion

Similar Documents

Publication Publication Date Title
CN102831892B (en) Toy control method and system based on internet voice interaction
CN103093755B (en) Based on terminal and mutual network household electric appliance control method and the system of internet voice
CN102855872B (en) Based on terminal and the mutual household electric appliance control method of internet voice and system
CN102847325B (en) Toy control method and system based on voice interaction of mobile communication terminal
CN102855874B (en) Method and system for controlling household appliance on basis of voice interaction of internet
CN102855875B (en) Network speech conversing control system and method based on external open control of speech input
CN102543071B (en) Voice recognition system and method used for mobile equipment
CN106201424B (en) A kind of information interacting method, device and electronic equipment
CN102842306B (en) Sound control method and device, voice response method and device
WO2018188586A1 (en) Method and device for user registration, and electronic device
CN108958810A (en) A kind of user identification method based on vocal print, device and equipment
CN108694940B (en) Voice recognition method and device and electronic equipment
CN102868740A (en) Method and system for controlling toy based on mobile communication terminal and internet voice interaction
CN108877790A (en) Speaker control method, device, readable storage medium storing program for executing and mobile terminal
CN103208285A (en) Household electrical appliance control method and system based on voice interaction of mobile communication terminals
CN103280216B (en) Improve the speech recognition device the relying on context robustness to environmental change
CN109637548A (en) Voice interactive method and device based on Application on Voiceprint Recognition
CN202961885U (en) Voice-controlled toy communication device based on mobile communication terminal
CN111261151B (en) Voice processing method and device, electronic equipment and storage medium
CN204613722U (en) One can voice-operated intelligent cloud life staying idle at home system
CN102404278A (en) Song request system based on voiceprint recognition and application method thereof
CN112735418B (en) Voice interaction processing method, device, terminal and storage medium
CN109767763A (en) It is customized wake up word determination method and for determine it is customized wake up word device
CN108806688A (en) Sound control method, smart television, system and the storage medium of smart television
CN103095927A (en) Displaying and voice outputting method and system based on mobile communication terminal and glasses

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20130109