CN108899036A - A kind of processing method and processing device of voice data - Google Patents
A kind of processing method and processing device of voice data Download PDFInfo
- Publication number
- CN108899036A CN108899036A CN201810549538.6A CN201810549538A CN108899036A CN 108899036 A CN108899036 A CN 108899036A CN 201810549538 A CN201810549538 A CN 201810549538A CN 108899036 A CN108899036 A CN 108899036A
- Authority
- CN
- China
- Prior art keywords
- user
- information
- tone
- data
- user intent
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification
- G10L17/22—Interactive procedures; Man-machine interfaces
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B20/00—Signal processing not specific to the method of recording or reproducing; Circuits therefor
- G11B20/10—Digital recording or reproducing
- G11B20/10527—Audio or video recording; Data buffering arrangements
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11C—STATIC STORES
- G11C7/00—Arrangements for writing information into, or reading information out from, a digital store
- G11C7/20—Memory cell initialisation circuits, e.g. when powering up or down, memory clear, latent image memory
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0487—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
- G06F3/0488—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
-
- G—PHYSICS
- G11—INFORMATION STORAGE
- G11B—INFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
- G11B20/00—Signal processing not specific to the method of recording or reproducing; Circuits therefor
- G11B20/10—Digital recording or reproducing
- G11B20/10527—Audio or video recording; Data buffering arrangements
- G11B2020/10537—Audio or video recording
- G11B2020/10546—Audio or video recording specifically adapted for audio data
Abstract
The embodiment of the present invention provides a kind of processing method and processing device of voice data, the method includes:Obtain the operation information from the first user;Based on the operation information, the corresponding user intent information of the first user is determined;If the user intent information, which is used to indicate, plays the tone information from second user, based on the user intent information, obtain the first tone information data to be played corresponding with the user intent information, wherein the first tone information data are recorded by second user;Play the first tone information data.In this way, obtaining tone information data to be played by identification user intent information and playing out, the function of intelligent audio equipment can be enriched and improve the degree of intelligence of intelligent audio equipment.
Description
Technical field
The present embodiments relate to intelligent terminal application field more particularly to the processing methods and dress of a kind of voice data
It sets.
Background technique
With the rise of smart home, Internet of Things, the intelligent audios equipment such as intelligent sound box, wearable device has biggish
Development, intelligent audio equipment can not only be interacted with user, and have the function of voice broadcasting.
Currently, With the fast development of internet, voice data playing function provided by intelligent audio equipment is mostly to adopt
The voice data for collecting user's input searches feedback information corresponding with the voice data, in internet web page from internet
Music, the Weather information in internet play the feedback information after getting feedback information.But intelligent audio equipment
Provided service is mostly the interactive service of user and internet, and this interactive service is more single, and cannot provide multiple intelligence
Service is recorded and played to message between energy audio frequency apparatus, can not realize individually to leave a message and record and play service.
During stating intelligent audio equipment in use, inventor has found that existing intelligent audio equipment does not have voice to stay
Say mailbox function, the tone information that user is recorded in other equipment or current device can not be played, there are function compared with
For the lower technical problem of single, degree of intelligence.
Summary of the invention
In view of this, the embodiment of the present invention provides a kind of processing method and processing device of voice data, main purpose is to lead to
Identification user intent information is crossed to play the tone information recorded in other equipment or current device, audio is can be improved and sets
Standby degree of intelligence, and the function of abundant audio frequency apparatus.
In order to achieve the above objectives, the embodiment of the present invention mainly provides the following technical solutions:
In a first aspect, the embodiment of the present invention provides a kind of processing method of voice data, the method includes:It is come from
The operation information of first user;Based on the operation information, the corresponding user intent information of the first user is determined;If the use
Family intent information be used to indicate play the tone information from second user, be based on the user intent information, obtain with it is described
The corresponding first tone information data to be played of user intent information, wherein the first tone information data are used by second
It records at family;Play the first tone information data.
Second aspect, the embodiment of the present invention provide a kind of processing unit of voice data, and described device includes:It obtains single
Member, for obtaining the operation information from the first user;First determination unit determines first for being based on the operation information
The corresponding user intent information of user;Acquiring unit, if being used to indicate broadcasting from second for the user intent information
The tone information of user is based on the user intent information, obtains corresponding with the user intent information to be played first
Tone information data, wherein the first tone information data are recorded by second user;Broadcast unit, for playing described
One tone information data.
The third aspect, the embodiment of the present invention provide a kind of storage medium, and the storage medium includes the program of storage,
In, described program operation when control the storage medium where equipment execute the processing method of above-mentioned voice data the step of.
Fourth aspect, the embodiment of the present invention provide a kind of intelligent audio equipment, and the intelligent audio equipment includes:At least one
A processor;And at least one processor, the bus being connected to the processor;Wherein, the processor, memory pass through
The bus completes mutual communication;The processor is used to call the program instruction in the memory, above-mentioned to execute
The step of processing method of voice data.
The processing method and processing device of voice data provided in an embodiment of the present invention is obtaining the operation from the first user
It, can be according to the operation information, to determine user intent information corresponding to the first user after information;Next, if first
The user intent information of user is to be used to indicate to play the tone information from second user, will be intended to letter based on the user
Breath, to obtain the first tone information data to be played corresponding with the user intent information, wherein first voice stays
Speech data are recorded by second user;Finally, the first tone information data can be played.In this way, passing through identification user intent information
It plays the tone information data recorded in other audio frequency apparatuses or present video equipment, can be realized multiple intelligent audios
Service is recorded and played to message between equipment, is also able to achieve individually message and records and play service, thus, it improves audio and sets
Standby degree of intelligence, and the function of abundant audio frequency apparatus.
Detailed description of the invention
By reading the following detailed description of the preferred embodiment, various other advantages and benefits are common for this field
Technical staff will become clear.The drawings are only for the purpose of illustrating a preferred embodiment, and is not considered as to the present invention
Limitation.And throughout the drawings, the same reference numbers will be used to refer to the same parts.In the accompanying drawings:
Fig. 1 is the structural schematic diagram of the phonetic searching system in the embodiment of the present invention one;
Fig. 2 is the flow diagram one of the processing method of the voice data in the embodiment of the present invention one;
Fig. 3 A is the flow diagram two of the processing method of the voice data in the embodiment of the present invention one;
Fig. 3 B is the flow diagram three of the processing method of the voice data in the embodiment of the present invention one;
Fig. 4 is the structural schematic diagram of the processing unit of the voice data in the embodiment of the present invention two;
Fig. 5 is the structural schematic diagram of the intelligent audio equipment in the embodiment of the present invention three.
Specific embodiment
The exemplary embodiment that the present invention will be described in more detail below with reference to accompanying drawings.Although showing the present invention in attached drawing
Exemplary embodiment, it being understood, however, that may be realized in various forms the present invention without should be by embodiments set forth here
It is limited.It is to be able to thoroughly understand the present invention on the contrary, providing these embodiments, and can be by the scope of the present invention
It is fully disclosed to those skilled in the art.
Embodiment one
The embodiment of the present invention provides a kind of phonetic searching system, and Fig. 1 is the phonetic searching system in the embodiment of the present invention one
Structural schematic diagram, shown in Figure 1, which includes:Overall control center (Controller) 101, automatic language
Sound identification service (ASR, Automatic Speech Recognition) module 102, question and answer (QA, Query Answer) service
Module 103, dialogue management (DM, Dialogue Management) module 104, client (Client) 105 and text-to-speech
(TTS, Text to Speech) service module 106;
Wherein, above-mentioned overall control center, for the voice operating information according to transmitted by client, by calling system
Other service modules determine the corresponding user intent information of the operation information, and search for corresponding with the user intent information wait broadcast
The tone information data put.
Above-mentioned ASR service module, for carrying out speech recognition to voice operating information transmitted by overall control center, by language
Sound operation information is converted to text identification as a result, and text recognition result is issued overall control center.The ASR service includes:Stream
Media services (streaming server) module and recognizer server (identification service) module, wherein
Streaming server module mainly does some audio decoders to the voice operating information that overall control center is sent, sample rate turns
The audio processings such as change, recognizer server module mainly will treated that voice data is converted to text (text) number
According to, while during conversion, to overall control center returning part result (partial result), short pause (short
Pause), the speech characteristic parameters information such as mute (silence), final result (final result).
Above-mentioned QA service module, for passing through qa-api after receiving text identification result transmitted by overall control center
Come call DM module come to text identification result carry out semantic analysis, which is natural language processing (NLP, Natural
Language Processing) portal service.
Above-mentioned DM module is right after obtaining text identification result transmitted by overall control center for doing dialog logic control
Text recognition result carries out semantic analysis, determines user intent information.The DM module is by query analysis (query-
Analysis) service module, buffer service (cache-server) module and spatial term (NLG, Natural
Language Generation) service module realizes.Wherein, query-analysis service module is mainly used for completing
Semantic understanding, including intent classifier and entity word extract the two functions, and in practical applications, query-analysis services mould
Block can be realized by natural language understanding (NLU, Natural laguage understanding) technology;cache-
Server module is used for according to tone information data needed for user intent information inquiry, and stores query result, so as to client
Intelligent audio device plays tone information data where holding, in practical applications, cache-server module on the one hand can be with
Some lesser data of variation are stored in advance, to improve retrieval rate, on the other hand, can also pass through and call internet hunt
Engine, such as onebox, to retrieve required search result;NLG service module is used for according to NLG technology to cache-server
Various information in the search result searched carry out structured analysis, and according to search need be organized into one it is succinct from
Right language, to facilitate user to listen to.
Above-mentioned client, for using the NLG data in search result, request of the initiation to TTS service module will be literary
The NLG data of this format are converted to voice data, thus, it is played out in intelligent audio equipment.
Above-mentioned TTS service module, for text data to be converted to voice data.
In practical applications, client is set in intelligent audio equipment, and intelligent audio equipment can be come in a variety of manners
Implement.For example, intelligent audio equipment described in the embodiment of the present invention may include such as intelligent sound box, smart television, intelligence
Carry-on equipment such as the smart home devices such as set-top box, smart phone, tablet computer, smartwatch, Intelligent bracelet etc..When
So, it can also be not specifically limited in the embodiment of the present invention here for other types of audio frequency apparatus.
Further, in conjunction with above-mentioned phonetic searching system, the embodiment of the present invention provides a kind of processing method of voice data,
The processing method of the voice data is applied to intelligent audio equipment.
Fig. 2 is the flow diagram one of the processing method of the voice data in the embodiment of the present invention one, shown in Figure 2,
The processing method of the voice data includes:
S201:Obtain the operation information from the first user;
Specifically, according to the difference of the action type of the first user, aforesaid operations information can be voice operating information,
It is also possible to touch control operation information, it is, of course, also possible to be other types of operation information, such as fingerprinting operation information, here, sheet
Inventive embodiments are not specifically limited.
In practical applications, when the first user wants to give by the other users of intelligent audio device plays or the first user
The tone information oneself stayed, when leaving a message to other users or leaving a message to oneself, the first user can pass through the side of interactive voice
Formula realizes that such as the first user inquires that intelligent audio equipment " message for having me ", " playing message ", " I wants by voice
Message ", " I will create voice reminder " etc., at this point, intelligent audio equipment will obtain the letter of the voice operating from the first user
Breath, alternatively, the first user can also be realized by way of touch control operation, such as the first user can press intelligent audio equipment
On play button perhaps record button can also in the user interface of intelligent audio equipment press play message-leaving function button or
Message button etc. is recorded, the voice play function of intelligent audio equipment is opened or records message-leaving function, and generates respective operations
Information, at this point, intelligent audio equipment will obtain the touch control operation information from the first user.
S202:Based on operation information, the corresponding user intent information of the first user is determined;
Specifically, below with reference to phonetic searching system, by taking operation information is voice operating information as an example, to illustrate how
The corresponding user intent information of the first user is determined according to operation information.It obtains in intelligent audio equipment from the first user
Voice operating information after, which can be sent to overall control center, overall control center calls ASR service module to pass through
The voice operating information is converted to textual identification by speech recognition technology, and then overall control center is by text identification information
It is sent to QA service module, QA service module calls DM module to carry out semantic reason to text recognition result by qa-api
Solution, DM module use natural language understanding technology, carry out semantic understanding to text identification information, and determine the first user couple
The user intent information answered.In this way, just obtaining the corresponding user intent information of the first user.
Application scenarios one:User intent information, which is used to indicate, plays the tone information from second user.
Illustratively, when aforesaid operations information is voice operating information, if the corresponding text of voice operating information
Identification information is that " what message I has " " message for playing me " " having to my message " corresponding user intent information is " to broadcast
Put tone information ";If the corresponding textual identification of voice operating information is that " message for playing A1 to me " " has A1 to me
Message " corresponding user intent information is " playing the tone information from A1 ", " playing the tone information that A2 is stayed ".This
When, it can determine that above-mentioned user intent information is to be used to indicate to play the tone information from second user.
Application scenarios two:User intent information is used to indicate leaves a message to second user recorded speech.
Illustratively, when aforesaid operations information is voice operating information, if the corresponding text of voice operating information
Identification information is that " I will leave a message " " I wants to record message " " creation voice reminder " corresponding user intent information is " recorded speech
Message ";If the corresponding textual identification of voice operating information is that " give A1 message " " reminding to A1 recorded speech " " I am
B, I will leave a message to A1 " corresponding user intent information is " leaving a message to A1 recorded speech ", " staying tone information to A1 ".If should
The corresponding textual identification of voice operating information is " I is B, I will leave a message to owner ", and corresponding user intent information is
" B will leave a message ".It leaves a message at this point, can determine that above-mentioned user intent information is used to indicate to second user recorded speech.
It in practical applications, can be right by operation information institute when operation information obtained is touch control operation information
The user that the function of answering is determined as the first user is intended to.For example, if function corresponding to touch control operation information is to play message,
At this point it is possible to determine that the user intent information of the first user is " playing tone information ";If corresponding to touch control operation information
Function is to record message, at this point it is possible to determine that the user intent information of the first user is " recorded speech message ".
In the specific implementation process, if it is determined that the user intent information of the first user is to be used to indicate broadcasting from second
The tone information of user then executes S203 to S204.
S203:Based on user intent information, the first tone information number to be played corresponding with user intent information is obtained
According to;
Wherein, the first tone information data are recorded by second user.
In practical applications, an intelligent audio equipment can be by multiple users come using for example, having four in one family
A kinsfolk, respectively:Mother, father, eldest daughter and small daughter, the intelligent sound box in family then correspond to four users.
So, according to the difference of the scene of practical application, second user can be identical as the first user, e.g., when mother returns
After to home, the intelligent sound box in family is can be used to play the voice for prompting that the previous day self is recorded and stay in mother
Speech;Second user can also be not identical as the first user, as mother can also play two female using the intelligent sound box in family
Youngster gives her tone information.
Certainly, second user can be a user, or multiple users, such as two users, three users.This
In, the embodiment of the present invention is not specifically limited.
In the specific implementation process, in order to obtain the first tone information data to be played, above-mentioned S203 may include with
Lower step:
Step 2031:Based on user intent information, the corresponding identification information of the first tone information data is determined;
In practical applications, the corresponding identification information of above-mentioned first tone information data can be user identity information, such as
Leave a message listener user identity information, leave a message recording side user identity information, or recording time information, certainly,
It can also be other information that can be identified for that tone information data, such as multiple combinations in equipment identification information or above- mentioned information
Deng here, the embodiment of the present invention is not specifically limited.
Specifically, user identity information can be User ID, user's pet name, address name etc..
Step 2032:From tone information data set, tone information data that label information and identification information are matched
It is determined as the first tone information data.
In practical applications, tone information data set can store in the local storage space of intelligent audio equipment,
Can store in shared memory space associated by multiple intelligent audio equipment, it is, of course, also possible to be stored in it is other external,
As voice mail server memory space in, here, the embodiment of the present invention is not specifically limited.
Illustratively, when the processing method of the voice data is applied to single intelligent audio equipment, such as multiple users make
Tone information is carried out with an intelligent sound box, is deposited at this point, tone information data set can store in the local of the intelligent sound box
It stores up in space;When the processing method of the voice data is applied in tone information system, as the tone information system includes:Intelligence
Energy speaker and smartwatch, the first user use intelligent sound box, and second user uses smartwatch, intelligent sound box and smartwatch
It shares memory space with preset cloud respectively to be associated with, at this point, tone information data set can store and share memory space in cloud.
In practical applications, for the ease of being quickly found out required tone information data, when storaged voice leaves a message data,
Corresponding mark can be generated according to the recording time of the tone information, the equipment of recording, message recording side, message listener etc.
Sign information.
In this way, after obtaining the corresponding identification information of the first tone information data by step 2031, so that it may by this
The label information of each of identification information and tone information data set tone information data is matched, finally, in voice
It leaves a message in data set, the tone information data that label information and the identification information match are required the first language to be played
Sound message data.
The is realized below with the user identifier for listener of leaving a message and at least one of the user identifier of message recording side
For the identification information of one tone information data, to illustrate how based on the corresponding user intent information of the first user, to determine
The corresponding identification information of first tone information data.
In the specific implementation process, above-mentioned steps 2031 may comprise steps of:
Step 2031a:User intent information is parsed, judges whether user intent information meets preset condition, and
Generate judging result;
Specifically, it after the intent information for obtaining the first user, " is broadcast if the text structure of user intent information meets
A1 is put to the tone information of B1 ", show in user intent information while indicating the user identifier A1 and message of message recording side
The user identifier B1 of listener;If the text structure of user intent information meets " playing the tone information from A2 ", " plays
The tone information that A2 is stayed " shows the user identifier A2 that message recording side is simply had an indication that in user intent information, and instruction is not stayed
Say the user identifier of listener;If the text structure of user intent information meets, " tone information for playing to B2 ", " broadcasting is stayed
To the tone information of B2 ", show the user identifier B2 that message listener is simply had an indication that in user intent information, instruction is not stayed
Say the user identifier of recording side;If the text structure of user intent information meets " playing tone information ", show that user is intended to
The user identifier that message recording side is not pointed out in information does not also point out the user identifier of message listener.
Certainly, in practical applications, above-mentioned preset condition can also be other, be not limited to " broadcasting A1 enumerated above
To the tone information of B1 ", " play the tone information from A2 ", the forms such as " playing tone information ", can be by those skilled in the art
Member determines that here, the embodiment of the present invention is not specifically limited according to the actual situation in the specific implementation process.
Step 2031b:The corresponding user's mark of the first tone information data is obtained according to preset strategy based on judging result
Know information;
Wherein, user identity information is the first user identity information of the first user and the second user mark of second user
At least one of information.
Step 2031c:User identity information is determined as identification information.
In practical applications, according to the difference of judging result, corresponding preset strategy is also different.Specifically,
Above-mentioned steps 2031b may exist and be not limited to following three kinds of situations.
Situation one:The user identifier of message recording side is explicitly pointed out in user intent information, i.e. the second of second user is used
Family identification information directly extracts required user identity information from user intent information.
So, above-mentioned steps 2031b may include:If it is judged that showing that user intent information meets the first default item
Part extracts second user identification information from user intent information.
Here, the first preset condition refer to include in user intent information second user second user identification information.
Illustratively, when user intent information is " playing mother to the tone information of father ", mother is second user
Second user identification information, father be the first user the first user identity information;When user intent information is " to play Zhang San
To my tone information " when, Zhang San is the second user identification information of second user.
Situation two:The user identifier of message listener is explicitly pointed out in user intent information, i.e. the first of the first user is used
Family identification information directly extracts required user identity information from user intent information.
So, above-mentioned steps 2031b may include:If it is judged that showing that user intent information meets the second default item
Part extracts the first user identity information from user intent information.
Here, the second preset condition refer to include in user intent information the first user the first user identity information.
Illustratively, when user intent information is " playing Zhang San to the tone information of Li Si ", Zhang San is second user
Second user identification information, Li Si be the first user the first user identity information;When user intent information is " to play king five
Tone information " when, king five be the first user the first user identity information.
Situation three:The user identifier of message listener and the use of message recording side are not explicitly pointed out in user intent information
Family mark, shows prompt information, to obtain required user identity information.
So, above-mentioned steps 2031b may include:If it is judged that showing that user intent information meets third and presets item
Part shows default prompt information corresponding with user intent information to the first user, receives the response message from the first user;
Based on response message, second user identification information and/or the first user identity information are obtained.
Illustratively, when user intent information is " playing tone information ", at this point, can not be mentioned from user intent information
User identifier needed for taking out needs to show default prompt information to user, needed for being obtained according to the response message of user
Identification information.
In practical applications, default prompt information can disappear for the prompt of the user identifier for obtaining message listener
Breath, such as " may I ask you is whom ", or for obtaining the prompting message of the identification information of message recording side, such as " may I ask will be broadcast
The tone information who is stayed put ", it is, of course, also possible to be the prompting message of other contents, such as " it may I ask and whom needs to play stayed to whose
Speech ", here, the embodiment of the present invention is not specifically limited.
In practical applications, the difference of the mode interacted according to intelligent audio equipment and user shows default prompt
The mode of information can be varied.For example, default prompt information can be broadcasted by voice broadcast, it can also be by aobvious
The content of default prompt information is directly displayed out in display screen, it is, of course, also possible to otherwise, such as show on a user interface
Push button, drop-down menu etc. come allow user select message listener and message recording side.
Certainly, in practical applications, above-mentioned steps 2031b can also be realized by other means, the embodiment of the present invention
It is not specifically limited.
S204:Play the first tone information data.
Specifically, obtain with after the first tone information data corresponding to the user intent information of the first user,
The first tone information data can be played to the first user.
In practical applications, due to the case where using identical equipment there are multiple users, can such as make per capita for one four mouthfuls
With the intelligent sound box in family, in order to avoid playback error, realize it is effective play, can before playing tone information every time,
First determine whether lower accessed tone information data give the user's of current operation intelligent audio equipment.
So, in the specific implementation process, above-mentioned S204 may include:It is right when operation information is voice operating information
Voice operating information carries out Application on Voiceprint Recognition, obtains the vocal print feature of the first user;Believed according to user's vocal print feature and user identifier
Mapping relations between breath determine the first user identity information corresponding with the vocal print feature of the first user;First user is marked
Know information to be matched with the message listener label of the first tone information data;If successful match, plays the first voice and stay
Say data.
Specifically, useful due to directly being carried in the voice operating information when operation information is voice operating information
The vocal print feature at family, and vocal print feature is capable of the identity of unique identification user, therefore, can directly to the voice operating information into
Row Application on Voiceprint Recognition, to obtain the vocal print feature of the first user, next, can be believed according to user's vocal print feature and user identifier
Mapping relations between breath determine corresponding first user identity information of the vocal print feature of first user, finally, by first
User identity information is matched with the message listener label of the first tone information data, according to matching result, so that it may really
The first tone information data are made whether to the first current user's.If successful match, show the first voice
Message data are exactly to the message of the first user, and the first user is that message listener can be listened to, at this point, can play this
One tone information data.
In addition, when operation information is touch control operation information touch control operation can be carried out to intelligent audio equipment in user
During, while other biological informations of user, such as the fingerprint characteristic of user are acquired, to determine the mark of user
Information.It is, of course, also possible to show preset authentication prompt letter to the first user before playing the first tone information data
Breath, to obtain the user identity information of the first user responded.
In an alternative embodiment of the invention, Fig. 3 A is the process of the processing method of the voice data in the embodiment of the present invention one
Schematic diagram two, referring to shown in Fig. 3 A, after executing above-mentioned steps S201 and S202, if it is determined that the user of the first user is intended to
Information is to be used to indicate to leave a message to second user recorded speech, and the processing method of the voice data can also include:
S301:Acquire the second tone information data from the first user;
S302:According to user intent information, the corresponding second user identification information of second user is determined;
In the specific implementation process, similar with the process of identification information of the first tone information data is determined, according to user
Intent information determines that the corresponding second user identification information of second user may exist and be not limited to following three kinds of modes.
Mode one:If only pointing out the user identifier of message recording side in user intent information, i.e. the first user the with
User identity information, above-mentioned S302 may include:From user intent information, the first user identifier letter of the first user is extracted
Breath;By in preset user identity information library, all user identity informations in addition to the first user identity information determine the second use
The second user identification information at family.
As an example it is assumed that preset user identity information library includes:Zhang San, Li Si, king five, if user intent information
For " Zhang San will leave a message ", it is possible to which Li Si and king five to be determined as to the second user identification information of second user.
Mode two:If explicitly pointing out the user identifier of message listener in user intent information, i.e. the of second user
Two user identity informations, above-mentioned S302 may include:From user intent information, the second user mark letter of second user is extracted
Breath.
For example, if user intent information is " leaving a message to Li Si ", so that it may Zhang San are directly determined as the second use
The second user identification information at family.
Mode three:The user identifier of message listener and the use of message recording side are not explicitly pointed out in user intent information
Family mark, can show prompt information by intelligent audio equipment, to obtain required user identity information.Above-mentioned S302 can
To include:Default prompt information corresponding with user intent information is shown to the first user, receives the response from the first user
Information;Based on response message, second user identification information and/or the first user identity information are obtained.
For example, if user intent information is " recorded speech message ", at this point, being directly to be intended to believe from user
Required user identifier is extracted in breath, needs to show default prompt information to user, to be obtained according to the response message of user
Take required identification information.
In practical applications, default prompt information can disappear for the prompt of the user identifier for obtaining message listener
Breath, such as " whom may I ask you will leave a message to ", or for obtaining the prompting message of the identification information of message recording side, such as " ask
Ask that whom you are ", it is, of course, also possible to be the prompting message of other contents, such as " whom, which may I ask, to leave a message to whom ", here, the present invention
Embodiment is not specifically limited.
In practical applications, the difference of the mode interacted according to intelligent audio equipment and user shows default prompt
The mode of information can be varied.For example, default prompt information can be broadcasted by voice broadcast, it can also be by aobvious
The content of default prompt information is directly displayed out in display screen, it is, of course, also possible to otherwise, such as show on a user interface
Push button, drop-down menu etc. come allow user select message listener and message recording side.
Certainly, other than the above-mentioned embodiment listed, in practical applications, above-mentioned S302 can also be by other means
It realizes, the embodiment of the present invention is not specifically limited.
S303:Second user identification information is labeled as the corresponding message listener label of the second tone information data;
Specifically, for the ease of being quickly found out required tone information when playing message, second user is being obtained
Second user identification information after, so that it may the second user identification information is stayed labeled as the second tone information data are corresponding
Say listener label.
S304:The second tone information data after storage label.
In an alternative embodiment of the invention, it in order to more accurately find required tone information, referring to shown in Fig. 3 B, is holding
Before row S304, the processing method of above-mentioned voice data can also include:
S305:Application on Voiceprint Recognition is carried out to the second tone information data, obtains the vocal print feature of the first user;
S306:According to the mapping relations between user's vocal print feature and user identity information, the determining sound with the first user
Corresponding first user identity information of line feature;
S307:First user identity information is labeled as the corresponding message recording side's label of the second tone information data.
Specifically, for the ease of more accurately finding required tone information when playing message, first is being obtained
After the first user identity information of user, so that it may which first user identity information is corresponding labeled as the second tone information data
Message recording side's label.
After having executed S307, S304 can be executed, to store the second tone information data after label.
Here, it should be noted that can only mark message listener label in practical applications, can also only mark
Note message recording side's label, can also mark message listener label and message recording side's label simultaneously, certainly, in addition to giving second
Tone information data are marked outside label with user identity information, can also by other information, such as recording time information, equipment
Identification information etc. comes to the second corresponding label of tone information data markers.
So far, the treatment process to voice data is just completed.
As shown in the above, technical solution provided in an embodiment of the present invention is obtaining the operation from the first user
It, can be according to the operation information, to determine user intent information corresponding to the first user after information;Next, if first
The user intent information of user is to be used to indicate tone information of the broadcasting from second user to be, will be intended to letter based on the user
Breath, to obtain the first tone information data to be played corresponding with user intent information, wherein the first tone information data by
Second user is recorded;Finally, the first tone information data can be played.In this way, other to play by identification user intent information
The tone information data recorded in audio frequency apparatus or present video equipment can be realized between multiple intelligent audio equipment
Service is recorded and played to message, is also able to achieve individually message and records and play service, thus, improve the intelligent journey of audio frequency apparatus
Degree, and the function of abundant audio frequency apparatus.
Embodiment two
Based on the same inventive concept, as an implementation of the above method, the embodiment of the invention provides a kind of voice data
Processing unit, the Installation practice is corresponding with preceding method embodiment, and to be easy to read, present apparatus embodiment is no longer to aforementioned
Detail content in embodiment of the method is repeated one by one, it should be understood that before the device in the present embodiment can correspond to realization
State the full content in embodiment of the method.
Fig. 4 is the structural schematic diagram of the processing unit of the voice data in the embodiment of the present invention two, shown in Figure 4, should
Device 40 includes:Obtaining unit 401, for obtaining the operation information from the first user;First determination unit 402 is used for base
In operation information, the corresponding user intent information of the first user is determined;Acquiring unit 403, if used for user intent information
In instruction play the tone information from second user, be based on user intent information, obtain it is corresponding with user intent information to
The the first tone information data played, wherein the first tone information data are recorded by second user;Broadcast unit 404, for broadcasting
Put the first tone information data.
In embodiments of the present invention, acquiring unit is also used to determine the first tone information data based on user intent information
Corresponding identification information;From tone information data set, the tone information data that label information and identification information are matched are true
It is set to the first tone information data.
In embodiments of the present invention, acquiring unit is also used to parse user intent information, judges that user is intended to letter
Whether breath meets preset condition, and generates judging result;The first tone information is obtained according to preset strategy based on judging result
The corresponding user identity information of data, wherein user identity information is that the first user identity information of the first user and second are used
At least one of the second user identification information at family;User identity information is determined as identification information.
In embodiments of the present invention, acquiring unit is also used to if it is judged that showing that user intent information meets first
Preset condition extracts second user identification information from user intent information;If it is judged that showing that user intent information is full
The second preset condition of foot extracts the first user identity information from user intent information;If it is judged that showing that user is intended to
Information meets third preset condition, shows default prompt information corresponding with user intent information to the first user, reception comes from
The response message of first user;Based on response message, second user identification information and/or the first user identity information are obtained.
In embodiments of the present invention, broadcast unit, for believing voice operating when operation information is voice operating information
Breath carries out Application on Voiceprint Recognition, obtains the vocal print feature of the first user;According to reflecting between user's vocal print feature and user identity information
Relationship is penetrated, determines the first user identity information corresponding with the vocal print feature of the first user;By the first user identity information and
The message listener label of one tone information data is matched;If successful match, the first tone information data are played.
In other embodiments of the present invention, above-mentioned apparatus further includes:Acquisition unit, if be used for for user intent information
It is indicated to second user recorded speech message, acquires the second tone information data from the first user;Second determination unit is used
According to user intent information, the corresponding second user identification information of second user is determined;First marking unit is used for second
User identity information is labeled as the corresponding message listener label of the second tone information data;Storage unit, for storing label
The second tone information data afterwards.
In an alternative embodiment of the invention, above-mentioned apparatus further includes:Recognition unit, for the second tone information data into
Row Application on Voiceprint Recognition obtains the vocal print feature of the first user;Third determination unit, for according to user's vocal print feature and user identifier
Mapping relations between information determine the first user identity information corresponding with the vocal print feature of the first user;Second label is single
Member, for the first user identity information to be labeled as the corresponding message recording side's label of the second tone information data.
The processing unit for the voice data introduced by the present embodiment is the voice that can be executed in the embodiment of the present invention
The device of the processing method of data, so the processing method based on voice data described in the embodiment of the present invention, this field
Those of skill in the art can understand the specific embodiment and its various change of the processing unit of the voice data of the present embodiment
Form, so how to realize the processing side of the voice data in the embodiment of the present invention for the processing unit of the voice data at this
Method is no longer discussed in detail.As long as the processing method that those skilled in the art implement voice data in the embodiment of the present invention is adopted
Device belongs to the range to be protected of the application.
In practical applications, the processing unit of the voice data can be applied in intelligent audio equipment.Intelligent audio equipment
It can implement in a variety of manners.For example, intelligent audio equipment described in the embodiment of the present invention may include such as intelligent sound
The smart home devices such as case, smart television, Intelligent set top box, such as smart phone, tablet computer, smartwatch, Intelligent bracelet
Etc. carry-on equipment etc..It is, of course, also possible to be not specifically limited in the embodiment of the present invention here for other types of audio frequency apparatus.
Embodiment three
Based on the same inventive concept, the embodiment of the present invention provides a kind of intelligent audio equipment.Fig. 5 is the embodiment of the present invention three
In intelligent audio equipment structural schematic diagram, shown in Figure 5, which includes:At least one processor
51;And at least one processor 52, the bus 53 being connect with the processor 51;Wherein, the processor 51, memory 52
Mutual communication is completed by the bus 53;The processor 51 is used to call the program instruction in the memory 52,
To execute following steps:Obtain the operation information from the first user;Based on operation information, the corresponding user of the first user is determined
Intent information;If user intent information, which is used to indicate, plays the tone information from second user, it is based on user intent information,
Obtain the first tone information data to be played corresponding with user intent information, wherein the first tone information data are by second
User records;Play the first tone information data.
In embodiments of the present invention, following steps be can also carry out when above-mentioned processor caller instructs:It is anticipated based on user
Figure information determines the corresponding identification information of the first tone information data;From tone information data set, by label information and mark
The tone information data that information matches are determined as the first tone information data.
In embodiments of the present invention, following steps be can also carry out when above-mentioned processor caller instructs:User is intended to
Information is parsed, and judges whether user intent information meets preset condition, and generates judging result;Based on judging result, press
According to preset strategy, the corresponding user identity information of the first tone information data is obtained, wherein user identity information is the first user
The first user identity information and at least one of the second user identification information of second user;User identity information is determined
For identification information.
In embodiments of the present invention, following steps be can also carry out when above-mentioned processor caller instructs:If it is determined that knot
Fruit shows that user intent information meets the first preset condition, from user intent information, extracts second user identification information;If
Judging result shows that user intent information meets the second preset condition, from user intent information, extracts the first user identifier letter
Breath;If it is judged that showing that user intent information meets third preset condition, to the first user displaying and user intent information
Corresponding default prompt information receives the response message from the first user;Based on response message, second user mark letter is obtained
Breath and/or the first user identity information.
In embodiments of the present invention, following steps be can also carry out when above-mentioned processor caller instructs:Work as operation information
When for voice operating information, Application on Voiceprint Recognition is carried out to voice operating information, obtains the vocal print feature of the first user;According to user's sound
Mapping relations between line feature and user identity information determine the first user identifier corresponding with the vocal print feature of the first user
Information;First user identity information is matched with the message listener label of the first tone information data;If matching at
Function plays the first tone information data.
In embodiments of the present invention, following steps be can also carry out when above-mentioned processor caller instructs:If user anticipates
Figure information is used to indicate leaves a message to second user recorded speech, acquires the second tone information data from the first user;According to
User intent information determines the corresponding second user identification information of second user;Second user identification information is labeled as second
The corresponding message listener label of tone information data;The second tone information data after storage label.
In embodiments of the present invention, following steps be can also carry out when above-mentioned processor caller instructs:To the second voice
Data of leaving a message carry out Application on Voiceprint Recognition, obtain the vocal print feature of the first user;According to user's vocal print feature and user identity information it
Between mapping relations, determine corresponding with the vocal print feature of the first user the first user identity information;First user identifier is believed
Breath is labeled as the corresponding message recording side's label of the second tone information data.
The embodiment of the invention also provides a kind of processor, the processor is for running program, wherein described program fortune
The processing method of the voice data in above-described embodiment is executed when row.
Above-mentioned processor can be by central processing unit (Central Processing Unit, CPU), microprocessor (Micro
Processor Unit, MPU), digital signal processor (Digital Signal Processor, DSP) or field-programmable
Gate array (Field Programmable Gate Array, FPGA) etc. is realized.Memory may include computer-readable medium
In non-volatile memory, the shapes such as random access memory (Random Access Memory, RAM) and/or Nonvolatile memory
Formula, if read-only memory (Read Only Memory, ROM) or flash memory (Flash RAM), memory include at least one storage
Chip.
Example IV
Based on the same inventive concept, the present embodiment provides a kind of storage medium, above-mentioned storage medium be stored with one or
Multiple programs, said one or multiple programs can be executed by one or more processor, to realize in above-described embodiment
The processing method of voice data.
It should be understood by those skilled in the art that, the embodiment of the present invention can provide as method, system or computer program
Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the present invention
Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the present invention, which can be used in one or more,
Usable storage medium (including but not limited to magnetic disk storage, CD-ROM (Compact Disc Read-Only Memory,
CD-ROM), optical memory etc.) on the form of computer program product implemented.
The present invention be referring to according to the method for the embodiment of the present invention, the process of equipment (system) and computer program product
Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions
The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs
Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce
A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real
The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates,
Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or
The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting
Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or
The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one
The step of function of being specified in a box or multiple boxes.
In a typical configuration, calculating equipment includes one or more processors (CPU), input/output interface, net
Network interface and memory.
Memory may include the non-volatile memory in computer-readable medium, RAM and/or Nonvolatile memory etc.
Form, such as ROM or Flash RAM.Memory is the example of computer-readable medium.
Computer-readable medium includes permanent and non-permanent, removable and non-removable media can be by any method
Or technology come realize information store.Information can be computer readable instructions, data structure, the module of program or other data.
Computer readable storage medium can be ROM, programmable read only memory (Programmable Read-Only Memory,
PROM), Erasable Programmable Read Only Memory EPROM (Erasable Programmable Read-Only Memory, EPROM), electricity
Erasable Programmable Read Only Memory EPROM (Electrically Erasable Programmable Read-Only Memory,
EEPROM), magnetic RAM (Ferromagnetic Random Access Memory, FRAM), flash
Device (Flash Memory), magnetic surface storage, CD or CD-ROM (Compact Disc Read-Only Memory,
The memories such as CD-ROM);Be also possible to flash memory or other memory techniques, CD-ROM, digital versatile disc (DVD) or
Other optical storage, magnetic cassettes, tape magnetic disk storage or other magnetic storage devices or any other non-transmission medium,
It can be used for storing and can be accessed by a computing device information;It can also be various including one of above-mentioned memory or any combination
Electronic equipment, such as mobile phone, computer, tablet device, personal digital assistant.As defined in this article, computer can
Reading medium not includes temporary computer readable media (transitory media), such as data-signal and carrier wave of modulation.
It should also be noted that, the terms "include", "comprise" or its any other variant are intended to nonexcludability
It include so that the process, method, commodity or the equipment that include a series of elements not only include those elements, but also to wrap
Include other elements that are not explicitly listed, or further include for this process, method, commodity or equipment intrinsic want
Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including element
There is also other identical elements in process, method, commodity or equipment.
It will be understood by those skilled in the art that the embodiment of the present invention can provide as method, system or computer program product.
Therefore, complete hardware embodiment, complete software embodiment or embodiment combining software and hardware aspects can be used in the present invention
Form.It is deposited moreover, the present invention can be used to can be used in the computer that one or more wherein includes computer usable program code
The shape for the computer program product implemented on storage media (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.)
Formula.
The above is only the embodiment of the present invention, are not intended to restrict the invention.To those skilled in the art,
The invention may be variously modified and varied.It is all within the spirit and principles of the present invention made by any modification, equivalent replacement,
Improve etc., it should be included within scope of the presently claimed invention.
Claims (10)
1. a kind of processing method of voice data, which is characterized in that the method includes:
Obtain the operation information from the first user;
Based on the operation information, the corresponding user intent information of the first user is determined;
If the user intent information, which is used to indicate, plays the tone information from second user, letter is intended to based on the user
Breath obtains the first tone information data to be played corresponding with the user intent information, wherein first tone information
Data are recorded by second user;
Play the first tone information data.
2. the method according to claim 1, wherein it is described be based on the user intent information, obtain with it is described
The corresponding first tone information data to be played of user intent information, including:
Based on the user intent information, the corresponding identification information of the first tone information data is determined;
From tone information data set, the tone information data that label information and the identification information match are determined as described
First tone information data.
3. according to the method described in claim 2, determining described the it is characterized in that, described be based on the user intent information
The corresponding identification information of one tone information data, including:
The user intent information is parsed, judges whether the user intent information meets preset condition, and generates and sentences
Disconnected result;
The corresponding user identity information of the first tone information data is obtained according to preset strategy based on the judging result,
Wherein, the user identity information is the first user identity information of the first user and the second user identification information of second user
At least one of;
The user identity information is determined as the identification information.
4. according to the method described in claim 3, it is characterized in that, described obtained based on the judging result according to preset strategy
The corresponding user identity information of the first tone information data is taken, including:
If the judging result shows that the user intent information meets the first preset condition, from the user intent information
In, extract the second user identification information;
If the judging result shows that the user intent information meets the second preset condition, from the user intent information
In, extract first user identity information;
If the judging result shows that the user intent information meets third preset condition, to first user show with
The corresponding default prompt information of the user intent information, receives the response message from the first user;Believed based on the response
Breath, obtains the second user identification information and/or the first user identity information.
5. the first tone information data are played the method according to claim 1, wherein described, including:
When the operation information is voice operating information, Application on Voiceprint Recognition is carried out to the voice operating information, first is obtained and uses
The vocal print feature at family;
According to the mapping relations between user's vocal print feature and user identity information, the determining vocal print feature with first user
Corresponding first user identity information;
First user identity information is matched with the message listener label of the first tone information data;
If successful match, the first tone information data are played.
6. determining the first user the method according to claim 1, wherein being based on the instruction information described
After corresponding user intent information, the method also includes:
It leaves a message if the user intent information is used to indicate to second user recorded speech, acquires second from the first user
Tone information data;
According to the user intent information, the corresponding second user identification information of second user is determined;
The second user identification information is labeled as the corresponding message listener label of the second tone information data;
The second tone information data after storage label.
7. according to the method described in claim 6, it is characterized in that, it is described storage label after the second tone information data it
Before, the method also includes:
Application on Voiceprint Recognition is carried out to the second tone information data, obtains the vocal print feature of first user;
According to the mapping relations between user's vocal print feature and user identity information, the determining vocal print feature with first user
Corresponding first user identity information;
First user identity information is labeled as the corresponding message recording side's label of the second tone information data.
8. a kind of processing unit of voice data, which is characterized in that described device includes:
Obtaining unit, for obtaining the operation information from the first user;
First determination unit determines the corresponding user intent information of the first user for being based on the operation information;
Acquiring unit plays the tone information from second user if be used to indicate for the user intent information, is based on
The user intent information obtains the first tone information data to be played corresponding with the user intent information, wherein institute
The first tone information data are stated to be recorded by second user;
Broadcast unit, for playing the first tone information data.
9. a kind of storage medium, which is characterized in that the storage medium includes the program of storage, wherein run in described program
When control the storage medium where equipment execute the processing method of voice data as described in any one of claim 1 to 7
Step.
10. a kind of intelligent audio equipment, which is characterized in that the intelligent audio equipment includes:
At least one processor;
And at least one processor, the bus being connected to the processor;
Wherein, the processor, memory complete mutual communication by the bus;The processor is described for calling
The step of program instruction in memory, processing method to execute voice data as described in any one of claim 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810549538.6A CN108899036A (en) | 2018-05-31 | 2018-05-31 | A kind of processing method and processing device of voice data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810549538.6A CN108899036A (en) | 2018-05-31 | 2018-05-31 | A kind of processing method and processing device of voice data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108899036A true CN108899036A (en) | 2018-11-27 |
Family
ID=64344022
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810549538.6A Pending CN108899036A (en) | 2018-05-31 | 2018-05-31 | A kind of processing method and processing device of voice data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108899036A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109889643A (en) * | 2019-03-29 | 2019-06-14 | 广东小天才科技有限公司 | A kind of tone information broadcasting method and device and storage medium |
CN109889644A (en) * | 2019-03-29 | 2019-06-14 | 广东小天才科技有限公司 | A kind of tone information listens to method and apparatus and storage medium |
CN110413250A (en) * | 2019-06-14 | 2019-11-05 | 华为技术有限公司 | A kind of voice interactive method, apparatus and system |
WO2020107360A1 (en) * | 2018-11-30 | 2020-06-04 | 华为技术有限公司 | Voice recognition method, device and system |
CN111339348A (en) * | 2018-12-19 | 2020-06-26 | 北京京东尚科信息技术有限公司 | Information service method, device and system |
CN112087669A (en) * | 2020-08-07 | 2020-12-15 | 广州华多网络科技有限公司 | Method and device for presenting virtual gift and electronic equipment |
CN112256947A (en) * | 2019-07-05 | 2021-01-22 | 北京猎户星空科技有限公司 | Method, device, system, equipment and medium for determining recommendation information |
CN113470656A (en) * | 2020-07-09 | 2021-10-01 | 青岛海信电子产业控股股份有限公司 | Intelligent voice interaction device and voice message leaving method under target scene |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101286344A (en) * | 2008-05-27 | 2008-10-15 | 深圳华普数码有限公司 | Word-leaving system |
CN201312322Y (en) * | 2008-11-14 | 2009-09-16 | 孟秀娟 | Network audio sharing system |
CN103916443A (en) * | 2013-01-07 | 2014-07-09 | 上海博路信息技术有限公司 | Method for sharing mobile phone sound recording capacity |
CN106777099A (en) * | 2016-12-14 | 2017-05-31 | 掌阅科技股份有限公司 | The processing method of business speech data, device and terminal device |
CN107644640A (en) * | 2016-07-22 | 2018-01-30 | 佛山市顺德区美的电热电器制造有限公司 | A kind of information processing method and home appliance |
CN107977183A (en) * | 2017-11-16 | 2018-05-01 | 百度在线网络技术(北京)有限公司 | voice interactive method, device and equipment |
CN108023737A (en) * | 2010-04-15 | 2018-05-11 | 三星电子株式会社 | Method and apparatus for transmitting digital content from from computer to mobile hand-held device |
-
2018
- 2018-05-31 CN CN201810549538.6A patent/CN108899036A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101286344A (en) * | 2008-05-27 | 2008-10-15 | 深圳华普数码有限公司 | Word-leaving system |
CN201312322Y (en) * | 2008-11-14 | 2009-09-16 | 孟秀娟 | Network audio sharing system |
CN108023737A (en) * | 2010-04-15 | 2018-05-11 | 三星电子株式会社 | Method and apparatus for transmitting digital content from from computer to mobile hand-held device |
CN103916443A (en) * | 2013-01-07 | 2014-07-09 | 上海博路信息技术有限公司 | Method for sharing mobile phone sound recording capacity |
CN107644640A (en) * | 2016-07-22 | 2018-01-30 | 佛山市顺德区美的电热电器制造有限公司 | A kind of information processing method and home appliance |
CN106777099A (en) * | 2016-12-14 | 2017-05-31 | 掌阅科技股份有限公司 | The processing method of business speech data, device and terminal device |
CN107977183A (en) * | 2017-11-16 | 2018-05-01 | 百度在线网络技术(北京)有限公司 | voice interactive method, device and equipment |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020107360A1 (en) * | 2018-11-30 | 2020-06-04 | 华为技术有限公司 | Voice recognition method, device and system |
CN111339348A (en) * | 2018-12-19 | 2020-06-26 | 北京京东尚科信息技术有限公司 | Information service method, device and system |
CN109889643A (en) * | 2019-03-29 | 2019-06-14 | 广东小天才科技有限公司 | A kind of tone information broadcasting method and device and storage medium |
CN109889644A (en) * | 2019-03-29 | 2019-06-14 | 广东小天才科技有限公司 | A kind of tone information listens to method and apparatus and storage medium |
CN110413250A (en) * | 2019-06-14 | 2019-11-05 | 华为技术有限公司 | A kind of voice interactive method, apparatus and system |
WO2020249091A1 (en) * | 2019-06-14 | 2020-12-17 | 华为技术有限公司 | Voice interaction method, apparatus, and system |
CN112256947A (en) * | 2019-07-05 | 2021-01-22 | 北京猎户星空科技有限公司 | Method, device, system, equipment and medium for determining recommendation information |
CN112256947B (en) * | 2019-07-05 | 2024-01-26 | 北京猎户星空科技有限公司 | Recommendation information determining method, device, system, equipment and medium |
CN113470656A (en) * | 2020-07-09 | 2021-10-01 | 青岛海信电子产业控股股份有限公司 | Intelligent voice interaction device and voice message leaving method under target scene |
CN112087669A (en) * | 2020-08-07 | 2020-12-15 | 广州华多网络科技有限公司 | Method and device for presenting virtual gift and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108899036A (en) | A kind of processing method and processing device of voice data | |
US11564090B1 (en) | Audio verification | |
CN105120304B (en) | Information display method, apparatus and system | |
CN1910654B (en) | Method and system for determining the topic of a conversation and obtaining and presenting related content | |
US20200126566A1 (en) | Method and apparatus for voice interaction | |
CN108846054A (en) | A kind of audio data continuous playing method and device | |
WO2021083071A1 (en) | Method, device, and medium for speech conversion, file generation, broadcasting, and voice processing | |
JP2020525903A (en) | Managing Privilege by Speaking for Voice Assistant System | |
CN107895578A (en) | Voice interactive method and device | |
US11463772B1 (en) | Selecting advertisements for media programs by matching brands to creators | |
US11250857B1 (en) | Polling with a natural language interface | |
JP6783339B2 (en) | Methods and devices for processing audio | |
US11580982B1 (en) | Receiving voice samples from listeners of media programs | |
US11776541B2 (en) | Communicating announcements | |
CN107342088B (en) | Method, device and equipment for converting voice information | |
US10838954B1 (en) | Identifying user content | |
CN107943914A (en) | Voice information processing method and device | |
CN109346057A (en) | A kind of speech processing system of intelligence toy for children | |
JP2000207170A (en) | Device and method for processing information | |
CN112673641B (en) | Inline response to video or voice messages | |
CN110379406A (en) | Voice remark conversion method, system, medium and electronic equipment | |
CN110046242A (en) | A kind of automatic answering device and method | |
CN112599130A (en) | Intelligent conference system based on intelligent screen | |
JP6322125B2 (en) | Speech recognition apparatus, speech recognition method, and speech recognition program | |
US11790913B2 (en) | Information providing method, apparatus, and storage medium, that transmit related information to a remote terminal based on identification information received from the remote terminal |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181127 |
|
RJ01 | Rejection of invention patent application after publication |