CN106782544A - Interactive voice equipment and its output intent - Google Patents
Interactive voice equipment and its output intent Download PDFInfo
- Publication number
- CN106782544A CN106782544A CN201710199965.1A CN201710199965A CN106782544A CN 106782544 A CN106782544 A CN 106782544A CN 201710199965 A CN201710199965 A CN 201710199965A CN 106782544 A CN106782544 A CN 106782544A
- Authority
- CN
- China
- Prior art keywords
- audio
- audio output
- input data
- parameter
- audio input
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000002452 interceptive effect Effects 0.000 title abstract description 18
- 238000000034 method Methods 0.000 claims abstract description 60
- 238000004458 analytical method Methods 0.000 claims abstract description 49
- 230000004044 response Effects 0.000 claims abstract description 18
- 238000012545 processing Methods 0.000 claims description 47
- 238000003860 storage Methods 0.000 description 16
- 230000009471 action Effects 0.000 description 15
- 230000006399 behavior Effects 0.000 description 13
- 230000006870 function Effects 0.000 description 12
- 238000005516 engineering process Methods 0.000 description 10
- 238000004590 computer program Methods 0.000 description 7
- 238000004891 communication Methods 0.000 description 6
- 238000005259 measurement Methods 0.000 description 6
- 230000033228 biological regulation Effects 0.000 description 5
- 230000008859 change Effects 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 230000033764 rhythmic process Effects 0.000 description 4
- 230000005611 electricity Effects 0.000 description 3
- 239000002245 particle Substances 0.000 description 3
- 238000010183 spectrum analysis Methods 0.000 description 3
- 239000013589 supplement Substances 0.000 description 3
- 238000005538 encapsulation Methods 0.000 description 2
- 210000005224 forefinger Anatomy 0.000 description 2
- 230000005484 gravity Effects 0.000 description 2
- 210000004247 hand Anatomy 0.000 description 2
- 238000010191 image analysis Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000005236 sound signal Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000005303 weighing Methods 0.000 description 2
- 241000208340 Araliaceae Species 0.000 description 1
- 235000005035 Panax pseudoginseng ssp. pseudoginseng Nutrition 0.000 description 1
- 235000003140 Panax quinquefolius Nutrition 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000001934 delay Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 210000003414 extremity Anatomy 0.000 description 1
- 239000004744 fabric Substances 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 210000003811 finger Anatomy 0.000 description 1
- 235000008434 ginseng Nutrition 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000036651 mood Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000000047 product Substances 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/165—Management of the audio stream, e.g. setting of volume, audio stream path
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/22—Interactive procedures; Man-machine interfaces
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/18—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being spectral information of each sub-band
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/21—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/225—Feedback of the input speech
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Acoustics & Sound (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Theoretical Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
Present disclose provides a kind of interactive voice equipment and its output intent.Methods described includes:Audio input data is gathered by audio collecting device;Obtain the audio output data of the response audio input data;Obtain the target audio output parameter that analysis determines;Audio output device is controlled to export the audio output data according to the target audio output parameter.
Description
Technical field
This invention relates generally to electronic device field, relate more specifically to interactive voice equipment and its output intent.
Background technology
With the development of voice technology, the setting with voice interactive function of such as mobile phone, computer, intelligent sound box etc
Standby use is more and more universal.The output parameter of these equipment, such as output volume, typically settable, but can still go out sometimes
Existing some problems.For example, the late into the night all fall asleep after it is just very quiet in room, it is assumed that user comes back a problem to ask
Intelligent sound box (such as he inquire day weather how), if audio amplifier is also answered by the volume for setting daytime, will seem
Sound is very big, it is possible to wake other people up.If by turning down sound volume before depending on user to sleep, if certain day user is if forgetting regulation
This problem can equally be gone out.
Accordingly, it would be desirable to a kind of mechanism of the output for being capable of Based Intelligent Control interactive voice equipment.
The content of the invention
According to the first aspect of the application, there is provided a kind of output intent.Methods described includes:By audio collecting device
Collection audio input data;Obtain the audio output data of the response audio input data;Obtain the target sound that analysis determines
Frequency output parameter;And, export the audio output data according to target audio output parameter control audio output device.
According to the second aspect of the application, there is provided a kind of processing equipment.The equipment includes:Audio collecting device;Sound
Frequency output device;And processing unit.The processing unit is configured to:Audio input data is gathered by audio collecting device;
Obtain the audio output data of the response audio input data;Obtain the target audio output parameter that analysis determines;And, according to
The audio output data are exported according to target audio output parameter control audio output device.
Brief description of the drawings
By the following specifically describes with reference to accompanying drawing, above and other aspect of exemplary embodiment of the invention and its excellent
Point will be apparent for those of ordinary skill in the art, wherein:
Fig. 1 shows the flow chart of output intent according to embodiments of the present invention.
Fig. 2 shows the flow chart of output intent according to an embodiment of the invention.
Fig. 3 shows the flow chart of output intent according to another embodiment of the present invention.
Fig. 4 shows the flow chart of the output intent according to further embodiment of this invention.
Fig. 5 shows the block diagram of interactive voice equipment according to embodiments of the present invention.
Fig. 6 shows the block diagram of the processing unit of interactive voice equipment according to embodiments of the present invention.
In the accompanying drawings, similar reference indicates same or similar key element.
Specific embodiment
According to reference to accompanying drawing to the described in detail below of disclosure exemplary embodiment, the other side of the disclosure, advantage
Be will become apparent to those skilled in the art with prominent features.
In the disclosure, term " including " and " containing " and its derivative mean including and it is unrestricted;Term "or" is bag
Containing property, mean and/or.
In this manual, following is explanation for describing the various embodiments of disclosure principle, should not be with any
Mode is construed to limit scope of disclosure.Referring to the drawings described below is used to help comprehensive understanding by claim and its equivalent
The exemplary embodiment of the disclosure that thing is limited.It is described below to help understand including various details, but these details should
Think what is be merely exemplary.Therefore, it will be appreciated by those of ordinary skill in the art that without departing substantially from the scope of the present disclosure and spirit
In the case of, embodiment described herein can be made various changes and modifications.Additionally, for clarity and brevity,
Eliminate the description of known function and structure.Additionally, running through accompanying drawing, same reference numbers are directed to similar function and operation.
Fig. 1 shows the flow chart of the output intent 100 in equipment 10 according to embodiments of the present invention.
Equipment 10 in an embodiment of the present invention can be the various equipment that voice interactive function can be provided (hereinafter
Also referred to as speech ciphering equipment), such as intelligent sound box, response formula Intelligent voice toy, the smart phone with voice assistant, computer,
Robot with voice interactive function etc..Audio collecting device, audio output dress are generally included in these speech ciphering equipments
Put, processing unit etc..
When user provides audio input, method 100 starts.In some embodiments, equipment 10 can after powering always
In sound monitoring state, when audio input is detected, method 100 starts.Preferably, in further embodiments, user
Can recognize that follow-up sound is input into by specific word (such as " you associate the well ") annunciator that wakes up.In these embodiments,
Only when equipment 10 detects the wake-up word of user input, method 100 starts.
As shown in figure 1, in step s 110, audio input data is gathered by audio collecting device.
According to embodiments of the present invention, the audio collecting device can be various known or following exploitation with audio
The device of acquisition function, such as microphone.Audio collecting device is also referred to as sound pick-up, to monitor head, pickup first-class, or can be
Various audio collection cards.
Preferably, in the case where user recognizes that follow-up sound is input into by specific wake-up word annunciator, step
Audio input data in S110 is often referred to the audio input data after word is waken up gathered by audio collecting device.
In the step s 120, the audio output data of the gathered audio input data of response are obtained.
Specifically, in the step s 120, the audio input data of audio collecting device collection can be processed, according to
Result determines to respond the audio output data of the audio input data, then obtains identified audio output data.
The treatment of audio input data to audio collecting device collection can be locally executed by equipment 10, or can be by
External processing apparatus are performed.
Specifically, equipment 10 can be from sound local or from the gathered audio input data of external equipment acquisition response
Frequency output data.The audio input data that equipment 10 can be gathered with processing locality by audio collecting device.For example, equipment 10 can
With with native processor processing audio input data, or even audio collecting device in itself can with integrated special processor with
Treatment audio input data.
Alternatively, the audio input data gathered by audio collecting device can be sent to external treatment and set by equipment 10
It is standby to be processed.
Either local in equipment 10, still in the outside in reason equipment, the treatment to audio input data can include
To the content analysis of voice that is included in audio input data and to the audio attribute of audio input data (such as in time domain when
Long and/or spectral characteristic) analysis.Specifically, the treatment to audio input data can for example include that noise filtering, voice were known
Not, spectrum analysis etc..
According to the result to audio input data, it may be determined that respond the audio output number of the audio input data
According to.If as an example, when being query time to the result instruction user of audio input data, could be by the current time
As audio output data.And for example, when user sends the inquiry of " how is weather tomorrow ", carried out by audio input data
The result instruction user that speech recognition and semantic analysis are obtained wishes to know current weather, in can be by using interconnection
Net search technique searches for weather forecast information, and the weather forecast information that will be searched to be converted to audio signal defeated as audio
Go out data.And for example, when user sends the request of " telling a story ", the result instruction user to audio input data is wished
Story is listened in prestige, in can be to select story according to predetermined policy, the selected corresponding audio signal of story is defeated as audio
Go out data.It can be diversified to select the strategy of story.For example, can be random from the candidate's story set being locally stored
One story of selection is played out.It is alternatively possible to according to the spectrum analysis to audio input data determine the sex of importer/
Age selects different candidate's story subsets, and a story is then randomly choosed from selected candidate's story subset is broadcast
Put.
The operation of the audio output data of above-mentioned determination response audio input data can be locally executed by equipment 10, or
The audio input data of collection can be sent to external processing apparatus by equipment 10, then performing this by external equipment determines behaviour
Make.
Additionally, equipment 10 can obtain the audio output data of response audio input data from local storage.Alternatively,
Equipment 10 can receive the audio output data of response audio input data from external equipment.
The audio input data of collection is sent in the case that external processing apparatus are processed in equipment 10, Ke Yiyou
The external processing apparatus will beam back equipment 10 to the result of audio input data, and sound is then determined and obtained by equipment 10
Should audio input data audio output data.Alternatively it is also possible to by external processing apparatus according to audio input data
Result determine and obtain respond the audio input data audio output data, then by the audio output data is activation
To equipment 10.
In step s 130, the target audio output parameter that analysis determines is obtained.
It is alternatively possible to determine that target audio output parameter is defeated to obtain target audio by local analytics by equipment 10
Go out parameter.It should be understood that analysis determines the native processor and above-mentioned treatment audio input data of target audio output parameter
Native processor can be the different processor in same processor, or equipment 10 in equipment 10.
It is alternatively possible to analyze determination target audio output parameter by external processing apparatus.Then, equipment 10 can be from
The external equipment receives target audio output parameter.It should be understood that analysis determines that the external treatment of target audio output parameter sets
The external processing apparatus of standby and above-mentioned treatment audio input data can be same equipment, or distinct device.
Either determine target audio output parameter, the analysis in local still the analysis in reason equipment in the outside of equipment 10
Various ways can be used.
In certain embodiments, can be by analyzing the target sound that audio input data determination is matched with audio input data
Frequency output parameter.
As an example, target audio output parameter can be determined according to audio input data corresponding audio frequency parameter.Audio
The corresponding audio frequency parameter of input data can for example include at least one of parameters described below:Volume parameters, prosodic parameter, tone color
Parameter.Especially, volume parameters can represent the height of volume.Prosodic parameter can represent the speed of word speed.Tamber parameter example
Male/female/child's voice can such as be represented.These parameters generally can carry out power spectrumanalysis and frequency spectrum by audio input data
Analyze to obtain.
In the implementation, target audio output parameter can be presented uniformity or be in the audio frequency parameter of audio input data
Existing phase reflexive.
As an example, if the volume for analyzing audio input data is small, it is determined that target audio output volume
For small;If instead the volume for analyzing audio input data is big, it is determined that target audio output volume is big.This is suitable for
User whispers under quiet environment, the scene spoken up under noisy environment.In such an implementation, in such as late into the night
In such quiet environment, user whispers, equipment output small volume, without interference with other people;And on daytime, user is loud
Speak, equipment also exports big volume, be easy to user not hear.
As another example, if the volume for analyzing audio input data is small, it is determined that target audio output volume
For big;If instead the volume for analyzing audio input data is big, it is determined that target audio output volume is small.This is suitable for
Small volume is input into because occasion caused by remote, and small volume shows user in equipment at a distance, it should have the big volume output to make
Obtaining user can hear, and big volume input shows user near equipment, and small volume output allows for user and hears.
As an example, if it is fast that the rhythm for analyzing audio input data indicates word speed, it is determined that target audio
The prosodic parameter of output also indicates that fast word speed, if instead the rhythm instruction word speed for analyzing audio input data is slow, then really
The prosodic parameter of the audio output that sets the goal also indicates that slow word speed.Alternatively, if the rhythm for analyzing audio input data is indicated
Word speed is fast, it is determined that the output duration of target audio output uses the first duration, otherwise determines the output of target audio output
Duration uses the second duration, wherein the first duration was shorter than for the second time.So, rapid input will correspond to rapid output, full
The demand that sufficient user worries.
As another example, if it is fast that the rhythm for analyzing audio input data indicates word speed, it is determined that target audio
The prosodic parameter of output indicates slow word speed.So, rapid input slowly exports correspondence, can relax the mood of user.
As another example, if the tone color for analyzing audio input data indicates male voice/female voice, it is determined that target audio
The tamber parameter of output indicates female voice/male voice.Alternatively, if analyze audio input data tone color indicate male voice/female voice/
Child's voice, it is determined that the tamber parameter of target audio output correspondingly indicates male voice/female voice/child's voice.
In a preferred embodiment, can be by combining source of sound distance analysis audio input data, to determine the target of matching
Audio output parameters.It is corresponding that this audio frequency parameter for allowing for the audio input data of the collection of equipment 10 depends not only on source of sound
Audio frequency parameter, is also influenceed by source of sound distance.
Source of sound distance can determine in several ways.
In some instances, waveform analysis can be performed come really by the audio input data gathered to audio collecting device
Accordatura source is with a distance from audio collecting device.For example, because radio-frequency component and low-frequency component are with apart from the decay or time delay for producing
Characteristic is differed, by the waveform for analyzing audio input data, it may be determined that distance of the source of sound from audio collecting device.
In other examples, source of sound distance can be estimated by microphone array.In this example, the sound of equipment 10
Frequency harvester includes microphone array.Source of sound distance can be estimated by microphone array.Preferably, in order that all directions
Estimated accuracy it is basically identical, can using equilateral triangle microphone array arrange, wherein three microphones are arranged in
Three summits of equilateral triangle.Or, can be arranged using foursquare microphone array, wherein four microphones are respectively arranged
On foursquare four summits.During the different delays produced by the difference of distance using each microphone of sound source to microphone array
Between, the specific arrangement (such as equilateral triangle or square) according to microphone array is it is estimated that the distance of sound source.
In other example, the distance of source of sound distance can be estimated by image collecting device.In this example, if
Standby 10 include image collecting device (such as camera).Preferably, equipment 10 can include more than one camera, such as with court
To multiple cameras of different directions.In some implementations, can be within sweep of the eye by detecting in image collecting device
It is no to have user to determine the distance of source of sound distance.If for example, by human face detection tech in the visual field of image collecting device
In the range of detect user, it is determined that source of sound distance is near, otherwise for remote.In other realizations, can be by IMAQ
Device gather one or more image, by native processor or external processing apparatus analysis acquired image come determine source of sound away from
From distance.For example, if the analysis result indicate that all there is same people in the plurality of pictures of same camera continuous acquisition, and its
Mouth opening and closing form is different, then the people is considered as supplier's (i.e. source of sound) of audio input signal, determine source of sound distance be it is near, it is no
Then determine that source of sound distance is remote.
In other example, source of sound distance can be estimated by infrared range-measurement system.In this example, equipment 10 includes
Infrared range-measurement system.Equipment 10 can first determine the direction of sound source by audio collecting device (such as microphone), then using red
Outer rangefinder is measured with a distance from sound source on Sounnd source direction.
After source of sound distance is determined, can be by combining source of sound distance analysis audio input data, to determine matching
Target audio output parameter.
Target output volume simply can be defined as the volume of direct ratio and audio input data, and inverse ratio and source of sound away from
From.Can generally use sound power W as the Measure Indexes for weighing volume, unit can be watt (w).For example, can be with
Target output volume is determined by following formula:
Wherein WoutRepresent target output volume, WinputThe volume of audio input data is represented, r represents audio collecting device
With the distance of source of sound, k0It is constant.
It is alternatively possible to pass through to combine source of sound distance analysis audio input data, the audio frequency parameter of source of sound is determined, then really
The fixed target audio output parameter matched with the audio frequency parameter of source of sound.
For example, by taking volume as an example, (that is, the audio input data of the collection of equipment 10 is corresponding for the volume that equipment 10 is collected
Volume) volume of source of sound can not be directly characterized, but also influenceed relative to the distance of equipment 10 by source of sound.Generally can be with
Using the sound intensity as the Measure Indexes for weighing sound intensity, unit can be watt/meter2(w/m2), it is possible to use sound power conduct
The Measure Indexes of volume are weighed, unit can be watt (w).Voice command in view of people is point sound source, can be by voice
Propagation is processed as spherical wave.For spherical wave, sound intensity I is directly proportional to the sound power W of point sound source, with putting down apart from r
Side is inversely proportional, and is shown below
The inverse square law of the acoustics based on formula (2), by the actual measurement sound intensity I and source of sound at measurement point (equipment 10) place
The sound power W of sound source (that is, source of sound) counter can be pushed away apart from r:
W=I × 4 π r2 (3)。
Then, by combining source of sound distance analysis audio input data, it may be determined that source of sound volume W, then can be by mesh
Mark audio output volume is set to be proportional to source of sound volume W.When source of sound volume is high, target output volume is also high;Otherwise work as sound
When source volume is low, target output volume is also low.For example, target output volume can be determined by following formula:
Wout=k1×W
(4),
Wherein WoutTarget output volume is represented, W represents source of sound volume, and r represents the distance of audio collecting device and source of sound,
k1It is constant.
The target audio output parameter that matching is determined by combining source of sound distance analysis audio input data is described above
Some embodiments.These embodiments there may be change in implementing.
For example, in certain embodiments, the determination of target audio output parameter had always both considered the sound of audio input data
Frequency parameter, it is also considered that source of sound distance.
And in further embodiments, the audio frequency parameter only in audio input data meets certain condition (for example, audio is defeated
The volume for entering data is small) when, the determination of target audio output parameter just considers source of sound distance.As an example, first analyze
Audio input data, if the analysis result indicate that the volume of audio input data is big, it is determined that target audio output volume is
Greatly, it is not necessary to detect source of sound distance;If the analysis result indicate that the volume of audio input data is small, then source of sound distance is detected,
Volume and source of sound distance then in conjunction with audio input data, it is determined that the target audio output volume of matching is (for example, with reference to above
Formula 1 determine).In this implementation example, it is to whisper or speaking up that equipment can distinguish user, especially
It is to whisper or speak up a long way off on hand that ground can distinguish user, so as to be adaptively adjusted output volume, is changed
Kind Consumer's Experience.
In further embodiments, the target audio that can be matched with the ambient parameter by analysis environments parameter determination
Output parameter.Ambient parameter can for example include ambient brightness.For example, ambient brightness can be detected by photodetector.Work as inspection
The ambient brightness for measuring is brighter, can be set to target audio output parameter (such as volume) higher;Otherwise ambient brightness is darker,
Target audio output parameter can be set to lower.This realization can adaptively export big volume, the late into the night output daytime
Small volume, reduces the interference to other people.
In some other embodiments, can be determined and institute by the behavior or attribute of analyzing the importer of audio input data
State the behavior of importer or the target audio output parameter of attributes match.
In this example, equipment 10 includes image collecting device (such as camera).Can be by image recognition technology (such as people
Face identification technology) etc. behavior or attribute to determine user are analyzed to the image that gathers.
The behavior of importer can for example include indicating gesture or the action of small sound, and/or indicate loud gesture or dynamic
Make.The gesture or action for indicating small sound for example forefinger can be placed on the action before mouth including user, be pressed downward along gravity direction
Action that gesture, the user of palm walk close to when saying to equipment etc..The loud gesture of instruction or action can for example include upward
Raise one's hand the palm gesture, by both hands before mouth in horn-like gesture, the mouth of user and equipment the same side of palm gesture etc..
By one or more image of image acquisition device importer, and user can be determined by image recognition technology
Gesture or action.
If for example, recognition result shows that finger (preferably forefinger) is placed on the action before mouth by user, it may be determined that
The volume of target audio output parameter uses small volume.
If recognition result shows that user is pressed downward the gesture of palm/palm of raising one's hand upwards along gravity direction, it may be determined that mesh
The volume for marking audio output parameters uses small/big volume.
If recognition result shows the action that user walks close to when saying to equipment, it may be determined that target audio output parameter
Volume uses small volume.
If recognition result show user by both hands before mouth be in horn-like gesture, it may be determined that target audio output ginseng
Several volumes uses big volume.
If recognition result shows the gesture of mouth and equipment in the same side of palm of user, it may be determined that target audio is defeated
The volume for going out parameter uses big volume.
These realize limb action regulation audio output parameters that can adaptively according to user, improve the body of user
Test.
The attribute of importer for example can be including the parameter for indicating the sex of importer, age etc..Can be by camera
Deng the image of image acquisition device importer, and input is determined by image analysis technology (such as face recognition technology)
The sex of person and/or age.Sex/age according to importer determines target audio output parameter.
If for example, recognition result shows that the age of importer is the elderly, it may be determined that target audio output volume is
It is high.
If recognition result shows that importer is male/female/child, it may be determined that target audio output tone color is male voice/female
Sound/child's voice.
These realize the regulation such as sex, age that can adaptively according to user audio output parameters, improve user's
Experience.
It should be understood that above example is only that schematically illustrate a selected some modes that analysis determines target audio output parameter,
But the invention is not restricted to this.For example, audio input data, source of sound distance, ambient parameter and/or importer can be taken into consideration
Behavior or attribute in two or more determine target audio output parameter.
It should be understood that maximum and/or minimum threshold value can also be set so that target audio output parameter is confined to threshold value model
In enclosing.
In step S140, audio output device output is controlled to obtain in the step s 120 according to target audio output parameter
Audio output data.
Equipment 10 can adjust the audio output parameters of its audio output device according to target audio output parameter.If mesh
Mark audio output parameters indicate regulated quantity, then correspondingly adjust present video output parameter.If target audio output parameter
Absolute magnitude is indicated, then in the situation that present video output parameter is different from the target audio output parameter obtained in step S130
Under, target audio parameter is set to present video output parameter.Preferably, therefore, to assure that the audio output parameters after regulation are still
In threshold range.
Then the audio for being obtained in the step s 120 according to the audio output parameters control audio output device output for setting
Output data.
Then, method according to embodiments of the present invention, can be adaptively adjusted the audio output of speech interactive equipment
Parameter, obtains suitable audio output, improves the use feeling of user.
The method 100 of the embodiment of the present invention is carried out with reference to Fig. 2~Fig. 4 as the example of audio frequency parameter using volume below
It is described in further detail.
Fig. 2 shows the stream of the output intent 200 in speech interactive equipment 10 according to an embodiment of the invention
Cheng Tu.Method 200 is a specific embodiment of method 100.In the present embodiment, it is defeated by combining source of sound distance analysis audio
Enter data, it is determined that the target audio output volume matched with audio volume.
As illustrated, method 200 starts from step S210.
In step S210, audio input data is gathered by audio collecting device.Step S210 is the one of step S110
It is individual to implement.Especially, in the present embodiment, audio collecting device is popped one's head in including PU, and step S210 also includes being visited using PU
Head, the acoustic pressure p and direct voice particle rapidity u of measurement input audio.
In step S212, detect source of sound (providing the importer of audio input) and equipment 10 between using depth camera
Distance (referred to as source of sound distance) r.The depth camera can be arranged in apparatus 10 as one part, or can be with cloth
Put near equipment 10.
In step S220, the audio output data of the gathered audio input data of response are obtained.
In step S230, the target audio output parameter that analysis determines is obtained.
In the present embodiment, especially, by combining source of sound distance analysis audio input data, it is determined that the target sound of matching
Frequency output parameter.Specifically, can be according to the acoustic pressure p of the input audio measured in step S210 and direct voice particle rapidity
U calculates the sound intensity I of the input audio at equipment 10.Then, according to formula (2), according to defeated at the equipment 10 for calculating
Enter the source of sound that determines in the sound intensity I and step S212 of audio apart from r, it is estimated that the sound power W of source of sound, as source of sound sound
The measurement of amount.It is then possible to target audio power output to be set to be proportional to the volume of source of sound.When source of sound volume is high, mesh
Mark output volume is also high;Otherwise when source of sound volume is low, target output volume is also low.
Equally, analysis determines that the aforesaid operations of target audio output parameter can be locally executed in equipment 10, or can be with
Performed by external processing apparatus.Additionally, depth camera can with analysis determine target audio output parameter equipment 10 and/
Or external processing apparatus are communicatively coupled.
In step S240, audio output device output is controlled to obtain in the step s 120 according to target audio output parameter
Audio output data.
It is to whisper or speaking up that method 200 can distinguish user, and it is near that can especially distinguish user
Place whispers and still speak up a long way off, correspondingly provides suitable audio output volume, improves Consumer's Experience.
Step S220, S240 in method 200 is identical with step S120, S140 in method 100.Except above-mentioned
Special feature, the S210 and S230 in method 200 is also similar with the step S110 and S130 in method 100.Accordingly, with respect to method
200 part similar with method 100 will not be repeated here.
Fig. 3 shows the output intent 300 in speech interactive equipment 10 in accordance with another embodiment of the present invention
Flow chart.Method 300 is a specific embodiment of method 100.In the present embodiment, by analysis environments parameter determination and ring
The target audio output parameter of border parameter matching.
As illustrated, method 300 starts from step S310.
In step S310, audio input data is gathered by audio collecting device.
In step s 320, the audio output data of the gathered audio input data of response are obtained.
In step S330a, ambient parameter is detected by detectors such as sensors.For example use light sensors environment
Brightness.The sensor can be arranged in apparatus 10 as one part, or can be arranged near equipment 10.
In step S330, the target audio output parameter that analysis determines is obtained.
In the present embodiment, especially, the target audio for being matched with ambient parameter by analysis environments parameter determination is exported
Parameter.If for example, the ambient brightness detected in step S330a is brighter, target audio output volume can be set to
It is higher;If otherwise ambient brightness is darker, target audio output volume can be set to lower.
Equally, analysis determines that the aforesaid operations of target audio output parameter can be locally executed in equipment 10, or can be with
Performed by external processing apparatus.Additionally, for detecting that the detector of environment can determine target audio output parameter with analysis
Equipment 10 and/or external processing apparatus be communicatively coupled.
In step S340, audio output device output is controlled to obtain in step s 320 according to target audio output parameter
Audio output data.
Method 300 can adaptively export big volume daytime, and late into the night output small volume reduces night and other people are done
Disturb.
Step S310, S320, S340 are identical with step S110, S120 in method 100, S140 in method 300.Except upper
The special feature that face is mentioned, the S330 in method 300 is one of the step S130 in method 100 and implements.Accordingly, with respect to
The part similar with method 100 of method 300 will not be repeated here.
Fig. 4 shows the output intent 400 in speech interactive equipment 10 according to another embodiment of the invention
Flow chart.Method 400 is a specific embodiment of method 100.In the present embodiment, by analyzing the defeated of audio input data
The behavior of the person of entering or attribute determine the behavior with importer or the target audio output parameter of attributes match.
As illustrated, method 400 starts from step S410.
In step S410, audio input data is gathered by audio collecting device.
In the step s 420, the audio output data of the gathered audio input data of response are obtained.
In step S430a, the gesture and/or attribute of importer are detected.Gesture and/or attribute for detecting importer
Detector can include optical module (such as camera) and image processing modules.For example, can be adopted by imaging first-class image
The image of acquisition means acquisition importer, and the hand of importer is determined by image analysis technology (such as face recognition technology)
Gesture/action or attribute.The behavior of importer can for example heighten the gesture/action of volume (speaking up) including instruction, such as
Raise one's hand upwards the gesture of the palm, action for speaking of magnifying etc.;Or volume (whispering) can be turned down including instruction
Gesture/action, such as lip slightly the speech act of change, be pressed downward the gesture of palm, walk close to when saying the action of equipment.It is defeated
The attribute of the person of entering for example can be including the parameter for indicating the sex of importer, age etc..
In step S430, the target audio output parameter that analysis determines is obtained.
In the present embodiment, especially, by analyze the importer of audio input data behavior or attribute determine with it is defeated
The behavior of the person of entering or the target audio output parameter of attributes match.If for example, the instruction that user is detected in step S430a is big
Sound is spoken (heighten volume) or the gesture whispered (turn down volume) or action, can correspondingly by target audio output volume
It is set to high or low, or is arranged to heighten or turns down.Alternately or supplement, can also foundation S430a in detect it is defeated
The sex of the person of entering is male/female/child, determines that target audio output tone color is male voice/female voice/child's voice.Alternatively, can also foundation
The age of importer is the elderly, determines that target audio output volume is (or further heightening) high.
Equally, analysis determines that the aforesaid operations of target audio output parameter can be locally executed in equipment 10, or can be with
Performed by external processing apparatus.Equipment 10 can be determined at equipment 10 and/or the outside of target audio output parameter with analysis
Reason equipment is communicatively coupled.
In step S440, audio output device output is controlled to obtain in the step s 120 according to target audio output parameter
Audio output data.
Method 400 can adaptively according to user the regulation such as sex, age audio output parameters, improve the body of user
Test.
Step S410, S420, S440 are identical with step S110, S120 in method 100, S140 in method 400.Except upper
The special feature that face is mentioned, the S430 in method 400 is one of the step S130 in method 100 and implements.Accordingly, with respect to
The part similar with method 100 of method 400 will not be repeated here.
Fig. 5 shows the block diagram of the example implementation of equipment 10 according to embodiments of the present invention.
Equipment 10 according to embodiments of the present invention can be the various equipment that can provide voice interactive function, such as intelligent sound
Case, response formula Intelligent voice toy, the smart phone with voice assistant, the robot with voice interactive function, computer
Etc..
If shown, equipment 10 includes audio collecting device 11, audio output device 12, processing unit 13 and memory
14。
Audio collecting device 11 is configured to gather audio input data.Audio collecting device 11 can be various known
Or the device with audio collection function of following exploitation, such as microphone.Alternatively, audio collecting device 11 can be with built-in
There is voice processing card, the audio input data for gathering can be processed.Such as voice life in identification audio input data
Order, or identification natural-sounding carries out semantic analysis etc..As an alternative or supplement, audio collecting device 11 can be built-in with PU
Probe, acoustic pressure and direct voice particle rapidity of audio etc. are input into for measuring.
Audio output device 12 is configured to export audio output data.Audio output device 12 can be various known
Or the device with audio output function of following exploitation, for example, loudspeaker.Although audio collection is filled in this manual
Put 11 and audio output device 12 be shown as separate device, it should be appreciated that, both can be integrated, and be embodied as
Audio devices with audio transmission-receiving function.
Processing unit 13 is arranged for controlling for the integrated operation of equipment 10.For example, processing unit 13 may be configured to control
Audio collecting device processed gathers audio input data;Obtain the audio output data of response audio input data;Obtain analysis true
Fixed target audio output parameter;Audio output device output audio output data are controlled according to target audio output parameter.
Alternatively, processing unit 13 can also be configured to process the audio input data that audio collecting device is gathered.
Alternatively, processing unit 13 can also be configured in response to audio input data, generation or from local or outside
(such as internet, outside cloud, another equipment) retrieves corresponding audio output data.
Alternatively, processing unit 13 can also be configured to analysis determination target audio output parameter.
Processing unit 13 can come individually and/or common real with hardware, software, firmware or substantial their any combination
It is existing.Alternatively, processing unit 13 can include one or more application specific integrated circuit (ASIC), field programmable gate arrays
(FPGA), digital signal processor (DSP) or other integrated devices.Alternately or supplement, processing unit 13 can include place
Reason device (or microprocessor) on a memory can be by one or more computer programs of computing device with storage.
In operation, processing unit 13 can perform the miscellaneous part with equipment 10 (for example, audio collecting device 11, sound
Frequency output device 12, and/or optional other part) control and/or communication related operation or data processing.
Memory 14 can be with data storage and program.Memory 14 can include various permanent or provisional storage
Medium.
Alternatively, equipment 10 can also include communicator 15.Communicator 15 be configured to it is outside (external equipment,
Including outside cloud) communication.For example, communicator 15 is configurable to send the sound that audio collecting device 11 is gathered to external equipment
Frequency input data, and/or the audio output data of the response audio input data are received from external equipment.Alternately or mend
Fill, communicator 15 can send the data that equipment 10 is detected to external equipment, and/or receive target audio from external equipment
Output parameter.Communicator 15 can include wireless or wired communication interface, can support various suitable communication standards,
This respect, the embodiment of the present invention is unrestricted.
In certain embodiments, the audio collecting device 11 of equipment 10 includes microphone array.Processing unit 13 also configures use
In:Source of sound distance is estimated based on microphone array, and obtains true by combining audio input data described in source of sound distance analysis
Fixed target audio output parameter.
In certain embodiments, equipment 10 can also include:At least one sensor, is configured to detect ambient parameter.
Processing unit is also configured to:Obtain and exported by the target audio matched with the ambient parameter of analysis environments parameter determination
Parameter.
In certain embodiments, equipment 10 can also include:Image collecting device, for the figure of the surrounding environment of equipment 10
Picture.For example, image collecting device can gather the image of the importer of audio input data.Processing unit 13 is also configured to:
Obtain behavior with the importer or category that the behavior by analyzing the importer of the audio input data or attribute determine
Property matching target audio output parameter.
Equipment 10 can be used for performing method according to embodiments of the present invention, such as method 100~400.Equipment 10 it is specific
Operation may be referred to the above-mentioned description on method 100~400, will not be repeated here.
It will be understood by those skilled in the art that equipment 10 in Figure 5 illustrate only part related to the present invention, to avoid
Obscure the present invention.Although according to embodiments of the present invention however, it will be understood by those skilled in the art that not shown in Figure 5
Equipment 10 can also include constituting other elementary cells of concrete sound interactive device.
Interactive voice equipment according to embodiments of the present invention can be adaptively adjusted audio output parameters, and it is suitable to obtain
Audio output, improves the use feeling of user, improves the competitiveness of product.
Fig. 6 diagrammatically illustrates the dress of the treatment for performing the method described with reference to Fig. 1~4 according to the embodiment of the present application
Put the block diagram of 13 example.As shown in fig. 6, the processing unit 13 includes processing unit or processor 136.The processor 136 can
Being the combination of individual unit or multiple unit, the different step for performing method.Processing unit 13 may also include:Input
Unit 132, for receiving signal from other equipment or component (for example, the audio collecting device being attached thereto, sensor etc.);With
And output unit 134, for providing signal (for example, the audio output device being attached thereto, communication dress to other equipment or component
Put).Input block and output unit can be arranged to an entirety.
Additionally, processing unit 13 also includes memory 138, be stored with computer program 139 in memory 138.
Computer program 139 can include code/computer executable instructions, and it is caused when being performed by processor 136
Processor 136 is performed for example above in conjunction with the method flow described by Fig. 1~Fig. 4 and its any deformation.
Computer program 139 can be configured with such as computer program code including computer program module.Example
Such as, in the exemplary embodiment, the code in computer program 139 can include one or more program modules, for example, include
139A, module 139B ....It should be noted that the dividing mode and number of module are not fixed, those skilled in the art
Can be combined using suitable program module or program module according to actual conditions, when the combination of these program modules is by processor
During 136 execution so that processor 136 can be performed for example above in conjunction with the method flow described by Fig. 1~Fig. 4 and its any change
Shape.
Above combined preferred embodiment invention has been described.It will be understood by those skilled in the art that above
The apparatus and method for showing only are exemplary.Equipment of the invention can include portion more more or less than the part for showing
Part.The method of the present invention is not limited to step and order illustrated above.Those skilled in the art are according to illustrated embodiment
Teaching can carry out many and change and modifications.
The above method, device, unit and/or module according to each embodiment of the application can be by the electricity that have computing capability
Sub- equipment performs the software comprising computer instruction to realize.The system can include storage device, described above to realize
Various storages.The electronic equipment for having computing capability can be comprising general processor, digital signal processor, dedicated processes
Device, re-configurable processor etc. are able to carry out the device of computer instruction, but not limited to this.Perform such instruction and cause electricity
Sub- equipment is configured as performing the above-mentioned operations according to the application.Above-mentioned each equipment and/or module can be in an electronics
Realized in equipment, it is also possible to realized in distinct electronic apparatuses.These softwares can be stored in a computer-readable storage medium.
Computer-readable recording medium storage one or more programs (software module), one or more of programs include instruction, when
When one or more processors in electronic equipment perform the instruction, the instruction causes that electronic equipment performs the side of the application
Method.
These softwares can be stored as volatile memory or the form of Nonvolatile memory devices (is such as similar to ROM etc.
Storage device), it is whether erasable or rewritable, or form (such as RAM, storage core for being stored as memory
Piece, equipment or integrated circuit), or (such as, CD, DVD, disk or magnetic are stored on light computer-readable recording medium or magnetic computer-readable recording medium
Band etc.).It should be appreciated that storage device and storage medium are adapted for storing the machine readable storage dress of one or more programs
The embodiment put, one program or multiple programs include instruction, when executed, realize the implementation of the application
Example.Embodiment provides the machine-readable storage device of program and this program of storage, and described program is included for realizing the application
Any one claim described in device or method code.Furthermore, it is possible to via any medium (such as, via wired
The signal of communication that connection or wireless connection are carried) to send a telegram here and transmit these programs, multiple embodiments suitably include these programs.
Method, device, unit and/or module according to each embodiment of the application can also use such as field programmable gate
System on array (FPGA), programmable logic array (PLA), on-chip system, substrate, the system in encapsulation, special integrated electricity
Road (ASIC) can come real for carrying out the hardware such as integrated or encapsulation any other rational method or firmware to circuit
It is existing, or realized with software, the appropriately combined of three kinds of implementations of hardware and firmware.The system can include storage device,
To realize storage as described above.When realizing in such ways, software, hardware and/or the firmware for being used be programmed or
It is designed as performing according to the corresponding above method of the application, step and/or function.Those skilled in the art can be according to actual need
Come one or more in these systems and module suitably, or a part therein or some use it is different upper
Implementation is stated to realize.These implementations each fall within the protection domain of the application.
Although the certain exemplary embodiments with reference to the application have shown and described the application, art technology
Personnel it should be understood that in the case of the spirit and scope limited without departing substantially from appended claims and its equivalent,
The various changes in form and details can be carried out to the application.Therefore, scope of the present application should not necessarily be limited by above-described embodiment,
But not only should be determined by appended claims, also it is defined by the equivalent of appended claims.
Claims (10)
1. a kind of output intent, including:
Audio input data is gathered by audio collecting device;
Obtain the audio output data of the response audio input data;
Obtain the target audio output parameter that analysis determines;And
The audio output data are exported according to target audio output parameter control audio output device.
2. method according to claim 1, wherein, the analysis includes:
Analyze the target audio output parameter that the audio input data determines to be matched with the audio input data.
3. method according to claim 2, wherein, analyze the audio input data and determine and the audio input data
The target audio output parameter of matching includes:
Determine target audio output parameter according to the corresponding audio frequency parameter of the audio input data.
4. method according to claim 2, wherein, analyze the audio input data and determine and the audio input data
The target audio output parameter of matching includes:
The audio input data with reference to described in source of sound distance analysis, to determine the target audio output parameter.
5. method according to claim 1, wherein, the analysis includes:
The target audio output parameter that analysis environments parameter determination is matched with the ambient parameter.
6. method according to claim 1, wherein, the analysis includes:
The behavior or attribute for analyzing the importer of the audio input data determine the behavior with the importer or attributes match
Target audio output parameter.
7. a kind of processing equipment, including:
Audio collecting device;
Audio output device;
Processing unit, is configured to:
Audio input data is gathered by audio collecting device;
Obtain the audio output data of the response audio input data;
Obtain the target audio output parameter that analysis determines;And
The audio output data are exported according to target audio output parameter control audio output device.
8. processing equipment according to claim 7, wherein,
The audio collecting device includes microphone array, and
The processing unit is also configured to:Based on microphone array estimate source of sound distance, and obtain by combine source of sound away from
From the target audio output parameter for analyzing the audio input data determination.
9. processing equipment according to claim 7, also includes:At least one sensor, is configured to detect ambient parameter;
Wherein, the processing unit is also configured to:Obtain being matched with the ambient parameter by analysis environments parameter determination
Target audio output parameter.
10. output equipment according to claim 7, also includes:Image collecting device, for gathering the audio input number
According to importer image;
Wherein, the processing unit is configured to:Obtain behavior or the category by analyzing the importer of the audio input data
Property the behavior with the importer that the determines or target audio output parameter of attributes match.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710199965.1A CN106782544A (en) | 2017-03-29 | 2017-03-29 | Interactive voice equipment and its output intent |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710199965.1A CN106782544A (en) | 2017-03-29 | 2017-03-29 | Interactive voice equipment and its output intent |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106782544A true CN106782544A (en) | 2017-05-31 |
Family
ID=58967992
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710199965.1A Pending CN106782544A (en) | 2017-03-29 | 2017-03-29 | Interactive voice equipment and its output intent |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106782544A (en) |
Cited By (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107274900A (en) * | 2017-08-10 | 2017-10-20 | 北京灵隆科技有限公司 | Information processing method and its system for control terminal |
CN107368280A (en) * | 2017-07-06 | 2017-11-21 | 北京小米移动软件有限公司 | Method for controlling volume, device and the interactive voice equipment of interactive voice |
CN107396176A (en) * | 2017-07-18 | 2017-11-24 | 青岛海信电器股份有限公司 | The player method and device of audio-video document |
CN107423021A (en) * | 2017-07-28 | 2017-12-01 | 联想(北京)有限公司 | A kind of smart machine and intelligent control method |
CN107610705A (en) * | 2017-10-27 | 2018-01-19 | 成都常明信息技术有限公司 | One kind is according to age intelligence tone color speech robot people |
CN107621800A (en) * | 2017-10-27 | 2018-01-23 | 成都常明信息技术有限公司 | A kind of intelligent sound robot based on age regulation volume |
CN107657954A (en) * | 2017-10-27 | 2018-02-02 | 成都常明信息技术有限公司 | A kind of Intelligent volume speech robot people |
CN107895579A (en) * | 2018-01-02 | 2018-04-10 | 联想(北京)有限公司 | A kind of audio recognition method and system |
CN108200527A (en) * | 2017-12-29 | 2018-06-22 | Tcl海外电子(惠州)有限公司 | Assay method, device and the computer readable storage medium of sound source loudness |
CN108899012A (en) * | 2018-07-27 | 2018-11-27 | 中国电子产品可靠性与环境试验研究所((工业和信息化部电子第五研究所)(中国赛宝实验室)) | Interactive voice equipment evaluating method, system, computer equipment and storage medium |
CN109036388A (en) * | 2018-07-25 | 2018-12-18 | 李智彤 | A kind of intelligent sound exchange method based on conversational device |
CN109166580A (en) * | 2018-09-17 | 2019-01-08 | 珠海格力电器股份有限公司 | A kind of voice feedback prompt control method, system and air conditioner |
CN109509470A (en) * | 2018-12-11 | 2019-03-22 | 平安科技(深圳)有限公司 | Voice interactive method, device, computer readable storage medium and terminal device |
CN110677776A (en) * | 2019-09-26 | 2020-01-10 | 恒大智慧科技有限公司 | Volume adjusting method and device, intelligent sound box and storage medium |
CN112634884A (en) * | 2019-09-23 | 2021-04-09 | 北京声智科技有限公司 | Method of controlling output audio, method of outputting audio, apparatus, electronic device, and computer-readable storage medium |
CN108335700B (en) * | 2018-01-30 | 2021-07-06 | 重庆与展微电子有限公司 | Voice adjusting method and device, voice interaction equipment and storage medium |
CN113763942A (en) * | 2020-06-03 | 2021-12-07 | 广东美的制冷设备有限公司 | Interaction method and interaction system of voice household appliances and computer equipment |
Citations (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101951235A (en) * | 2010-09-08 | 2011-01-19 | 成都千帆科技开发有限公司 | Method and device for automatically controlling voice volume of parking lot management system |
CN102413218A (en) * | 2011-08-03 | 2012-04-11 | 宇龙计算机通信科技(深圳)有限公司 | Method, device and communication terminal for automatically adjusting speaking tone |
CN102426838A (en) * | 2011-08-24 | 2012-04-25 | 华为终端有限公司 | Voice signal processing method and user equipment |
CN102522961A (en) * | 2005-12-07 | 2012-06-27 | 苹果公司 | Portable audio device providing automated control of audio volume parameters for hearing protection |
CN103543979A (en) * | 2012-07-17 | 2014-01-29 | 联想(北京)有限公司 | Voice outputting method, voice interaction method and electronic device |
CN103714824A (en) * | 2013-12-12 | 2014-04-09 | 小米科技有限责任公司 | Audio processing method, audio processing device and terminal equipment |
CN103731711A (en) * | 2013-12-27 | 2014-04-16 | 乐视网信息技术(北京)股份有限公司 | Method and system for executing operation of smart television |
CN103943106A (en) * | 2014-04-01 | 2014-07-23 | 北京豪络科技有限公司 | Intelligent wristband for gesture and voice recognition |
CN104335559A (en) * | 2014-04-04 | 2015-02-04 | 华为终端有限公司 | Method for adjusting volume automatically, volume adjusting apparatus and electronic apparatus |
CN104795067A (en) * | 2014-01-20 | 2015-07-22 | 华为技术有限公司 | Voice interaction method and device |
US20150213800A1 (en) * | 2014-01-28 | 2015-07-30 | Simple Emotion, Inc. | Methods for adaptive voice interaction |
CN204795452U (en) * | 2015-07-07 | 2015-11-18 | 北京联合大学 | Can realize gesture, voice interaction's TV set system |
CN105282345A (en) * | 2015-11-23 | 2016-01-27 | 小米科技有限责任公司 | Method and device for regulation of conversation volume |
CN105654950A (en) * | 2016-01-28 | 2016-06-08 | 百度在线网络技术(北京)有限公司 | Self-adaptive voice feedback method and device |
CN105895096A (en) * | 2016-03-30 | 2016-08-24 | 乐视控股(北京)有限公司 | Identity identification and voice interaction operating method and device |
CN105895095A (en) * | 2015-02-12 | 2016-08-24 | 哈曼国际工业有限公司 | Adaptive interactive voice system |
CN106358120A (en) * | 2016-09-23 | 2017-01-25 | 成都创慧科达科技有限公司 | Audio play device with various regulation methods |
CN106453946A (en) * | 2016-11-15 | 2017-02-22 | 维沃移动通信有限公司 | Method for regulating output volume and mobile terminal |
-
2017
- 2017-03-29 CN CN201710199965.1A patent/CN106782544A/en active Pending
Patent Citations (18)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102522961A (en) * | 2005-12-07 | 2012-06-27 | 苹果公司 | Portable audio device providing automated control of audio volume parameters for hearing protection |
CN101951235A (en) * | 2010-09-08 | 2011-01-19 | 成都千帆科技开发有限公司 | Method and device for automatically controlling voice volume of parking lot management system |
CN102413218A (en) * | 2011-08-03 | 2012-04-11 | 宇龙计算机通信科技(深圳)有限公司 | Method, device and communication terminal for automatically adjusting speaking tone |
CN102426838A (en) * | 2011-08-24 | 2012-04-25 | 华为终端有限公司 | Voice signal processing method and user equipment |
CN103543979A (en) * | 2012-07-17 | 2014-01-29 | 联想(北京)有限公司 | Voice outputting method, voice interaction method and electronic device |
CN103714824A (en) * | 2013-12-12 | 2014-04-09 | 小米科技有限责任公司 | Audio processing method, audio processing device and terminal equipment |
CN103731711A (en) * | 2013-12-27 | 2014-04-16 | 乐视网信息技术(北京)股份有限公司 | Method and system for executing operation of smart television |
CN104795067A (en) * | 2014-01-20 | 2015-07-22 | 华为技术有限公司 | Voice interaction method and device |
US20150213800A1 (en) * | 2014-01-28 | 2015-07-30 | Simple Emotion, Inc. | Methods for adaptive voice interaction |
CN103943106A (en) * | 2014-04-01 | 2014-07-23 | 北京豪络科技有限公司 | Intelligent wristband for gesture and voice recognition |
CN104335559A (en) * | 2014-04-04 | 2015-02-04 | 华为终端有限公司 | Method for adjusting volume automatically, volume adjusting apparatus and electronic apparatus |
CN105895095A (en) * | 2015-02-12 | 2016-08-24 | 哈曼国际工业有限公司 | Adaptive interactive voice system |
CN204795452U (en) * | 2015-07-07 | 2015-11-18 | 北京联合大学 | Can realize gesture, voice interaction's TV set system |
CN105282345A (en) * | 2015-11-23 | 2016-01-27 | 小米科技有限责任公司 | Method and device for regulation of conversation volume |
CN105654950A (en) * | 2016-01-28 | 2016-06-08 | 百度在线网络技术(北京)有限公司 | Self-adaptive voice feedback method and device |
CN105895096A (en) * | 2016-03-30 | 2016-08-24 | 乐视控股(北京)有限公司 | Identity identification and voice interaction operating method and device |
CN106358120A (en) * | 2016-09-23 | 2017-01-25 | 成都创慧科达科技有限公司 | Audio play device with various regulation methods |
CN106453946A (en) * | 2016-11-15 | 2017-02-22 | 维沃移动通信有限公司 | Method for regulating output volume and mobile terminal |
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107368280A (en) * | 2017-07-06 | 2017-11-21 | 北京小米移动软件有限公司 | Method for controlling volume, device and the interactive voice equipment of interactive voice |
CN107396176A (en) * | 2017-07-18 | 2017-11-24 | 青岛海信电器股份有限公司 | The player method and device of audio-video document |
CN107423021A (en) * | 2017-07-28 | 2017-12-01 | 联想(北京)有限公司 | A kind of smart machine and intelligent control method |
CN107274900B (en) * | 2017-08-10 | 2020-09-18 | 北京京东尚科信息技术有限公司 | Information processing method for control terminal and system thereof |
CN107274900A (en) * | 2017-08-10 | 2017-10-20 | 北京灵隆科技有限公司 | Information processing method and its system for control terminal |
CN107610705A (en) * | 2017-10-27 | 2018-01-19 | 成都常明信息技术有限公司 | One kind is according to age intelligence tone color speech robot people |
CN107621800A (en) * | 2017-10-27 | 2018-01-23 | 成都常明信息技术有限公司 | A kind of intelligent sound robot based on age regulation volume |
CN107657954A (en) * | 2017-10-27 | 2018-02-02 | 成都常明信息技术有限公司 | A kind of Intelligent volume speech robot people |
CN108200527A (en) * | 2017-12-29 | 2018-06-22 | Tcl海外电子(惠州)有限公司 | Assay method, device and the computer readable storage medium of sound source loudness |
CN107895579A (en) * | 2018-01-02 | 2018-04-10 | 联想(北京)有限公司 | A kind of audio recognition method and system |
CN108335700B (en) * | 2018-01-30 | 2021-07-06 | 重庆与展微电子有限公司 | Voice adjusting method and device, voice interaction equipment and storage medium |
CN109036388A (en) * | 2018-07-25 | 2018-12-18 | 李智彤 | A kind of intelligent sound exchange method based on conversational device |
CN108899012B (en) * | 2018-07-27 | 2021-04-20 | 中国电子产品可靠性与环境试验研究所((工业和信息化部电子第五研究所)(中国赛宝实验室)) | Voice interaction equipment evaluation method and system, computer equipment and storage medium |
CN108899012A (en) * | 2018-07-27 | 2018-11-27 | 中国电子产品可靠性与环境试验研究所((工业和信息化部电子第五研究所)(中国赛宝实验室)) | Interactive voice equipment evaluating method, system, computer equipment and storage medium |
CN109166580A (en) * | 2018-09-17 | 2019-01-08 | 珠海格力电器股份有限公司 | A kind of voice feedback prompt control method, system and air conditioner |
CN109509470A (en) * | 2018-12-11 | 2019-03-22 | 平安科技(深圳)有限公司 | Voice interactive method, device, computer readable storage medium and terminal device |
CN109509470B (en) * | 2018-12-11 | 2024-05-07 | 平安科技(深圳)有限公司 | Voice interaction method and device, computer readable storage medium and terminal equipment |
CN112634884A (en) * | 2019-09-23 | 2021-04-09 | 北京声智科技有限公司 | Method of controlling output audio, method of outputting audio, apparatus, electronic device, and computer-readable storage medium |
CN110677776A (en) * | 2019-09-26 | 2020-01-10 | 恒大智慧科技有限公司 | Volume adjusting method and device, intelligent sound box and storage medium |
CN110677776B (en) * | 2019-09-26 | 2021-08-17 | 星络智能科技有限公司 | Volume adjusting method and device, intelligent sound box and storage medium |
CN113763942A (en) * | 2020-06-03 | 2021-12-07 | 广东美的制冷设备有限公司 | Interaction method and interaction system of voice household appliances and computer equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106782544A (en) | Interactive voice equipment and its output intent | |
US9418665B2 (en) | Method for controlling device and device control system | |
CN110826358B (en) | Animal emotion recognition method and device and storage medium | |
CN105556595B (en) | For adjusting the method and apparatus for activating the detection threshold value of speech miscellaneous function | |
US6838994B2 (en) | Adaptive alarm system | |
CN110291489A (en) | The efficient mankind identify intelligent assistant's computer in calculating | |
CN109076310A (en) | The autonomous semantic marker of physical location | |
CN109346075A (en) | Identify user speech with the method and system of controlling electronic devices by human body vibration | |
CN109920419B (en) | Voice control method and device, electronic equipment and computer readable medium | |
CN105452822A (en) | Sound event detecting apparatus and operation method thereof | |
US10978093B1 (en) | Computer apparatus and method implementing sound detection to recognize an activity | |
JP2012518828A (en) | System, method and apparatus for placing apparatus in active mode | |
CN111124108B (en) | Model training method, gesture control method, device, medium and electronic equipment | |
KR20140144499A (en) | Method and apparatus for quality measurement of sleep using a portable terminal | |
CN107609501A (en) | The close action identification method of human body and device, storage medium, electronic equipment | |
CN105615839B (en) | A kind of human body wearable device and its detection method | |
CN106464812A (en) | Lifelog camera and method of controlling same according to transitions in activity | |
CN105848061B (en) | Control method and electronic equipment | |
CN106873939A (en) | Electronic equipment and its application method | |
He et al. | An elderly care system based on multiple information fusion | |
CN109831817A (en) | Terminal control method, device, terminal and storage medium | |
CN109257490A (en) | Audio-frequency processing method, device, wearable device and storage medium | |
CN112990429A (en) | Machine learning method, electronic equipment and related product | |
CN108230312A (en) | A kind of image analysis method, equipment and computer readable storage medium | |
CN209606794U (en) | A kind of wearable device, sound-box device and intelligent home control system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170531 |
|
RJ01 | Rejection of invention patent application after publication |