CN108469966A

CN108469966A - Voice broadcast control method and device, intelligent device and medium

Info

Publication number: CN108469966A
Application number: CN201810237857.3A
Authority: CN
Inventors: 褚长森
Original assignee: Beijing Kingsoft Internet Security Software Co Ltd
Current assignee: Beijing Kingsoft Internet Security Software Co Ltd
Priority date: 2018-03-21
Filing date: 2018-03-21
Publication date: 2018-08-31

Abstract

The embodiment of the invention discloses a voice broadcast control method, a voice broadcast control device, intelligent equipment and a medium, wherein the method comprises the following steps: the intelligent device receives a voice control instruction of a user, wherein the voice control instruction is used for indicating the intelligent device to broadcast preset voice content. The intelligent device determines a first volume according to preset reference information, wherein the preset reference information is used for determining the current volume of the broadcast voice of the intelligent device. The intelligent device adopts first volume broadcast in predetermineeing the pronunciation. By implementing the embodiment of the invention, the flexibility of the intelligent equipment for broadcasting the voice is improved, and the user experience of the intelligent equipment is improved.

Description

Voice broadcast control method, device, smart machine and medium

Technical field

The present invention relates to smart machine field more particularly to a kind of voice broadcast control method, device, smart machine and Jie Matter.

Background technology

Smart machine (such as intelligent sound box), can be interacted by voice and user, and user can be square by voice Just control of the realization to the common function of smart machine, reaches intelligent state.

In general, during user and smart machine interact, it is necessary first to wake up smart machine, can just open in this way Dynamic smart machine.Currently, being all to wake up smart machine using instruction is waken up, the name that instruction can be smart machine is waken up Deng.For example, the name of smart machine is " small leopard ", user speaks：" small leopard exists" attempt to wake up smart machine, smart machine It receives after user speaks, identifies that the content that user says is " small leopard exists " by speech recognition technology, smart machine is answered： " I please tell ".In the prior art, smart machine is reported afterwards in the wake-up instruction (such as " small leopard exists ") for receiving user It is to be broadcasted according to the volume of current smart machine when reply content (such as " I please tell "), for example, intelligence is set Standby current volume is 50 decibels, then sound decibel when reporting " I please tell " is 50 decibels.Speaker in the prior art Voice broadcasting modes are relatively simple, underaction, therefore how to design a kind of more flexible smart machine voice broadcast method It is the current technical issues that need to address.

Invention content

Technical problem to be solved of the embodiment of the present invention is, provides a kind of voice broadcast control method, device, intelligence Equipment and medium improve the flexibility that smart machine reports voice, improve the user experience of smart machine.

In a first aspect, an embodiment of the present invention provides a kind of voice broadcast control method, the method includes：

Smart machine receives the phonetic control command of user, and the phonetic control command is used to indicate the smart machine and broadcasts Report presets voice content；

The smart machine determines the first volume according to preset reference information, and the preset reference information is for described in determination Smart machine currently reports the volume of voice；

The smart machine reports the default voice content using the first volume.

With reference to first aspect, in the first possible realization method of first aspect, the preset reference information includes Following at least one：The distance between the corresponding speech volume of the phonetic control command, the user and described smart machine With the present system time of the smart machine.

With reference to first aspect or the first possible realization method of first aspect, second in first aspect can In the realization method of energy, the smart machine determines the first volume according to preset reference information, including：

The smart machine identifies the corresponding speech volume of the phonetic control command；

The corresponding speech volume of the phonetic control command is determined as first volume by the smart machine；

Alternatively, the smart machine determines the first volume according to preset reference information, including：

The smart machine identifies the distance between the user and the smart machine；

The smart machine searches the user and the smart machine in the mapping relations from preset distance with volume The distance between corresponding volume；

The smart machine determines the user found volume corresponding with the distance between the smart machine For first volume；

The smart machine identifies the current system time of the smart machine；

The smart machine searches the current system of the smart machine from the mapping relations of preset time and volume Time corresponding volume；

The current corresponding volume of system time of the smart machine found is determined as described by the smart machine First volume.

Either second of the first possible realization method of first aspect or first aspect with reference to first aspect Possible realization method, in the third possible realization method of first aspect, the smart machine receives the voice of user After control instruction, further include：

The smart machine identifies the corresponding word speed of the phonetic control command；

The smart machine determines the first word speed according to the corresponding word speed of the phonetic control command；

The smart machine reports the default voice content using the first volume, including：

The smart machine reports the default voice content using first volume and first word speed.

Either second of the first possible realization method of first aspect or first aspect with reference to first aspect The possible realization method of the third of possible realization method or first aspect, in the 4th kind of possible reality of first aspect In existing mode, after the smart machine receives the phonetic control command of user, further include：

The smart machine identifies the corresponding tone of the phonetic control command；

The smart machine determines the first tone according to the corresponding tone of the phonetic control command；

The smart machine reports the default voice content using first volume and first tone.

Either second of the first possible realization method of first aspect or first aspect with reference to first aspect The 4th kind of possible realization method either the third possible realization method of first aspect or first aspect is possible Realization method, in the 5th kind of possible realization method of first aspect, the voice control that the smart machine receives user refers to After order, further include：

The smart machine identifies the corresponding tone color of the phonetic control command；

The smart machine determines the first tone color according to the corresponding tone color of the phonetic control command；

The smart machine reports the default voice content using first volume and first tone color.

The 5th kind of realization method with reference to first aspect, it is described in the 6th kind of possible realization method of first aspect Smart machine determines the first tone color according to the corresponding tone color of the phonetic control command, including：

The smart machine searches the matched tone color of tone color corresponding with the phonetic control command from presetting database；

The matched tone color of the corresponding tone color of the phonetic control command is determined as first tone color by the smart machine.

The 5th kind of realization method with reference to first aspect, it is described in the 7th kind of possible realization method of first aspect Smart machine determines the first tone color according to the corresponding tone color of the phonetic control command, including：

The smart machine generates tone color corresponding with the phonetic control command using neural network tone color identification model Matched tone color；

The matched tone color of tone color corresponding with the phonetic control command described in generation is determined as by the smart machine First tone color.

The 7th kind of realization method with reference to first aspect, in the 8th kind of possible realization method of first aspect, in institute It states smart machine and generates the matched tone color of tone color corresponding with the phonetic control command using neural network tone color identification model Before, the method further includes：

The smart machine obtains sample tone color, wherein the sample tone color includes the tone color label of label；

The smart machine is trained preset neural network tone color identification model using the sample tone color, obtains The neural network tone color identification model.

Second aspect, an embodiment of the present invention provides a kind of voice broadcast control device, described device includes：

Receiving unit, the phonetic control command for receiving user, the phonetic control command are used to indicate the voice It reports control device and reports default voice content；

First determination unit, for determining the first volume according to preset reference information, the preset reference information is for true The fixed voice broadcast control device currently reports the volume of voice；

Unit is reported, for reporting the default voice content using the first volume.

In conjunction with second aspect, in the first possible realization method of second aspect, the preset reference information includes Following at least one：The corresponding speech volume of the phonetic control command, the user and the voice broadcast control device it Between distance and the voice broadcast control device present system time.

In conjunction with the possible realization method of the first of second aspect or second aspect, second in second aspect can In the realization method of energy, first determination unit includes：

First recognition unit, for identification corresponding speech volume of the phonetic control command；

Second determination unit, for the corresponding speech volume of the phonetic control command to be determined as first volume；

Alternatively, first determination unit includes：

Second recognition unit, for identification the distance between the user and the voice broadcast control device；

First searching unit, for searching the user and the voice in the mapping relations from preset distance with volume Report the corresponding volume of the distance between control device；

Third determination unit, for by the distance between the user found and the voice broadcast control device couple The volume answered is determined as first volume；

Alternatively, first determination unit includes：

Third recognition unit, for identification current system time of the voice broadcast control device；

Second searching unit, for searching the voice broadcast control dress from the mapping relations of preset time and volume Set the corresponding volume of current system time；

4th determination unit, for by the current corresponding sound of system time of the voice broadcast control device found Amount is determined as first volume.

In conjunction with second of second aspect either the first possible realization method of second aspect or second aspect Possible realization method further includes in the third possible realization method of second aspect：

4th recognition unit, for after the phonetic control command that the receiving unit receives user, identifying institute's predicate The corresponding word speed of sound control instruction；

5th determination unit, for determining the first word speed according to the corresponding word speed of the phonetic control command；

The report unit is specifically used for：The default voice is reported using first volume and first word speed Content.

In conjunction with second of second aspect either the first possible realization method of second aspect or second aspect The possible realization method of the third of possible realization method or second aspect, in the 4th kind of possible reality of second aspect In existing mode, further include：

5th recognition unit, for identification corresponding tone of the phonetic control command；

6th determination unit, for determining the first tone according to the corresponding tone of the phonetic control command；

The report unit is specifically used for：The default voice is reported using first volume and first tone Content.

In conjunction with second of second aspect either the first possible realization method of second aspect or second aspect The 4th kind of possible realization method either the third possible realization method of second aspect or second aspect is possible Realization method further includes in the 5th kind of possible realization method of second aspect：

6th recognition unit, for identification corresponding tone color of the phonetic control command；

7th determination unit, for determining the first tone color according to the corresponding tone color of the phonetic control command；

The report unit is specifically used for：The default voice is reported using first volume and first tone color Content.

In conjunction with the 5th kind of possible realization method of second aspect, in the 6th kind of possible realization method of second aspect In, the 7th determination unit includes：

Third searching unit, it is matched for searching tone color corresponding with the phonetic control command from presetting database Tone color；

8th determination unit, for the matched tone color of the corresponding tone color of the phonetic control command to be determined as described first Tone color.

In conjunction with the 5th kind of possible realization method of second aspect, in the 7th kind of possible realization method of second aspect In, the 7th determination unit includes：

Generation unit, for generating tone color corresponding with the phonetic control command using neural network tone color identification model Matched tone color；

9th determination unit, the matched tone color of tone color corresponding with the phonetic control command for that will generate are true It is set to first tone color.

In conjunction with the 7th kind of possible realization method of second aspect, in the 8th kind of possible realization method of second aspect In, described device further includes：

Acquiring unit, for being generated and the voice control using neural network tone color identification model in the generation unit Before instructing the corresponding matched tone color of tone color, sample tone color is obtained, wherein the sample tone color includes the tone color mark of label Label；

Training unit is obtained for being trained to preset neural network tone color identification model using the sample tone color To the neural network tone color identification model.

The third aspect, an embodiment of the present invention provides a kind of smart machines, including：Processor, memory, communication interface and Bus；

The processor, the memory are connected by the bus with the communication interface and complete mutual lead to Letter；The memory stores executable program code；The processor is by reading the executable journey stored in the memory Sequence code runs program corresponding with the executable program code, any one for executing first aspect or first aspect A kind of voice broadcast control method described in kind realization method.

Fourth aspect, the embodiment of the present invention provide a kind of computer storage media, wherein the computer storage media is used for Store application program, a kind of voice broadcast control of the application program for executing the embodiment of the present invention at runtime Method.

5th aspect, the embodiment of the present invention provide a kind of application program, wherein the application program for executing at runtime A kind of voice broadcast control method described in the embodiment of the present invention.

Optionally, above-mentioned smart machine includes intelligent sound box.

Implement the embodiment of the present invention, has the advantages that：

Smart machine is after receiving the phonetic control command of user, in conjunction with the corresponding voice sound of the phonetic control command At least one of present system time of the distance between amount, the user and described smart machine and the smart machine comes It determines the first volume, and then default voice content is reported using first volume, it can neatly adjust automatically smart machine The volume of report can reach the same of intellectual broadcast according to the currently used scene of user, the volume of rational playback equipment When, improve user experience.

Description of the drawings

To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment Attached drawing is briefly described.

Fig. 1 is a kind of flow diagram of voice broadcast control method provided in an embodiment of the present invention；

Fig. 2 is a kind of structural schematic diagram of voice broadcast control device provided in an embodiment of the present invention；

Fig. 3 is a kind of structural schematic diagram of smart machine provided in an embodiment of the present invention.

Specific implementation mode

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention is described.

Fig. 1 is referred to, is a kind of flow diagram of voice broadcast control method provided in an embodiment of the present invention.The voice It reports control method and can include but is not limited to following steps.

S101, smart machine receive the phonetic control command of user, and the phonetic control command is used to indicate the intelligence Equipment, which is reported, presets voice content.

Optionally, the smart machine in the embodiment of the present invention can include but is not limited to intelligent sound box, and smart machine may be used also To be equipment that other have voice play function.

In the embodiment of the present invention, smart machine has speech identifying function, can receive and identify out the voice control of user System instruction.The phonetic control command is used to indicate smart machine and plays default voice content.In one implementation, voice control System instruction is wakes up instruction, and wake-up instruction is for waking up smart machine, so that smart machine replys special sound content.Example Such as, wake-up instruction can be the name etc. of smart machine.For example, the name of smart machine is " small leopard ".Smart machine is replied Special sound content be：" I please tell ".In another implementation, phonetic control command is used to indicate intelligence and sets It is standby to play scheduled music.For example, phonetic control command is that " please play the song of Zhou Jielun《Thousands of miles away》", then smart machine The special sound content of reply is song《Thousands of miles away》.

S102, smart machine determine the first volume according to preset reference information, and preset reference information is for determining that intelligence is set The standby current volume for reporting voice.

In the embodiment of the present invention, after smart machine receives the phonetic control command of user, it is thus necessary to determine that the volume of reply, The volume is referred to as the first volume in embodiments of the present invention.Specifically, smart machine is determined according to preset reference information First volume.

In the first realization method, preset reference information includes the corresponding speech volume of phonetic control command of user. In this case, smart machine determines the first volume according to preset reference information, specially：Smart machine identifies voice control Instruct corresponding speech volume.The corresponding speech volume of phonetic control command is determined as the first volume by smart machine.It that is to say It says, volume when smart machine replys above-mentioned default voice content is identical as the volume of the phonetic control command of user.For example, small Leopard speaker receives the phonetic control command " small leopard " of user, identifies that the user says that the volume of " small leopard " is 30 decibels, then small Leopard speaker is answered：" I please tell ", the decibel broadcasted at this time is 30 decibels, but the volume of small leopard speaker corresponds to 50 points Shellfish temporarily adjusts volume at reply " I please tell ", user is avoided to hear the reply of excessive volume suddenly Frightened when content to.

In second of realization method, preset reference information includes the distance between user and smart machine.In this feelings Under condition, smart machine determines the first volume according to preset reference information, specially：Smart machine identify user and smart machine it Between distance.The distance between user and smart machine couple are searched in mapping relations of the smart machine from preset distance with volume The volume answered.The user found volume corresponding with the distance between smart machine is determined as the first volume by smart machine. In the embodiment of the present invention, smart machine can be configured with the sensors such as range sensor, camera.Smart machine can pass through this A little sensor senses users are the distance between to smart machine.For example, smart machine is in the phonetic control command for receiving user Afterwards, user is identified by camera, and then the distance between smart machine is arrived using range sensor detection user.Specifically, User to the distance between smart machine can be a certain human body of user the distance between to smart machine, the human body portion Position can include but is not limited to face, arm, leg, head etc..In the embodiment of the present invention, the mapping relations of preset distance and volume Include multiple distance values and the corresponding volume of each distance value.Specifically, the mapping relations of the preset distance and volume Presentation mode and content can be such as, but not limited to as shown in table 1 below.

Table 1

Distance (unit：Rice)	Volume (unit：Decibel)
		1	10
2	20
		3	30
5	40
		10	50

After smart machine is at a distance from detecting user between smart machine, it can be determined by inquiring above-mentioned table 1 User's volume corresponding with the distance between smart machine.For example, the distance between user and smart machine are 2 meters, then relatively The first volume answered is 20 decibels.

Alternatively, the mapping relations of above-mentioned preset distance and volume include multiple apart from interval value and each distance regions Between be worth corresponding volume.Specifically, the presentation mode and content of the preset distance and the mapping relations of volume can for example but It is not limited to as shown in table 2 below.

Table 2

Distance (unit：Rice)	Volume (unit：Decibel)
		0-1	10
1-2	20
		2-3	30
3-5	40
		5-10	50

As shown in Table 2, the nearlyr decibel of distance is smaller.Smart machine is at a distance from detecting user between smart machine Afterwards, the corresponding volume of the distance between user and smart machine can be determined by inquiring above-mentioned table 2.For example, user and intelligence The distance between energy equipment is 2.5 meters, then corresponding first volume is 30 decibels.

In the third realization method, preset reference information includes the present system time of smart machine.In such case Under, smart machine determines the first volume according to preset reference information, specially：The current system of smart machine identification intelligent equipment Time.The current corresponding sound of system time of smart machine is searched in mapping relations of the smart machine from the preset time with volume Amount.The corresponding volume of the current system time of the smart machine found is determined as the first volume by smart machine.The present invention is real It applies in example, preset time and the mapping relations of volume include multiple time values and the corresponding volume of each time value.Tool Body, the presentation mode and content of the mapping relations of the preset time and volume can be such as, but not limited to as shown in table 3 below.

Table 3

Time (unit：When, make within 24 hours)	Volume (unit：Decibel)
		10	20
12	30
		16	50
19	20
		23	10

Smart machine can determine current system after detecting current system time, by inquiring above-mentioned table 3 Time corresponding volume.For example, current system time is 12 points, then corresponding first volume is 30 decibels.

Alternatively, the mapping relations of above-mentioned preset time and volume include multiple time interval values and each time zone Between be worth corresponding volume.Specifically, the presentation mode and content of the mapping relations of the preset time and volume can for example but It is not limited to as shown in table 4 below.

Table 4

Time (unit：When, make within 24 hours)	Volume (unit：Decibel)
		7-12	30
12-19	40
		19-22	30
22-24	20
		24-7	10

Smart machine can determine current system after detecting current system time, by inquiring above-mentioned table 4 Time corresponding volume.For example, current system time is 20 points at night, then corresponding first volume value is 30 decibels.

Above-mentioned table 4 can also directly be divided into two parts, i.e. daytime and evening, specifically, the preset time and volume Mapping relations presentation mode and content can be such as, but not limited to as shown in table 5 below.

Table 5

Time (unit：When, make within 24 hours)	Volume (unit：Decibel)
		7-19 (daytime)	30
19-7 (evening)	10

As shown in Table 5, decibel is less than the decibel of day time period the period in the evening.Smart machine detect it is current After system time, the corresponding volume of current system time can be determined by inquiring above-mentioned table 5.For example, current system Time is 20 points at night, then corresponding first volume value is 10 decibels.

In other achievable modes, determine that the mode of the first volume can be combined with above-mentioned three kinds of reference information (languages The present system time of the distance between the corresponding speech volume of sound control instruction, user and smart machine and smart machine) in Arbitrary two kinds determine.For example, smart machine can be according to the distance between user and smart machine and current system Time determines the first volume.In this case, smart machine determines the first volume according to preset reference information, specially： Smart machine identifies the distance between user and smart machine and the current system time of smart machine.Smart machine is from default Distance, the current system of the distance between user and smart machine, smart machine is searched in the mapping relations of time and volume Time corresponding volume.Smart machine is by the current system of the distance between the user found and smart machine, smart machine Time, corresponding volume was determined as the first volume.In the embodiment of the present invention, in the mapping relations of preset distance, time and volume Including multiple distance values, time value and the corresponding volume of each time value.Specifically, preset distance, time and the volume Mapping relations presentation mode and content can be such as, but not limited to as shown in table 6 below.

Table 6

Smart machine can determine user and intelligence after detecting current system time, by inquiring above-mentioned table 6 The distance between equipment, the corresponding volume of current system time.For example, the distance between user and smart machine be 2.5 meters, Current system time is 12 points, then corresponding first volume is 30 decibels.

Alternatively, the mode of determining first volume can be combined with above-mentioned three kinds of reference informations, (phonetic control command is corresponding The present system time of the distance between speech volume, user and smart machine and smart machine) in three kinds determine.Specifically Realization method can refer to above-described embodiment, and details are not described herein again.

S103, smart machine are reported using the first volume presets voice content.

In the embodiment of the present invention, smart machine is asked after the first volume is determined, using first volume to play user The default voice content asked, rather than default voice content is played using current system volume.In other words, if the first volume with When smart machine current system default volume difference, smart machine needs are temporarily adjusted volume, using determining first Volume plays default voice content.Here, default voice content can be the interior of the wake-up instruction that smart machine replys user Hold, can also be some music or song or voice content.For example, smart machine receive user " please play Zhou Jielun's Song《Thousands of miles away》", it is 30 decibels by the first volume that step S102 is determined, the volume of system current default is 50 points Shellfish, then smart machine play song《Thousands of miles away》The volume of Shi Caiyong is 30 decibels.

By implementing the embodiment of the present invention, smart machine is after receiving the phonetic control command of user, in conjunction with voice control System instructs in the distance between corresponding speech volume, user and smart machine and the present system time of smart machine at least One kind reporting default voice content to determine the first volume using first volume, can neatly adjust automatically intelligence The volume that energy equipment is reported, can reach intellectual broadcast according to the currently used scene of user, the volume of rational playback equipment While, improve user experience.

Optionally, after the phonetic control command of smart machine reception user, further include：Smart machine identifies voice control Instruct corresponding word speed.Smart machine determines the first word speed according to the corresponding word speed of phonetic control command.Then smart machine uses First volume, which is reported, presets voice content, specially：Smart machine is reported using the first volume and the first word speed and presets voice Content.

In the embodiment of the present invention, smart machine has speech identifying function, and smart machine is in the voice control for receiving user After system instruction, the word speed of the phonetic control command can be identified.And then the first word speed is determined according to the word speed.Specifically, Smart machine directly can set the word speed of phonetic control command to the first word speed, in other words, the first word speed and voice control The word speed for making instruction is equal.For example, the word speed of phonetic control command is 100 words per minute clocks, then the first word speed is also 100 words per minutes Clock.Alternatively, smart machine can also determine the first word speed according to the word speed of phonetic control command in conjunction with difference, which can be with It is pre-defined.For example, the word speed of phonetic control command is 100 words per minute clocks, difference is 20 words per minute clocks, then the first word speed is also 120 words per minute clocks.So, smart machine is when reporting above-mentioned default voice content, i.e., using the first volume and the first word speed come Report above-mentioned default voice content.

By implementing the embodiment of the present invention, smart machine is after receiving the phonetic control command of user, in conjunction with voice control System instructs in the distance between corresponding speech volume, user and smart machine and the present system time of smart machine at least One kind determining the first volume, and the first word speed is determined according to the phonetic control command of user corresponding word speed, and then using should First volume and the first word speed report default voice content, can volume that neatly adjust automatically smart machine is reported, can With according to the currently used scene of user, the volume of rational playback equipment while reaching intellectual broadcast, improves user's body It tests.

Optionally, after the phonetic control command of smart machine reception user, further include：Smart machine identifies voice control Instruct corresponding tone.Smart machine determines the first tone according to the corresponding tone of phonetic control command.Then smart machine uses First volume, which is reported, presets voice content, specially：Smart machine is reported using the first volume and the first tone and presets voice Content.

In the embodiment of the present invention, smart machine has speech identifying function, and smart machine is in the voice control for receiving user After system instruction, the tone of the phonetic control command can be identified.And then the first tone is determined according to the tone.Specifically, Tone is determined according to the frequency of acoustical vibration, and vibration frequency is different, and tone is also different, and usually, tone can be drawn It is divided into high and low two class.In the embodiment of the present invention, tone can also be in detail a variety of different sounds according to different frequency partitions It adjusts, the embodiment of the present invention is not construed as limiting this.Specifically, smart machine can directly set the tone of phonetic control command to First tone, in other words, the first tone are identical as the tone of phonetic control command.For example, the tone of phonetic control command is Low pitch, then the first tone is also low pitch.So, smart machine uses first when reporting above-mentioned default voice content Volume and the first tone report above-mentioned default voice content.

In the embodiment of the present invention, smart machine determines the first tone according to the corresponding tone of phonetic control command, including：Intelligence Energy equipment searches the tone of pitch matches corresponding with phonetic control command from presetting database.Smart machine is by voice control The tone of corresponding pitch matches is instructed to be determined as the first tone.In other words, smart machine is previously stored with different types of The corresponding sound characteristic information of tone, after smart machine identifies the phonetic control command of user corresponding tone, from advance The corresponding sound characteristic information of the tone is searched in the database of storage, and the sound is generated using the sound characteristic information simulation It adjusts, the tone of generation is the first tone.For example, smart machine identifies that the corresponding tone of the phonetic control command of user is low Tone obtains the corresponding sound characteristic information of low pitch from pre-stored database, and then utilizes the sound characteristic information Simulate low pitch.Alternatively, smart machine has machine learning function, smart machine is in the phonetic control command for identifying user Tone after, utilize neural network tone identification model to generate the tone of corresponding with phonetic control command pitch matches.In turn The tone of the pitch matches corresponding with phonetic control command of generation is determined as the first tone.Optionally, in smart machine profit Before the tone for generating pitch matches corresponding with phonetic control command with neural network tone identification model, further include：Intelligence Equipment obtains sample tone, wherein sample tone includes the tone label of label.Smart machine is using sample tone to preset Neural network tone identification model is trained, and obtains neural network tone identification model.Wherein, the tone label can wrap It includes but is not limited to：Sound frequency.

By implementing the embodiment of the present invention, smart machine is after receiving the phonetic control command of user, in conjunction with voice control System instructs in the distance between corresponding speech volume, user and smart machine and the present system time of smart machine at least One kind determining the first volume, and the first tone is determined according to the phonetic control command of user corresponding tone, and then using should First volume and the first tone report default voice content, can volume that neatly adjust automatically smart machine is reported, can With according to the currently used scene of user, the volume of rational playback equipment while reaching intellectual broadcast, improves user's body It tests.

Optionally, after the phonetic control command of smart machine reception user, further include：Smart machine identifies voice control Instruct corresponding tone color.Smart machine determines the first tone color according to the corresponding tone color of phonetic control command.Smart machine is using the One volume, which is reported, presets voice content, specially：Smart machine is reported using the first volume and the first tone color and is preset in voice Hold.

Specifically, tone color can be divided into man, female, old, few four types.Alternatively, tone color can also be according to different sides It says to distinguish.Alternatively, tone color can also be according to sound characteristic：Electric sound, magnetism, free and natural, doll's sound etc. distinguish 's.

In the embodiment of the present invention, smart machine determines the first tone color according to the corresponding tone color of phonetic control command, including：Intelligence Energy equipment searches the matched tone color of tone color corresponding with phonetic control command from presetting database.Smart machine is by voice control The corresponding matched tone color of tone color is instructed to be determined as the first tone color.In other words, smart machine is previously stored with different types of The corresponding sound characteristic information of tone color, after smart machine identifies the phonetic control command of user corresponding tone color, from advance The corresponding sound characteristic information of the tone color is searched in the database of storage, and the sound is generated using the sound characteristic information simulation The tone color of color, generation is the first tone color.For example, smart machine identifies that the corresponding tone color of the phonetic control command of user is man Sound obtains the corresponding sound characteristic information of male voice from pre-stored database, and then utilizes the sound characteristic information simulation Go out male voice.Alternatively, smart machine has machine learning function, tone color of the smart machine in the phonetic control command for identifying user Afterwards, the matched tone color of tone color corresponding with phonetic control command is generated using neural network tone color identification model.And then it will generate The matched tone color of tone color corresponding with phonetic control command be determined as the first tone color.Optionally, nerve is utilized in smart machine Before network tone color identification model generates the matched tone color of tone color corresponding with phonetic control command, further include：Smart machine obtains Sample this tone color, wherein sample tone color includes the tone color label of label.Smart machine is using sample tone color to preset nerve net Network tone color identification model is trained, and obtains neural network tone color identification model.Wherein, the tone color label may include but not It is limited to：Sound frequency.

By implementing the embodiment of the present invention, smart machine is after receiving the phonetic control command of user, in conjunction with voice control System instructs in the distance between corresponding speech volume, user and smart machine and the present system time of smart machine at least One kind determining the first volume, and the first tone color is determined according to the phonetic control command of user corresponding tone color, and then using should First volume and the first tone color report default voice content, can volume that neatly adjust automatically smart machine is reported, can With according to the currently used scene of user, the volume of rational playback equipment while reaching intellectual broadcast, improves user's body It tests.

Optionally, above-mentioned realization method can be combined with each other realization.Smart machine when reporting above-mentioned default voice content, At least one of the first volume, the first word speed, the first tone and first tone color may be used to report above-mentioned default voice Content.For example, that the first volume, the first word speed, the first tone and the first tone color may be used is above-mentioned pre- to report for smart machine If voice content.

By implementing the embodiment of the present invention, smart machine is after receiving the phonetic control command of user, in conjunction with voice control System instructs in the distance between corresponding speech volume, user and smart machine and the present system time of smart machine at least One kind determining the first volume, determines the first word speed according to the corresponding word speed of the phonetic control command of user, according to user's The corresponding tone of phonetic control command determines the first tone, according to the corresponding tone color of the phonetic control command of user determines One tone color, and then default language is reported using at least one of first volume, the first word speed, the first tone and first tone color Sound content, can neatly adjust automatically smart machine report volume, can be according to the currently used scene of user, reasonably The volume of playback equipment while reaching intellectual broadcast, improves user experience.

The above-mentioned method for illustrating the embodiment of the present invention, following for convenient for preferably implementing the embodiment of the present invention Said program is correspondingly also provided below for coordinating the relevant device for implementing said program.

Fig. 2 is referred to, is a kind of structural schematic diagram of voice broadcast control device provided in an embodiment of the present invention, voice is broadcast Report control device 200 include：Receiving unit 201, the first determination unit 202 and report unit 203, wherein

Receiving unit 201, the phonetic control command for receiving user, the phonetic control command are used to indicate institute's predicate Sound reports control device 200 and reports default voice content；

First determination unit 202, for determining that the first volume, the preset reference information are used for according to preset reference information Determine that the voice broadcast control device 200 currently reports the volume of voice；

Unit 203 is reported, for reporting the default voice content using the first volume.

Optionally, the preset reference information includes following at least one：The corresponding voice sound of the phonetic control command The distance between amount, the user and the voice broadcast control device 200 and the voice broadcast control device 200 it is current System time.

Optionally, first determination unit 202 includes：

Alternatively, first determination unit 202 includes：

Second recognition unit, for identification the distance between the user and the voice broadcast control device 200；

First searching unit, for searching the user and the voice in the mapping relations from preset distance with volume Report the corresponding volume of the distance between control device 200；

Third determination unit, for by between the user found and the voice broadcast control device 200 away from It is determined as first volume from corresponding volume；

Alternatively, first determination unit 202 includes：

Third recognition unit, for identification current system time of the voice broadcast control device 200；

Second searching unit, for searching the voice broadcast control dress from the mapping relations of preset time and volume Set the 200 current corresponding volumes of system time；

4th determination unit, for corresponding to the current system time of the voice broadcast control device 200 found Volume be determined as first volume.

Optionally, voice broadcast control device 200 further includes：

The report unit specific 203 is used for：It is reported using first volume and first word speed described default Voice content.

Optionally, voice broadcast control device 200 further includes：

The report unit 203 is specifically used for：It is reported using first volume and first tone described default Voice content.

Optionally, voice broadcast control device 200 further includes：

Optionally, the 7th determination unit includes：

Optionally, voice broadcast control device 200 further includes：

It should be noted that the voice broadcast control device 200 of the present embodiment can be equivalent in preceding method embodiment Smart machine, the function of each function module of the voice broadcast control device 200 of the present embodiment can be according to above-mentioned Fig. 1 methods reality The specific implementation that smart machine is corresponded in example is applied, details are not described herein again.

Fig. 3 is referred to, is a kind of structural schematic diagram of smart machine provided by the invention.Wherein, as shown in figure 3, intelligence Equipment 300 may include：At least one processor 301, such as central processing unit (Central Processing Unit, CPU), at least one communication bus 302, at least one input unit 303 and at least one output device 304, memory 305. Wherein, communication bus 302 is for realizing the communication connection between these components；The concretely keyboard of input unit 303 (Keyboard), microphone etc..Input unit 303, which is configured to can to acquire or detect user, sends out current speech.Output device 304 can be display screen (Display), loud speaker etc..Output device 304 is configured to that audio can be played.Memory 305 can be with It is high-speed RAM memory, can also be stable memory (non-volatile memory), such as magnetic disk storage.Storage Device 305 optionally can also be the storage device independently of aforementioned processor 301.In addition, smart machine 300 can also include Communication interface 306, the communication interface 306 may include optionally standard wireline interface and wireless interface (such as WI-FI interfaces). Smart machine 300 can also include camera 307, range sensor 308, the camera 307 user for identification, Distance-sensing Device 308 is for detecting the distance between user and smart machine 300.

It will be understood by those skilled in the art that structure shown in Fig. 3 does not constitute the restriction to smart machine 300, it can To include either combining certain components or different components arrangement than illustrating more or fewer components.

As shown in figure 3, as may include in a kind of memory 305 of storage medium operating system, data memory module, The voice broadcast of network communication module, Subscriber Interface Module SIM and smart machine controls program.

The speaker that the smart machine 300 can be that by voice collecting or detection, program operation and audio plays Equipment, the present embodiment do not limit this.

In smart machine 300 shown in Fig. 3, the voice that processor 301 can be used for storing in run memory 305 is broadcast Report control program, and execute following operation：

The phonetic control command of user is received by input unit 303, the phonetic control command is used to indicate the intelligence It can the default voice content of equipment report；

The first volume is determined according to preset reference information, and the preset reference information is for determining that the smart machine is current Report the volume of voice；

The default voice content is reported using the first volume by output device 304.

Optionally, the preset reference information includes following at least one：The corresponding voice sound of the phonetic control command The present system time of the distance between amount, the user and described smart machine and the smart machine.

Further, processor 301 determines the first volume according to preset reference information, including：

Identify the corresponding speech volume of the phonetic control command；

The corresponding speech volume of the phonetic control command is determined as first volume；

Alternatively, processor 301 determines the first volume according to preset reference information, including：

Identify the distance between the user and the smart machine；

The distance between the user and the smart machine couple are searched in mapping relations from preset distance with volume The volume answered；

The user found volume corresponding with the distance between the smart machine is determined as first sound Amount；

Identify the current system time of the smart machine；

The corresponding sound of the current system time of the smart machine is searched in mapping relations from the preset time with volume Amount；

The current corresponding volume of system time of the smart machine found is determined as first volume.

Optionally, it after phonetic control command of the processor 301 by the reception user of input unit 303, is additionally operable to：

Identify the corresponding word speed of the phonetic control command；

The first word speed is determined according to the corresponding word speed of the phonetic control command；

Processor 301 reports the default voice content by output device 304 using the first volume, including：

The default voice content is reported using first volume and first word speed.

Identify the corresponding tone of the phonetic control command；

The first tone is determined according to the corresponding tone of the phonetic control command；

The default voice content is reported using first volume and first tone.

Optionally, it after phonetic control command of the processor by the reception user of input unit 303, is additionally operable to：

Identify the corresponding tone color of the phonetic control command；

The first tone color is determined according to the corresponding tone color of the phonetic control command；

The default voice content is reported using first volume and first tone color.

Optionally, processor 301 determines the first tone color according to the corresponding tone color of the phonetic control command, including：

The matched tone color of tone color corresponding with the phonetic control command is searched from presetting database；

The matched tone color of the corresponding tone color of the phonetic control command is determined as first tone color.

The matched tone color of tone color corresponding with the phonetic control command is generated using neural network tone color identification model；

The matched tone color of tone color corresponding with the phonetic control command described in generation is determined as first tone color.

Optionally, it is generated using neural network tone color identification model in processor 301 corresponding with the phonetic control command The matched tone color of tone color before, processor 301 is additionally operable to：

Obtain sample tone color, wherein the sample tone color includes the tone color label of label；

Preset neural network tone color identification model is trained using the sample tone color, obtains the neural network Tone color identification model.

It will be appreciated that for details, reference can be made to the contents described in Fig. 1 embodiments for the execution step of processor 301, here no longer It repeats.

Based on same inventive concept, the embodiment of the present invention also provides a kind of computer storage media, and computer storage is situated between Matter is broadcast for storing application program, the application program for executing a kind of voice of the embodiment of the present invention at runtime Report control method.

Based on same inventive concept, the embodiment of the present invention also provides a kind of application program, wherein the application program is used for A kind of voice broadcast control method described in the embodiment of the present invention is executed when operation.

In conclusion by implementing the embodiment of the present invention, smart machine is after receiving the phonetic control command of user, knot It closes the distance between the corresponding speech volume of the phonetic control command, the user and the smart machine and the intelligence is set At least one of standby present system time is reported using first volume in default voice to determine the first volume Hold, can volume that neatly adjust automatically smart machine is reported, can be rational to play according to the currently used scene of user The volume of equipment while reaching intellectual broadcast, improves user experience.

One of ordinary skill in the art will appreciate that realizing all or part of flow in above-described embodiment method, being can be with Relevant hardware is instructed to complete by computer program, the program can be stored in a computer read/write memory medium In, the program is when being executed, it may include such as the flow of the embodiment of above-mentioned each method.Wherein, the storage medium can be magnetic Dish, CD, read-only memory (Read-Only Memory, ROM) or random access memory (Random Access Memory, RAM) etc..

It is above disclosed to be only a preferred embodiment of the present invention, the power of the present invention cannot be limited with this certainly Sharp range, those skilled in the art can understand all or part of the processes for realizing the above embodiment, and is weighed according to the present invention Equivalent variations made by profit requirement, still belong to the scope covered by the invention.

Claims

1. a kind of voice broadcast control method, which is characterized in that the method includes：

Smart machine receives the phonetic control command of user, and the phonetic control command is used to indicate the smart machine and reports in advance If voice content；

The smart machine determines the first volume according to preset reference information, and the preset reference information is for determining the intelligence Equipment currently reports the volume of voice；

The smart machine reports the default voice content using the first volume.

2. according to the method described in claim 1, it is characterized in that, the preset reference information includes following at least one：Institute State the distance between the corresponding speech volume of phonetic control command, the user and described smart machine and the smart machine Present system time.

3. method according to claim 1 or 2, which is characterized in that the smart machine is determined according to preset reference information First volume, including：

The smart machine is searched in the mapping relations from preset distance with volume between the user and the smart machine The corresponding volume of distance；

The user found volume corresponding with the distance between the smart machine is determined as institute by the smart machine State the first volume；

The smart machine identifies the current system time of the smart machine；

The smart machine searches the current system time of the smart machine from the mapping relations of preset time and volume Corresponding volume；

The current corresponding volume of system time of the smart machine found is determined as described first by the smart machine Volume.

4. method according to any one of claims 1 to 3, which is characterized in that the smart machine receives the voice of user After control instruction, further include：

5. method according to any one of claims 1 to 4, which is characterized in that the smart machine receives the voice of user After control instruction, further include：

6. method according to any one of claims 1 to 5, which is characterized in that the smart machine receives the voice of user After control instruction, further include：

7. according to the method described in claim 6, it is characterized in that, the smart machine is corresponded to according to the phonetic control command Tone color determine the first tone color, including：

8. a kind of voice broadcast control device, which is characterized in that including：

Receiving unit, the phonetic control command for receiving user, the phonetic control command are used to indicate the voice broadcast Control device, which is reported, presets voice content；

First determination unit, for determining the first volume according to preset reference information, the preset reference information is for determining institute State the volume that voice broadcast control device currently reports voice；

9. a kind of smart machine, which is characterized in that including：Processor, memory, communication interface and bus；The processor, institute Memory is stated to connect by the bus with the communication interface and complete mutual communication；The memory storage is executable Program code；The processor can perform to run with described by reading the executable program code stored in the memory The corresponding program of program code, for executing such as claim 1-7 any one of them voice broadcast control methods.

10. a kind of computer storage media, which is characterized in that the computer storage media is stored with computer program, described Computer program includes program instruction, and described program instruction makes the processor execute such as claim when being executed by a processor 1-7 any one of them voice broadcast control methods.