CN109686366A

CN109686366A - Voice broadcast method and device

Info

Publication number: CN109686366A
Application number: CN201811517453.6A
Authority: CN
Inventors: 张新; 毛跃辉; 廖湖锋; 王慧君; 廖海霖; 韩雪; 郑文成; 李保水; 汪进
Original assignee: Gree Electric Appliances Inc of Zhuhai
Current assignee: Gree Electric Appliances Inc of Zhuhai
Priority date: 2018-12-12
Filing date: 2018-12-12
Publication date: 2019-04-26

Abstract

The present invention proposes a kind of voice broadcast method and device, and wherein method comprises determining that the casting text of voice broadcast；Background music is determined according to casting text；It plays casting text and plays background music simultaneously.The content that casting text is understood and absorbed to facilitate user, improves user experience, does not have mood when solving voice broadcast in the prior art, excessively stiff, the low problem of user experience.

Description

Voice broadcast method and device

Technical field

The present invention relates to voice broadcast fields, in particular to voice broadcast method and device.

Background technique

As artificial intelligence is gradually pursued, numerous intelligent sound equipment are come into being, intelligent sound assistant at For a part in life.Intelligent sound assistance application is convenient, has the function of to inquire song, story, weather, stroke, translation etc..

For existing voice broadcast equipment, when carrying out voice broadcast, casting voice is stiff does not have mood, and being not achieved has The effect conveyed is imitated, causes user experience low.

Therefore, user experience when voice broadcast is improved, is this field urgent problem to be solved.

Summary of the invention

The present invention provides a kind of voice broadcast method and devices, for improving user experience when voice broadcast.

To solve the above-mentioned problems, as one aspect of the present invention, a kind of voice broadcast method is provided, comprising:

Determine the casting text of voice broadcast；

Background music is determined according to casting text；

It plays casting text and plays background music simultaneously.

Optionally, voice broadcast text is determined, comprising:

Obtain phonetic order and/or text instruction；

Semantic parsing is carried out to phonetic order and/or text instruction；

Casting text is determined according to semantic parsing result.

Optionally, casting text is determined according to semantic parsing result, comprising:

Determine in phonetic order whether include inquiry instruction according to semantic parsing result；

If so, then being inquired according to inquiry instruction, use query result as voice broadcast text.

Optionally, background music is determined according to casting text, comprising:

Obtain the text type and/or text keyword of casting text；

Corresponding alternative music is obtained according to text type and/or text key word；

Background music is chosen from alternative music according to preset rules.

Optionally, any music is corresponding at least one text type and/or opposite at least one text keyword It answers；

Broadcast text have at least one text type and/or have at least one broadcast text keyword；

Alternative music is corresponding with casting one or more text types of text；

And/or alternative music is corresponding with casting one or more text keywords of text.

Optionally, text type, comprising: style type, text reader and text mood.

Optionally, it after determining background music according to casting text, is playing casting text and is playing background sound simultaneously Before pleasure, further includes:

Setting plays volume value when casting text；

And/or volume value when setting broadcasting background music.

The application also proposes a kind of sound broadcasting device, comprising:

Text identification unit, for determining the casting text of voice broadcast；

Music acquiring unit, for determining background music according to casting text；

Voice broadcast unit, for playing casting text and while playing background music.

Optionally, text identification unit determines voice broadcast text, comprising:

Obtain phonetic order and/or text instruction；

Semantic parsing is carried out to phonetic order and/or text instruction；

Casting text is determined according to semantic parsing result.

Optionally, text identification unit determines casting text according to semantic parsing result, comprising:

Optionally, music acquiring unit determines background music according to casting text, comprising:

Obtain the text type and/or text keyword of casting text；

Background music is chosen from alternative music according to preset rules.

Optionally, text type, comprising: style type, text reader and text mood.

It optionally, further include sound volume regulation unit；

After music acquiring unit determines background music according to casting text, casting text is played in voice broadcast unit And simultaneously play background music before, sound volume regulation unit is used for:

Setting plays volume value when casting text；

And/or volume value when setting broadcasting background music.

The invention proposes a kind of voice broadcast method and devices, determine background music according to casting text, broadcast in broadcasting This while of message, plays background music, to facilitate the content that user understands and absorbs casting text, improves user experience, There is no mood when solving voice broadcast in the prior art, excessively stiff, the low problem of user experience.

Detailed description of the invention

Fig. 1 is a kind of flow chart of the method for voice broadcast in the embodiment of the present invention；

Fig. 2 is a kind of composition figure of the device of voice broadcast in the embodiment of the present invention.

Specific embodiment

To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with the specific embodiment of the invention and Technical solution of the present invention is clearly and completely described in corresponding attached drawing.Obviously, described embodiment is only the present invention one Section Example, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art are not doing Every other embodiment obtained under the premise of creative work out, shall fall within the protection scope of the present invention.

It should be noted that description and claims of this specification and term " first " in above-mentioned attached drawing, " Two " etc. be to be used to distinguish similar objects, without being used to describe a particular order or precedence order.It should be understood that using in this way Data be interchangeable under appropriate circumstances, so as to the embodiment of the present invention described herein can in addition to illustrating herein or Sequence other than those of description is implemented.In addition, term " includes " and " having " and their any deformation, it is intended that cover Covering non-exclusive includes to be not necessarily limited to for example, containing the process, method of a series of steps or units, device, product or electric appliance Step or unit those of is clearly listed, but may include be not clearly listed or for these process, methods, product Or other step or units that electric appliance is intrinsic.

In the prior art, when carrying out voice broadcast, often only text is broadcasted in simple broadcasting, not any The mode of emotion, broadcasting is extremely stiff, influences the content that user absorbs and understands casting text, user experience is low, in order to improve The experience of user, the application propose a kind of voice broadcast method, to improve user experience when voice broadcast, user are helped to understand The content of text is broadcasted with absorption, as shown in Figure 1, the method that the application proposes, comprising:

S11: the casting text of voice broadcast is determined；

S12: background music is determined according to casting text；

S13: playing casting text and plays background music simultaneously.

Specifically, the method that the application proposes can be used for smart machine, especially intelligent sound equipment, such as voice helps The equipment such as hand, the household electrical appliance with phonetic function, smartwatch, Intelligent bracelet, in the casting text for determining voice broadcast, The microphone of intelligent sound equipment receives user and issues to obtain phonetic order either smart machine, such as smart phone, passes through The content of text of user's input is detected to determine the intention of user, determines the casting text of voice broadcast according to the user's intention, Such as user inputs in mobile phone searching and searches Qiao Busi brief introduction, then mobile phone will search for the brief introduction of Qiao Busi, the brief introduction automatically Word content be exactly this voice broadcast casting text, later according to casting text determine background music, play broadcast It is played out in the form of voice broadcast when text, while playing background music, it should be noted that casting text is corresponding Background music possible more than one is first, i.e., one casting text can correspond to multiple background musics, but play appointing for casting text One moment only played a wherein first background music, such as when the content of casting text is the brief introduction of Qiao Busi, in Qiao Busi cause High music is played during smoothly, droning music is played when Qiao Busi cause is tried out, so that when playing casting text The variation of background music is realized according to the variation of content of text, so that user experience is improved, the voice broadcast that the application proposes Method is according to different casting texts so that it is determined that corresponding background music, enables user when listening to voice broadcast More accurately understand casting text, improves user experience, solving voice broadcast in the prior art does not have mood to be difficult to absorb The problem of grasping casting content of text.Optionally, it if the casting text Central Plains got is originally embedded with embedded music, protects It stays embedded music or replaces the embedded music with background music.

Preferably, voice broadcast text is determined, comprising:

Obtain phonetic order and/or text instruction；

Semantic parsing is carried out to phonetic order and/or text instruction；

Casting text is determined according to semantic parsing result.

Specifically, the method for the voice broadcast that the application proposes is preferred for intelligent terminal, such as intelligent sound control is set It is standby, such as with voice control function mobile phone, the microphone on mobile phone receives the control voice of user, refers to obtain voice It enables, then the input interface in semantic parsing or mobile phone is carried out to phonetic order and detects that user has input text instruction, it is right Text instruction is parsed.Here the process for carrying out semantic parsing can be carried out automatically by mobile phone, or will be on phonetic order Server is passed to, semantic parsing is carried out to phonetic order by server, semantic parsing result is obtained after parsing and determines casting text Sheet, semantic parsing result is the content wanted to the user of phonetic order into determination after analysis, such as phonetic order here Be: " how is weather tomorrow ", then the result of semantic parsing is exactly that user wants inquiry weather tomorrow, and corresponding casting text is exactly The weather of tomorrow, such as casting text can be " 26-30 DEG C of temperature, calm tomorrow, clear to cloudy ".In another example phonetic order It is " GDP for inquiring this year ", then semantic parsing result is that user wants inquiry GDP, then corresponding text of broadcasting is the state in this year Interior total output value.

Specifically, judging in phonetic order whether to include inquiry after parsing phonetic order or literal order Instruction, such as phonetic order or literal order can be analyzed, whether see wherein has for example: lookup, search, inquiry etc. are closed Key word is to judge whether there is inquiry instruction.It when having inquiry instruction, is inquired according to inquiry instruction, then obtains inquiry As a result it is used as voice broadcast text, to meet the needs of user.

Obtain the text type and/or text keyword of casting text；

Background music is chosen from alternative music according to preset rules.

Specifically, the corresponding text type of casting text can be a variety of, i.e., one casting text corresponds to multiple text classes Type, same casting text can correspond to multiple text keywords, and text type may include: style type, text reading Person and text mood.Such as: user requires the resume of inquiry Huo Jin by voice command, then the life of Huo Jin is found from network Brief introduction, text type are personage's biography, corresponding text reader for greater than 7 years old, text mood was neutrality at this time.Text type refers to Be subject matter, the style of text, such as can be autobiography, novel, prose, children's stories etc., text reader refers to be suitable for reading this to broadcast Crowd of message sheet, such as children, old man, young people, owner, sex etc., text mood are in the casting text The mood classification contained, for example, actively, passive and neutrality etc..Text keyword can be the concrete term in casting text, It is also possible to the label being arranged for word, i.e. the text keyword label that is to be able to quick localization of text, such as can be with Be: science and technology, personage, natural etc. are also possible to specific a certain item science and technology, some personage or some natural landscape etc. Deng.By setting text type and text keyword quickly to be matched with alternative music.Such as casting text is Huo Jin Biography when, text keyword may is that personage, Huo Jin.When broadcasting text is Andersen's children's stories " lobo ", text Keyword may is that Andersen's children's stories, lobo.Text type are as follows: children's stories, corresponding text reader are children, corresponding text This mood is neutrality.

Specifically, having pre-established music libraries in the method that the application proposes, more songs are stored in music libraries, in advance For each music in music libraries, corresponding text type and/or text keyword, the corresponding all texts of a music are set Type and the set of all text keywords composition can be the musical features set of the music, broadcast all text types of text With text keyword composition set can be the casting text casting characteristic set, the musical features set of alternative music with broadcast There are intersections for the casting characteristic set of message sheet, such as: the corresponding text type of alternative music is: children, positive, children's stories, right The keyword answered is: Andersen, Snow White.And broadcasting at least one in the text type of text is the corresponding text of alternative music This type, and/or, broadcasting at least one in the text keyword of text is the corresponding text keyword of alternative music, is passed through Text type is respectively set for casting text and music and the quick of casting text and alternative music may be implemented in text keyword Matching.Need to obtain optimum music after obtaining alternative music from alternative music as music type.In the application Preset rules for example can be weighting algorithm, such as be respectively each text type and text keyword setting power of casting text Weight, each alternative music calculates weighted value according to text type and text keyword of its corresponding casting text, from alternative Select the highest music of weighted value as background music in music.Such as: user issues phonetic order and wishes to play Andersen child Words, then broadcasting text is children, broadcasts the text type of text are as follows: children's (weight 5), actively (weight 6) and children's stories (weight 4) broadcasts the text keyword of text are as follows: Andersen's (weight 2), Snow White's (weight 3), if one is standby Select music corresponding with children and Andersen, then the weighted value of the alternative music is that the weight of children adds the weight etc. of Andersen In 7.The specific weight of each text type and the weight of keyword can according to need and be configured, preferred text type Weight be greater than text keyword weight, the presence of text keyword be in order to enable screening music it is more accurate.Default rule It then can also be and chosen according to scoring height of the user each on network to each music, having collected a large number of users to each After the scoring of a music, chosen according to scoring height, the highest alternative music that preferably scores is as background music.Either in advance The use habit for collecting active user, the use preference of user is determined according to use habit, according to using preference to determine background sound It is happy.Such as: the use preference that user is determined by collecting information in advance is to like listening children's class music, likes positive music, Like the children's stories of Andersen, then preferably user recommends alternative music background corresponding with children, positive, Andersen and children's stories Music.

Setting plays volume value when casting text；

And/or volume value when setting broadcasting background music.

Specifically, needing to combine current environment in the volume value of the volume value and background music that set casting text Volume first obtains current environmental volume, the volume value for playing casting text is adjusted according to environmental volume and plays background sound Happy volume value.The corresponding volume value for increasing and playing text and background music is needed when environmental volume is larger, such as guarantees to broadcast The volume for putting text and background music is not less than environmental volume, to prevent volume is improper from causing user that can hear casting text This content.Preferably, setting plays the volume value of background music no more than the volume value for playing casting text at any one time.

The application also proposes a kind of sound broadcasting device, comprising:

Text identification unit 10, for determining the casting text of voice broadcast；

Music acquiring unit 20, for determining background music according to casting text；

Voice broadcast unit 30, for playing casting text and while playing background music.

Specifically, the device that the application proposes can be smart machine, especially intelligent sound equipment, such as voice helps The equipment such as hand, the household electrical appliance with phonetic function, smartwatch, Intelligent bracelet are by smart phone of sound broadcasting device Example, text identification unit 10 can be the speech processing software on smart phone, in the casting text for determining voice broadcast, intelligence The microphone of energy mobile phone receives user and issues to obtain phonetic order alternatively, semantic by the content of text progress of detection user's input It identifies the intention to determine user, determines the casting text of voice broadcast according to the user's intention.Such as user is in mobile phone searching Qiao Busi brief introduction is searched in middle input, then mobile phone will search for the brief introduction of Qiao Busi automatically, and the word content of the brief introduction is exactly this The casting text of voice broadcast determines background music according to casting text later, uses voice broadcast when playing and broadcasting text Form play out, while playing background music, it should be noted that the corresponding background music of casting text may more than one Head, i.e., one casting text can correspond to multiple background musics, but only play wherein one in any moment for playing casting text First background music, such as content when being the brief introduction of Qiao Busi of casting text, play high during Qiao Busi has a successful career Music plays droning music when Qiao Busi cause is tried out so that voice broadcast unit 30 play casting text when according to The variation of background music is realized in the variation of content of text, so that user experience is improved, the method for the voice broadcast that the application proposes According to different casting texts so that it is determined that corresponding background music, enable user when listening to voice broadcast more Accurately understand casting text, improves user experience, solving voice broadcast in the prior art does not have mood to be difficult to absorb grasp The problem of broadcasting content of text.

The device that the application proposes can also be combined by server and intelligent terminal two parts, and intelligent terminal receives user Instruction, send server for the instruction, server as the corresponding casting text of text identification unit judges user instruction, Then server is used as music acquiring unit to inquire suitable background music from database simultaneously, then will casting text and background Music sends back intelligent terminal, plays casting text and background music as voice broadcast unit 30 by intelligent terminal.

Optionally, text identification unit 10 determines voice broadcast text, comprising:

Obtain phonetic order and/or text instruction；

Semantic parsing is carried out to phonetic order and/or text instruction；

Casting text is determined according to semantic parsing result.

Specifically, the device for the voice broadcast that the application proposes preferably includes intelligent terminal and server, such as with language Sound control functional mobile phone and server, the microphone on mobile phone receives the control voice of user, to obtain phonetic order, then right The input interface that phonetic order carries out on semantic parsing or mobile phone detects that user has input text instruction, refers to text Order is parsed.Here the process for carrying out semantic parsing can be carried out automatically by mobile phone, or phonetic order is uploaded to clothes Business device carries out semantic parsing to phonetic order by server, obtains semantic parsing result after parsing and determine casting text, here Semantic parsing result is the content wanted to the user of phonetic order into determination after analysis, such as phonetic order is: " tomorrow How is weather ", then the result of semantic parsing is exactly that user wants inquiry weather tomorrow, and corresponding casting text is exactly the day of tomorrow Gas, such as casting text can be " 26-30 DEG C of temperature, calm tomorrow, clear to cloudy ".In another example phonetic order is that " inquiry is modern The GDP " in year, then semantic parsing result is that user wants inquiry GDP, then corresponding casting text is total for the domestic production in this year Value.

Optionally, text identification unit 10 determines casting text according to semantic parsing result, comprising:

Optionally, music acquiring unit 20 determines background music according to casting text, comprising:

Obtain the text type and/or text keyword of casting text；

Background music is chosen from alternative music according to preset rules.

Specifically, the corresponding text type of casting text can be a variety of, i.e., one casting text corresponds to multiple text classes Type, same casting text can correspond to multiple text keywords, and text type may include: style type, text reading Person and text mood.Such as: user requires the resume of inquiry Huo Jin by voice command, then the life of Huo Jin is found from network Brief introduction, text type are personage's biography, corresponding text reader for greater than 7 years old, text mood was neutrality at this time.Text type refers to Be subject matter, the style of text, such as can be autobiography, novel, prose, children's stories etc., text reader refers to be suitable for reading this to broadcast Crowd of message sheet, such as children, old man, young people, owner, sex etc., text mood are in the casting text The mood classification contained, for example, actively, passive and neutrality etc..Text keyword is to be able to the mark of quick localization of text Label, such as may is that science and technology, personage, natural etc..By setting text type and text keyword with quick and alternative sound Pleasure is matched.

Specifically, corresponding text type and text keyword, a sound is arranged for each music in music libraries in advance Happy corresponding all text types and the set of all text keywords composition can be the musical features set of the music, casting text The set of this all text types and text keyword composition can be the casting characteristic set of the casting text, alternative music There are intersections for the casting characteristic set of musical features set and casting text, such as: the corresponding text type of alternative music is: youngster Virgin, positive, children's stories, corresponding keyword is: Andersen, Snow White.And broadcasting at least one in the text type of text is The corresponding text type of alternative music, and/or, broadcasting at least one in the text keyword of text is that alternative music is corresponding Text keyword, by be respectively set text type for casting text and music and text keyword may be implemented casting text and The Rapid matching of alternative music.Need to obtain optimum music after obtaining alternative music from alternative music as music Type.Preset rules for example can be weighting algorithm in the application, such as be respectively each text type and text for broadcasting text Weight is arranged in this keyword, each alternative music is calculated according to the text type and text keyword of its corresponding casting text Weighted value selects the highest music of weighted value as background music from alternative music.Such as: broadcast the text type of text Are as follows: children's (weight 5), actively (weight 6) and children's stories (weight 4) broadcast the text keyword of text are as follows: Andersen (weight 2), Snow White's (weight 3), if an alternative music is corresponding with children and Andersen, the alternative music Weighted value be children weight plus Andersen weight be equal to 7.The weight of specific each text type and keyword Weight, which can according to need, to be configured, and the weight of preferred text type is greater than the weight of text keyword, text keyword Presence be in order to enable screening music it is more accurate.

It optionally, further include sound volume regulation unit 40；Background music is determined according to casting text in music acquiring unit 20 Later, before voice broadcast unit 30 plays casting text and plays background music simultaneously, sound volume regulation unit 40 is used for:

Setting plays volume value when casting text；

And/or volume value when setting broadcasting background music.

Specifically, sound volume regulation unit 40 set casting text volume value and background music volume value when, need In conjunction with current environmental volume, i.e., current environmental volume is first obtained, the volume for playing casting text is adjusted according to environmental volume Value and the volume value for playing background music.To prevent the improper content for causing user that can hear casting text of volume.It is preferred that Ground, setting play the volume value of background music no more than the volume value for playing casting text at any one time.

The foregoing is only a preferred embodiment of the present invention, is not intended to restrict the invention, for the skill of this field For art personnel, the invention may be variously modified and varied.All within the spirits and principles of the present invention, made any to repair Change, equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.

Claims

1. a kind of voice broadcast method characterized by comprising

Determine the casting text of voice broadcast；

Background music is determined according to the casting text；

It plays the casting text and plays the background music simultaneously.

2. voice broadcast method according to claim 1, which is characterized in that determine voice broadcast text, comprising:

Obtain phonetic order and/or text instruction；

Semantic parsing is carried out to the phonetic order and/or text instruction；

Casting text is determined according to semantic parsing result.

3. voice broadcast method according to claim 2, which is characterized in that determine casting text according to semantic parsing result This, comprising:

Determine in the phonetic order whether include inquiry instruction according to the semantic parsing result；

If so, then being inquired according to the inquiry instruction, use query result as the voice broadcast text.

4. voice broadcast method according to claim 1-3, which is characterized in that determined according to the casting text Background music, comprising:

Obtain the text type and/or text keyword of the casting text；

Corresponding alternative music is obtained according to the text type and/or text key word；

Background music is chosen from the alternative music according to preset rules.

5. voice broadcast method according to claim 4, which is characterized in that

Any music is corresponding with text type described at least one and/or corresponding with text keyword described at least one；

It is that the casting text has a text type described at least one and/or there is at least one described casting text key Word；

The alternative music is corresponding with casting one or more text types of text；

And/or the alternative music is corresponding with casting one or more text keywords of text.

6. according to the method for the described in any item voice broadcasts of claim 4-5, which is characterized in that the text type, comprising: Style type, text reader and text mood.

7. the method for voice broadcast according to claim 6, which is characterized in that determining background according to the casting text After music, before playing the casting text and playing the background music simultaneously, further includes:

Setting plays the volume value when casting text；

And/or the volume value when setting broadcasting background music.

8. a kind of sound broadcasting device characterized by comprising

Music acquiring unit, for determining background music according to the casting text；

Voice broadcast unit, for playing the casting text and playing the background music simultaneously.

9. sound broadcasting device according to claim 8, which is characterized in that the text identification unit determines voice broadcast Text, comprising:

Obtain phonetic order and/or text instruction；

Casting text is determined according to semantic parsing result.

10. sound broadcasting device according to claim 9, which is characterized in that the text identification unit is solved according to semanteme It analyses result and determines casting text, comprising:

11. according to the described in any item sound broadcasting devices of claim 8-10, which is characterized in that the music acquiring unit root Background music is determined according to the casting text, comprising:

Obtain the text type and/or text keyword of the casting text；

12. sound broadcasting device according to claim 11, which is characterized in that

13. the device of the described in any item voice broadcasts of 1-12 according to claim 1, which is characterized in that the text type, packet It includes: style type, text reader and text mood.

14. the device of voice broadcast according to claim 13, which is characterized in that further include sound volume regulation unit；

After the music acquiring unit determines background music according to the casting text, played in the voice broadcast unit Before the casting text simultaneously plays the background music simultaneously, the sound volume regulation unit is used for:

Setting plays the volume value when casting text；

And/or the volume value when setting broadcasting background music.