CN105304082B

CN105304082B - A kind of speech output method and device

Info

Publication number: CN105304082B
Application number: CN201510568430.8A
Authority: CN
Inventors: 王天; 王天一; 刘升平
Original assignee: Beijing Yunzhisheng Information Technology Co Ltd
Current assignee: Unisound Intelligent Technology Co Ltd
Priority date: 2015-09-08
Filing date: 2015-09-08
Publication date: 2018-12-28
Anticipated expiration: 2035-09-08
Also published as: WO2017041510A1; CN107077845A; CN107077845B; CN105304082A

Abstract

The invention discloses a kind of speech output method and devices.The described method includes: receiving the voice input content of user's input；According to the voice input content, the user is determined to the cognition degree of the voice input content generic, the cognition degree is the professional knowledge degree of awareness of the user to the classification；From at least one voice output content corresponding with the voice input content, obtains and export the voice output content to match with the cognition degree.The technical solution can be according to user to the cognition degree of the voice input content generic of input, the voice output content to match for user's selection with its cognition degree exports, so that voice output content is more in line with the demand of user, to provide more personalized voice output function for user, the accuracy of voice output is improved simultaneously, it allows users to get maximum information content from voice output content, improves the Experience Degree of user.

Description

A kind of speech output method and device

Technical field

The present invention relates to technical field of information processing more particularly to a kind of speech output methods and device.

Background technique

Currently, voice input is increasingly praised highly by people with the development of electronics technology, voice input is known by voice The Content Transformation that people is not spoken is a kind of input mode of text.It is more next with intelligent terminal popularizing in people's lives More intelligent terminals gradually have the function of voice service, for example, user can input proposition problem by voice, intelligence is eventually The problem of voice software on end passes through the voice of analysis user, and user is equally answered in a manner of voice, to be mentioned for user For helping to service.However, although this method is that user brings greats convenience, without user by cumbersome online inquire come Answer is obtained, but only has a kind of answering model in current voice service software, that is to say, that different users puts question to identical Problem (main contents of problem are identical), then export identical help information.And the technical level or patent ability of different user All be it is different, for a user, it may be necessary to different help informations or different answer-modes, it is therefore, above-mentioned Method cannot be distinguished from different technical needs and help to provide voice for user, not have specific aim.

Summary of the invention

The embodiment of the present invention provides a kind of speech output method and device.The technical solution is as follows:

A kind of speech output method, comprising the following steps:

Receive the voice input content of user's input；

According to the voice input content, determine the user to the cognition degree of the voice input content generic, The cognition degree is the professional knowledge degree of awareness of the user to the classification；

From at least one voice output content corresponding with the voice input content, obtains and export and recognize with described The voice output content that degree of knowing matches.

Some beneficial effects of the embodiment of the present invention may include:

Above-mentioned technical proposal can be user according to user to the cognition degree of the voice input content generic of input The voice output content to match with its cognition degree is selected to be exported, so that voice output content is more in line with the need of user It asks, to provide more personalized voice output function for user, while improving the accuracy of voice output, enable users to It is enough that maximum information content is got from voice output content, improve the Experience Degree of user.

In one embodiment, described according to the voice input content, determine the user in voice input Hold the cognition degree of generic, comprising:

Identify the voiceprint of the user；

According to the voiceprint, the voice input content for receiving the user for the first time is judged whether it is；

When the voice input content to receive the user for the first time, determine the user to the voice input content institute The cognition degree for belonging to classification is to preset minimum cognition degree.

In the embodiment, according to whether being that user selects the voice to match to receive the voice input content of user for the first time Output content is exported, so that voice output content is more in line with the demand of user, to provide for user more personalized Voice output function, while improving the accuracy of voice output, allow users to get most from voice output content Big information content improves the Experience Degree of user.

In one embodiment, the method also includes:

Record the input time of the voice input content and using duration, when use is a length of to receive the voice Duration between input content and the output voice output content.

In the embodiment, by recording the input time of voice input content and using duration, so that subsequent is that user is defeated Out when voice output content, determine that the foundation of user cognition degree is more abundant, to more accurately determine the cognition of user Degree, and then the voice output content more accurate, personalized for user's output.

Identify the voiceprint of the user；

According to the voiceprint of the user, judge whether the adjacent voice input content received twice is same user It is inputted；

When the adjacent voice input content received twice is inputted by same user, adjacent received twice according to described The input time of the voice input content arrived and use duration, calculate between the adjacent voice input content received twice when Between be spaced；

According to the time interval, determine the user to the cognition degree of the voice input content generic；Wherein, The time interval is longer, and the cognition degree is lower.

In the embodiment, between the time between voice input content by calculating the adjacent same user received twice Every so that determining that the foundation of user cognition degree is more abundant, to determine the cognition degree of user, and then more accurately to use More accurate, the personalized voice output content of family output.

Identify the voiceprint of the user；

According to the voiceprint of the user, history input record information corresponding with the user, the history are obtained Input record information includes that history is accumulative using at least one of time, the accumulative input number of history and history input frequency letter Breath；

According to the history input record information, cognition of the user to the voice input content generic is determined Degree；Wherein, the history is accumulative longer using the time, and the cognition degree is higher；The accumulative input number of the history is more, described Cognition degree is higher；The history input frequency is higher, and the cognition degree is higher.

In the embodiment, the cognition degree of user is determined according to the corresponding history input record information of user, so that terminal It can more accurately determine the cognition degree of user, and then the voice output content more accurate, personalized for user's output.

Extract the keyword in the voice input content；

Determine the matching degree of the keyword and predetermined keyword in the voice input content；

According to the matching degree of keyword and predetermined keyword in the voice input content, determine the user to described The cognition degree of voice input content generic；Wherein, in the keyword and predetermined keyword in the voice input content The matching degree of professional keyword is higher, and the cognition degree is higher；Keyword and predetermined keyword in the voice input content In amateur keyword matching degree it is higher, the cognition degree is lower.

In the embodiment, determine user's according to the matching degree of keyword and predetermined keyword in voice input content Cognition degree, so that the determination of user cognition degree is more accurate, personalized, thus the language more accurate, personalized for user's output Sound exports content.

Determine that the sentence structure type of the voice input content, the sentence structure type include professional sentence structure class Type or amateur sentence structure type；

According to the sentence structure type of the voice input content, determine the user to belonging to the voice input content The cognition degree of classification；Wherein, cognition of the user to the voice input content generic of the professional sentence structure type Degree is higher than the cognition degree to the voice input content generic of the amateur sentence structure type.

In the embodiment, the cognition degree of user is determined according to the sentence structure type of voice input content, so that user The determination of cognition degree is more accurate, personalized, thus the voice output content more accurate, personalized for user's output.

When determining that the adjacent voice input content received twice is inputted by same user, received twice according to adjacent To voice input content in keyword, determine the degree of association between the adjacent voice input content received twice；

According to the degree of association between the adjacent voice input content received twice, determine the user to institute's predicate The cognition degree of sound input content generic；Wherein, the degree of association is higher, and the cognition degree is lower.

In the embodiment, according to the degree of association between the voice input content of the adjacent same user received twice come really The cognition degree of user is determined, so that the determination of user cognition degree is more accurate, personalized, to export for user more accurate, a The voice output content of property.

According to the voice input content, at least two voice input parameters of the voice input content are determined, it is described Voice input parameter include: the voiceprint of the user, same user the adjacent voice input content inputted twice between Time interval, history input record information corresponding with the user, the keyword in the voice input content and default The matching degree of keyword, the sentence structure type of the voice input content and the adjacent voice input inputted twice of same user The degree of association between content；

The weight that parameter is inputted according to preset each single item voice, calculates the user to belonging to the voice input content The cognition degree of classification.

In the embodiment, the different weights of parameter are inputted, according to the voice of multinomial voice input content to calculate user couple The cognition degree of voice input content generic, so that the determination of user cognition degree is more accurate, personalized, thus defeated for user More accurate, personalized voice output content out.

When the voice that can not determine the voice input content inputs parameter, determine that the user inputs the voice The cognition degree of content generic is to preset minimum cognition degree.

In the embodiment, for can not determine voice input parameter voice input content, output with the voice input in Hold the voice output content to match, to provide more accurate and personalized voice output function for user, makes user More useful information can be got from voice output content, improve the Experience Degree of user.

In one embodiment, described from least one voice output content corresponding with the voice input content In, it obtains and exports the voice output content to match with the cognition degree, comprising:

According to the corresponding relationship between cognition degree and cognition grade, the corresponding cognition grade of the cognition degree is determined；

According to the corresponding relationship between cognition grade and voice output content, language corresponding with the cognition grade is obtained Sound exports content；

Export the voice output content.

In the embodiment, selected according to the corresponding relationship between cognition grade and voice output content for user matched Voice output content is exported, so that it is defeated to select the voice output content to match with user cognition degree progress for user Out, so that voice output content is more in line with the demand of user, the accuracy of voice output is improved, is allowed users to from voice Maximum information content is got in output content, improves the Experience Degree of user.

In one embodiment, the method also includes:

According to the input time of the voice input content and duration is used, updates the history input record information.

In the embodiment, by the update to history input record information, so that being exported in voice output for user again Rong Shi can determine the cognition degree of user according to accurate history input record, to export more accurate voice for user Export content.

In one embodiment, the method also includes:

The user is stored to the cognition degree of the voice input content generic；

It is described according to the voice input content, determine cognition of the user to the voice input content generic Degree, comprising:

Identify the voiceprint of the user；

The user is inquired to the cognition degree of the voice input content generic according to the voiceprint of the user.

In the embodiment, by inquiring the cognition degree of user, it is defeated to voice more convenient can to quickly determine out user Enter the cognition degree of content generic, so that it is defeated more quickly and accurately to select the voice output content to match to carry out for user Out.

A kind of instantaneous speech power, comprising:

Receiving module, for receiving the voice input content of user's input；

Determining module, for determining the user to belonging to the voice input content according to the voice input content The cognition degree of classification, the cognition degree are the professional knowledge degree of awareness of the user to the classification；

Output module, for obtaining from least one voice output content corresponding with the voice input content And export the voice output content to match with the cognition degree.

In one embodiment, the determining module includes:

First identifies submodule, for identification the voiceprint of the user；

First judging submodule, for judging whether it is the voice for receiving the user for the first time according to the voiceprint Input content；

Second determines submodule, for determining the user when the voice input content to receive the user for the first time Cognition degree to the voice input content generic is to preset minimum cognition degree.

In one embodiment, described device further include:

Logging modle, for recording the input time of the voice input content and using duration, when use, is a length of Receive the duration between the voice input content and the output voice output content.

In one embodiment, the determining module includes:

Second identifies submodule, for identification the voiceprint of the user；

Second judgment submodule judges that the adjacent voice received twice is defeated for the voiceprint according to the user Enter whether content is inputted by same user；

First computational submodule, for when the adjacent voice input content received twice is inputted by same user, According to input time of the adjacent voice input content received twice and duration is used, calculates and adjacent receives twice Time interval between voice input content；

Third determines submodule, for determining the user to the voice input content institute according to the time interval Belong to the cognition degree of classification；Wherein, the time interval is longer, and the cognition degree is lower.

In one embodiment, the determining module includes:

Third identifies submodule, for identification the voiceprint of the user；

It is defeated to obtain history corresponding with the user for the voiceprint according to the user for first acquisition submodule Enter and record information, the history input record information includes that history is accumulative defeated using time, the accumulative input number of history and history Enter at least one of frequency information；

4th determines submodule, for determining that the user is defeated to the voice according to the history input record information Enter the cognition degree of content generic；Wherein, the history is accumulative longer using the time, and the cognition degree is higher；The history Accumulative input number is more, and the cognition degree is higher；The history input frequency is higher, and the cognition degree is higher.

In one embodiment, the determining module includes:

Extracting sub-module, for extracting the keyword in the voice input content；

5th determines submodule, for determining the matching of keyword and predetermined keyword in the voice input content Degree；

6th determines submodule, for the matching according to keyword and predetermined keyword in the voice input content Degree, determines the user to the cognition degree of the voice input content generic；Wherein, the pass in the voice input content The matching degree of keyword and the professional keyword in predetermined keyword is higher, and the cognition degree is higher；In the voice input content Keyword and predetermined keyword in amateur keyword matching degree it is higher, the cognition degree is lower.

In one embodiment, the determining module includes:

7th determines submodule, for determining the sentence structure type of the voice input content, the sentence structure class Type includes professional sentence structure type or amateur sentence structure type；

8th determines that submodule determines the user couple for the sentence structure type according to the voice input content The cognition degree of the voice input content generic；Wherein, the user is defeated to the voice of the professional sentence structure type The cognition degree for entering content generic is higher than recognizing to the voice input content generic of the amateur sentence structure type Degree of knowing.

In one embodiment, the determining module includes:

9th determines submodule, for being inputted when the adjacent voice input content received twice of judgement by same user When, according to the keyword in the adjacent voice input content received twice, determine that the adjacent voice received twice is defeated Enter the degree of association between content；

Tenth determines submodule, for according to the degree of association between the adjacent voice input content received twice, Determine the user to the cognition degree of the voice input content generic；Wherein, the degree of association is higher, the cognition degree It is lower.

In one embodiment, the determining module includes:

11st determines submodule, for determining the voice input content at least according to the voice input content Two voices input parameters, the voice input parameter include: the voiceprint of the user, same user it is adjacent defeated twice The time interval between voice input content, the history input record information corresponding with the user, voice input entered Keyword and the matching degree of predetermined keyword, the sentence structure type of the voice input content and same user's phase in content The degree of association between voice input content that neighbour inputs twice；

Computational submodule calculates the user to described for inputting the weight of parameter according to preset each single item voice The cognition degree of voice input content generic.

In one embodiment, the determining module includes:

12nd determines submodule, for determining when the voice that can not determine the voice input content inputs parameter The user is to preset minimum cognition degree to the cognition degree of the voice input content generic.

In one embodiment, the output module includes:

13rd determines submodule, for determining the cognition according to the corresponding relationship between cognition degree and cognition grade Spend corresponding cognition grade；

Second acquisition submodule, for according to the corresponding relationship between cognition grade and voice output content, acquisition and institute State the corresponding voice output content of cognition grade；

Output sub-module, for exporting the voice output content.

In one embodiment, described device further include:

It is defeated to update the history for the input time according to the voice input content and using duration for update module Enter to record information.

In one embodiment, described device further include:

Memory module, for storing the user to the cognition degree of the voice input content generic；

The determining module includes:

4th identifies submodule, for identification the voiceprint of the user；

Submodule is inquired, for inquiring the user to the voice input content institute according to the voiceprint of the user Belong to the cognition degree of classification.

Some beneficial effects of the embodiment of the present invention may include:

Above-mentioned apparatus can be selected according to user to the cognition degree of the voice input content generic of input for user The voice output content to match with its cognition degree is exported, so that voice output content is more in line with the demand of user, from And more personalized voice output function is provided for user, while improving the accuracy of voice output, allow users to from Maximum information content is got in voice output content, improves the Experience Degree of user.

Other features and advantages of the present invention will be illustrated in the following description, also, partly becomes from specification It obtains it is clear that understand through the implementation of the invention.The objectives and other advantages of the invention can be by written explanation Specifically noted structure is achieved and obtained in book, claims and attached drawing.

Below by drawings and examples, technical scheme of the present invention will be described in further detail.

Detailed description of the invention

Attached drawing is used to provide further understanding of the present invention, and constitutes part of specification, with reality of the invention It applies example to be used to explain the present invention together, not be construed as limiting the invention.In the accompanying drawings:

Fig. 1 is a kind of flow chart of speech output method in the embodiment of the present invention；

The flow chart that Fig. 2 is step S12 in a kind of speech output method in the embodiment of the present invention；

The flow chart that Fig. 3 is step S12 in a kind of speech output method in the embodiment of the present invention；

The flow chart that Fig. 4 is step S12 in a kind of speech output method in the embodiment of the present invention；

The flow chart that Fig. 5 is step S12 in a kind of speech output method in the embodiment of the present invention；

The flow chart that Fig. 6 is step S13 in a kind of speech output method in the embodiment of the present invention；

Fig. 7 is a kind of block diagram of instantaneous speech power in the embodiment of the present invention；

Fig. 8 is a kind of block diagram of determining module in instantaneous speech power in the embodiment of the present invention；

Fig. 9 is a kind of block diagram of determining module in instantaneous speech power in the embodiment of the present invention；

Figure 10 is a kind of block diagram of determining module in instantaneous speech power in the embodiment of the present invention；

Figure 11 is a kind of block diagram of determining module in instantaneous speech power in the embodiment of the present invention；

Figure 12 is a kind of block diagram of output module in instantaneous speech power in the embodiment of the present invention；

Figure 13 is a kind of block diagram of instantaneous speech power in the embodiment of the present invention.

Specific embodiment

Hereinafter, preferred embodiments of the present invention will be described with reference to the accompanying drawings, it should be understood that preferred reality described herein Apply example only for the purpose of illustrating and explaining the present invention and is not intended to limit the present invention.

Fig. 1 is a kind of flow chart of speech output method in the embodiment of the present invention.As shown in Figure 1, this method is used for terminal In, terminal can be mobile phone, computer, digital broadcasting terminal, messaging device, game console, tablet device, doctor Equipment, body-building equipment are treated, personal digital assistant etc. includes the following steps S11-S13:

Step S11 receives the voice input content of user's input.

In the step, user can input voice input content by way of typing sound.

Step S12 determines user to the cognition degree of voice input content generic according to voice input content；This is recognized Degree of knowing is the professional knowledge degree of awareness of the user to voice input content generic.

For example, user inputs voice input content " how air-conditioner temperature being arranged ", then user is to voice input content institute The cognition degree for belonging to classification is the professional knowledge degree of awareness of the user to air-conditioning class；User inputs voice input content " Ah Si Woods is any medicine ", then user is professional knowledge of the user to pharmaceutical to the cognition degree of voice input content generic The degree of awareness.Terminal can be by extracting the keyword in voice input content, to determine classification belonging to voice input content.

Step S13 is obtained from least one voice output content corresponding with voice input content and is exported and recognize The voice output content that degree of knowing matches.

It, can be according to user to the affiliated class of voice input content of input using technical solution provided in an embodiment of the present invention Other cognition degree, the voice output content to match for user's selection with its cognition degree exports, so that voice output content It is more in line with the demand of user, to provide more personalized voice output function for user, while improving voice output Accuracy, allow users to get maximum information content from voice output content, improve the Experience Degree of user.

In step s 12, user can determine the cognition degree of voice input content generic in several ways.It can be first First according to voice input content, determines the voice input parameter of voice input content, determine and use further according to voice input parameter Cognition degree of the family to voice input content generic.Wherein, the difference of parameter, the method for determination of cognition degree are inputted according to voice Also different, voice input parameter may include that the voiceprint of user, the adjacent voice inputted twice of same user are defeated Enter time interval between content, history input record information corresponding to the user, the keyword in voice input content and pre- If in the matching degree of keyword, the sentence structure type of voice input content and the adjacent voice input inputted twice of same user Degree of association between appearance, etc..Illustrate the embodiment of step S12 below by way of different embodiments.

In one embodiment, as shown in Fig. 2, step S12 may be embodied as following steps S21-S23:

Step S21 identifies the voiceprint of user.

Step S22 judges whether it is the voice input content for receiving user for the first time according to voiceprint.

Step S23 determines user to the affiliated class of voice input content when the voice input content to receive user for the first time Other cognition degree is to preset minimum cognition degree.

The corresponding voiceprint of different user is stored in the present embodiment, in terminal, when user inputs voice input When content, if terminal can inquire the voiceprint of the user in pre-stored voiceprint, illustrate it is not to connect for the first time The voice input content of the user is received, and if terminal fails to inquire the vocal print of the user in pre-stored voiceprint Information then illustrates that terminal is the voice input content for receiving the user for the first time.When not in the voice input of reception user for the first time Rong Shi, then terminal continues to determine that other voices input parameter according to voice input content, and is inputted and joined according to other voices Number executes step S12.In the terminal, the corresponding relationship being previously stored between cognition degree and voice input content, wherein including Voice input content corresponding with minimum cognition degree is preset.

In one embodiment, the above method is further comprising the steps of: recording the input time of voice input content and makes With duration, this using when a length of receive voice input content and export the duration between voice output content.Therefore, such as Fig. 3 Shown, step S12 may be embodied as following steps S31-S34:

Step S31 identifies the voiceprint of user.

Step S32 judges whether the adjacent voice input content received twice is same according to the voiceprint of user User is inputted.

Step S33, when the adjacent voice input content received twice is inputted by same user, according to it is adjacent twice The input time of the voice input content received and duration is used, calculated between the adjacent voice input content received twice Time interval.

Step S34 determines user to the cognition degree of voice input content generic according to time interval；Wherein, the time Interval is longer, and cognition degree is lower.

In the present embodiment, when the adjacent voice input content received twice is inputted by same user, then adjacent It is defeated that the time interval between voice input content received twice can reflect out the upper voice that user exports terminal The reaction duration of content out, in addition, the reaction duration for the upper voice output content that user exports terminal can also be by upper one Secondary output voice output content receives the time interval between voice input content to this to characterize.For example, one in terminal The secondary voice input content received is " how air-conditioner temperature being arranged ", and for the voice input content, terminal is exported therewith Corresponding voice output content is " being introduced into temperature shaping modes, then change temperature "；This voice for receiving of terminal is defeated Enter content is " how entering temperature shaping modes ", when terminal determines that the adjacent voice input content received twice is same use When family is inputted, then it can be used and receive voice input content " how air-conditioner temperature is set " and receive voice input content Time interval between " how entering temperature shaping modes " " is introduced into temperature to characterize user to a upper voice output content Then shaping modes change temperature " reaction duration, and then determine cognition of the user to voice input content generic Degree.Alternatively, can also be used output voice output content " being introduced into temperature shaping modes, then change temperature " and this receive Time interval between voice input content " how to enter temperature shaping modes " characterizes user in a upper voice output Hold the reaction duration of " being introduced into temperature shaping modes, then change temperature ", and then determines user to voice input content institute Belong to the cognition degree of classification.Time interval is longer, illustrates that user is longer to the reaction duration of a upper voice output content, then recognizing Degree of knowing is lower.

In addition, a prefixed time interval can be also preset, when the adjacent voice input content received twice is same When one user inputs and the time interval between the adjacent voice input content received twice is more than prefixed time interval, It is to preset minimum cognition degree, and obtain and preset to the cognition degree of voice input content generic that terminal, which can directly determine user, The voice output content that minimum cognition degree matches is exported.

In one embodiment, as shown in figure 4, step S12 may be embodied as following steps S41-S43:

Step S41 identifies the voiceprint of user.

Step S42 obtains history input record information corresponding to the user according to the voiceprint of user；History input Record information includes that history is accumulative using at least one of time, the accumulative input number of history and history input frequency information.

Step S43 determines user to the cognition degree of voice input content generic according to history input record information； Wherein, history is accumulative longer using the time, and cognition degree is higher；The accumulative input number of history is more, and cognition degree is higher；History input Frequency is higher, and cognition degree is higher.

In the present embodiment, terminal receives the voice input content of user's input every time, just will record voice input content Input time and use duration, this using when it is a length of receive voice input content and export voice output content between when It is long.Terminal can count history input record information corresponding to the user according to the input time of record and using duration, In, the accumulative summation using duration as recorded each time using the time of history.In addition, the above method further includes following step It is rapid: according to the input time of voice input content and to use duration, more new historical input record information.In this way, when terminal according to When history input record information corresponding to the user determines cognition degree of the user to voice input content generic, based on History input record information is more abundant accurate, so as to the voice output content more accurate and personalized for user's selection It is exported.

In one embodiment, as shown in figure 5, step S12 may be embodied as following steps S51-S53:

Step S51 extracts the keyword in voice input content.

Step S52 determines the matching degree of the keyword and predetermined keyword in voice input content.

Step S53 determines user to voice according to the matching degree of keyword and predetermined keyword in voice input content The cognition degree of input content generic；Wherein, the profession in the keyword and predetermined keyword in voice input content is crucial The matching degree of word is higher, and cognition degree is higher；Amateur keyword in keyword and predetermined keyword in voice input content Matching degree it is higher, cognition degree is lower.

In the present embodiment, the predetermined keyword prestored in terminal includes two classes of professional keyword and amateur keyword Type needs to determine the matching between the keyword and professional keyword in voice input content respectively when executing step S52 Degree, and the matching degree between amateur keyword.For example, professional keyword includes " setting path ", amateur keyword Including " how using ", if the voice input content that terminal receives is " ... setting path ", it can determine that voice is defeated The matching degree entered between the keyword and professional keyword in content is higher, therefore user is to voice input content generic Cognition degree is also higher；If the voice input content that terminal receives is " ... how to use ", it can determine that voice inputs The matching degree between keyword and amateur keyword in content is higher, therefore user is to voice input content generic Cognition degree is also lower.

In one embodiment, step S12 may be embodied as following steps A1-A2:

Step A1 determines that the sentence structure type of voice input content, sentence structure type include professional sentence structure class Type or amateur sentence structure type.

Step A2 determines user to voice input content generic according to the sentence structure type of voice input content Cognition degree；Wherein, user is higher than to non-special the cognition degree of the voice input content generic of professional sentence structure type The cognition degree of the voice input content generic of industry sentence structure type.

Sentence structure type is prestored in the present embodiment, in terminal, sentence structure type can be by regular expression come body It is existing.Wherein, the regular expression of professional sentence structure type is such as: adjective+noun+verb；Amateur sentence structure type Regular expression is such as: pronoun+verb.It should be pointed out that the manifestation mode of sentence structure type is not limited to regular expression, also It can be embodied in such a way that other can embody sentence structure.It is exemplified below, the voice input content that terminal receives is " to open What the step of machine is ", terminal determines the sentence structure type of the voice input content by analyzing the voice input content For " adjective+noun+verb+pronoun ", then can determine the sentence structure type of the voice input content for professional sentence knot Structure type, user are higher to the cognition degree of the voice input content generic.For another example, in the voice input that terminal receives Holding is " how this thing is used ", and terminal determines the sentence structure of the voice input content by analyzing the voice input content Type is " pronoun+verb ", then can determine that the sentence structure type of the voice input content is amateur sentence structure type, User is lower to the cognition degree of the voice input content generic.

In one embodiment, step S12 may be embodied as following steps B1-B2:

Step B1, when determining that the adjacent voice input content received twice is inputted by same user, according to adjacent The keyword in voice input content received twice determines the association between the adjacent voice input content received twice Degree.

Step B2 determines that user is defeated to voice according to the degree of association between the adjacent voice input content received twice Enter the cognition degree of content generic；Wherein, the degree of association is higher, and cognition degree is lower.

In the present embodiment, when the adjacent voice input content received twice is inputted by same user, then adjacent The degree of association between voice input content received twice can reflect out understanding of the user to a upper voice output content Degree, therefore the degree of association between the adjacent voice input content received twice is higher, illustrates that user is defeated to a upper voice The degree of understanding of content is lower out, and user is also lower to the cognition degree of voice input content generic；It is adjacent to receive twice The degree of association between voice input content arrived is lower, illustrates that user is higher to the degree of understanding of a upper voice output content, User is also higher to the cognition degree of voice input content generic.For example, the terminal last time receive voice input in Holding is " how air-conditioner temperature being arranged ", while terminal this voice input content for receiving is " how to enter temperature and adjust mould Formula " can extract and adjacent connect twice when terminal determines that the adjacent voice input content received twice is inputted by same user The keyword in voice input content received, it is " empty by keyword such as keyword " air-conditioner temperature ", " temperature shaping modes " The degree of association between temperature regulating " and keyword " temperature shaping modes " come determine the adjacent voice input content received twice it Between the degree of association, since " air-conditioner temperature " and " temperature shaping modes " is all keyword related with temperature, therebetween The degree of association it is higher.For another example, the voice input content that the terminal last time receives is " how air-conditioner temperature being arranged ", while eventually The voice input content for holding this to receive is " what the step of booting is ", when terminal determines the adjacent voice received twice When input content is inputted by same user, extracting the keyword in two neighboring voice input content respectively is " air-conditioner temperature " " booting ", due to the keyword that the two keywords are two unrelated types, the degree of association between the two is almost Zero, that is to say, the degree of association between the bright adjacent voice input content received twice is very low, user is to a upper voice output The degree of understanding of content is higher, and it is higher to the cognition degree of voice input content generic to further relate to user.

It in one embodiment, can also be defeated by multinomial voice for the various ways of execution step S12 in above-described embodiment Enter parameter to combine, and user is calculated to the cognition degree of voice input content generic according to pre-set weight. Therefore, above-mentioned steps S12 can also be embodied as following steps: according to voice input content, determine voice input content at least Two voices input parameters, wherein voice input parameter includes: that the voiceprint of user, the adjacent of same user input twice Voice input content between time interval, history input record information corresponding to the user, the pass in voice input content The keyword language inputted twice adjacent with the matching degree of predetermined keyword, the sentence structure type of voice input content and same user The degree of association between sound input content；The weight of parameter is inputted according to preset each single item voice, calculates user and voice is inputted The cognition degree of content generic.

In one embodiment, the above method is further comprising the steps of: when the voice that can not determine voice input content is defeated When entering parameter, determine that user is to preset minimum cognition degree to the cognition degree of voice input content generic.In the present embodiment, when When receiving the voice input content of user's input, for that can not determine the voice input content of voice input content, terminal can Directly determining user is to preset minimum cognition degree to the cognition degree of the voice input content generic, therefore, even can not Determine the voice input content of voice input parameter, user can also get matched voice output content, to mention The Experience Degree of high user.

In one embodiment, the above method is further comprising the steps of: user is to voice input content generic for storage Cognition degree.At this point, step S12 may be embodied as following steps: identifying the voiceprint of user；According to the voiceprint of user User is inquired to the cognition degree of voice input content generic.In the present embodiment, by inquiring the cognition degree of user, Neng Gougeng Add and quickly and easily determines user to the cognition degree of voice input content generic, to more quickly and accurately be user The voice output content to match is selected to be exported.

In one embodiment, as shown in fig. 6, step S13 is implementable for following steps S61-S63:

Step S61 determines the corresponding cognition grade of cognition degree according to the corresponding relationship between cognition degree and cognition grade.

Step S62 is obtained corresponding with cognition grade according to the corresponding relationship between cognition grade and voice output content Voice output content.

Step S63 exports voice output content.

In the present embodiment, terminal prestores cognition degree and recognizes the corresponding relationship between grade, and cognition grade and language Sound export content between corresponding relationship, for example, cognition grade can be divided into as needed low cognition grade, middle cognition grade, Height cognition grade three grades, the low cognition grade of correspondence of the cognition degree between " 0%~30% ", cognition degree " 31%~ Grade, corresponding high cognition grade of the cognition degree between " 71%~100% " are recognized in correspondence between 70% ".With low cognition The corresponding voice output content of grade is detailed version voice output content, and voice output content corresponding with middle cognition grade is mark Quasi- version voice output content, voice output content corresponding with height cognition grade is succinct version voice output content, for each A voice input content, terminal can all store three kinds of corresponding detailed version, succinct version, standard edition voice output contents.It lifts For example, for voice input content " how air-conditioner temperature being arranged ", voice output content corresponding thereto includes: detailed version " mode button in first row middle position is clicked, clicks and enters temperature shaping modes twice, clicks second row left button ' +/- ' change temperature clicks primary, 1 degree of temperature ' +/- ' "；Standard edition " clicks mode button and enters temperature shaping modes, click ' +/- ' the change temperature of button "；Succinct version " being introduced into temperature shaping modes, then change temperature ".In addition, minimum recognizing with presetting The corresponding cognition grade of degree of knowing can be low cognition grade, therefore, in the voice input for that can not determine voice input parameter Hold or receive for the first time the voice input content of user, terminal can directly export detailed version voice output content.As it can be seen that adopting With the technical solution of the present embodiment, make terminal when exporting voice output content for user, it can be by determining user to voice The cognition degree of input content generic analyzes the current demand of user, and the demand output and its phase current according to user Matched voice output content allows users to get more more accurate information from voice output content.

Corresponding to a kind of above-mentioned speech output method, the embodiment of the present invention also provides a kind of instantaneous speech power, the device To execute the above method.

Fig. 7 is a kind of block diagram of instantaneous speech power in the embodiment of the present invention.As shown in fig. 7, the device includes:

Receiving module 71, for receiving the voice input content of user's input；

Determining module 72, for determining cognition of the user to voice input content generic according to voice input content Degree, cognition degree are the professional knowledge degree of awareness of the user to classification；

Output module 73, for obtaining simultaneously from least one voice output content corresponding with voice input content The voice output content that output matches with cognition degree.

In one embodiment, as shown in figure 8, determining module 72 includes:

First identifies submodule 721, for identification the voiceprint of user；

First judging submodule 722, for according to voiceprint, judging whether it is in the voice input for receiving user for the first time Hold；

Second determines submodule 723, for determining user to voice when the voice input content to receive user for the first time The cognition degree of input content generic is to preset minimum cognition degree.

In one embodiment, above-mentioned apparatus further include:

Logging modle, for recording the input time of voice input content and using duration, when use is a length of to receive language Duration between sound input content and output voice output content.

In one embodiment, as shown in figure 9, determining module 72 includes:

Second identifies submodule 724, for identification the voiceprint of user；

Second judgment submodule 725 judges the adjacent voice input received twice for the voiceprint according to user Whether content is inputted by same user；

First computational submodule 726, for being inputted when the adjacent voice input content received twice by same user When, according to input time of the adjacent voice input content received twice and duration is used, calculates and adjacent receives twice Time interval between voice input content；

Third determines submodule 727, for determining that user recognizes voice input content generic according to time interval Degree of knowing；Wherein, time interval is longer, and cognition degree is lower.

In one embodiment, as shown in Figure 10, determining module 72 includes:

Third identifies submodule 728, for identification the voiceprint of user；

First acquisition submodule 729 obtains history input record corresponding to the user for the voiceprint according to user Information, history input record information includes that history is accumulative to be inputted in frequency extremely using time, the accumulative input number of history and history One item missing information；

4th determines submodule 7210, for determining user to voice input content institute according to history input record information Belong to the cognition degree of classification；Wherein, history is accumulative longer using the time, and cognition degree is higher；The accumulative input number of history is more, cognition It spends higher；History input frequency is higher, and cognition degree is higher.

In one embodiment, as shown in figure 11, determining module 72 includes:

Extracting sub-module 7211, for extracting the keyword in voice input content；

5th determines submodule 7212, for determining the matching of keyword and predetermined keyword in voice input content Degree；

6th determines submodule 7213, for the matching according to keyword and predetermined keyword in voice input content Degree, determines user to the cognition degree of voice input content generic；Wherein, the keyword in voice input content and default pass The matching degree of professional keyword in keyword is higher, and cognition degree is higher；Keyword and predetermined keyword in voice input content In amateur keyword matching degree it is higher, cognition degree is lower.

In one embodiment, determining module 72 includes:

7th determines submodule, for determining that the sentence structure type of voice input content, sentence structure type include special Industry sentence structure type or amateur sentence structure type；

8th determines submodule, for the sentence structure type according to voice input content, determines that user inputs voice The cognition degree of content generic；Wherein, cognition of the user to the voice input content generic of professional sentence structure type Degree is higher than the cognition degree to the voice input content generic of amateur sentence structure type.

In one embodiment, determining module 72 includes:

9th determines submodule, for being inputted when the adjacent voice input content received twice of judgement by same user When, according to the keyword in the adjacent voice input content received twice, determine in the adjacent voice input received twice The degree of association between appearance；

Tenth determines submodule, for determining according to the degree of association between the adjacent voice input content received twice Cognition degree of the user to voice input content generic；Wherein, the degree of association is higher, and cognition degree is lower.

In one embodiment, determining module 72 includes:

11st determines submodule, for determining at least two voices of voice input content according to voice input content Parameter is inputted, voice input parameter includes: the adjacent voice input content inputted twice of the voiceprint of user, same user Between time interval, history input record information corresponding to the user, the keyword in voice input content and default key Between the matching degree of word, the sentence structure type of voice input content and the adjacent voice input content inputted twice of same user The degree of association；

Computational submodule calculates user and inputs to voice for inputting the weight of parameter according to preset each single item voice The cognition degree of content generic.

In one embodiment, determining module 72 includes:

12nd determines submodule, for determining user when the voice that can not determine voice input content inputs parameter Cognition degree to voice input content generic is to preset minimum cognition degree.

In one embodiment, as shown in figure 12, output module 73 includes:

13rd determines submodule 731, for determining cognition degree according to the corresponding relationship between cognition degree and cognition grade Corresponding cognition grade；

Second acquisition submodule 732, for according to cognition grade and voice output content between corresponding relationship, obtain with Recognize the corresponding voice output content of grade；

Output sub-module 733, for exporting voice output content.

In one embodiment, as shown in figure 13, above-mentioned apparatus further include:

Update module 74, for according to input time of voice input content and using duration, more new historical input record Information.

Memory module 75, for storing user to the cognition degree of voice input content generic.

In one embodiment, determining module 72 includes:

4th identifies submodule, for identification the voiceprint of user；

Submodule is inquired, for inquiring cognition of the user to voice input content generic according to the voiceprint of user Degree.

It, can be according to user to the voice input content generic of input using device provided in an embodiment of the present invention Cognition degree, the voice output content to match for user's selection with its cognition degree exports, so that voice output content is more Meet the demand of user, to provide more personalized voice output function for user, while improving the standard of voice output True property allows users to get maximum information content from voice output content, improves the Experience Degree of user.

It should be understood by those skilled in the art that, the embodiment of the present invention can provide as method, system or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the present invention Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the present invention, which can be used in one or more, The shape for the computer program product implemented in usable storage medium (including but not limited to magnetic disk storage and optical memory etc.) Formula.

The present invention be referring to according to the method for the embodiment of the present invention, the process of equipment (system) and computer program product Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.

These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.

These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.

Obviously, various changes and modifications can be made to the invention without departing from essence of the invention by those skilled in the art Mind and range.In this way, if these modifications and changes of the present invention belongs to the range of the claims in the present invention and its equivalent technologies Within, then the present invention is also intended to include these modifications and variations.

Claims

1. a kind of speech output method characterized by comprising

Receive the voice input content of user's input；

According to the voice input content, determine the user to the cognition degree of the voice input content generic, it is described Cognition degree is the professional knowledge degree of awareness of the user to the classification；

From at least one voice output content corresponding with the voice input content, obtains and export and the cognition degree The voice output content to match；

The method also includes:

It is described according to the voice input content, determine the user to the cognition degree of the voice input content generic, Include:

Identify the voiceprint of the user；

2. determining the use the method according to claim 1, wherein described according to the voice input content Cognition degree of the family to the voice input content generic, comprising:

Identify the voiceprint of the user；

When the voice input content to receive the user for the first time, determine the user to the affiliated class of voice input content Other cognition degree is to preset minimum cognition degree.

3. the method according to claim 1, wherein the method also includes:

Record the input time of the voice input content and using duration, when use is a length of to receive the voice input Duration between content and the output voice output content.

4. according to the method described in claim 3, determining the use it is characterized in that, described according to the voice input content Cognition degree of the family to the voice input content generic, comprising:

Identify the voiceprint of the user；

According to the voiceprint of the user, judge whether the adjacent voice input content received twice is defeated by same user Enter；

When the adjacent voice input content received twice is inputted by same user, adjacent received twice according to described The input time of voice input content and duration is used, calculated between the time between the adjacent voice input content received twice Every；

According to the time interval, determine the user to the cognition degree of the voice input content generic；Wherein, described Time interval is longer, and the cognition degree is lower.

5. according to the method described in claim 3, determining the use it is characterized in that, described according to the voice input content Cognition degree of the family to the voice input content generic, comprising:

Identify the voiceprint of the user；

According to the voiceprint of the user, history input record information corresponding with the user, the history input are obtained Record information includes that history is accumulative using at least one of time, the accumulative input number of history and history input frequency information；

According to the history input record information, determine the user to the cognition degree of the voice input content generic； Wherein, the history is accumulative longer using the time, and the cognition degree is higher；The accumulative input number of the history is more, described to recognize Degree of knowing is higher；The history input frequency is higher, and the cognition degree is higher.

6. determining the use the method according to claim 1, wherein described according to the voice input content Cognition degree of the family to the voice input content generic, comprising:

Extract the keyword in the voice input content；

According to the matching degree of keyword and predetermined keyword in the voice input content, determine the user to the voice The cognition degree of input content generic；Wherein, the profession in the keyword and predetermined keyword in the voice input content The matching degree of keyword is higher, and the cognition degree is higher；In keyword and predetermined keyword in the voice input content The matching degree of amateur keyword is higher, and the cognition degree is lower.

7. determining the use the method according to claim 1, wherein described according to the voice input content Cognition degree of the family to the voice input content generic, comprising:

Determine the sentence structure type of the voice input content, the sentence structure type include professional sentence structure type or Amateur sentence structure type；

According to the sentence structure type of the voice input content, determine the user to the voice input content generic Cognition degree；Wherein, the user is high to the cognition degree of the voice input content generic of the professional sentence structure type In the cognition degree of the voice input content generic to the amateur sentence structure type.

8. determining the use the method according to claim 1, wherein described according to the voice input content Cognition degree of the family to the voice input content generic, comprising:

When determining that the adjacent voice input content received twice is inputted by same user, received twice according to adjacent Keyword in voice input content determines the degree of association between the adjacent voice input content received twice；

According to the degree of association between the adjacent voice input content received twice, determine that the user is defeated to the voice Enter the cognition degree of content generic；Wherein, the degree of association is higher, and the cognition degree is lower.

9. method according to claim 1-8, which is characterized in that it is described according to the voice input content, really Cognition degree of the fixed user to the voice input content generic, comprising:

According to the voice input content, at least two voice input parameters of the voice input content, the voice are determined Input parameter include: the voiceprint of the user, same user the adjacent voice input content inputted twice between when Between interval, the keyword in history input record information corresponding with the user, the voice input content and default key The matching degree of word, the sentence structure type of the voice input content and the adjacent voice input content inputted twice of same user Between the degree of association；

The weight that parameter is inputted according to preset each single item voice, calculates the user to the voice input content generic Cognition degree.

10. according to the method described in claim 9, determining the use it is characterized in that, described according to the voice input content Cognition degree of the family to the voice input content generic, comprising:

When the voice that can not determine the voice input content inputs parameter, determine the user to the voice input content The cognition degree of generic is to preset minimum cognition degree.

11. the method according to claim 1, wherein it is described from it is corresponding with the voice input content to In a kind of few voice output content, obtains and exports the voice output content to match with the cognition degree, comprising:

According to the corresponding relationship between cognition grade and voice output content, it is defeated to obtain voice corresponding with the cognition grade Content out；

Export the voice output content.

12. according to the method described in claim 5, it is characterized in that, the method also includes:

13. a kind of instantaneous speech power characterized by comprising

Receiving module, for receiving the voice input content of user's input；

Determining module, for determining the user to the voice input content generic according to the voice input content Cognition degree, the cognition degree is the user to the professional knowledge degree of awareness of the classification；

Output module, for obtaining and defeated from least one voice output content corresponding with the voice input content The voice output content to match out with the cognition degree；

Described device further include:

The determining module includes:

4th identifies submodule, for identification the voiceprint of the user；

Submodule is inquired, for inquiring the user to the affiliated class of voice input content according to the voiceprint of the user Other cognition degree.

14. device according to claim 13, which is characterized in that the determining module includes:

First identifies submodule, for identification the voiceprint of the user；

First judging submodule, for according to the voiceprint, judging whether it is the voice input for receiving the user for the first time Content；

Second determines submodule, for determining the user to institute when the voice input content to receive the user for the first time The cognition degree of predicate sound input content generic is to preset minimum cognition degree.

15. device according to claim 13, which is characterized in that described device further include:

Logging modle, for recording the input time of the voice input content and using duration, a length of reception of when use Duration between the voice input content and the output voice output content.

16. device according to claim 15, which is characterized in that the determining module includes:

Second identifies submodule, for identification the voiceprint of the user；

Second judgment submodule judges in the adjacent voice input received twice for the voiceprint according to the user Hold and whether is inputted by same user；

First computational submodule, for when the adjacent voice input content received twice is inputted by same user, according to The input time of the adjacent voice input content received twice and duration is used, calculates the adjacent voice received twice Time interval between input content；

Third determines submodule, for determining the user to the affiliated class of voice input content according to the time interval Other cognition degree；Wherein, the time interval is longer, and the cognition degree is lower.

17. device according to claim 13, which is characterized in that the determining module includes:

Third identifies submodule, for identification the voiceprint of the user；

First acquisition submodule obtains history input note corresponding with the user for the voiceprint according to the user Information is recorded, the history input record information includes that history is accumulative using time, the accumulative input number of history and history input frequency At least one of rate information；

4th determines submodule, for determining the user in voice input according to the history input record information Hold the cognition degree of generic；Wherein, the history is accumulative longer using the time, and the cognition degree is higher；The history is accumulative Input number is more, and the cognition degree is higher；The history input frequency is higher, and the cognition degree is higher.

18. device according to claim 13, which is characterized in that the determining module includes:

Extracting sub-module, for extracting the keyword in the voice input content；

5th determines submodule, for determining the matching degree of keyword and predetermined keyword in the voice input content；

6th determines submodule, for the matching degree according to keyword and predetermined keyword in the voice input content, really Cognition degree of the fixed user to the voice input content generic；Wherein, the keyword in the voice input content Higher with the matching degree of the professional keyword in predetermined keyword, the cognition degree is higher；Pass in the voice input content The matching degree of keyword and the amateur keyword in predetermined keyword is higher, and the cognition degree is lower.

19. device according to claim 13, which is characterized in that the determining module includes:

7th determines submodule, for determining the sentence structure type of the voice input content, the sentence structure type packet Include professional sentence structure type or amateur sentence structure type；

8th determines that submodule determines the user to described for the sentence structure type according to the voice input content The cognition degree of voice input content generic；Wherein, the user is in the voice input of the professional sentence structure type Hold cognition degree of the cognition degree higher than the voice input content generic to the amateur sentence structure type of generic.

20. device according to claim 13, which is characterized in that the determining module includes:

9th determines submodule, for when determining that the adjacent voice input content received twice is inputted by same user, According to the keyword in the adjacent voice input content received twice, determine in the adjacent voice input received twice The degree of association between appearance；

Tenth determines submodule, for determining according to the degree of association between the adjacent voice input content received twice Cognition degree of the user to the voice input content generic；Wherein, the degree of association is higher, and the cognition degree is got over It is low.

21. the described in any item devices of 3-19 according to claim 1, which is characterized in that the determining module includes:

11st determines submodule, for determining at least two of the voice input content according to the voice input content Voice inputs parameter, and voice input parameter includes: that the voiceprint of the user, the adjacent of same user input twice Time interval, history input record information corresponding with the user, the voice input content between voice input content In keyword and the matching degree of predetermined keyword, the sentence structure type of the voice input content and same user adjacent two The degree of association between the voice input content of secondary input；

Computational submodule calculates the user to the voice for inputting the weight of parameter according to preset each single item voice The cognition degree of input content generic.

22. device according to claim 21, which is characterized in that the determining module includes:

12nd determines submodule, described in determining when the voice that can not determine the voice input content inputs parameter User is to preset minimum cognition degree to the cognition degree of the voice input content generic.

23. device according to claim 13, which is characterized in that the output module includes:

13rd determines submodule, for determining the cognition degree pair according to the corresponding relationship between cognition degree and cognition grade The cognition grade answered；

Second acquisition submodule, for according to the corresponding relationship between cognition grade and voice output content, acquisition to be recognized with described Know the corresponding voice output content of grade；

Output sub-module, for exporting the voice output content.

24. device according to claim 17, which is characterized in that described device further include:

Update module updates the history input note for the input time according to the voice input content and using duration Record information.