CN109086455A

CN109086455A - A kind of construction method and facility for study of speech recognition library

Info

Publication number: CN109086455A
Application number: CN201811002956.XA
Authority: CN
Inventors: 徐杨
Original assignee: Guangdong Genius Technology Co Ltd
Current assignee: Guangdong Genius Technology Co Ltd
Priority date: 2018-08-30
Filing date: 2018-08-30
Publication date: 2018-12-25
Anticipated expiration: 2038-08-30
Also published as: CN109086455B

Abstract

The present invention relates to technical field of electronic equipment, disclose the construction method and facility for study of a kind of speech recognition library, comprising: identify to acquisition and the matched voice messaging of identity information of facility for study user, obtain several words；The frequency of use of each word is calculated, and the word that frequency of use in word is greater than predeterminated frequency is determined as everyday expressions；By everyday expressions and relevant information associated storage corresponding with everyday expressions extremely and in the speech recognition library of user.Implement the embodiment of the present invention, the pronunciation of everyday expressions and meaning can be stored into the speech recognition library of user-specific, so that subsequent facility for study is when identifying the voice messaging of user's input, voice messaging can be identified according to the meaning of everyday expressions in speech recognition library, reduce the difficulty of facility for study identification voice messaging, the accuracy of facility for study speech recognition can also be improved, to improve the efficiency of facility for study speech recognition.

Description

A kind of construction method and facility for study of speech recognition library

Technical field

The present invention relates to technical field of electronic equipment, and in particular to a kind of construction method of speech recognition library and study are set It is standby.

Background technique

With the fast development of the facility for study such as private tutor's machine, computer for learning, the low age bracket user of facility for study is also more next More, since the manipulative ability of low age bracket user is weaker, low age bracket user is usually by way of voice to Equipment is practised to be manipulated.The voice of the available low age bracket user input of facility for study, by identification voice content execute with The corresponding operation of voice content.However, it has been found in practice that there are different speech habits, traditional study by different users Equipment identify user input voice messaging when, voice messaging can not be identified according to user's distinctive speech habits, Efficiency so as to cause facility for study speech recognition is lower.

Summary of the invention

The embodiment of the present invention discloses the construction method and facility for study of a kind of speech recognition library, can be improved facility for study language The efficiency of sound identification.

First aspect of the embodiment of the present invention discloses a kind of construction method of speech recognition library, which comprises

Obtain several matched pre-stored voice messagings of identity information with facility for study user；

Several described pre-stored voice messagings are identified, are obtained in several described pre-stored voice messagings The word for including；

The frequency of use for calculating each word determines that the frequency of use is greater than predeterminated frequency from the word Target word and the target word is determined as everyday expressions；

Identify the corresponding relevant information of the everyday expressions, and by everyday expressions and corresponding with the everyday expressions The relevant information associated storage to and in the matched speech recognition library of the identity information of the user, wherein one The corresponding relevant information of the everyday expressions, the relevant information include at least the meaning of the everyday expressions and described The pronunciation of everyday expressions.

As an alternative embodiment, in first aspect of the embodiment of the present invention, the identity of the acquisition and user Several pre-stored voice messagings of information matches, comprising:

When detecting the target instruction target word for constructing the speech recognition library, the identity letter of facility for study user is obtained Breath；

The voice key factor of the user is determined from the identity information；

It is obtained and several matched pre-stored voice messagings of the voice key factor from database.

As an alternative embodiment, in first aspect of the embodiment of the present invention, it is described to detect for constructing When the target instruction target word of the speech recognition library, before the identity information for obtaining facility for study user, the method also includes:

When the microphone of the facility for study receives the target voice of user's input, the identity letter of the user is judged It whether include voice key factor in breath；

If not, identifying the vocal print of the target voice by sound groove recognition technology in e；

Several vocal print nodes are extracted from the vocal print；

By calculating several described vocal print nodes, the voice key factor of the target voice is generated, And the voice key factor is stored into the identity information of institute user.

As an alternative embodiment, in first aspect of the embodiment of the present invention, it is described to calculate each word Frequency of use, determine that the frequency of use is greater than the target word of predeterminated frequency and by the target word from the word Language is determined as everyday expressions, comprising:

Using several described pre-stored voice messagings as foundation, calculate each word by access times；

The described by access times of each word is integrated, all words are calculated is used total degree；

Described according to each word is used total degree with described by access times, and each institute's predicate is calculated The corresponding frequency of use of language, wherein the corresponding frequency of use of a word；

Determine that the frequency of use is greater than the target word of predeterminated frequency and by the target word from the word It is determined as everyday expressions.

As an alternative embodiment, in first aspect of the embodiment of the present invention, the identification everyday expressions Corresponding relevant information, and extremely by the everyday expressions and the relevant information associated storage corresponding with the everyday expressions After the matched speech recognition library of the identity information of the user, the method also includes:

When detecting the current speech of user's input, detecting in the current speech whether there is and the voice Identify the matched target word voice of pronunciation of any one of everyday expressions in library；

If so, obtaining the mesh of the everyday expressions corresponding with the target word voice from the speech recognition library Mark relevant information；

Semantics recognition is carried out to the current speech according to the target relevant information, it is corresponding to obtain the current speech It is semantic.

Second aspect of the embodiment of the present invention discloses a kind of facility for study, comprising:

First acquisition unit, for obtaining, matched several are pre-stored with the identity information of the facility for study user Voice messaging；

First recognition unit, for being identified to several described pre-stored voice messagings, obtain it is described several The word for including in pre-stored voice messaging；

Computing unit is determined described using frequency for calculating the frequency of use of each word from the word Rate is greater than the target word of predeterminated frequency and the target word is determined as everyday expressions；

Storage unit, the corresponding relevant information of the everyday expressions for identification, and by the everyday expressions and with institute State the corresponding relevant information associated storage of everyday expressions to the matched speech recognition of the identity information of the user In library, wherein the corresponding relevant information of the everyday expressions, the relevant information include at least the everyday words The pronunciation of the meaning of language and the everyday expressions.

As an alternative embodiment, in second aspect of the embodiment of the present invention, the first acquisition unit includes:

First obtains subelement, for obtaining and learning when detecting the target instruction target word for constructing the speech recognition library Practise the identity information of equipment user；

First determines subelement, for determining the voice key factor of the user from the identity information；

Second obtains subelement, is pre-stored for obtaining from database with matched several of the voice key factor Voice messaging.

As an alternative embodiment, in second aspect of the embodiment of the present invention, the facility for study further include:

Judging unit, for working as the target detected for constructing the speech recognition library in the first acquisition subelement When instruction, before the identity information for obtaining facility for study user, and when to receive user defeated for the microphone of the facility for study When the target voice entered, judge in the identity information of the user whether to include voice key factor；

Second recognition unit when the result for judging in the judging unit is no, passes through sound groove recognition technology in e, identification The vocal print of the target voice；

Extraction unit, for extracting several vocal print nodes from the vocal print；

Generation unit, for generating the described of the target voice by calculating several described vocal print nodes Voice key factor, and the voice key factor is stored into the identity information of institute user.

As an alternative embodiment, in second aspect of the embodiment of the present invention, the computing unit includes:

First computation subunit, for calculating each institute's predicate using several described pre-stored voice messagings as foundation Language by access times；

First computation subunit is also used to integrate the described by access times of each word, institute is calculated Predicate language is used total degree；

Second computation subunit, for being used total time by access times with described according to the described of each word Number, is calculated the corresponding frequency of use of each word, wherein the corresponding frequency of use of a word；

Second determines subelement, for determining that the frequency of use is greater than the target word of predeterminated frequency from the word The target word is simultaneously determined as everyday expressions by language.

Detection unit, for identifying the corresponding relevant information of the everyday expressions in the storage unit, and will be described normal It is extremely matched with the identity information of the user with word and the relevant information associated storage corresponding with the everyday expressions The speech recognition library in after, and when detecting the current speech of user input, detect the current speech In with the presence or absence of the matched target word voice of pronunciation with any one of everyday expressions in the speech recognition library；

Second acquisition unit, the result for detecting in the detection unit are when being, to obtain from the speech recognition library Take the target relevant information of the everyday expressions corresponding with the target word voice；

Third recognition unit is obtained for carrying out semantics recognition to the current speech according to the target relevant information The corresponding semanteme of the current speech.

The third aspect of the embodiment of the present invention discloses a kind of electronic equipment, comprising:

It is stored with the memory of executable program code；

The processor coupled with the memory；

The processor calls the executable program code stored in the memory, executes any of first aspect A kind of some or all of method step.

Fourth aspect of the embodiment of the present invention discloses a kind of computer readable storage medium, the computer readable storage medium Store program code, wherein said program code includes the part or complete for executing any one method of first aspect The instruction of portion's step.

The 5th aspect of the embodiment of the present invention discloses a kind of computer program product, when the computer program product is calculating When being run on machine, so that the computer executes some or all of any one method of first aspect step.

The aspect of the embodiment of the present invention the 6th disclose a kind of using distribution platform, and the application distribution platform is for publication calculating Machine program product, wherein when the computer program product is run on computers, so that the computer executes first party Some or all of any one method in face step.

Compared with prior art, the embodiment of the present invention has the advantages that

In the embodiment of the present invention, obtains and believe with several matched pre-stored voices of the identity information of facility for study user Breath；To several, pre-stored voice messaging is identified, obtains the word for including in several pre-stored voice messagings；Meter The frequency of use of each word is calculated, and the word that frequency of use in word is greater than predeterminated frequency is determined as everyday expressions；Identification The corresponding relevant information of everyday expressions, and by everyday expressions and relevant information associated storage corresponding with everyday expressions to with In the matched speech recognition library of the identity information at family, relevant information includes at least the meaning and pronunciation of everyday expressions.As it can be seen that implementing The embodiment of the present invention can determine the everyday expressions that user is commonly used from the voice messaging of pre-stored user, and will be normal It is stored with the pronunciation of word and meaning into the speech recognition library of user-specific, so that subsequent facility for study is in identification user's input Voice messaging when, voice messaging can be identified according to the meaning of everyday expressions in speech recognition library, reduce study Equipment identifies the difficulty of voice messaging, the accuracy of facility for study speech recognition can also be improved, to improve facility for study The efficiency of speech recognition.

Detailed description of the invention

It to describe the technical solutions in the embodiments of the present invention more clearly, below will be to needed in the embodiment Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for ability For the those of ordinary skill of domain, without creative efforts, it can also be obtained according to these attached drawings other attached Figure.

Fig. 1 is a kind of flow diagram of the construction method of speech recognition library disclosed by the embodiments of the present invention；

Fig. 2 is the flow diagram of the construction method of another speech recognition library disclosed by the embodiments of the present invention；

Fig. 3 is the flow diagram of the construction method of another speech recognition library disclosed by the embodiments of the present invention；

Fig. 4 is a kind of structural schematic diagram of facility for study disclosed by the embodiments of the present invention；

Fig. 5 is the structural schematic diagram of another facility for study disclosed by the embodiments of the present invention；

Fig. 6 is the structural schematic diagram of another facility for study disclosed by the embodiments of the present invention；

Fig. 7 is the structural schematic diagram of a kind of electronic equipment disclosed by the embodiments of the present invention.

Specific embodiment

Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that the described embodiment is only a part of the embodiment of the present invention, instead of all the embodiments.Based on this Embodiment in invention, every other reality obtained by those of ordinary skill in the art without making creative efforts Example is applied, shall fall within the protection scope of the present invention.

It should be noted that term " includes " and " having " and their any changes in the embodiment of the present invention and attached drawing Shape, it is intended that cover and non-exclusive include.Such as contain the process, method of a series of steps or units, system, product or Equipment is not limited to listed step or unit, but optionally further comprising the step of not listing or unit or optional Ground further includes the other step or units intrinsic for these process, methods, product or equipment.

The embodiment of the present invention discloses the construction method and facility for study of a kind of speech recognition library, can be improved facility for study language The efficiency of sound identification.It is described in detail separately below.

Embodiment one

Referring to Fig. 1, Fig. 1 is a kind of process signal of the construction method of speech recognition library disclosed by the embodiments of the present invention Figure.As shown in Figure 1, the construction method of the speech recognition library may comprise steps of:

101, facility for study obtains several matched pre-stored voice messagings of identity information with facility for study user.

In the embodiment of the present invention, facility for study can be the electronic equipments such as study plate, private tutor's machine laptop, right This, the embodiment of the present invention is without limitation.Facility for study and the relationship of user can be the corresponding user of a facility for study, It can be a facility for study and correspond to multiple users, in this regard, the embodiment of the present invention is also without limitation.The identity information of user can be with For user using facility for study account information (such as user using facility for study unique account coding, user identification card number Deng), or the physiological characteristic information (voice of the finger print information of such as user, the face information of user or user of user Key factor etc.), in this regard, the embodiment of the present invention is without limitation.The Mike that the available user of facility for study passes through facility for study All voice messagings of wind input, and all voices are pre-stored into the database of facility for study or memory.

As an alternative embodiment, following steps can also be performed before facility for study executes step 101:

When detect facility for study be triggered arbitrary instruction when, facility for study obtain active user identity information；

Facility for study judges whether to be stored with and the matched speech recognition library of the identity information of user；

It gets Date with the matched speech recognition library of the identity information of user, facility for study if be stored with and should The building date of speech recognition library；

The building duration of the speech recognition library is calculated according to current date and building date in facility for study；

Facility for study judges to construct whether duration is greater than preset duration；

If it is greater than preset duration, facility for study acquisition is prestored with matched several of the identity information of facility for study user The voice messaging of storage.

Wherein, implement this embodiment, speech recognition library can be updated every preset time interval, by It is changing at any time in the speech habits of user, therefore is being updated the language that can make facility for study on time to speech recognition library Sound identification is more accurate.

102, the facility for study voice messaging pre-stored to several identifies, obtains several pre-stored voice letters The word for including in breath.

In the embodiment of the present invention, pre-stored voice messaging can be a word, a sentence or one section of word, obtain Pre-stored voice messaging in include word between there may be comprising with by comprising relationship.When pre-stored voice is believed When breath is a word, which can be determined as the word in pre-stored voice messaging included by facility for study；When prestoring When the voice messaging of storage is a sentence, facility for study can identify all words for including in the sentence, and by all words Language is determined as the word in pre-stored voice messaging included, and such as pre-stored voice messaging can be " making a phone call to small step ", The word for including in the pre-stored voice messaging may include: " to ", " small step ", " making a call " and " ", due to acquisition Between the word for including in pre-stored voice messaging there may be comprising with by comprising relationship, the pre-stored voice messaging In include word can also include " making a phone call ", to sum up, the word for including in the pre-stored voice messaging can be with are as follows: " giving ", " small step ", " making a call ", " " and " making a phone call "；When pre-stored voice messaging is one section of word, facility for study can be with It identifies all words for including in this section words, and all words is determined as to the word in pre-stored voice messaging included Language.

As an alternative embodiment, the facility for study voice messaging pre-stored to several identifies, obtain The mode for the word for including in several pre-stored voice messagings may comprise steps of:

Facility for study carries out speech recognition to several pre-stored voice messagings, to obtain each pre-stored voice letter Cease corresponding text information, wherein the corresponding text information of a pre-stored voice messaging；

Facility for study carries out semantic analysis to each text information, and each text information is divided into several target words Language；

Facility for study integrates several corresponding target words of each text information, generates target word repertorie；

Facility for study merges the same target word in target word repertorie, and will be in the target word repertorie after merging Target word be determined as the word in several pre-stored voice messagings included.

Wherein, implement this embodiment, voice messaging can be converted to text information, and then carry out to text information Identification obtains several words that text information includes, gets through the above way by analyzing text information The word for including in several words namely several voice messagings, the voice that this mode can be identified more accurately The word for including in information.

103, facility for study calculates the frequency of use of each word, determines that frequency of use is greater than predeterminated frequency from word Target word and target word is determined as everyday expressions.

In the embodiment of the present invention, frequency of use can be gone out in all pre-stored voice messagings by calculating each word The ratio for the total degree that existing number occurs with all words in all pre-stored voice messagings determines.Predeterminated frequency can Think that facility for study is pre-set, or user's self-setting of facility for study, in this regard, the embodiment of the present invention is not done It limits.

104, the corresponding relevant information of facility for study identification everyday expressions, and by everyday expressions and corresponding with everyday expressions Relevant information associated storage to and in the matched speech recognition library of identity information of user, wherein everyday expressions are corresponding One relevant information, the relevant information include at least the meaning of everyday expressions and the pronunciation of everyday expressions.

In the embodiment of the present invention, the corresponding relevant information of everyday expressions can also include the everyday expressions frequency of use compared with The information such as high usage time interval, in this regard, the embodiment of the present invention is without limitation.The identity information of each user can only with one Speech recognition storehouse matching, this matching way can summarize the common word of user more accurate, avoid the occurrence of repetition and deposit The case where storage or omission.Speech recognition library can store into the memory of facility for study, can also store to facility for study It pre-establishes in the server (such as Cloud Server) of connection, in this regard, the embodiment of the present invention is without limitation.

In the method depicted in fig. 1, voice messaging can be carried out according to the meaning of everyday expressions in speech recognition library Identification reduces the difficulty of facility for study identification voice messaging, can also improve the accuracy of facility for study speech recognition, thus Improve the efficiency of facility for study speech recognition.It can also be by the update on time to speech recognition library, so that facility for study Speech recognition is more accurate.Further, it is also possible to the word for including in the voice messaging more accurately identified.

Embodiment two

Referring to Fig. 2, Fig. 2 is the process signal of the construction method of another speech recognition library disclosed by the embodiments of the present invention Figure.As shown in Fig. 2, the construction method of the speech recognition library may comprise steps of:

201, when the microphone of facility for study receives the target voice of user's input, facility for study judges the body of user It whether include voice key factor in part information, if so, terminating this process；If not, executing step 202~step 210.

In the embodiment of the present invention, when facility for study receives the target voice of user's input by microphone, if should Target voice is that facility for study is initially received, then in identity information corresponding to the user not including voice in facility for study Key factor, therefore, facility for study need the target voice inputted to user to identify, crucial with the voice for obtaining the user The factor.

202, facility for study identifies the vocal print of target voice by sound groove recognition technology in e.

In the embodiment of the present invention, vocal print (Voiceprint) is a kind of sound wave spectrum for carrying language message, and vocal print is not only With specificity, also have the characteristics that relative stability, therefore can be determined by the voice key factor in the vocal print of user The identity information of user.Facility for study can extract the phonetic feature in target voice using sound groove recognition technology in e, according to language Sound feature identifies the vocal print in target voice.

203, facility for study extracts several vocal print nodes from vocal print.

In the embodiment of the present invention, vocal print node can be the node that can significantly show the feature of user's vocal print, and use The vocal print number of nodes for including in the vocal print at family is without limitation.

204, facility for study generates the voice key factor of target voice by calculating several vocal print nodes, And voice key factor is stored into the identity information of institute user.

In the embodiment of the present invention, facility for study can be with several vocal print nodes of comprehensive analysis, so that facility for study is from several Analysis obtains the specific voice key factor of user in a vocal print node.Facility for study can be obtained by voice key factor Several pre-stored voice messagings corresponding to the user can also execute and the distinctive identity of user according to voice key factor The relevant operation of information, in this regard, the embodiment of the present invention is without limitation.

In the embodiment of the present invention, implement above-mentioned step 201~step 204, facility for study can be used for the first time in user Speech recognition technology when, the voice of user is analyzed, the voice key factor of user is obtained and store so that study set It is standby quickly to be determined according to the voice key factor of user and the matched information of user.

205, when detecting the target instruction target word for constructing speech recognition library, facility for study obtains facility for study user's Identity information.

In the embodiment of the present invention, the target instruction target word mode triggered for constructing speech recognition library can be by facility for study certainly Dynamic triggering (such as facility for study can trigger once every scheduled duration for constructing the target instruction target word of speech recognition library), may be used also With actively triggered by the user of facility for study (as user can by triggering facility for study with building speech recognition library The target instruction target word of the corresponding case triggering building speech recognition library of target instruction target word).When facility for study automatic trigger is for constructing language When sound identifies the target instruction target word in library, the identity information of the available user currently logged on facility for study of facility for study； When facility for study triggers the target instruction target word for constructing speech recognition library by the user of facility for study, facility for study can be obtained The biological informations such as fingerprint, voice key factor, iris or the facial image of user of current triggering target instruction target word are taken to obtain The identity information of active user.

206, facility for study determines the voice key factor of user from identity information.

In the embodiment of the present invention, identity information may include several contents, such as the name of user, age, gender, identity card Number and voice key factor etc., if facility for study is previously stored with the voice key factor of user, facility for study can be with The voice key factor of user is got by the identity information of user.

207, facility for study obtains and several matched pre-stored voice messagings of voice key factor from database.

In the embodiment of the present invention, voice messaging when being pre-stored into facility for study, can identity information to user into Line flag, so that facility for study can get the identity information with user by any one content in subscriber identity information Matched voice messaging.

In the embodiment of the present invention, implement above-mentioned step 205~step 207, the available voice key with user because Several matched pre-stored voice messagings of son can be closed since voice key factor is unique by voice The key factor accurately obtains voice messaging corresponding to the user.

208, the facility for study voice messaging pre-stored to several identifies, obtains several pre-stored voice letters The word for including in breath.

209, facility for study calculates the frequency of use of each word, determines that frequency of use is greater than predeterminated frequency from word Target word and target word is determined as everyday expressions.

210, the corresponding relevant information of facility for study identification everyday expressions, and by everyday expressions and corresponding with everyday expressions Relevant information associated storage to and in the matched speech recognition library of identity information of user, wherein everyday expressions are corresponding One relevant information, the relevant information include at least the meaning of everyday expressions and the pronunciation of everyday expressions.

As an alternative embodiment, the corresponding relevant information of facility for study identification everyday expressions, and by everyday words Language and relevant information associated storage corresponding with everyday expressions are extremely and in the matched speech recognition library of identity information of user Mode may comprise steps of:

Facility for study identifies several pronunciations of the everyday expressions, and identification from several pre-stored voice messagings The everyday expressions several corresponding meanings in several pre-stored voice messagings；

Facility for study determines the most target pronunciation of access times from several pronunciations, and from several meanings really Determine the most target meaning of access times；

Facility for study combining target pronunciation and target meaning generate relevant information corresponding with the everyday expressions；

The everyday expressions and the relevant information are associated by facility for study, generate everyday expressions information aggregate, and should Everyday expressions information aggregate store to in the matched speech recognition library of the identity information of user.

Wherein, implement this embodiment, when using identical everyday expressions under different scenes due to user, can go out The now everyday expressions correspond to the situation of different pronunciation and unused semanteme, therefore facility for study can use the everyday expressions Most pronunciations and the semantic relevant information for combining the generation everyday expressions, and it is peculiar to user with the everyday expressions associated storage Speech recognition library in so that subsequent facility for study identify user voice messaging when can more quickly determine to commonly use The meaning of word.

In the method depicted in fig. 2, voice messaging can be carried out according to the meaning of everyday expressions in speech recognition library Identification reduces the difficulty of facility for study identification voice messaging, can also improve the accuracy of facility for study speech recognition, thus Improve the efficiency of facility for study speech recognition.The body of user can also be determined by the voice key factor in the vocal print of user Part information, improves the accuracy of determining subscriber identity information.Further, it is also possible to be triggered in several ways for constructing voice The target instruction target word for identifying library, so that the building of speech recognition library is more flexible.

Embodiment three

Referring to Fig. 3, Fig. 3 is the process signal of the construction method of another speech recognition library disclosed by the embodiments of the present invention Figure.As shown in figure 3, the construction method of the speech recognition library may comprise steps of:

301, facility for study obtains several matched pre-stored voice messagings of identity information with facility for study user.

302, the facility for study voice messaging pre-stored to several identifies, obtains several pre-stored voice letters The word for including in breath.

303, facility for study is using several pre-stored voice messagings as foundation, calculate each word by access times.

In the embodiment of the present invention, several words that the voice messaging pre-stored from several is identified, several words Each word in language can occur at least once in several pre-stored voice messagings, and facility for study can count each The number that word occurs in several pre-stored voice messagings, and the number of the appearance of each word can be confirmed as Each word is used total degree.

304, the comprehensive each word of facility for study by access times, all words are calculated is used total degree.

In the embodiment of the present invention, total degree is used to be added to obtain total degree each word, it can be total by this Number be all words occur in several pre-stored voice messagings used total degree.

305, facility for study by access times and is used total degree according to each word, and each word pair is calculated The frequency of use answered, wherein the corresponding frequency of use of a word.

In the embodiment of the present invention, any one available word of facility for study by access times, calculate this and used Number and the ratio for being used total degree, facility for study the ratio can be determined as with this by the corresponding word of access times Frequency of use；Facility for study can calculate each word, obtain the corresponding frequency of use of each word.

306, facility for study determines that frequency of use is greater than the target word of predeterminated frequency and by target word from word It is determined as everyday expressions.

In the embodiment of the present invention, implement above-mentioned step 303~step 306, by calculating appointing for user speech input The number of meaning word occupies the ratio of the total degree of all words of family input, determines the everyday expressions of user, can make to learn The everyday expressions for the user that equipment generates and the speech habits true correlation of user, and then make subsequent facility for study identification user's Voice messaging is more accurate.

307, the corresponding relevant information of facility for study identification everyday expressions, and by everyday expressions and corresponding with everyday expressions Relevant information associated storage to and in the matched speech recognition library of identity information of user, wherein everyday expressions are corresponding One relevant information, the relevant information include at least the meaning of everyday expressions and the pronunciation of everyday expressions.

308, when detecting the current speech of user's input, whether there is and voice in facility for study detection current speech The matched target word voice of pronunciation of any one everyday expressions in library is identified, if so, executing step 309~step 310； If not, terminating this process.

In the embodiment of the present invention, facility for study, can be by current speech after the current speech for getting user's input It is compared with the pronunciation of everyday expressions all in speech recognition library, if the pronunciation of any one everyday expressions and current language A part of voice match in sound, which can be determined as target word voice by facility for study, and think the mesh Marking the corresponding word of word voice is the everyday expressions in speech recognition library, and therefore, facility for study can be by the target word language Sound is extracted from current speech, and the everyday expressions with the target word voice match are searched for from speech recognition library.

309, it is related to obtain the target of everyday expressions corresponding to target word voice from speech recognition library for facility for study Information.

In the embodiment of the present invention, there may be one or more any a part with current speech in speech recognition library The pronunciation of voice match, therefore facility for study can extract several target word voices, facility for study from current speech The target relevant information of everyday expressions corresponding with target word voice, and a target word can be obtained in speech recognition library Language voice corresponds to a target relevant information.

310, facility for study carries out semantics recognition to current speech according to target relevant information, and it is corresponding to obtain current speech It is semantic.

It may include containing for everyday expressions in the relevant information of the everyday expressions in speech recognition library in the embodiment of the present invention Justice, facility for study can obtain several target correlations letter according to several target relevant informations obtained in speech recognition library It is corresponding can also to assist in identifying current speech according to the meaning of several everyday expressions for the meaning for the everyday expressions for including in breath Semanteme, to improve the accuracy rate of the semantics recognition of current speech.

In the embodiment of the present invention, implement above-mentioned step 308~step 310, user's voice currently entered can be passed through Information is assisted from the everyday expressions for including in voice messaging are obtained in speech recognition library according to information such as the meanings of everyday expressions The voice messaging for identifying user carries out semantics recognition to words each in voice messaging without facility for study, improves study and set The efficiency of the voice messaging of standby identification user input, but also the speech recognition of facility for study is more accurate.

In the method depicted in fig. 3, voice messaging can be carried out according to the meaning of everyday expressions in speech recognition library Identification reduces the difficulty of facility for study identification voice messaging, can also improve the accuracy of facility for study speech recognition, thus Improve the efficiency of facility for study speech recognition.Can also accurately obtain each word by access times and all words Used total degree so that the calculated result of the frequency of use of each word is more accurate.Further, it is also possible to make to learn The everyday expressions for the user that equipment generates and the speech habits true correlation of user, and then make subsequent facility for study identification user's Voice messaging is more accurate.

Example IV

Referring to Fig. 4, Fig. 4 is a kind of structural schematic diagram of facility for study disclosed by the embodiments of the present invention.As shown in figure 4, The facility for study may include:

First acquisition unit 401, for obtaining, matched several are pre-stored with the identity information of facility for study user Voice messaging.

As an alternative embodiment, first acquisition unit 401 can be also used for:

When detect facility for study be triggered arbitrary instruction when, obtain the identity information of active user；

Judge whether to be stored with and the matched speech recognition library of the identity information of user；

If be stored with the matched speech recognition library of the identity information of user, get Date and the speech recognition The building date in library；

According to current date and building date, the building duration of the speech recognition library is calculated；

Judge to construct whether duration is greater than preset duration；

If it is greater than preset duration, several matched pre-stored voices of identity information with facility for study user are obtained Information.

First recognition unit 402, several pre-stored voice messagings for obtaining to first acquisition unit 401 carry out Identification, obtains the word for including in several pre-stored voice messagings.

As an alternative embodiment, the first recognition unit 402 voice messaging pre-stored to several is known Not, the mode for obtaining the word for including in several pre-stored voice messagings is specifically as follows:

Speech recognition is carried out to several pre-stored voice messagings, it is corresponding with the voice messaging for obtaining each pre-stored Text information, wherein the corresponding text information of a pre-stored voice messaging；

Semantic analysis is carried out to each text information, each text information is divided into several target words；

Several corresponding target words of each text information are integrated, target word repertorie is generated；

Same target word in target word repertorie is merged, and by the target word in the target word repertorie after merging Language is determined as the word in several pre-stored voice messagings included.

Computing unit 403, the frequency of use of each word for calculating the identification of the first recognition unit 402, from word Determine that frequency of use is greater than the target word of predeterminated frequency and target word is determined as everyday expressions.

Storage unit 404, the corresponding relevant information of everyday expressions that computing unit 403 determines for identification, and will commonly use Word and relevant information associated storage corresponding with everyday expressions to and in the matched speech recognition library of identity information of user, Wherein, the corresponding relevant information of an everyday expressions, the relevant information include at least the meaning and everyday expressions of everyday expressions Pronunciation.

As an alternative embodiment, storage unit 404 identifies the corresponding relevant information of everyday expressions, and will commonly use Word and relevant information associated storage corresponding with everyday expressions are extremely and in the matched speech recognition library of identity information of user Mode be specifically as follows:

Several pronunciations of the everyday expressions are identified in the voice messaging pre-stored from several, and identify the everyday words Language several corresponding meanings in several pre-stored voice messagings；

The most target pronunciation of access times is determined from several pronunciations, and is determined from several meanings using secondary The most target meaning of number；

Combining target pronunciation and target meaning generate relevant information corresponding with the everyday expressions；

The everyday expressions and the relevant information are associated, generate everyday expressions information aggregate, and by the everyday expressions Information aggregate store to in the matched speech recognition library of the identity information of user.

In the facility for study described in Fig. 4, can according to everyday expressions in speech recognition library meaning to voice messaging It is identified, reduces the difficulty of facility for study identification voice messaging, the accuracy of facility for study speech recognition can also be improved, To improve the efficiency of facility for study speech recognition.It can also be by the update on time to speech recognition library, so that study is set Standby speech recognition is more accurate.Further, it is also possible to the word for including in the voice messaging more accurately identified.

Embodiment five

Referring to Fig. 5, Fig. 5 is the structural schematic diagram of another facility for study disclosed by the embodiments of the present invention.Wherein, Fig. 5 Shown in facility for study be that facility for study as shown in Figure 4 optimizes.Compared with facility for study shown in Fig. 4, Fig. 5 Shown in the first acquisition unit 401 of facility for study may include:

First obtains subelement 4011, for obtaining and learning when detecting the target instruction target word for constructing speech recognition library Practise the identity information of equipment user.

First determines subelement 4012, determines user's in the identity information that subelement 4011 obtains for obtaining from first Voice key factor.

Second obtains subelement 4013, determines that the voice that subelement 4012 determines closes with first for obtaining from database Several matched pre-stored voice messagings of the key factor.

It is available to believe with several matched pre-stored voices of user voice key factor in the embodiment of the present invention Breath, since voice key factor is unique, can accurately be obtained corresponding to the user by voice key factor Voice messaging.

As an alternative embodiment, facility for study shown in fig. 5 can also include:

Judging unit 405 detects that the target for constructing speech recognition library refers to for working as in the first acquisition subelement 401 When enabling, before the identity information for obtaining facility for study user, and when the microphone of facility for study receives the mesh of user's input When poster sound, judge in the identity information of user whether to include voice key factor；

Second recognition unit 406 when the result for judging in judging unit 405 is no, by sound groove recognition technology in e, is known The vocal print of other target voice；

Extraction unit 407, for extracting several vocal print nodes from the vocal print that the second recognition unit 406 identifies；

Generation unit 408, for generating mesh by calculating several vocal print nodes that extraction unit 407 extracts The voice key factor of poster sound, and voice key factor is stored into the identity information of institute user.

Wherein, implement this embodiment, can when user uses the speech recognition technology of facility for study for the first time, to The voice at family is analyzed, and the voice key factor of user is obtained and store, so that facility for study can be according to the voice of user Key factor is quickly determined and the matched information of user.

In the facility for study described in Fig. 5, can according to everyday expressions in speech recognition library meaning to voice messaging It is identified, reduces the difficulty of facility for study identification voice messaging, the accuracy of facility for study speech recognition can also be improved, To improve the efficiency of facility for study speech recognition.User can also be determined by the voice key factor in the vocal print of user Identity information, improve the accuracy of determining subscriber identity information.Further, it is also possible to be triggered in several ways for constructing The target instruction target word of speech recognition library, so that the building of speech recognition library is more flexible.

Embodiment six

Referring to Fig. 6, Fig. 6 is the structural schematic diagram of another facility for study disclosed by the embodiments of the present invention.Wherein, Fig. 6 Shown in facility for study be that facility for study as shown in Figure 5 optimizes.Compared with facility for study shown in fig. 5, Fig. 6 Shown in the computing unit 403 of facility for study may include:

First computation subunit 4031, several pre-stored voice messagings for being obtained with first acquisition unit 401 For foundation, calculate each word of the first recognition unit 402 identification by access times.

First computation subunit 4031, be also used to integrate each word by access times, all words are calculated Used total degree.

Second computation subunit 4032, for being used according to the calculated each word of the first computation subunit 4031 Number and total degree is used, the corresponding frequency of use of each word is calculated, wherein a word corresponding one uses frequency Rate.

Second determines subelement 4033, the frequency of use obtained for determining the second computation subunit 4032 from word Greater than predeterminated frequency target word and target word is determined as everyday expressions.

In the embodiment of the present invention, all of family input are occupied by calculating the number of any word of user speech input The ratio of the total degree of word determines the everyday expressions of user, the everyday expressions and use of the user that facility for study can be made to generate The speech habits true correlation at family, and then keep the voice messaging of subsequent facility for study identification user more accurate.

Detection unit 409, for identifying the corresponding relevant information of everyday expressions in storage unit 404, and by everyday expressions And relevant information associated storage corresponding with everyday expressions to in the matched speech recognition library of identity information of user after, And when detecting the current speech of user's input, detecting in current speech whether there is and any one in speech recognition library The matched target word voice of the pronunciation of everyday expressions；

Second acquisition unit 410, the result for detecting in detection unit 409 are when being, to obtain from speech recognition library The target relevant information of everyday expressions corresponding with target word voice；

Third recognition unit 411, for according to second acquisition unit 410 obtain target relevant information to current speech into Row semantics recognition obtains the corresponding semanteme of current speech.

Wherein, implement this embodiment, can be obtained from speech recognition library by user's voice messaging currently entered The everyday expressions for including in voice messaging are taken, the voice messaging of user, nothing are assisted in identifying according to information such as the meanings of everyday expressions It needs facility for study to carry out semantics recognition to words each in voice messaging, improves the voice letter of facility for study identification user's input The efficiency of breath, but also the speech recognition of facility for study is more accurate.

In the facility for study described in Fig. 6, can according to everyday expressions in speech recognition library meaning to voice messaging It is identified, reduces the difficulty of facility for study identification voice messaging, the accuracy of facility for study speech recognition can also be improved, To improve the efficiency of facility for study speech recognition.Can also accurately obtain each word by access times and all Word is used total degree, so that the calculated result of the frequency of use of each word is more accurate.Further, it is also possible to make The everyday expressions for the user that facility for study generates and the speech habits true correlation of user, and then use subsequent facility for study identification The voice messaging at family is more accurate.

Embodiment seven

Referring to Fig. 7, Fig. 7 is the structural schematic diagram of a kind of electronic equipment disclosed by the embodiments of the present invention.As shown in fig. 7, The electronic equipment may include:

It is stored with the memory 701 of executable program code；

The processor 702 coupled with memory 701；

Wherein, processor 702 calls the executable program code stored in memory 701, executes the above each method and implements Some or all of method in example step.

A kind of computer readable storage medium is also disclosed in the embodiment of the present invention, wherein computer-readable recording medium storage Program code, wherein program code includes for executing some or all of the method in above each method embodiment step Instruction.

A kind of computer program product is also disclosed in the embodiment of the present invention, wherein when computer program product on computers When operation, so that computer executes some or all of the method in such as above each method embodiment step.

The embodiment of the present invention is also disclosed a kind of using distribution platform, wherein using distribution platform for issuing computer journey Sequence product, wherein when computer program product is run on computers, so that computer executes such as the above each method embodiment In some or all of method step.

It should be understood that " embodiment of the present invention " that specification is mentioned in the whole text mean special characteristic related with embodiment, Structure or characteristic is included at least one embodiment of the present invention.Therefore, the whole instruction occur everywhere " in the present invention In embodiment " not necessarily refer to identical embodiment.In addition, these a particular feature, structure, or characteristics can be with any suitable Mode combines in one or more embodiments.Those skilled in the art should also know that embodiment described in this description Alternative embodiment is belonged to, related actions and modules are not necessarily necessary for the present invention.

In various embodiments of the present invention, it should be appreciated that magnitude of the sequence numbers of the above procedures are not meant to execute suitable Successively, the execution sequence of each process should be determined by its function and internal logic the certainty of sequence, without coping with the embodiment of the present invention Implementation process constitutes any restriction.

In addition, the terms " system " and " network " are often used interchangeably herein.It should be understood that the terms "and/or", only a kind of incidence relation for describing affiliated partner, indicates may exist three kinds of relationships, such as A and/or B, can To indicate: individualism A exists simultaneously A and B, these three situations of individualism B.In addition, character "/" herein, typicallys represent Forward-backward correlation object is a kind of relationship of "or".

In embodiment provided by the present invention, it should be appreciated that " B corresponding with A " indicates that B is associated with A, can be with according to A Determine B.It is also to be understood that determine that B is not meant to determine B only according to A according to A, it can also be according to A and/or other information Determine B.

Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of above-described embodiment is can It is completed with instructing relevant hardware by program, which can be stored in a computer readable storage medium, storage Medium include read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), programmable read only memory (Programmable Read-only Memory, PROM), erasable programmable is read-only deposits Reservoir (Erasable Programmable Read Only Memory, EPROM), disposable programmable read-only memory (One- Time Programmable Read-Only Memory, OTPROM), the electronics formula of erasing can make carbon copies read-only memory (Electrically-Erasable Programmable Read-Only Memory, EEPROM), CD-ROM (Compact Disc Read-Only Memory, CD-ROM) or other disc memories, magnetic disk storage, magnetic tape storage or can For carrying or any other computer-readable medium of storing data.

Above-mentioned unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, can be in one place, or may be distributed over multiple nets On network unit.Some or all of units can be selected to achieve the purpose of the solution of this embodiment according to the actual needs.

In addition, each functional unit in various embodiments of the present invention can integrate in one processing unit, it is also possible to Each unit physically exists alone, and can also be integrated in one unit with two or more units.Above-mentioned integrated unit Both it can take the form of hardware realization, can also realize in the form of software functional units.

If above-mentioned integrated unit is realized in the form of SFU software functional unit and when sold or used as an independent product, It can store in a retrievable memory of computer.Based on this understanding, technical solution of the present invention substantially or Person says all or part of of the part that contributes to existing technology or the technical solution, can be in the form of software products It embodies, which is stored in a memory, including several requests are with so that a computer is set Standby (can be personal computer, server or network equipment etc., specifically can be the processor in computer equipment) executes Some or all of each embodiment above method of the invention step.

The construction method and facility for study of a kind of speech recognition library disclosed by the embodiments of the present invention have been carried out in detail above It introduces, used herein a specific example illustrates the principle and implementation of the invention, the explanation of above embodiments It is merely used to help understand method and its core concept of the invention；At the same time, for those skilled in the art, according to this The thought of invention, there will be changes in the specific implementation manner and application range, in conclusion the content of the present specification is not answered It is interpreted as limitation of the present invention.

Claims

1. a kind of construction method of speech recognition library, which is characterized in that the described method includes:

Several described pre-stored voice messagings are identified, include in several described pre-stored voice messagings of acquisition Word；

The frequency of use for calculating each word determines that the frequency of use is greater than the mesh of predeterminated frequency from the word The target word is simultaneously determined as everyday expressions by mark word；

Identify the corresponding relevant information of the everyday expressions, and by the everyday expressions and institute corresponding with the everyday expressions Relevant information associated storage is stated extremely and in the matched speech recognition library of the identity information of the user, wherein described in one Everyday expressions correspond to a relevant information, and the relevant information includes at least the meaning of the everyday expressions and described common The pronunciation of word.

2. the method according to claim 1, wherein the acquisition and the identity information of user it is matched several Pre-stored voice messaging, comprising:

When detecting the target instruction target word for constructing the speech recognition library, the identity information of facility for study user is obtained；

The voice key factor of the user is determined from the identity information；

3. according to the method described in claim 2, it is characterized in that, described ought detect for constructing the speech recognition library When target instruction target word, before the identity information for obtaining facility for study user, the method also includes:

When the microphone of the facility for study receives the target voice of user's input, in the identity information that judges the user It whether include voice key factor；

Several vocal print nodes are extracted from the vocal print；

By calculating several described vocal print nodes, the voice key factor of the target voice is generated, and will The voice key factor is stored into the identity information of institute user.

4. described in any item methods according to claim 1~3, which is characterized in that the use for calculating each word Frequency determines that the frequency of use is greater than the target word of predeterminated frequency and determines the target word from the word For everyday expressions, comprising:

Described according to each word is used total degree with described by access times, and each word pair is calculated The frequency of use answered, wherein the corresponding frequency of use of a word；

Determine that the frequency of use is greater than the target word of predeterminated frequency and determines the target word from the word For everyday expressions.

5. method according to any one of claims 1 to 4, which is characterized in that the identification everyday expressions are corresponding Relevant information, and by the everyday expressions and the relevant information associated storage corresponding with the everyday expressions to it is described After in the matched speech recognition library of the identity information of user, the method also includes:

When detecting the current speech of user's input, detecting in the current speech whether there is and the speech recognition The matched target word voice of the pronunciation of any one of everyday expressions in library；

If so, obtaining the target phase of the everyday expressions corresponding with the target word voice from the speech recognition library Close information；

Semantics recognition is carried out to the current speech according to the target relevant information, obtains the corresponding language of the current speech Justice.

6. a kind of facility for study characterized by comprising

First acquisition unit, for obtaining and several matched pre-stored voices of the identity information of the facility for study user Information；

First recognition unit obtains described several and prestores for identifying to several described pre-stored voice messagings The word for including in the voice messaging of storage；

Computing unit determines that the frequency of use is big for calculating the frequency of use of each word from the word In predeterminated frequency target word and the target word is determined as everyday expressions；

Storage unit, the corresponding relevant information of the everyday expressions for identification, and by the everyday expressions and with it is described often With the corresponding relevant information associated storage of word extremely and in the matched speech recognition library of the identity information of the user, Wherein, the corresponding relevant information of the everyday expressions, the relevant information include at least the everyday expressions The pronunciation of meaning and the everyday expressions.

7. facility for study according to claim 6, which is characterized in that the first acquisition unit includes:

First obtains subelement, sets for when detecting the target instruction target word for constructing the speech recognition library, obtaining study The identity information of standby user；

Second obtains subelement, for obtaining and several matched pre-stored languages of the voice key factor from database Message breath.

8. facility for study according to claim 7, which is characterized in that the facility for study further include:

Judging unit, for working as the target instruction target word detected for constructing the speech recognition library in the first acquisition subelement When, obtain facility for study user identity information before, and when the microphone of the facility for study receive user input When target voice, judge in the identity information of the user whether to include voice key factor；

Second recognition unit, when the result for judging in the judging unit is no, by sound groove recognition technology in e, described in identification The vocal print of target voice；

Generation unit, for generating the voice of the target voice by calculating several described vocal print nodes Key factor, and the voice key factor is stored into the identity information of institute user.

9. according to the described in any item facility for study of claim 6~8, which is characterized in that the computing unit includes:

First computation subunit, for calculating each word using several described pre-stored voice messagings as foundation By access times；

First computation subunit is also used to integrate the described by access times of each word, all institutes is calculated Predicate language is used total degree；

Second computation subunit, for being used total degree, meter with described by access times according to the described of each word Calculation obtains the corresponding frequency of use of each word, wherein the corresponding frequency of use of a word；

Second determines subelement, for determining that the frequency of use is greater than the target word of predeterminated frequency simultaneously from the word The target word is determined as everyday expressions.

10. according to the described in any item facility for study of claim 6~9, which is characterized in that the facility for study further include:

Detection unit, for identifying the corresponding relevant information of the everyday expressions in the storage unit, and by the everyday words Language and the relevant information associated storage corresponding with the everyday expressions to the matched institute of the identity information of the user After stating in speech recognition library, and when detecting the current speech of user's input, detecting in the current speech is The no matched target word voice of pronunciation existed with any one of everyday expressions in the speech recognition library；

Second acquisition unit, result for being detected in the detection unit be when being obtained from the speech recognition library with The target relevant information of the corresponding everyday expressions of the target word voice；

Third recognition unit, for carrying out semantics recognition to the current speech according to the target relevant information, described in acquisition The corresponding semanteme of current speech.